User Tools

Site Tools


en:security:computer:cluster:start
 
 

Clustering

Servers can be clustered to increase availability and response times for users.

  • Clustering provides transparent backup and failover, and delivers redundancy of systems, peripherals and data — critical to sustaining a 24x7x365 arena.
  • Different server resources can be tweaked in different ways to achieve different purposes.
  • Clustering can give access to, share and update data of all the different server resources to each other in real-time.
  • Clustering solves the problem of integrating these different types of servers with centralized network and storage resources.
  • Separate servers can be managed as if they are one. Availability, performance, and reliability are most likely the main metrics measured.

Architecture

  • Which data is (or needs to be) replicated?
  • Which storage solution to use for replication?
    • Performance capacity?
    • Storage capacity?
    • Are any of the replicated systems placed remotely?
  • Hardware?
    • UPS?
  • Connections?
    • Local area networks (LANs)?
    • Wide area networks (WANs)?
    • Volume(s) of transactions?
    • Response time requirements?
    • Distances between the nodes?
  • TCP/IP
    • Addresses?
    • Attributes?
    • Cluster partitions?
    • Sandbox for testing (new or changed) configurations?
  • Which type of availability is required?
    • A coordinated failover of both the application and its associated data to the same, or to another node or nodes in the cluster, or to a backup server?
    • Checkpoint-restart processing? On the node, on the backup server or across the servers?

Security

Standard considerations:

  • Access to the cluster API
    • Is all API traffic encrypted?
    • Do all API clients, including nodes, proxies, scheduler, and plugins require authentication (certificate, static Bearer token, OIDC or LDAP server).
    • Are API calls expected to pass an authorization check? Role based?
  • Capabilities of a workload or user at runtime
    • Resource quota?
    • Privileges VM's run with?
    • Loading unwanted kernel modules?
    • Network access? (per-node firewalls? or?)
    • Access of VM's to nodes?
  • Compromised cluster components
    • Access to backend
    • Administrator credentials?
    • Logging? Audits?
    • Upgrading process? Sandbox? Security reviews of new application code?
    • Lifetime of secrets or credentials?
    • Databases? Other storage solutions?
    • Backup process?
    • Disk encryption?
    • Patching and security updates?

Means and methods

en/security/computer/cluster/start.txt · Last modified: 2020/07/05 22:19 by Digital Dot