Deep Dive Into VMware High Availability Features

Posted on April 9, 2021 by Isaac Davis | Updated: December 22, 2023

Home > Blog > Cloud > Deep Dive Into VMware High Availability Features

Wondering whether to purchase VMware to achieve high availability (HA)?

All production environments require constant uptime. With this in mind, how confident are you in the redundancy of your current traditional or cloud environment? In short, redundancy enables you to have confidence in your choice, where efficiency and manageability take you the rest of the way.

VMware’s vSphere High Availability (HA), vSphere vMotion, and the Distributed Resource Scheduler™ (DRS) make up the backbone of a high availability system for VMware Private Cloud.

These three technologies work in unison with the ESXi servers to allow you to restart virtual machines (VMs) automatically or move live workloads when:

Loads spike due to traffic.
Issues occur due to hardware failure or security issues.
Performing routine maintenance.

What Is a Virtual Machine?

A virtual machine is a computer machine with hardware components that provide computing resources. Hardware resources are not directly connected to a virtual machine. Its relation is a hypervisor set up on a physical computer and serves as the VM's host. This setup allows you to allocate resources flexibly and isolate yourself from the outside world. Selecting an operating system for your VM also offers various and more simplistic arrangements.

ESXi Hosts

For VMware High Availability, you need at least two physical servers that serve as ESXi (ESX Integrated) hosts. All the servers in the system connect to each other in a primary-primary relationship. This means that each member of the chain is equally crucial.

Several nodes coming together form a cluster. Any member of a cluster can manipulate data from other nodes in the chain. You can share storage, CPU, memory, and network resources in the environment with the hosted cluster.

In contrast, when using a dedicated server solution, resources are constrained to a single physical server. Unplanned hardware maintenance causes downtime when expanding resources, performing maintenance, or encountering unexpected hardware failures. The server would have to shut down for the duration of the hardware swap without redundancy.

What Is High Availability?

High availability is a server architecture designed for maximum uptime by removing single points of failure and keeping mission-critical applications and websites online during spikes in traffic, malicious attacks, or hardware failure. To ensure that everything functions in concert to create redundancy, every high-availability environment needs more than one component. The Distributed Resource Scheduler (DRS) is an important part of vSphere. DRS determines which of our cluster's hosts is best suited to host that virtual machine. For your VM, Liquid Web utilizes NetApp Storage Area Network rather than the host's drives (SAN). This configuration provides you with more redundancy.

VMware High Availability Features

Automatic Detection of Server Failures

VMware High Availability detects failed VMs and restarts them on different physical servers without the need for human intervention. Virtual Machine Disk (VMDK) files are stored on shared storage that is accessible to all physical servers connected via the HA cluster.

Automatic Restart of Virtual Machines

You can safeguard any application with an automatic restart on a different physical server in the resource pool.

Intelligent Choice of Servers

To balance workloads that need to be restarted on different hosts, VMware vSphere High Availability frequently works with VMware Distributed Resource Scheduler (DRS). Restarted VMs won't impact the performance of other VMs on the failover host if an organization combines VMware vSphere HA and DRS.

Resource Checks

Make sure there is always enough capacity to restart all virtual machines affected by server failure. VMware High Availability continuously monitors capacity utilization and reserves spare capacity to allow virtual machines to continue running.

Fault Tolerance

Fault tolerance is a type of redundancy enabling visitors to access the host server if one or more components fail. This property allows visitors to still receive the requested site or application with limited functionality in the event of a failure of any component. In addition, fault-tolerant systems experience almost no downtime in the event of a system failure since there is no need to crossover to another server.

VMware vSphere Fault Tolerance (FT) offers a live shadow instance of a virtual machine (VM) that mirrors the primary VM to prevent data loss and downtime during outages. vSphere FT delivers continuous availability for applications with up to four virtual CPUs by creating a live shadow instance of a VM that mirrors the primary.

If a hardware failure or outage happens, vSphere FT automatically initiates a failover to eliminate downtime and prevent data loss. After failover, vSphere FT automatically creates a new secondary VM to deliver continuous protection for the application.

How vSphere Handles Planned Maintenance

You can migrate your virtual machine with vMotion without any downtime. Your VM's resource usage will change as a result of this action. Live migration does not interrupt the VM, so performing hardware-related maintenance does not cause downtime because the VM does not notice any changes to its environment. The vMotion use case is not limited to maintenance. If one of the hosts becomes under or overutilized, DRS can use vMotion to move VMs to another host to maintain cluster balance. When allocating resources, Storage vMotion adds another layer of VM management. In addition to moving your memory and CPU resources, you can also move the virtual machine's file system on a storage device to allow maintenance on the object without affecting the VM.

How vSphere Handles Unanticipated Issues

Understanding the significance of VMs running continuously on a particular host will help you better comprehend how vSphere HA handles unexpected failures. When Fault Domain Manager (FDM) is installed on each host, a portion of this election takes place. FDM is an agent that facilitates communication between hosts regarding various topics such as resource allocation, VM housing, and host states. Protection against host failures, VM failures, and application-level failures is the responsibility of the FDM agent.

How the System Recognizes Failures

The VMware tool, Heartbeat, serves as a route for status reports on the cluster by gathering and sending information about the status of various cluster infrastructure components. This feature takes care of the task of adequately starting and stopping services. A cluster resource manager (CRM) is required for Heartbeat, and Pacemaker is the most frequently used in high-availability environments like VMware. It has the reasoning necessary to guarantee that services are running in just one place. In Pacemaker, you specify what should be operating on your infrastructure and how it should continue functioning in a changing environment.

How VMware vSphere Creates High Availability

With the help of the vCenter Server application, the vSphere infrastructure can control the cluster from a single location. It is a central hub for managing hosts and virtual machines and is installed on each host in the set. ESXi (the operating system running on the hosts) and the vCenter Server would make up vSphere if it could be divided into two components. Initially, ESXi functions as a hypervisor and virtualization program, enabling us to create our VMs. The vCenter Server simultaneously functions as a virtual data center where we can manage our virtual environment. VMware grants the vSphere Web Client access to our infrastructure via a web browser to simplify access. It is a web-based application, and most web browsers and operating systems can support its requirements.

VMware High Availability Requirements

VMware has a few requirements for creating a VMware cluster with HA enabled. Those requirements are as follows:

Each host in the HA cluster needs a valid vSphere HA license. It is necessary to apply for VMware vSphere Standard or Enterprise Plus licenses, which include vCenter Standard rights.
HA must be enabled on two hosts. However, most experts advise using three or more.
Each host has static IP addresses configured.
All the hosts must share at least one management network.
The hosts must be configured with the same networks and data stores for VMs to run across them if they are transferred to various hosts in the cluster.
HA demands shared storage.
VMware Tools must be installed on HA-monitored VMs.

VMware High Availability Best Practices

Number of VMs per Host

Use the distributed resource scheduler (DRS) features in your clusters to the fullest extent possible. Using DRS, you can ensure workloads are distributed evenly among all of the hosts in your set.

Admission Control

Using Admission Control when configuring HA is generally a good idea. Enabling Admission Control prevents you from starting new virtual machines that exceed your Number of host failures to tolerate limit.

Large Hosts vs Small Hosts

Ensuring your hosts are just the right size requires striking a delicate balance. Hardware costs and cluster resilience are essential considerations when setting up your clusters.

What Is the Difference Between Virtual Machines and Dedicated Servers?

A dedicated server would be comparable to having one virtual machine in a private cloud in typical use cases, with the server's hardware limiting its ability to scale. Without requiring unnecessary downtime for something like a memory upgrade on a dedicated server, using vSphere is much simpler and faster. When using a dedicated server, any unscheduled maintenance, such as a hardware failure, would directly impact your production environment and result in downtime. This would not occur with an HA private cloud because your domain would be automatically brought back online on a different host.

Liquid Web can efficiently distribute resources within the cluster using vSphere to select several Virtual Machines and the VM operating system that best meets your needs. Expanding our resources by adding more hosts to your cluster is another way to think about scalability.

Invest in High Availability for Your Business With VMware Private Cloud

VMware Private Cloud is a highly adaptable, dependable, affordable solution that simplifies scaling. Whether you need a method to start up a cluster or a customized solution, a VMware Private Cloud is a long-term investment for your company.

Contact the Most Helpful Humans In Hosting® today to get started.

Tagged with: Uptime, Uptime & Performance, VMware