Tuesday, February 20, 2007

VMware: Error in HA

VMware Infrastructure v3.0 (ESX Server)

Error: Insufficient resources to satisfy HA failover level on cluster in data center.
Error: Internal AAM error. Agent did not start.

Solution:

- Check the HOSTNAME entry in /etc/sysconfig/network to the short name.

- Check if your FQDN is greater than 30 characters, in which case HA will not configure properly. This is a known bug in VC20 (see KB article 2259).
- Check IP, routing, and DNS for each host.
- Make sure that storage and network are available across the cluster - Ensure that the hosts are not managed directly: perform all host management through VC.
- May want to add nodes to /etc/hosts on ESX Server AND hosts file on VC Server. A better plan would be to use primary and secondary DNS servers.
- Check if Service Console has default gateway defined.
- Verify logs: /opt/LGTOaam512/* and /opt/LGTOaam512/vmsupport/*.
- Check /etc/hosts and /etc/resolv.conf.
- In ESX 3.x the memory reservation is zero, and the limit is "unlimited." To see this, edit the settings of a virtual machine, click on the Resources tab, and select Memory on the left. To conform to the ESX 3.x defaults, change the settings to a reservation of 0, and check the Unlimited box under limit. After doing this for all virtual machines, edit the settings for the cluster. Disable HA, and then edit the cluser settings again to reenable HA. The current failover capacity should now match the configured capacity.

1 comment:

David said...

Also confirm that the IP address is the correct one in the DNS entries in the following files:
/etc/host
/etc/vmware/esx.conf

http://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&externalId=1004965&sliceId=1&docTypeID=DT_KB_1_1&dialogID=27696258&stateId=1%200%2027692856