Avamar: Backup Fails After vCenter Crash or HA Event

Avamar Error Code: 10059

2014-08-05 11:13:27 avvcbimage Error <17780>: Snapshot cannot be performed because Host 'esx01.domain.com' is currently in Maintenance Mode (Log #1)
2014-08-05 11:13:27 avvcbimage FATAL <0000>: [IMG0009] The Host 'esx01.domain.com' is in a state that cannot perform snapshot operations. (Log #1)
2014-08-05 11:13:27 avvcbimage Error <0000>: [IMG0009] createSnapshot: snapshot creation failed (Log #1)

You may also notice these details in the session drill-down:

2014-08-05 11:13:23 avvcbimage Info <16005>: Login(https://vcenter.domain.com:443/sdk) problem with reused sessionID='52e2b2cb-225f-0229-9b25-929a652617fb' contacting data center 'Datacenter'.
2014-08-05 11:13:23 avvcbimage Warning <0000>: [IMG0014] Problem logging into URL 'https://vcenter.domain.com:443/sdk' with session cookie.

At first, the list of failed VM backups seemed to have no correlation–multiple hosts, various OSes, different policy groups. But the above session details revealed the root cause. Avamar thought the VMs were on a host that was in maintenance mode (or in a previous case, powered off). It’s a bit hard to snapshot a VM on a host that isn’t running the VM or even running at all.

The precipitating events to my situation revolved around host HA failovers and vCenter going offline after connectivity to our EMC XtremIO array was lost (somewhat intentionally in order to reproduce an unexpected loss of access in June). It was a lose-lose situation, because only hosts with decent load showed the issue, but showing the issue meant everything crashed hard and dirty.

avamar_services

It seems that when vCenter becomes unavailable like that, Avamar doesn’t quite reconnect fully. Most VMs were still backing up fine, but it seemed to have cached the locations of a few (15 in this case). Looking at the Administration > Services pane above, all appears well. However, the Avamar continued to try to snap the VMs on the wrong host until the vCenter service connection was restarted here (right-click > Restart).

The other things I tried which failed to fix Avamar backups (but worked fine themselves) include: vMotions between hosts, snapshots w/ and w/o memory, leaving and deleting the snapshots, etc. The point was that Avamar wasn’t getting that far. It didn’t know where to start.

Moral of the story: after a vCenter crash, make sure to restart the vCenter connection service in your Avamar MCS.

2 Comments

  1. Alex said:

    You may need to restart the MCS completly as this options doesn’t completly resync Avamar with VCenter

    May 19, 2015
    Reply
    • Chris said:

      Yeah, I’ve since realized that. Otherwise I’ve had to remove the problem host from the cluster for Avamar to try elsewhere. Thanks!

      May 19, 2015
      Reply

Leave a Reply