Wednesday, 25 November 2020

Windows Hyper-V 2019 failed merge of snapshot with error 0x8007054F

 This is not a good fix and am not happy with it... seems a little too much like the nuclear option but it works and at the moment Microsoft are not being to forthcoming with answers and we seem to be stuck between the backup company and them pointing the fingers at each other,

I will update if we get a better fix.

But for now if you are seeing phantom snapshots where if you restart the Hyper V machine management server it will try to merge then fail with the following error code

0x8007054F

The issue appears to be the root parent, not being able to merge with the last checkpoint.  so you could roll all the AVHDX files back to the first one and that last one will not merge back with the primary disk.

Update #1 (02/12/2020)
After manually merge a few servers it is clear the issue is only with the System drive / First drive in the VM config.

Not sure if it because its the system drive or its the first one in the list in the VM config, but removing it from the VM config allows it too start merging the other drives as soon as the config is save.  this has allowed us to save a lot of time and cut down the work we need to do in the sort time.

Update #2 (07/12/2020)
Seems update 1 was false.  Although all up to that date had been the system drive / first one in the VM config the last two I cleaned up was data drives.  both at the end of the VM config list.

Still waiting on an update from Microsoft on this.

Update #3 (14/01/2021)
Microsoft are asking for the VM to be deleted and remade so that it makes a new VM ID,  we are testing this and will update if this has an impact.

Update #4 (09/02/2021)
The VM we have remade have not had issue with check pointing but the others have not either so its still up in the air at the moment.

Update #5 (10/03/2021)
Confirmation this week that the issue has returned on guests where we have not remade the VM config but the ones we have are still running fine and merging back even on the same host.

Short term fix
The quickest way I have found so far to resolve this is to do the Following

  1. Shutdown the virtual machine
  2. Break and remove replication if its enabled
  3. Open the virtual machines settings
  4. Take a note of the disk and location of the VM has set,  should end in a AVHDX
  5. Remove the disk
  6. Open Hyper-V and select "Edit Disk" point it to the AVHDX you noted in point 4 and select "Merge"
  7. Once on the Merge options select new disk and give it a name and a location
  8. Attach this new disk to the virtual machine in the same controller as the one you removed
  9. Boot the VM and confirm its working
  10. Remove old VHDX and AVHDX files
  11. Enable replication if you disabled it.
Doing it this way can be risky so always check your backups but it does work.

6 comments:

  1. I'm having the same problem, unfortunately it's been going on since november and I have close to 3.000 avhdx-files now - 3.5TB in size. Not looking forward to manually merge them.

    ReplyDelete
    Replies
    1. yeah its a pain, we have had it happen to a few sites on 2019 hosts. But we have had good mileage once we have merged the data back, deleting the VM config and making a new one in Hyper-V.

      Issue has not returned on them VMs but has returned on the VMs we have not done it too on the same host. it seems odd but its working by the look of it.

      Out of curiosity what do you run for backups as the sites we have the issue at run a 3rd party backup solution but then use Hyper-V Rep to an offsite location and we are thinking it could be the VSS getting in a knot when both the 3rd party backup is running and Hyper-V rep take a check point.

      Delete
  2. Thanks! This post safed my day

    ReplyDelete
  3. We are running into this issue after attempting to enable Azure Site Replication on one of our SQL servers and it failed in Azure. We are using Veeam as our backup and every time Veeam runs, it leaves behind another avhdx file. We have a 3-node AzureStack HCI cluster. I was able to remove replication and live-migrate the VM last week and it merged all the snapshots. Unfortunately the issue came back the next day and the live-migration/quick-migration options are not working now. I'm not sure what we are going to do to fix this but it's not good!

    ReplyDelete
  4. Please read https://www.altaro.com/hyper-v/clean-up-hyper-v-checkpoint/ - method #2 worked for me in our case

    ReplyDelete
    Replies
    1. yeah we tried that. had a case logged with Altaro, this was the last option that fixed it

      Delete