Blah, Cloud.

Adventures in architectures

  • Twitter
  • GitHub
  • Home
  • Blog
  • Kubernetes on vSphere
  • Multi-tenant IaaS Networking
  • Me
    • About
    • CV
    • Contact
Home » Blog » Infrastructure » Fix for CBT bug in VMWare Products

Fix for CBT bug in VMWare Products

02/12/2014 by Myles Gray 23 Comments

VMWare, as of writing, has a nasty bug that means your backups that run utilising CBT (hint: if you have basically any enterprise backup product worth its salt, it’s got CBT enabled) it loses track of the changed blocks when the VMDK reaches any Power 2 value of 128GB (128, 256, 512, 1024, etc.) which may make your backup unrecoverable.

The VMWare bug is in KB:

kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2090639

The remedy for this is to disable and re-enable (reset) CBT on the affected machines, this can be done with the machine powered off or with it turned on by running PowerCLI commands and a snapshot, we will be doing the latter, no one likes downtime.

Download and install VMWare PowerCLI then run the following command:

Connect-VIServer -Server {VC-Address}

Enter Username and Password when prompted. Should display output like below:

Name            Port  User
----            ----  ----
vcsa.domain.com 443   username

The following will run and collect the VMs matching the conditions VMDK>=128GB and CBT enabled into the array $vms:

[System.Collections.ArrayList]$vms = Get-VM| ?{$_.ExtensionData.Config.Hardware.Device.CapacityInKB -ge 128000000} | ?{$_.ExtensionData.Config.ChangeTrackingEnabled -eq $true}

To view the list of VMs run the following:

echo $vms

You should get a nice list of VMs that match the conditions and likely need CBT reset:

Name                 PowerState Num CPUs MemoryGB
----                 ---------- -------- --------
Machine1.domain... PoweredOn  4        8.000
Machine2.domain... PoweredOn  4        8.000
Machine3.domain... PoweredOn  2        6.000

To reset CBT on these machines while they are live you need to create a VM spec that disables CBT and apply it to the affected machines:

$spec = New-Object VMware.Vim.VirtualMachineConfigSpec; $spec.ChangeTrackingEnabled = $false;

To disable CBT on all VMs affected we then have to apply the $spec to each VM in the $vms array:

foreach($vm in $vms){$vm.ExtensionData.ReconfigVM($spec);$snap=$vm | New-Snapshot -Name 'Disable CBT';$snap | Remove-Snapshot -confirm:$false;}

This will apply the $spec to each VM affected, take a snapshot then remove it to commit the CBT param to turn off.

PowerCLI CBT Command

To check if your command ran successfully run:

get-vm | ?{$_.ExtensionData.Config.ChangeTrackingEnabled -eq $false}

This outputs a list of VMs with CBT disabled – you should see your full list of VMs from above here. If you are using a backup product that forces CBT to on, like Veeam, then you can leave it here, Veeam will re-enable CBT and run a full backup next time (because we have lost our CBT history).

However, if you run a product that doesn’t do this you will need to let your backup run once then run the following command to enable CBT in the spec again and apply to the VMs:

[System.Collections.ArrayList]$vms = Get-VM| ?{$_.ExtensionData.Config.Hardware.Device.CapacityInKB -ge 128000000} | ?{$_.ExtensionData.Config.ChangeTrackingEnabled -eq $false}
$spec = New-Object VMware.Vim.VirtualMachineConfigSpec; $spec.ChangeTrackingEnabled = $true;
foreach($vm in $vms){$vm.ExtensionData.ReconfigVM($spec);$snap=$vm | New-Snapshot -Name 'Disable CBT';$snap | Remove-Snapshot -confirm:$false;}

This is subtly different than the first set of commands; of note are:

.ChangeTrackingEnabled -eq $false

To only pull VMs with CBT disabled into the $vms array.

$spec.ChangeTrackingEnabled = $true;

To enable CBT on machines rather than disable.

This will resolve the problem until your machine crosses another Power 2 border of 128GB when this will need run again.

This bug is currently under research with VMWare and I am keeping an eye on the KB for updates on a hotfix available. Source for PowerShell code that has been adapted from: http://www.veeam.com/kb1940

Why not follow @mylesagray on Twitter for more like this!

Show some love:

  • Reddit
  • Twitter
  • Pocket
  • LinkedIn
  • Email
  • Telegram

Similar things I've written

Filed Under: Infrastructure, Software, Virtualisation Tagged With: CBT, vmdk, vmware

About Myles Gray

Hi! I'm Myles, and I'm a Dev Advocate at VMware. Focused primarily on content generation, product enablement and feedback from customers and field to engineering.

Comments

  1. Ashley Milne says

    06/12/2014 at 06:09

    Hi, I am following your steps and get stuck on the third step, perhaps I am a bit thick, but I am cutting and pasting your script and I get this error:

    The term ‘Get-VM ‘ is not recognized as the name of a cmdlet, function, script
    file, or operable program. Check the spelling of the name, or if a path was inc
    luded, verify that the path is correct and try again.
    At line:1 char:45
    + [System.Collections.ArrayList]$vms = Get-VM <<<< | ?{$_.ExtensionData.Config
    .Hardware.Device.CapacityInKB -ge 128000000} | ?{$_.ExtensionData.Config.Change
    TrackingEnabled -eq $true}
    + CategoryInfo : ObjectNotFound: (Get-VM :String) [], CommandNotF
    oundException
    + FullyQualifiedErrorId : CommandNotFoundException

    Are there variables I need to change specific to my environment?

    Reply
    • Myles Gray says

      06/12/2014 at 14:25

      Hi Ashley,

      You’ve made sure to install PowerCLI as listed above and launched it instead of straight PowerShell?

      Myles

      Reply
  2. Ashley Milne says

    06/12/2014 at 14:52

    Yes that is correct.

    Reply
    • Myles Gray says

      06/12/2014 at 15:38

      Can you try running the below just on it’s own after you’ve made a successful connection to the vCenter Server with Connect-VIServer:

      `Get-VM`

      You should get a list of all the VMs in your vCenter?

      Reply
  3. Ashley Milne says

    06/12/2014 at 15:47

    Correct, that command returns a list of vm’s in my Vcenter.

    Reply
    • Myles Gray says

      06/12/2014 at 16:03

      Ashley,

      Got to the bottom of it, apologies, the error you’re seeing is: The term ‘Get-VM ‘ is not recognized – note the extra whitespace after the Get-VM, I have adjusted my code above by removing the space after the Get-VM command, it should work as expected now.

      Myles

      Reply
      • Ashley Milne says

        06/12/2014 at 16:52

        Thanks, it works now as expected. Cheers!

        Reply
  4. Ashley Milne says

    06/12/2014 at 15:48

    If its helpful, the version of Vcenter I am running is 5.1 1473063 and the host I am running the script against is running Esxi 5.0 441354

    Reply
  5. gbkhor says

    16/12/2014 at 08:48

    Folks,

    so…. after apply above script, we doesn’t need to power cycle VM machine and this will fix the CBT issue temporary while waiting VM to come out a permanent fix?

    Note: I’using NBU 6.0.2 making use of CBT for backup.

    Reply
    • Myles Gray says

      16/12/2014 at 08:51

      Correct, no power cycle is needed, your next backup should take a lot longer as it is a full rather than a differential but it should be okay after that.

      Reply
      • gbkhor says

        18/12/2014 at 05:34

        Myles,

        How do we verify that CBT was recreated after running the above script? do we just look at the size of it?

        Thanks

        Reply
        • gbkhor says

          18/12/2014 at 10:27

          I got the answer already, CBT file is deleted once the script is run

          Reply
          • Myles Gray says

            18/12/2014 at 10:29

            Yep thats correct, it is then recreates with zero size

            Reply
      • gbkhor says

        18/12/2014 at 10:25

        Myles

        I notice after run the below script

        $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
        $spec.ChangeTrackingEnabled = $false

        it change the “scsi0:1.deviceType = “disk”” TO “scsi0:1.deviceType = “scsi-hardDisk”” at vmx file and it basically changing the ‘SCSI controller’ from “LSI Logic SAS” to “LSI Logic Parallel”

        Can we maintain the SCSI controller type? as I do have other disk is using “VMware Paravirtual” and would like to maintain all the SCSI controller type as what it is now.

        Thanks

        Reply
        • Myles Gray says

          18/12/2014 at 10:58

          I’ve never seen that behaviour or that deviceType before, scsi-hardDisk as far as I am aware it doesn’t code for the scsi-controller, that is set with the `scsi*.virtualDev` param in the vmx:

          http://faq.sanbarrow.com/index.php?action=artikel&cat=7&id=53&artlang=en

          Did you upgrade your ESX instance from a very old version to a newer one or has the machine in question been imported from an OpenStack / Fusion / Workstation / Xen / OtherVirtSolutionHere instance?

          More reference on vmx props:

          http://sanbarrow.com/vmx/vmx-scsi.html

          Reply
          • gbkhor says

            18/12/2014 at 11:13

            Myles

            Yes, it was upgraded from ESX 4 to 5 and to 5.5

            Thanks

            Reply
            • Myles Gray says

              18/12/2014 at 11:40

              Only references to deviceType=”disk” is with relation to IDE controllers.

              Where have you found info suggesting deviceType converts from SAS to Parallel?

              The SCSI controller should be unaffected by this operation.

              Reply
              • gbkhor says

                18/12/2014 at 12:48

                From vm guest properties. The scsi controller type changed from lsi logic sas to parallel. This guest is not imported from any platform, it was build from vm itself.

                Thanks

              • gbkhor says

                18/12/2014 at 13:41

                it change the “scsi0:1.deviceType = “disk”” TO “scsi0:1.deviceType = “scsi-hardDisk”” at vmx file

              • gbkhor says

                18/12/2014 at 13:52

                See below

                http://faq.sanbarrow.com/index.php?action=artikel&cat=7&id=54&artlang=en&highlight=deviceType

                It changed “disk” to “scsi-hardDisk” and this cause VM changed LSI Logic SAS to LSI Logic Parallel and this changes is not reversible, as once you try to change it back to LSI Logic SAS, OS drive fail to boot

                Thanks

            • Myles Gray says

              19/12/2014 at 11:34

              Sounds like one to take up with VMware support – this isn’t an expected behaviour.

              Reply
              • gbkhor says

                19/12/2014 at 13:00

                Thanks Myles!

              • Myles Gray says

                19/12/2014 at 15:42

                Any time :)

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Myles Gray

Hi! I'm Myles, and I'm a Dev Advocate at VMware. Focused primarily on content generation, product enablement and feedback from customers and field to engineering. Read More…

Categories

Tags

active directory authentication CBT cisco datastore dell design esxi fortigate iscsi jumbo frame kubernetes lab linux load-balancing lun md3000i mtu networking NginX nic nsx openSUSE osx pxe readynas san sdelete serial teaming ubuntu vcenter vcloud director vcsa vexpert video VIRL vmdk vmfs vmware vsan vsphere vsphere 6 vsphere beta windows

Subscribe to Blog via Email

Copyright © 2021 · News Pro Theme on Genesis Framework · WordPress · Log in

loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.