Azure managed disk

Xin Cheng
3 min readOct 6, 2021

--

Surprise (bad or good)

I have been using Azure managed disk in Azure VM for a long time, however, recently I found out that there is hidden gem in this technology. It begins with testing speed of migrating onpremise server hosting large amounts of data. The data is inside binary files (individual could be 100GB and total could be several of 10TBs). There are 2 scenarios:

  1. Migrate these files as a snapshot from onprem to Azure to spin up a server
  2. Do quick snapshot on cloud to spin up another server in Azure

We use azcopy to upload from onprem to cloud Azure file share CIFS. The upload speed ends up about 2Gb/s, which means it takes 1h to upload 1TB data (we could achieve better speed when we have larger capacity of Azure file share).

Certainly we need to download these files from Azure file share to managed disk attached to Azure VM to use them. We tested 2 options:

  1. mounting Azure file share CIFS, then use Linux cp to copy 4TB data. It ends up with 0.5Gb/s speed. However, the Azure file share should provide 2Gb/s according to MS guidance, provisioned Premium SSD managed disk should also provide 2Gb/s, VM disk and network bandwidth should be much higher according to 1,2.
  2. Use azcopy, it ends up with 1.5Gb/s.

It turns out that first option is limited by single-threaded application (cp is not multi-threaded and Azure file share by default disables SMB Multichannel, as in “Slow copy performance on Linux clients” in my Azure file share CIFS).

Although the second option provides good throughput, migrating 4TB data would still take 8h (first upload, then download).

We are expecting once the initial data is on cloud, spinning up second instance would take much shorter time. Therefore, we look to managed disk clone, something like below

# Create a managed disk by copying an existing disk or snapshot.
az disk create -g MyResourceGroup -n MyDisk --source MyDisk

I think it should be faster, but it still comes to my surprise: it takes less than 2 minutes to see the new managed disk. What happens?

Internals

I think it is due to internals of Azure managed disk. I cannot find any explanation online, but my general guess is that it does not really copy 4TB data on the fly when a new managed disk is created (which means 266Gb/s).

  1. Azure disk is backed on page blob. It was previously stored on page blob, now it abstracted away the process of creating Azure storage account.
  2. Page blob is a collection of 512-byte pages. It is backed by storage account architecture 1, 2. The pages are distributed across backend storage infrastructure.
  3. When you copy a managed disk, is it really copying the whole data to another one or is there any lazy copy/copy-on-write technology?

In following articles, we can always see the main pattern (getting disk SAS URI, which is Azure-storage SAS URI):

targetSASURI=$(az disk grant-access -n $targetDiskName -g $targetRG  --access-level Write --duration-in-seconds 86400 -o tsv)sourceSASURI=$(az disk grant-access -n $sourceDiskName -g $sourceRG --duration-in-seconds 86400 --query [accessSas] -o tsv)azcopy copy $sourceSASURI $targetSASURI --blob-type PageBlob

Disk snapshot is also stored on storage account

Copying blob also involves SAS URI

Appendix

Bonus video when researching Azure managed disk

--

--

Xin Cheng
Xin Cheng

Written by Xin Cheng

Multi/Hybrid-cloud, Kubernetes, cloud-native, big data, machine learning, IoT developer/architect, 3x Azure-certified, 3x AWS-certified, 2x GCP-certified

No responses yet