Azure SDK for Python: Retrieve Virtual Machine Image Details
holger
Posted on January 13, 2023
The Azure SDK for Python includes capabilities that allow users to retrieve Virtual Machine image details, such as the publisher names, offers, SKUs and the image details itself. [4] I was reading a thread on Stack Overflow and got interested in this functionality - and here is the corresponding write-up.
The corresponding library is the Azure Management Compute library which includes the ComputeManagementClient
class which in turn has a function called virtual_machine_images.list
. [1][2] However, this function requires further parameters:
location
publisher_name
offer
skus
Except for location, we may not have all the required details at hand to retrieve the image details. Further relevant functions include:
-
list_publishers - which has
location
as a required parameter -
list_offers - which requires
location
andpublisher_name
-
list_skus - which requires
location
,publisher_name
andoffer
parameters
Now that looks like a few for-loops would be helpful until we could finally list all VM images. But would we even want this? Maybe. Given that there are thousands of images available in each region, maybe rather not as it would take a significant amount of time. So, maybe build our own function to be flexible?
Let's try and go through this step by step and take a look at each of those functions.
First, we would need the appropriate Azure SDK for Python libraries for authentication and compute management:
- azure-identity [3]
- azure-mgmt-compute [1]
from azure.identity import AzureCliCredential
from azure.mgmt.compute import ComputeManagementClient
We'd need to define our Credentials, Subscription ID and Azure Location.
my_credential = AzureCliCredential()
my_subscription_id = "{subscription-id}"
my_location = "{azure-location}"
Now we could create the client and collect the list of Publishers.
compute_client = ComputeManagementClient(credential=my_credential, subscription_id=my_subscription_id)
img_publishers = compute_client.virtual_machine_images.list_publishers(location=my_location)
At this point, if we were to display the content of the img_publishers
variable, it would show loads of objects:
[...<azure.mgmt.compute.v2022_08_01.models._models_py3.VirtualMachineImageResource object at 0x7fb8d85c0790>, <azure.mgmt.compute.v2022_08_01.models._models_py3.VirtualMachineImageResource object at 0x7fb8d85c07c0>, <azure.mgmt.compute.v2022_08_01.models._models_py3.VirtualMachineImageResource object at 0x7fb8d85c07f0>, <azure.mgmt.compute.v2022_08_01.models._models_py3.VirtualMachineImageResource object at 0x7fb8d85c0820>, <azure.mgmt.compute.v2022_08_01.models._models_py3.VirtualMachineImageResource object at 0x7fb8d85c0850>, <azure.mgmt.compute.v2022_08_01.models._models_py3.VirtualMachineImageResource object at 0x7fb8d85c0880>...]
Just to get an idea of how many publishers there are (I used northeurope
as Azure location):
>>> len(img_publishers)
1871
Let's just pick one randomly, to see what details are contained within these objects:
>>> print(img_publishers[1200])
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/MicrosoftSQLServer', 'name': 'MicrosoftSQLServer', 'location': 'northeurope', 'tags': None, 'extended_location': None}
The item of interest seems to be the name. We can now use that to go further and query the corresponding offers that belong to this publisher. Let's stick to above example in order to keep it simple for the moment.
offers = compute_client.virtual_machine_images.list_offers(location=my_location, publisher_name=img_publishers[1200].name)
Let's see how many offers are there from MicrosoftSQLServer
:
>>> len(offers)
37
We'll again pick one of them to see what properties might be important - and it's, again, name
.
>>> print(offers[36])
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/MicrosoftSQLServer/ArtifactTypes/VMImage/Offers/sql2022-ws2022', 'name': 'sql2022-ws2022', 'location': 'northeurope', 'tags': None, 'extended_location': None}
Next stop: SKUs
skus = compute_client.virtual_machine_images.list_skus(location=my_location,publisher_name=img_publishers[1200].name, offer=offers[36].name)
Telling by the length of the list, there are only four SKUs for this particular offer:
>>> len(skus)
4
And picking one of them reveals that, again, the name
property is what we need.
>>> print(skus[0])
{'additional_properties': {'properties': {'automaticOSUpgradeProperties': {'automaticOSUpgradeSupported': False}}}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/MicrosoftSQLServer/ArtifactTypes/VMImage/Offers/sql2022-ws2022/Skus/enterprise-gen2', 'name': 'enterprise-gen2', 'location': 'northeurope', 'tags': None, 'extended_location': None}
Since we now have everything together we can finally call the list()
function.
images = compute_client.virtual_machine_images.list(location=my_location, publisher_name=img_publishers[1200].name, offer=offers[36].name, skus=skus[0].name)
We now have identified 2 images:
>>> len(images)
2
And from that we can get an idea about what the outcome will be:
>>> print(images[0])
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/MicrosoftSQLServer/ArtifactTypes/VMImage/Offers/sql2022-ws2022/Skus/enterprise-gen2/Versions/16.0.221025', 'name': '16.0.221025', 'location': 'northeurope', 'tags': None, 'extended_location': None}
Picking items from the individual lists is a) not much fun and b) does not scale very well. As next step we might want to wrap some for-loops around these.
So, let's take a step back and assume we were back at a point where we imported our libraries and defined my_credential
, my_subscription_id
and my_location
variables. We would also have created the ComputeManagementClient
and dumped the list of publishers into the img_publishers
variable.
from azure.identity import AzureCliCredential
from azure.mgmt.compute import ComputeManagementClient
my_credential = AzureCliCredential()
my_subscription_id = "{subscription-id}"
my_location = "{azure-location}"
compute_client = ComputeManagementClient(credential=my_credential, subscription_id=my_subscription_id)
img_publishers = compute_client.virtual_machine_images.list_publishers(location=my_location)
From here on we could iterate through the Publishers, Offers, SKUs and finally Images.
for publisher in img_publishers:
offers = compute_client.virtual_machine_images.list_offers(location=my_location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=my_location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=my_location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
print(image)
It shouldn't take long for us to see some results on screen (cancelled the processing using CTRL-C, otherwise this would take ages to get all the images):
...
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/aod/ArtifactTypes/VMImage/Offers/win2019azpolicy/Skus/win2019azpolicy/Versions/0.0.1', 'name': '0.0.1', 'location': 'northeurope', 'tags': None, 'extended_location': None}
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/aod/ArtifactTypes/VMImage/Offers/win2019azpolicy/Skus/win2019azpolicy/Versions/0.0.3', 'name': '0.0.3', 'location': 'northeurope', 'tags': None, 'extended_location': None}
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/apigee/ArtifactTypes/VMImage/Offers/apigee-edge/Skus/apigee-edge-4-15-07/Versions/1.0.0', 'name': '1.0.0', 'location': 'northeurope', 'tags': None, 'extended_location': None}
{'additional_properties': {}, 'id': '/Subscriptions/{subscription-id}/Providers/Microsoft.Compute/Locations/northeurope/Publishers/apigee/ArtifactTypes/VMImage/Offers/apigee-edge/Skus/apigee-edge-private-cloud/Versions/4.16.05', 'name': '4.16.05', 'location': 'northeurope', 'tags': None, 'extended_location': None}
...
While this works, it might not be very helpful for further processing. We might want to add the details to a custom dictionary? If so, the loops could be put out like this:
for publisher in img_publishers:
offers = compute_client.virtual_machine_images.list_offers(location=my_location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=my_location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=my_location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
image_dict = dict({
'publisherName' : publisher.name,
'offerName' : offer.name,
'skuName': sku.name,
'imageName': image.name
})
print(image_dict)
The dictionary would look something like this (again CTRL-C'd after a few seconds to stop the processing as it would take too long):
...
{'publisherName': 'aod', 'offerName': 'win2019azpolicy', 'skuName': 'win2019azpolicy', 'imageName': '0.0.1'}
{'publisherName': 'aod', 'offerName': 'win2019azpolicy', 'skuName': 'win2019azpolicy', 'imageName': '0.0.3'}
{'publisherName': 'apigee', 'offerName': 'apigee-edge', 'skuName': 'apigee-edge-4-15-07', 'imageName': '1.0.0'}
{'publisherName': 'apigee', 'offerName': 'apigee-edge', 'skuName': 'apigee-edge-private-cloud', 'imageName': '4.16.05'}
...
If we wanted to add those individual dictionary to a large one to process in the end, we could append the image details rather then printing them out on screen.
results = []
for publisher in img_publishers:
offers = compute_client.virtual_machine_images.list_offers(location=my_location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=my_location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=my_location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
image_dict = dict({
'publisherName' : publisher.name,
'offerName' : offer.name,
'skuName': sku.name,
'imageName': image.name
})
results.append(image_dict)
This would still run a very long time though. So maybe we would want to add some query functionality? Let's create a function.
def get_vm_images(credential, subscription_id, location, image_publisher):
results = []
compute_client = ComputeManagementClient(credential=credential, subscription_id=subscription_id)
img_publishers = compute_client.virtual_machine_images.list_publishers(location=location)
for publisher in img_publishers:
if image_publisher in publisher.name:
offers = compute_client.virtual_machine_images.list_offers(location=location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
image_dict = dict({
'publisherName' : publisher.name,
'offerName' : offer.name,
'skuName': sku.name,
'imageName': image.name
})
results.append(image_dict)
else:
pass
return results
By checking whether the value of image_publisher
(which we pass to the function) is part of the publisher.name
value, we could make sure that only if there is a match, Images would be queried and added to the results dictionary: if image_publisher in publisher.name
.
We could try this:
>>> get_vm_images(credential=my_credential, subscription_id=my_subscription_id, location=my_location, image_publisher="canonical")
Hm, we get nothing:
[]
So, maybe capitalization matters.
>>> get_vm_images(credential=my_credential, subscription_id=my_subscription_id, location=my_location, image_publisher="Canonical")
And indeed, this seems to work better:
[...{'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202006110'}, {'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202007020'}, {'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202007030'}, {'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202007070'}, {'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202007080'}, {'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202007090'}, {'publisherName': 'Canonical', 'offerName': 'UbuntuServer', 'skuName': '19_10-daily-gen2', 'imageName': '19.10.202007100'}]
So, something else to account for I presume... since there might be multiple variations. Maybe some publishers are typed all lowercase, others require capitalization - we could replace above if-clause (I'm sure there are more elegant ways though) by: if image_publisher.lower() in publisher.name or image_publisher.capitalize() in publisher.name:
The function code would now look like this
def get_vm_images(credential, subscription_id, location, image_publisher):
results = []
compute_client = ComputeManagementClient(credential=credential, subscription_id=subscription_id)
img_publishers = compute_client.virtual_machine_images.list_publishers(location=location)
for publisher in img_publishers:
if image_publisher.lower() in publisher.name or image_publisher.capitalize() in publisher.name:
offers = compute_client.virtual_machine_images.list_offers(location=location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
image_dict = dict({
'publisherName' : publisher.name,
'offerName' : offer.name,
'skuName': sku.name,
'imageName': image.name
})
results.append(image_dict)
else:
pass
return results
With this in place, we should be able to get results even when passing something like canonical
instead of Canonical
.
Now we have this very basic function in place. This could be used in various ways. I.E. to produce json output...
import json
canonical_images = get_vm_images(credential=my_credential, subscription_id=my_subscription_id, location=my_location, image_publisher="canonical")
print(json.dumps(canonical_images))
[ ... {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202006110"}, {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202007020"}, {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202007030"}, {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202007070"}, {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202007080"}, {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202007090"}, {"publisherName": "Canonical", "offerName": "UbuntuServer", "skuName": "19_10-daily-gen2", "imageName": "19.10.202007100"}]
...or to load into a Pandas DataFrame:
import pandas as pd
canonical_images = get_vm_images(credential=my_credential, subscription_id=my_subscription_id, location=my_location, image_publisher="canonical")
df = pd.DataFrame(canonical_images)
print(df)
>>> print(df)
publisherName offerName skuName imageName
0 Canonical 0001-com-ubuntu-confidential-vm-experimental 18_04 18.04.20210309
1 Canonical 0001-com-ubuntu-confidential-vm-experimental 18_04-gen2 18.04.20210309
2 Canonical 0001-com-ubuntu-confidential-vm-experimental 20_04 20.04.20210309
3 Canonical 0001-com-ubuntu-confidential-vm-experimental 20_04-gen2 20.04.20210309
4 Canonical 0001-com-ubuntu-confidential-vm-focal 20_04-lts-cvm 20.04.202111100
... ... ... ... ...
4546 Canonical UbuntuServer 19_10-daily-gen2 19.10.202007030
4547 Canonical UbuntuServer 19_10-daily-gen2 19.10.202007070
4548 Canonical UbuntuServer 19_10-daily-gen2 19.10.202007080
4549 Canonical UbuntuServer 19_10-daily-gen2 19.10.202007090
4550 Canonical UbuntuServer 19_10-daily-gen2 19.10.202007100
[4551 rows x 4 columns]
Finally, here is the full script for testing. I've added an additional if-clause so that the function would also accept None
as image publisher - however, this is not recommended really as it takes a lot of time.
from azure.identity import AzureCliCredential
from azure.mgmt.compute import ComputeManagementClient
def get_vm_images(credential, subscription_id, location, image_publisher=None):
results = []
compute_client = ComputeManagementClient(credential=credential, subscription_id=subscription_id)
img_publishers = compute_client.virtual_machine_images.list_publishers(location=location)
if image_publisher:
for publisher in img_publishers:
if image_publisher.lower() in publisher.name or image_publisher.capitalize() in publisher.name:
offers = compute_client.virtual_machine_images.list_offers(location=location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
image_dict = dict({
'publisherName' : publisher.name,
'offerName' : offer.name,
'skuName': sku.name,
'imageName': image.name
})
results.append(image_dict)
else:
pass
else:
for publisher in img_publishers:
offers = compute_client.virtual_machine_images.list_offers(location=location, publisher_name=publisher.name)
for offer in offers:
skus = compute_client.virtual_machine_images.list_skus(location=location,publisher_name=publisher.name, offer=offer.name)
for sku in skus:
images = compute_client.virtual_machine_images.list(location=location, publisher_name=publisher.name, offer=offer.name, skus=sku.name)
for image in images:
image_dict = dict({
'publisherName' : publisher.name,
'offerName' : offer.name,
'skuName': sku.name,
'imageName': image.name
})
results.append(image_dict)
return results
my_credential = AzureCliCredential()
my_subscription_id = "{subscription-id}"
get_vm_images(credential=my_credential, subscription_id=my_subscription_id, location="eastus", image_publisher="Canonical")
I hope this might be helpful for someone who wants to test and try the Azure SDK for Python - I certainly enjoyed looking into it. Let me just add a few bullet points:
- Above code samples are just for fun and learning and should not be used in production since there are a lot of important things missing such as error-handling and logging.
- There are surely more elegant ways of doing the same with Python.
- The Azure CLI is capable of listing all images out of the box:
az vm image list --all
That's it for the moment - thanks for reading.
References
Posted on January 13, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.