Learn to Add AI to your apps with Cognitive Services

Follow me on Twitter, happy to take your suggestions on topics or improvements /Chris

An AI villain against humanity! As a kid growing up in the 80s some of the coolest movies to come out was Terminator and Terminator 2 starring Arnold Schwarzenegger. ICYMI - an AI robot ( Arnold ) was sent from the future to try to destroy any chance of human resistance in the future.

Back then it felt far away into the future for us humans to construct a robot moving like that until this clip made the internet https://www.youtube.com/watch?v=LikxFZZO2sk
a robot constructed by Boston Dynamics. A lot of people choked on their coffee that day.

If that thing ever becomes smart and hostile to humans we need to join Elon Musks Tesla in space 😉
One really cutting edge scene in Terminator edged itself into my mind. The Terminator enters the motorcycle bar, scans the people and objects around the room, correctly classifying what the objects are, their color, size and if they are his target! https://www.youtube.com/watch?v=zzcdPA6qYAU

Back then it was amazing, science fiction at its best.
Here's the thing though, it's not science fiction anymore. So many things have happened in the area of Machine Learning. The Machine Learning industry employs an army of data scientists that construct algorithms that given training data is able to correctly identify what it's looking at.
A quite famous example is the pug or muffin training data in which we get a peek on how these algorithms are trained on countless images like this one:

I know some of you are probably chuckling by now, thinking we don't need to worry about machines overtaking us any time soon 😉.

I mentioned it wasn't science fiction anymore and it isn't. Microsoft offers a whole suite of services called Azure Cognitive Services

centering on

vision, This is image-processing algorithms that can identify, caption, index, moderate pictures and videos
speech, Can convert spoken audio into text, use voice for verification or add speech recognition to your app
language, Allow your apps to process natural language with pre-built scripts, evaluate sentiment and learn how to recognize what users want
knowledge, Map complex information and data in order to solve tasks such as intelligent recommendations and semantic search.
search, Enable apps and services to harness the power of a web-scale, ad-free search engine with Search. Use search services to find exactly what you're looking for across billions of web pages, images, videos and news search results

As you notice, where you to click on any of the above categories, each area leads to a ton of services and they are free to try. I don't know about you but I feel like a kid in a candy store when someone tells me that here are a ton of APIs for you to use and especially if it makes Machine Learning usable for me as a developer.
To go with the introduced narrative let's dive into the vision category, cause we want to see like a Terminator right? ;)

Let's click on Celebrity and landmark recognition in images. Oh, cool we get a demo page where we can see the algorithms at work, try it before you buy it :)

Above we can see it requires us to input a URL for an image and it seems to respond with JSON. Ok, let's give it something easy, a picture of Abe Lincoln:

And the winner is…. Abe Lincoln. Ok, that was easy let's try something else:

I have to admit. I'm about nervous about this one ;). Ok, let's see the results:

Ok, it recognized Arnold Schwarzenegger from the movie Terminator 2, good. I swear if it had mentioned John Connor I would have run for the hills, just kidding :)

Using Azure Cognitive Services

To start using the Cognitive Services API we need an API key. We need to take a few steps to acquire said key but it really isn't that much. The Cognitive Services resides on Azure. To get a free Azure account head to this link:

Once you are signed up you could either be using the Azure portal or the Azure CLI. The Azure CLI enables us to talk to Azure from the command line which is usually way quicker and cooler, than clicking around in a UI.

Once we have come this far there are only four steps left, so stay with me and we will soon see the world like Arnold 😃

What remains is the following:

create a resource group, this is a like a directory where you put all the things that belong together like accounts, databases, apps, it takes only a second to create
create a cognitive services account, that's also just a one-liner of code, creating this will give us our API key
make a POST call to the API, it's a very simple REST API call given they API key we get from constructing our cognitive services account
parse the JSON response, we will get a JSON back and we will have a look at the different parts it gives us to see what we can show to our user

Create a resource group

First thing we will need to do is to log in to Azure using the Azure CLI. To use the Azure CLI we first need to install it. Head over to this link for installation instructions, the installation instruction is different for different OS so make sure you pick the right one:

https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest

Let's login to Azure using the Azure CLI:

az login

This will open up a window in the browser where we login to our Azure account. Thereafter the terminal will have access to Azure.

Let's now create the resource group:

az group create \

--name resourceforcogservices \

--location westeurope

The command here is az group create and we are giving it the following arguments:

name, this is a name we choose
location, we can select between a number of locations here depending on where we are in the world

For location we have chosen westeurope, cause that's where I am writing this article. So choose a region depending on where you are located. Here is the full list of supported regions:

westus2
southcentralus
centralus
eastus
westeurope
southeastasia
japaneast
brazilsouth
australiasoutheast
centralindia

Create a Azure Cognitive Services account

It's quite easy to create this account. It's done with the following command:

az cognitiveservices account create \

--kind ComputerVision \

--name ComputerVisionService \

--sku S1 \

--resource-group resourceforcogservices \

--location westeurope

Ok, our basic command is az cognitiveservices account create, thereafter we have added some arguments to said command:

kind, here we need to type what kind of Cognitive Services we will use, our value here needs to be ComputerVision
name, the name is simply the name of the service, which is ComputerVisionService
sku, means the pricing tier and is fixed for the lifetime of the service, we choose S1, which is really cheap.
resourcegroup, we have created this one previously and as stated before this is like a folder where everything that is related should be organized under
location, we keep going with westeurope here cause that's what we started with, you are welcome to continue with the location you went with

https://docs.microsoft.com/en-us/azure/search/search-sku-tier

Once the Cognitive Services account is created then we can retrieve the API key. The following command will list our cognitive services account, including the API key:

az cognitiveservices account show \

--name ComputerVisionService \

--resource-group resourceforcogservices

Our command for retrieving the key is az cognitiveservices account show then we need to give said command some arguments:

name, this is the name of our service
resource group, we keep using the resource group westeurope that we chose initially

Make a POST call to the API

Now to make it easy to use when doing our REST call we will assign the API key to a shell variable and we can refer to said variable when we later do our REST call. Let's do the assignment:

key=$(az cognitiveservices account keys list \

--name ComputerVisionService \

--resource-group resourceforcogservices \

--query key1 -o tsv)

The above lists all the keys on the account picks out a key called key1 and assigns it to the variable key. Now we are all set up and ready to make our REST call.

Let's have a look at our API and see what the URL looks like generally:

https://[region].api.cognitive.microsoft.com/vision/v2.0/analyze?visualFeatures=<...>&details=<...>&language=<...>

We see that we need to replace [region] with whatever region we created our resource group and account with, in our case that is westeurope. Furthermore, we see the API is using a method called analyze and the parameters visualFeatures, details and language.

details, this can have value Landmarks or Celebrities
visualFeatures, this is about what kind of information you want back, The Categories option will categorize the content of the images like trees, buildings, and more. Faces will identify people's faces and give you their gender and age

Ok, let's see what the actual call looks like:

curl "https://westeurope.api.cognitive.microsoft.com/vision/v2.0/analyze?visualFeatures=Categories,Description&details=Landmarks" \

-H "Ocp-Apim-Subscription-Key: $key" \

-H "Content-Type: application/json" \

-d "{'url' : 'https://raw.githubusercontent.com/MicrosoftDocs/mslearn-process-images-with-the-computer-vision-service/master/images/mountains.jpg'}" \

| jq '.'

Above we call cURL and set the header Ocp-Apim-Subscription-Key to our API key, or more specifically to our variable key that contains our API key. We see that we create a BODY value with property url and set that to the image we want to analyze.

Looking at the response

Ok, we make the call, we were told there would be JSON. And there is, a whole lot of it :)

{
  "categories": [{
    "name": "outdoor_mountain",
    "score": 0.99609375,
     "detail": {
       "landmarks": []
     }
  }],
  "description": {
  "tags": [
    "snow",
    "outdoor",
    "mountain",
    "nature",
    "covered",
    "skiing",
    "man",
    "flying",
    "standing",
    "wearing",
    "side",
    "air",
    "slope",
    "jumping",
   "plane",
   "red",
   "hill",
   "riding",
  "people",
  "group",
  "yellow",
  "board",
  "doing",
  "airplane"
],
"captions": [{
  "text": "a snow covered mountain",
  "confidence": 0.956279380622841
}]
},
"requestId": "<undisclosed>",
  "metadata": {
  "width": 600,
  "height": 462,
  "format": "Jpeg"
 }
}

The score is an indication of how certain it is of the results. With a value of 0.99609375 (max is 1.0) i would say it's pretty darn certain. The captions are the algorithm trying to give us a normal sentence of what this is. It says it is: a snow-covered mountain. Let's see for ourselves with the URL we provided to the service call:

Yep, looks like a Mountain to me, good Skynet ;)

Summary

I've taken you through my childhood and by now you know I'm a movie nerd, a bit of a skeptic on where all this AI, Machine Learning research is taking us. At the same time excited about all the cool apps I can build with Cognitive Services.
Here is also some food for thought. It's easy to joke about killer robots especially when they come from the world of movies. With all great tech, we have a responsibility to do something useful with it, to serve mankind. Imagine algorithms like this mounted on drones or helicopters. Imagine further that a catastrophe has happened and you are looking for survivors and you got some great algorithms that quickly can aid you to find people. That can make a real difference, save lives.
I hope you are as excited as me and give it a try. The best way to get started is hopefully this blog post but it's worth checking the LEARN platform and especially this course. Good luck :)

If you found this article useful/hilarious/ amusing / anything, please give me a clap :)

Blog