The Google Cloud Vision API allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content.

In this codelab you will focus on using the Vision API with C#. You will learn how to perform text detection, landmark detection, and face detection!

What you'll learn

What you'll need

Codelab-at-a-conference setup

By using a kiosk at Google I/O, a test project has been created and can be accessed by using going to: https://console.cloud.google.com/.

These temporary accounts have existing projects that are set up with billing so that there are no costs associated for you with running this codelab.

Note that all these accounts will be disabled soon after the codelab is over.

Use these credentials to log into the machine or to open a new Google Cloud Console window https://console.cloud.google.com/. Accept the new account Terms of Service and any updates to Terms of Service.

When presented with this console landing page, please select the only project available. Alternatively, from the console home page, click on "Select a Project" :

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

Activate Cloud Shell

  1. From the Cloud Console, click Activate Cloud Shell .

If you've never started Cloud Shell before, you'll be presented with an intermediate screen (below the fold) describing what it is. If that's the case, click Continue (and you won't ever see it again). Here's what that one-time screen looks like:

It should only take a few moments to provision and connect to Cloud Shell.

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory and runs in Google Cloud, greatly enhancing network performance and authentication. Much, if not all, of your work in this codelab can be done with simply a browser or your Chromebook.

Once connected to Cloud Shell, you should see that you are already authenticated and that the project is already set to your project ID.

  1. Run the following command in Cloud Shell to confirm that you are authenticated:
gcloud auth list

Command output

 Credentialed Accounts
ACTIVE  ACCOUNT
*       <my_account>@<my_domain.com>

To set the active account, run:
    $ gcloud config set account `ACCOUNT`
gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If it is not, you can set it with this command:

gcloud config set project <PROJECT_ID>

Command output

Updated property [core/project].

Before you can begin using the Vision API you must enable the API. Using the Cloud Shell, you can enable the API by using the following command:

gcloud services enable vision.googleapis.com

In order to make requests to the Vision API, you need to use a Service Account. A Service Account belongs to your project and it is used by the Google Client C# library to make Vision API requests. Like any other user account, a service account is represented by an email address. In this section, you will use the gcloud tool to create a service account and then create credentials to authenticate as the service account.

GOOGLE_CLOUD_PROJECT environment variable should already be set to your project id in Cloud Shell. If not, make sure it is set as follows:

export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value core/project)

Next, create a new service account to access the Vision API:

gcloud iam service-accounts create my-vision-sa \
  --display-name "my vision service account"

Then, create credentials that your C# code will use to log in as your new service account. Create these credentials and save it as a JSON file "~/key.json" by using the following command:

gcloud iam service-accounts keys create ~/key.json \
  --iam-account my-vision-sa@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com

Finally, set the GOOGLE_APPLICATION_CREDENTIALS environment variable. This is used by the Vision API C# library, covered in the next step, to find your credentials. The environment variable should be set to the full path of the credentials JSON file you created:

export GOOGLE_APPLICATION_CREDENTIALS=~/key.json

You can read more about authenticating the Google Cloud Vision API.

First, create a simple C# console application that you will use to run Vision API samples:

dotnet new console -n VisionApiDemo

You should see the application created and dependencies resolved:

The template "Console Application" was created successfully.
Processing post-creation actions...
...
Restore succeeded.

Next, navigate to VisionApiDemo folder:

cd VisionApiDemo/

And add Google.Cloud.Vision.V1 NuGet package to the project:

dotnet add package Google.Cloud.Vision.V1
info : Adding PackageReference for package 'Google.Cloud.Vision.V1' into project '/home/atameldev/VisionApiDemo/VisionApiDemo.csproj'.
log  : Restoring packages for /home/atameldev/VisionApiDemo/VisionApiDemo.csproj...
...
info : PackageReference for package 'Google.Cloud.Vision.V1' version '1.2.0' added to file '/home/atameldev/VisionApiDemo/VisionApiDemo.csproj'.

Now, you're ready to use Vision API!

One of the Vision API's basic features is to identify objects or entities in an image, known as label annotation. Label detection identifies general objects, locations, activities, animal species, products, and more. The Vision API takes an input image and returns the most likely labels which apply to that image. It returns the top-matching labels along with a confidence score of a match to the image.

In this example, you will perform label detection on an image of a street scene in Shanghai. Open the code editor from the top right side of the Cloud Shell:

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            var image = Image.FromUri("gs://cloud-samples-data/vision/using_curl/shanghai.jpeg");
            var labels = client.DetectLabels(image);

            Console.WriteLine("Labels (and confidence score):");
            Console.WriteLine(new String('=', 30));

            foreach (var label in labels)
            {
                Console.WriteLine($"{label.Description} ({(int)(label.Score * 100)}%)");
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform label detection.

Back in Cloud Shell, run the app:

dotnet run 

You should see the following output:

Labels (and confidence score):
==============================
People (95%)
Street (89%)
Mode of transport (89%)
Transport (85%)
Vehicle (84%)
Snapshot (84%)
Urban area (80%)
Infrastructure (73%)
Road (72%)
Pedestrian (68%)

Summary

In this step, you were able to perform label detection on an image of a street scene in China and display the most likely labels associated with that image. Read more about Label Detection.

Vision API's Text Detection performs Optical Character Recognition. It detects and extracts text within an image with support for a broad range of languages. It also features automatic language identification.

In this example, you will perform text detection on an image of an Otter Crossing.

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            var image = Image.FromUri("gs://cloud-vision-codelab/otter_crossing.jpg");
            var response = client.DetectText(image);
            foreach (var annotation in response)
            {
                if (annotation.Description != null)
                {
                    Console.WriteLine(annotation.Description);
                }
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform text detection.

Back in Cloud Shell, run the app:

dotnet run 

You should see the following output:

CAUTION
Otters crossing
for next 6 miles

Summary

In this step, you were able to perform text detection on an image of an Otter Crossing and print recognized text from the image. Read more about Text Detection.

Vision API's Landmark Detection detects popular natural and man-made structures within an image.

In this example, you will perform landmark detection on an image of the Eiffel Tower.

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            var image = Image.FromUri("gs://cloud-vision-codelab/eiffel_tower.jpg");
            var response = client.DetectLandmarks(image);
            foreach (var annotation in response)
            {
                if (annotation.Description != null)
                {
                    Console.WriteLine(annotation.Description);
                }
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform landmark detection.

Back in Cloud Shell, run the app:

dotnet run

You should see the following output:

Eiffel Tower

Summary

In this step, you were able to perform landmark detection on image of the Eiffel Tower. Read more about Landmark Detection.

Face Detection detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear.

In this example, you will detect the likelihood of emotional state from four different emotional likelihoods including: joy, anger, sorrow, and surprise.

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            string[] pictures = {"face_surprise.jpg", "face_no_surprise.png"};
  
            foreach (var picture in pictures)
            {
                var image = Image.FromUri("gs://cloud-vision-codelab/" + picture);
                var response = client.DetectFaces(image);
                foreach (var annotation in response)
                {
                    Console.WriteLine($"Picture: {picture}");
                    Console.WriteLine($" Surprise: {annotation.SurpriseLikelihood}");
                }
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform emotional face detection.

Run the app;

dotnet run

You should see the following output for our face_surprise and face_no_surprise examples:

Picture: face_surprise.jpg
 Surprise: Likely
Picture: face_no_surprise.png
 Surprise: VeryUnlikely

Summary

In this step, you were able to perform emotional face detection. Read more about Face Detection.

You learned how to use the Vision API using C# to perform different detection on images!

Clean up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this quickstart:

Learn More

License

This work is licensed under a Creative Commons Attribution 2.0 Generic License.