TensorFlow is an open source library for numerical computation, specializing in machine learning applications.

What you will build

In this codelab, you will learn how to run TensorFlow on a single machine, and will train a simple classifier to classify images of flowers.

Image CC-BY by Retinafunk

daisy (score = 0.99071)
sunflowers (score = 0.00595)
dandelion (score = 0.00252)
roses (score = 0.00049)
tulips (score = 0.00032)

We will be using transfer learning, which means we are starting with a model that has been already trained on another problem. We will then retrain it on a similar problem. Deep learning from scratch can take days, but transfer learning can be done in short order.

We are going to use a model trained on the ImageNet Large Visual Recognition Challenge dataset. These models can differentiate between 1,000 different classes, like Dalmatian or dishwasher. You will have a choice of model architectures, so you can determine the right tradeoff between speed, size and accuracy for your problem.

We will use this same model, but retrain it to tell apart a small number of classes based on our own examples.

What you'll Learn

What you need

Setup Docker

Docker For Mac Icon

Launch Docker

Launch the Docker app Docker For Mac Icon. A Docker icon , will appear in the menu bar, at the top right.

Wait a few seconds for its cargo to finish loading.

Test your Docker installation

To test your Docker installation try running the following command in the terminal :

docker run hello-world

This should output some text starting with:

Hello from Docker!
This message shows that your installation appears to be working correctly.
...

Run and test the TensorFlow image

Now that you've confirmed that Docker is working, test out the latest TensorFlow image:

docker pull tensorflow/tensorflow:1.7.0
docker run -it tensorflow/tensorflow:1.7.0 bash

After downloading your prompt should change to root@xxxxxxx:/notebooks#.

Next check to confirm that your TensorFlow installation works by invoking Python from the container's command line:

# Your prompt should be "root@xxxxxxx:/notebooks" 
python

Once you have a python prompt, >>>, run the following code:

# python

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session() # It will print some warnings here.
print(sess.run(hello))

This should print Hello TensorFlow! (and maybe a warnings after the tf.Session line).

Exit Docker

Now press Ctrl-d, on a blank line, once to exit python, and a second time to exit the Docker image.

Give Docker access to more CPUs

Due to a known issue, Docker may crash during this codelab if it has an insufficient number of CPUs allocated! To avoid this problem select "Preferences..." from the Docker menu, , then select the "advanced" tab, , and set the "CPUs" slider to 4 or more.

Now press the button to save your changes, and wait a few seconds for the whale to reload it's cargo.

Now clone the git repository:

git clone https://github.com/googlecodelabs/tensorflow-for-poets-2

Then relaunch Docker with that directory shared as your working directory, and port number 6006 published for TensorBoard:

docker run -it \
  --publish 6006:6006 \
  --volume $(pwd)/tensorflow-for-poets-2:/tfp2 \
  --workdir /tfp2 \
  tensorflow/tensorflow:1.7.0 bash

Your prompt will change to root@xxxxxxxxx:/tfp2#

Before you start any training, you'll need a set of images to teach the model about the new classes you want to recognize. We've created an archive of creative-commons licensed flower photos to use initially. Download the photos (218 MB) by invoking the following two commands:

curl http://download.tensorflow.org/example_images/flower_photos.tgz \
    | tar xz -C tf_files

You should now have a copy of the flower photos. Confirm the contents of your working directory by issuing the following command:

ls tf_files/flower_photos

The preceding command should display the following objects:

daisy/
dandelion/
roses/
sunflowers/
tulip/
LICENSE.txt

Configure your MobileNet

In this exercise, we will retrain a MobileNet. MobileNet is a a small efficient convolutional neural network. "Convolutional" just means that the same calculations are performed at each location in the image.

The MobileNet is configurable in two ways:

We will use 224 0.5 for this codelab.

With the recommended settings, it typically takes only a couple of minutes to retrain on a laptop. You will pass the settings inside Linux shell variables. Set those variables in your shell:

IMAGE_SIZE=224
ARCHITECTURE="mobilenet_0.50_${IMAGE_SIZE}"

More about MobileNet performance (optional)

The graph below shows the first-choice-accuracies of these configurations (y-axis), vs the number of calculations required (x-axis), and the size of the model (circle area).

16 points are shown for MobileNet. For each of the 4 model sizes (circle area in the figure) there is one point for each image resolution setting. The 128px image size models are represented by the lower-left point in each set, while the 224px models are in the upper right.

Other notable architectures are also included for reference. "GoogleNet" in this figure is "Inception V1" in this table. An extended version of this figure is available in slides 84-89 here.

Start TensorBoard

Before starting the training, launch tensorboard in the background. TensorBoard is a monitoring and inspection tool included with tensorflow. You will use it to monitor the training progress.

tensorboard --logdir tf_files/training_summaries &

Investigate the retraining script

The retrain script is from the TensorFlow Hub repo, but it is not installed as part of the pip package. So for simplicity I've included it in the codelab repository. You can run the script using the python command. Take a minute to skim its "help".

python -m scripts.retrain -h

Run the training

As noted in the introduction, ImageNet models are networks with millions of parameters that can differentiate a large number of classes. We're only training the final layer of that network, so training will end in a reasonable amount of time.

Start your retraining with one big command (note the --summaries_dir option, sending training progress reports to the directory that tensorboard is monitoring) :

python -m scripts.retrain \
  --bottleneck_dir=tf_files/bottlenecks \
  --how_many_training_steps=500 \
  --model_dir=tf_files/models/ \
  --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
  --output_graph=tf_files/retrained_graph.pb \
  --output_labels=tf_files/retrained_labels.txt \
  --architecture="${ARCHITECTURE}" \
  --image_dir=tf_files/flower_photos

Note that this step will take a while.

This script downloads the pre-trained model, adds a new final layer, and trains that layer on the flower photos you've downloaded.

More about Bottlenecks (Optional)

This section and the next provide background on how this retraining process works.

The first phase analyzes all the images on disk and calculates the bottleneck values for each of them. What's a bottleneck?

These ImageNet models are made up of many layers stacked on top of each other, a simplified picture of Inception V3 from TensorBoard, is shown above (all the details are available in this paper, with a complete picture on page 6). These layers are pre-trained and are already very valuable at finding and summarizing information that will help classify most images. For this codelab, you are training only the last layer (final_training_ops in the figure below). While all the previous layers retain their already-trained state.

In the above figure, the node labeled "softmax", on the left side, is the output layer of the original model. While all the nodes to the right of the "softmax" were added by the retraining script.

A bottleneck is an informal term we often use for the layer just before the final output layer that actually does the classification. "Bottleneck" is not used to imply that the layer is slowing down the network. We use the term bottleneck because near the output, the representation is much more compact than in the main body of the network.

Every image is reused multiple times during training. Calculating the layers behind the bottleneck for each image takes a significant amount of time. Since these lower layers of the network are not being modified their outputs can be cached and reused.

So the script is running the constant part of the network, everything below the node labeled Bottlene... above, and caching the results.

The command you ran saves these files to the bottlenecks/ directory. If you rerun the script, they'll be reused, so you don't have to wait for this part again.

Once the script finishes generating all the bottleneck files, the actual training of the final layer of the network begins.

By default, this script runs 4,000 training steps. Each step chooses 10 images at random from the training set, finds their bottlenecks from the cache, and feeds them into the final layer to get predictions. Those predictions are then compared against the actual labels, and the results of this comparison is used to update the final layer's weights through a backpropagation process.

As it trains, you'll see a series of step outputs, each one showing training accuracy, validation accuracy, and the cross entropy:

The figures below show an example of the progress of the model's accuracy and cross entropy as it trains. If your model has finished generating the bottleneck files you can check your model's progress by opening TensorBoard, and clicking on the figure's name to show them. Ignore any warnings that TensorBoard prints to your command line.

The first figure shows accuracy (y-axis) as a function of training progress (x-axis):

Two lines are shown. The orange line shows the accuracy of the model on the training data. While the blue line shows the accuracy on the test set (which was not used for training). This is a much better measure of the true performance of the network. If the training accuracy continues to rise while the validation accuracy decreases then the model is said to be "overfitting". Overfitting is when the model begins to memorize the training set instead of understanding general patterns in the data.

As the process continues, you should see the reported accuracy improve. After all the training steps are complete, the script runs a final test accuracy evaluation on a set of images that are kept separate from the training and validation pictures. This test evaluation provides the best estimate of how the trained model will perform on the classification task.

You should see an accuracy value of between 85% and 99%, though the exact value will vary from run to run since there's randomness in the training process. (If you are only training on two classes, you should expect higher accuracy.) This number value indicates the percentage of the images in the test set that are given the correct label after the model is fully trained.

The retraining script writes data to the following two files:

Classifying an image

The codelab repo also contains a copy of tensorflow's label_image.py example, which you can use to test your network. Take a minute to read the help for this script:

python -m scripts.label_image -h

Now, let's run the script on this image of a daisy:

flower_photos/daisy/21652746_cc379e0eea_m.jpg

Image CC-BY by Retinafunk

python -m scripts.label_image \
    --graph=tf_files/retrained_graph.pb  \
    --image=tf_files/flower_photos/daisy/21652746_cc379e0eea_m.jpg

Each execution will print a list of flower labels, in most cases with the correct flower on top (though each retrained model may be slightly different).

Congratulations, you've taken your first steps into a larger world of deep learning!

You can see more about using TensorFlow at the TensorFlow website or the TensorFlow GitHub project. There are lots of other resources available for TensorFlow, including a discussion group and whitepaper.

If you're interested in running TensorFlow on mobile devices try the second part of this tutorial: There are three versions:

  1. TFLite Android
  2. TFLite iOS
  3. TFMobile Android

Or just go have some fun in the TensorFlow Playground!

This codelab is based on Pete Warden's TensorFlow for Poets blog post and this retraining tutorial.