Retraining Inception

Blog Hannah


By Hannah Deen

What is retraining?

We can retrain the inception v3 model by keeping all the trained layers in tact bar the last one. The first layers are responsible for things like picking out edges and basic shapes. These are necessary and universal steps for CNNs used in image classification and take a long time and a lot of images to train. The final layer of the neural net is where you can train on high-level categories relevant to you – like dogs vs. cats, or Javier Bardem vs. Jeffrey Dean Morgan.

We used Google’s Tensorflow for Poets codelab as our guide.

tensorflow for poets steps

one: docker

Have Docker installed.


two: the image set

Have a dataset to retrain on. The codelab provides a dataset of different flowers if you just want to try it out but don’t have a specific idea. If you do, then organise your image files with images belonging to specific categories in folders named after that category. Like this:


Trainingset Format


In the spirit of real-life, helpful applications, we decided it would be really cool if we could detect fire and smoke in images. Having a working model run on frames of CCTV footage could be a useful, perhaps life-saving, safety aid. We used ImageNet to get our dataset. The image set for fire unsurprisingly contained a lot of fireplaces, and the smoke set contained a lot of tanks. I suppose combat is a smoky affair… Of course we wanted to avoid the model deciding that because an image contained a tank, it would score high for ‘smoke’ and because an image contained an empty fireplace, it would score high for ‘fire’. To try and counteract this, we added a third miscellaneous category and included lots of images of empty fireplaces and non-smoking tanks to try and balance out the overall image set.


three: link image set to docker

In order for the docker image to use the image files for retraining, we need to give the docker image access to the relevant folder, in this case tf_files:

`$ docker run -it -v $HOME/tf_files:/tf_files`

four: get inception code

Retrieve the code for the Inception v3 framework…

$ cd /tensorflow
$ git pull

five: retrain the model

# In Docker
python tensorflow/examples/image_retraining/ \
--bottleneck_dir=/tf_files/bottlenecks \
--model_dir=/tf_files/inception \
--output_graph=/tf_files/retrained_graph.pb \
--output_labels=/tf_files/retrained_labels.txt \
--image_dir /tf_files/<imageset folder>

This will set off the process. It’s that simple! The retraining itself takes a little while, maybe half an hour. It generates ‘bottlenecks’. This is the term used for the layer just before the final output layer that we retrain to do the classification. The values are stored in /tmp/bottleneck so if you ever need to retrain again those values don’t have to be calculated.


By default, the script runs 4000 training steps, but this can be adjusted. At each iteration, 10 images are picked at random from the training set and fed into the final prediction layer. The predictions are checked for accuracy by comparing them to the actual labels and then the final layer’s weights are updated accordingly through back-propagation. It’s quite fun to watch the accuracy score change as they are printed during training (if you’re nerdy like us). There are two types of accuracy, the training accuracy  and the  validation accuracy. The validation accuracy is the important one. That’s the one that looks at test images, i.e. labeled images that the model hasn’t been trained on. The training accuracy is also important but a high score for that could mean the model is overfitting. Both the training and validation accuracies should be high before it is a reliable model.


The training ran for 4000 iterations, completing with a final test accuracy of 94.4%. We wanted to test it on some new images so we downloaded a selection to run through the model and see what results it would give. Google codelabs provides you will a little python script that you can just `curl` neatly into a file called `` and store in the tf_files folder.

 import tensorflow as tf, sys

# change this as you see fit
 image_path = sys.argv[1]

# Read in the image_data
 image_data = tf.gfile.FastGFile(image_path, 'rb').read()

# Loads label file, strips off carriage return
 label_lines = [line.rstrip() for line
 in tf.gfile.GFile("tf_files/retrained_labels.txt")]

# Unpersists graph from file
 with tf.gfile.FastGFile("tf_files/retrained_graph.pb", 'rb') as f:
 graph_def = tf.GraphDef()
 _ = tf.import_graph_def(graph_def, name='')

with tf.Session() as sess:
 # Feed the image_data as input to the graph and get first prediction
 softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')

predictions =, \
 {'DecodeJpeg/contents:0': image_data})

# Sort to show labels of first prediction in order of confidence
 top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]

for node_id in top_k:
 human_string = label_lines[node_id]
 score = predictions[0][node_id]
 print('%s (score = %.5f)' % (human_string, score))

This is where you get the script:

 # type 'exit' to exit Docker and then:
 $ curl -L > $HOME/tf_files/


We created a folder in tf_files called test_images, where we could store any image that we wanted to run through the model. Then after restarting docker linked to tf_files,

 $ docker run -it -v $HOME/tf_files:/tf_files

and running the following command:

 # In Docker
 $ python /tf_files/ /tf_files/test_images/Campfire.jpg

we could see how it did. Every time an image is passed through, the model returns a probability score for each category. So this calming beach campfire image:




returned the scores:

fire (score = 0.86540)
smoke (score = 0.12902)
misc (score = 0.00558)

whereas a burning cigarette:


Burning cigarette

returned this:

smoke (score = 0.79379)
 fire (score = 0.15999)
 misc (score = 0.04622)

and a happy raccoon:




was not much of any of them (although that is some smokey fur…) :

smoke (score = 0.48583)
 misc (score = 0.28314)
 fire (score = 0.23103)

We labeled a whole bunch of images to test the effectiveness of the model in multiple different areas and tweaked the code to run for all the images in the `test_images` folder by adding this line inside the tf.Session():

 for filename in os.listdir(sys.argv[1]):
 if filename.endswith(".jpg") or filename.endswith(".jpeg"):

The code worked great, but also revealed some problems with the model. Images that contained a lot of white or light grey were labeled as having a high probability score for ‘smoke’, and despite the _fire_ dataset containing the most images (almost 2000), fire was not likely to be labeled correctly. Perhaps this was because the majority of fire images from ImageNet were close up of fires in hearths so images containing fire like this one…


Burning car


…got a 60% probability that it contained ‘smoke’ and only a 35% probability that it contained ‘fire’. So it’s not an out of the box solution but with a bit of tweaking we can customise an existing machine learning model to make it relevant to us.