Using an Object Detection Model with A Classifier


Screenshot alwaysAi team

Why use an object detection model with a classification model?

There are many situations where it is helpful to add a classification layer to an application using object detection. For instance, if you already have an app that detects people, you could add a model that classifies the gender of a detected individual. Or, as I will show in this tutorial, you could classify the detected individual in terms of an age range. Both applications would be useful for any situation that wants to track demographics.

In this tutorial I will demonstrate how to easily add a classification model to a starter app from alwasyAI that already uses a detection model with a few lines of code. All of the finished code from this tutorial is available on GitHub.

Set up

  1. An alwaysAI account (it’s free!)
  2. alwaysAI set up on your machine
  3. A text editor such as sublime or an IDE such as PyCharm, both of which offer free versions, or whatever else you prefer to code in

Please see the alwaysAI blog for more background on computer vision, developing models, how to change models, and more.

Getting started

After you have your account and have set up your developer environment, you need to download the starter apps; do so by using this link before proceeding with the rest of the tutorial. 

With the starter applications downloaded, you can begin to modify an existing starter app to use two object detection models. The app that was modified for this tutorial was the object detector app, so cd into the starter apps folder and then into the realtime_object_detector folder. 

$ cd ./alwaysai-starter-apps/realtime_object_detector

The object detection model that is used by default in the realtime_object_detector starter app is alwaysai/mobilenet_ssd, but we are going to change this to the detection model alwaysai/res10_300x300_ssd_iter_14000, which identifies human faces; since we will be classifying by age, this is a well-suited detection model. The classification model, alwaysai/agenet will classify people that were detected by alwaysai/res10_300x300_ssd_iter_14000 into age ranges of:  0-2, 4-6, 8-12, 15-20, 25-32, 38-43, 48-53, and 60-100 years old.

Since we added two new models to the app, we will need to add both of these models to our app environment. Make sure you are in the folder for the app being developed and then run these commands in your terminal:

 $ aai app models add alwaysai/agenet

$ aai app models add alwaysai/ares10_300x300_ssd_iter_140000

Now, since we are no longer using the default object detection model that was originally used with the realtime_object_detector, we are going to remove the model alwaysai/mobilenet_ssd from the app environment to reduce the overall app size. Do this by typing the following command into the command line:

$ aai app models remove alwaysai/mobilenet_ssd

NOTE: You can easily change any model by following the ‘Changing the Computer Vision Model’ documentation.

Now that you’ve set up your app environment, you can begin to modify the starter app code. We’ll do this in the following steps:

  1. First, alter the code that sets up the obj_detect variable on line #16 of the original code. 
    • 1a. We’ll replace the name with something more appropriate for our new app: facial_detector. Do this for all instances of obj_detect (in Pycharm you can highlight obj_detect and click Refactor → Rename in the toolbar). Check that the print statements and other instances were properly altered.
    • 1b. Change alwaysai/mobilenet_ssd to alwaysai/res10_300x300_ssd_iter_14000 in line #18. The code should now look like this: 
      facial_detector = edgeiq.ObjectDetection("alwaysai/res10_300x300_ssd_iter_140000")
      2.  Now, we’re going to add a classifier object. We’ll use the same format as we did for the facial_detector, but instead of ObjectDetection as the class, we’ll instantiate an Classification object (this can also be seen in the starter app image_classifier. You can always check out the starter apps for inspiration!)
    • 2a. Add the following lines of code in after the facial_detector start up but before the print statements: 
      classifier = edgeiq.Classification("alwaysai/agenet") 
    • 2b. Add print statements to display the engine, accelerator, and model of the classifier to the terminal. You can copy the print statements for facial_detector and change facial_detector to classifierAt this stage, the first part of main should look like this:
      def main():

      # first make a detector to detect facial objects
      facial_detector = edgeiq.ObjectDetection(

      # then make a classifier to classify the age of the image
      classifier = edgeiq.Classification("alwaysai/agenet")

      # descriptions printed to console
      print("Engine: {}".format(facial_detector.engine))
      print("Accelerator: {}\n".format(facial_detector.accelerator))

      print("Engine: {}".format(classifier.engine))
      print("Accelerator: {}\n".format(classifier.accelerator))

      fps = edgeiq.FPS()


      3. We’re going to add a counting variable to track the faces we detect. In the final code, I also removed some labels in order to make the markup less cluttered. See the code comments for these changes. Inside the while loop, before anything else add the following:

    • 3a. Create a counting variable, I called mine ‘count’, and initialize it to 1.
    • 3b. Create a loop that alters the labels for each prediction, using the ‘count’ variable to track which face is which:
      for p in results.predictions:
      p.label = "Face " + str(count)
      count = count + 1
    • 3c. Alter the original frame mark up code to show labels but not show confidences, and additionally, modify the text for the label to just display the faces shown:
      frame = edgeiq.markup_image(frame, results.predictions, show_labels=True, show_confidences=False)
    • 3d. Modify the description of what is appended to the text field. Change the line that appends ‘Objects:’ to the ‘text’ variable to instead say ‘Faces:’:

      4. Now, we just need to get the results from the classifier. We'll do this in the following steps:

    • 4a. Within the ‘while’ loop but just before the ‘for’ loop, create a variable to track the faces, much like we did with ‘count’, and set it to 1. I named mine ‘age_label’.
    • 4b. Now, inside the for loop, underneath the ‘for prediction in results.prediction’ do the following steps, append a label along with some text identifying each face, and then increment the variable:
      text.append("Face {} ".format(age_label))
      age_label = age_label + 1
    • 4c. We need to trim out each face from the prediction, so that the classifer will work properly. The code to do this is:
      face_image = edgeiq.cutout_image(frame,
    • 4d. Create a variable ‘age_results’ and store the classification results from the face_image in this variable using the following code:
      age_results = classifier.classify_image(face_image)
    • 4e. Check if there were actually results from the classification model. Use an if/else statement, and append the results to the output text sent to the streamer if so. Add the following text for the ‘if’ part underneath your ‘age_results’ initialization:
      if age_results.predictions:
      text.append("Label: {}, {:.2f}".format(age_results.predictions[0].label, age_results.predictions[0].confidence))
    • If there are no results, we will display this fact to the output stream. Finish the ‘else’ part of the conditional with:
      text.append("No age prediction")

The final while loop code will look like this:

# loop detection
while True:

# Step 3a: track how many faces are detected in a frame
count = 1

# read in the video stream
frame =

# detect human faces
results = facial_detector.detect_objects(
frame, confidence_level=.5)

# Step 3b: altering the labels to show which face was detected
for p in results.predictions:
p.label = "Face " + str(count)
count = count + 1

# Step 3c: alter the original frame mark up to just show labels
frame = edgeiq.markup_image(
frame, results.predictions, show_labels=True, show_confidences=False)

# generate labels to display the face detections on the streamer
text = ["Model: {}".format(facial_detector.model_id)]
text.append("Inference time: {:1.3f} s".format(results.duration))

# Step 3d:

# Step 4a: add a counter for the face detection label
age_label = 1

# append each predication to the text output
for prediction in results.predictions:

# Step 4b: append labels for face detection & classification
text.append("Face {} ".format(

age_label = age_label + 1

## to show confidence, use the following instead of above:
# text.append("Face {}: detected with {:2.2f}% confidence,
".format(#count, prediction.confidence * 100))

# Step 4c: cut out the face and use for the classification
face_image = edgeiq.cutout_image(frame,

# Step 4d: attempt to classify the image in terms of age
age_results = classifier.classify_image(face_image)

# Step 4e: if there are predictions for age classification,
# generate these labels for the output stream

if age_results.predictions:
text.append("is {}".format(age_results.predictions[0].label))
text.append("No age prediction")

## to append classification confidence, use the following
## instead of the above if/else:
# if age_results.predictions:
# text.append("age: {}, confidence: {:.2f}\n".format(
# age_results.predictions[0].label,
# age_results.predictions[0].confidence))
# else:
# text.append("No age prediction")
# send the image frame and the predictions to the output stream

streamer.send_data(frame, text)


if streamer.check_exit():

That’s it! Now you can build and start your app to see it in action. You may need to configure the app first, especially if you changed edge devices or created a new folder from scratch. Do this with the following command and enter the desired configuration input when prompted:

$ aai app configure

Now, to see your app in action, first build the app by typing into the command line:

$ aai app deploy

Once it's done building, type the following command to start the app:

$ aai app start

Now open any browser to localhost:5000 and you should see the output illustrated at the beginning of the article!

Get started now

We provide professional developers with a simple and easy-to-use platform to build and deploy computer vision applications on embedded devices. The alwaysAI beta program is open. Create your account now for free!

Get started