Using a Computer Vision Classifier to Sort Images

cover_image_c

If you have a host of images that you’d like to sort based on the presence of particular things (like people, cars, buildings, etc.), using computer vision classifiers can make this a pretty simple and fast thing to accomplish.

Today I’m going to show you how to set up a Python app that can detect animals from a set of images, and then move these images (containing animals) to an appropriate folder. 

All code for this can be found on Github here.

 

Impetus

A Doctor I met recently (who is working with an Ecological Reserve in Murrieta) is trying to catalog all the animals that live on the reserve. He has access to thousands of images from trail cams but 80% of them are false positives. These need to be sorted before he can start training a model based on the images that actually contain wildlife.

 

Requirements

 

Workflow

When complete, our application will:

  1. Loop through a folder of images from the source_images/ folder
  2. Run a classifier on each image
  3. Move the source image to the correct output_images/ subfolder

 

Setup

Find a location on your computer and create a folder called classifier-image-sorter, in that folder create:

  • An output_images/ folder for where the sorted images will go. In this folder create a animals/ folder and a no_animals/ folder
  • A source_images/ folder where all the starting images to be sorted will be placed
  • An app.py file that will contain the main application code
  • A JSON configuration file named config.json
  • A JSON configuration file named alwaysai.app.json
  • A Dockerfile which the alwaysAI CLI will use to build a docker container of the app

If you’re not using an IDE to build this app, you can use the following CLI/Terminal commands to create the above files and folders:

mkdir classifier-image-sorter && cd classifier-image-sorter
mkdir output_images
mkdir output_images/animals
mkdir output_images/no_animals
mkdir source_images
touch app.py
touch config.json
touch alwaysai.app.json
touch Dockerfile

NOTE: You can also use alwaysAI’s CLI to auto generate some of the above files, see the docs for more information.

When complete, your project folder structure should look like this:

Screen Shot 2020-03-25 at 9.41.55 AM-1

Put the following line into the Dockerfile that’s used by the alwaysAI CLI to determine which version of the Python edgeIQ library to use:

FROM alwaysai/edgeiq:0.11.1

And the following into the alwaysai.app.json file also used by alwaysAI’s CLI:

{
"scripts": {
"start": "python app.py"
},
"models": {
"alwaysai/squeezenet_v1.1": 1
}
}

1. Configuration

The config.json file contains all the configuration data and argument variables used by the app. Keeping settings outside of the main app.py file makes these easily editable from one location. Additionally, if you wanted to pull such configuration data from a remote location, from say a REST server, this is now trivial because we’re already set up to read from a JSON source.

Put the following into this .json file:

{
"classifier": {
"model_id": "alwaysai/shufflenet_1x_g3",
"minimum_confidence_level": 0.2,
"target_labels": [
"lion",
"tiger",
"cheetah",
"zebra",
"hippopotamus",
"water buffalo"
]
},
"found_folder": "sorted_images/animals",
"empty_folder": "sorted_images/no_animals",
"source_folder": "unsorted_images"
}

The found_folder , empty_folder , and source_folder keys can be used to optionally override the default input and output folders. The classifier key-value contains a dictionary object with the classifier configuration we’ll use to set the classifer arguments.

The following table details each of these high level key-values:

Screen Shot 2020-03-10 at 1.20.13 PM
JSON Configuration key-value details

And the below table details the key-values found in the classifier key:

Classifier JSON Configuration options

Now that we have our configuration file ready, we can add the following code to the app.py file to read it:

import os
import json

# Static keys for extracting data from config JSON file
CONFIG_FILE = 'config.json'
CLASSIFIER = 'classifier'
FOUND_FOLDER = 'found_folder'
EMPTY_FOLDER = 'empty_folder'
SOURCE_FOLDER = 'source_folder'
MODEL_ID = 'model_id'
THRESHOLD = 'confidence_level'
TARGETS = 'target_labels'

def load_json(filepath):
# Convenience to check and load a JSON file
if os.path.exists(filepath) == False:
raise Exception(
'File at {} does not exist'.format(filepath))
with open(filepath) as data:
return json.load(data)

def main():
# 1. Load configuration data from the alwaysai.app.json file
config = load_json(
CONFIG_FILE)
found_folder = config.get(FOUND_FOLDER)
empty_folder = config.get(EMPTY_FOLDER)
source_folder = config.get(SOURCE_FOLDER)
classifier_config = config.get(CLASSIFIER)

if __name__ == "__main__":
main()

 

2. The Classifier

Now that we can read all the arguments we’d like to use when the app runs, let’s spin up a classifier using alwaysAI’s Python library. First add this import statement:

import edgeiq

Then the following code block after the configuration code above:

    # 2. Spin up just the classifier
model_id = config.get(MODEL_ID)
classifier = edgeiq.Classification(model_id)
classifier.load(engine=edgeiq.Engine.DNN)

If you happen to have an Intel Movidius stick, you can change the last line to make use of it:

classifier.load(engine=edgeiq.Engine.DNN_OPENVINO)

 

3. Looking & Sorting

Now we’ll read from the source folder and loop through every image, looking for the animals we’re interested in — if we don’t find any, we’ll move the image to the empty_folder, otherwise to the found_folder .

# 3. Loop through all source images
image_paths = sorted(list(edgeiq.list_images(source_folder + '/')))
image_count = len(image_paths)
for image_path in image_paths:
# 3a. Load the image to check
image_display = cv2.imread(image_path)
image = image_display.copy()
empty_path = image_path.replace(source_folder, empty_folder)

# 3b. Find all objects the classifier is capable of finding
confidence = classifier_config.get(THRESHOLD)
results = classifier.classify_image(image, confidence)
if len(results.predictions) == 0:
# Nothing found with given confidence level
shutil.move(image_path, empty_path)
continue
predictions = results.predictions

# 3c. Filter results by the target labels if specified
targets = classifier_config.get(TARGETS, None)
if targets is not None:
predictions = edgeiq.filter_predictions_by_label(
results.predictions, targets)
if len(predictions) == 0:
# No matching labels found
shutil.move(image_path, empty_path)
continue

# 3d. At least one target found, move image to found folder
found_path = image_path.replace(source_folder, found_folder)
shutil.move(image_path, found_path)

And that’s it. To run this application using alwaysAI on any device that can run Docker, including edge devices like a Raspberry Pi or Jetson Nano, first designate where you’d like to deploy (locally or to a remote device):

aai app configure

Then build:

aai app deploy

And finally:

aai app start

You should see output similar to below:

Console output after running the app.py file from the sample app

Going Further

Rather than just sorting files, you could also easily modify the app to do other things, such as:

  • Processing images from server via TCP
  • Recording images of interest from a camera stream
  • Uploading found data & images to a remote server

Join our beta now

We are providing professional developers with a simple and easy-to-use platform to build and deploy computer vision applications on embedded devices. The alwaysAI beta program is open. Create your account now for free.

Get started