How to Detect People Using alwaysAI

Detecting people can be an important part of applications across many industries. Common use cases include security applications that track who’s coming and going, as well as safety systems designed to keep people out of harm’s way.

In computer vision, we use a technique called object detection to detect the presence of people in an image. In many cases, people are just one thing an object detection model is capable of detecting. Also, this technique differs from facial recognition in that it doesn’t identify a specific person, but simply detects when a human is in the frame. Hence, for our purposes, an object detection model works quite well and makes it easy to get started with an app for detecting people.

 

Human Detection in Action

Human detection using alwaysAI

alwaysAI provides a set of open-source pre-trained models in the Model Catalog. The following example uses one of the starter models with a simple algorithm in order to achieve its goal.

 

Snap a Picture of Each Person

In this guide, we’ll build an application that takes a picture of each new person it sees. The alwaysAI “realtime_object_detector” starter application closely fits our end goal, so it will be a great starting point. Taking a look at the model catalog, we can see that the default model of this starter app, alwaysai/mobilenet_ssd, has “person” as one of its labels, so let’s stick with that for now.

The source for this guide can be found at https://github.com/alwaysai/snapshot-security-camera.

 

Filter People in Detection Results

The first step is to filter people from the detection results to make sure that our app only reacts to people.

To make this change, we call the filter_predictions_by_label() function after getting the results back from the object detector:

def main():
   ...
   try:
       with edgeiq.WebcamVideoStream(cam=0) as video_stream, \
               edgeiq.Streamer() as streamer:
           ...
           while True:
               ...
               results = obj_detect.detect_objects(frame, confidence_level=.5)
               people = edgeiq.filter_predictions_by_label(results.predictions, ['person'])
                ...


Next, replace all occurrences of results.predictions elsewhere in the code with the variable people.

Let’s run the app and see if our change worked:

$ aai app deploy
$ aai app start

Point the camera at yourself, and then try another item in the labels list, such as a potted plant. At this point, only the people in the frame should get a bounding box around them.

 

Track Each Person in the Frame

Tracking humans in the frame using alwaysAI

Our current app can detect people in real-time, but it can’t tell us when a new person is detected. That’s where the centroid tracker comes in. The centroid tracker matches a detection on the current frame with a detection on the previous frame so that we can tell when a detection indicates the same person frame-to-frame.

First, let’s add the centroid tracker to the initialization section of our app:

def main():
   obj_detect = edgeiq.ObjectDetection("alwaysai/mobilenet_ssd")
   obj_detect.load(engine=edgeiq.Engine.DNN)
   tracker = edgeiq.CentroidTracker(deregister_frames=30)
    ...

The deregister_frames parameter tells the centroid tracker when to stop tracking a person by specifying how many continuous frames the person is not detected in before they are dropped from the tracker. Our system runs about 30 FPS, so we selected 30 frames so that detections would be dropped one second after they leave the frame. A smaller value might lead to a person being detected multiple times, where a larger number might lead to a bounding box hanging around longer than the original detected person.

The next step is to track the outputs of the object detector and parse the outputs. Add the tracker update() call after the filter:

def main():
   ...
   try:
       with edgeiq.WebcamVideoStream(cam=0) as video_stream, \
               edgeiq.Streamer() as streamer:
           ...
           while True:
               ...
               people = edgeiq.filter_predictions_by_label(results.predictions, ['person'])
               tracked_people = tracker.update(people)
                ...

Then construct a new people list using the outputs of the tracker and updating the label so that we can visualize the results in the streamer:

def main():
   ...
   try:
       with edgeiq.WebcamVideoStream(cam=0) as video_stream, \
               edgeiq.Streamer() as streamer:
           ...
           while True:
               ...
               people = edgeiq.filter_predictions_by_label(results.predictions, ['person'])
               tracked_people = tracker.update(people)

               people = []
               for (object_id, prediction) in tracked_people.items():
                   new_label = 'Person {}'.format(object_id)
                   prediction.label = new_label
                   people.append(prediction)
               ...

When we run our app now, each detected person is given a unique index.

 

Notify When a Person Enters or Exits the Frame

Next, let’s add a notification every time someone new enters or exits the frame. We’ll do this by storing the previous tracker results and comparing those against the latest tracker results. First, we need to create some variables to store state across detections.

Add prev_tracked_people and logs to the app before the while loop starts:

def main():
   ...
   try:
       with edgeiq.WebcamVideoStream(cam=0) as video_stream, \
               edgeiq.Streamer() as streamer:
           ...
           prev_tracked_people = {}
           logs = []
       
           while True:
               ...

Now, check for people entering or exiting the frame, and add a log for each change. This is accomplished by checking to see if there are new entries in the latest tracker results, or if previous results have been removed from the latest results. We’ll need to update the prev_tracked_people for the next detection. Initializing a new dictionary ensures that we don’t have a reference.

The last step adds the entering and exit logs to the streamer.

def main():
   ...
   try:
       with edgeiq.WebcamVideoStream(cam=0) as video_stream, \
               edgeiq.Streamer() as streamer:
           ...
           while True:
               ...
               frame = edgeiq.markup_image(
                        frame, people, colors=obj_detect.colors)

               new_entries = set(tracked_people) - set(prev_tracked_people)
               for entry in new_entries:
                    logs.append('Person {} entered'.format(entry))

               new_exits = set(prev_tracked_people) - set(tracked_people)
               for exit in new_exits:
                    logs.append('Person {} exited'.format(exit))

               prev_tracked_people = dict(tracked_people)
               ...
               for prediction in people:
                   text.append("{}: {:2.2f}%".format(
                    prediction.label, prediction.confidence * 100))

               text.append('Logs:')
               text += logs
                ...

 

Take a Picture

The last step to make our app complete is to take a picture of each person as they’re first detected.

We’ll write a simple function that saves an image using OpenCV:

import cv2
...
def save_snapshot(image, person_id):
   snap_date = time.strftime('%Y-%m-%d')
   snap_time = time.strftime('%H:%M:%S')
   filename = '{}-{}-person-{}.jpg'.format(snap_date, snap_time, person_id)
    cv2.imwrite(filename, image)

Next, we add it to our loop that checks new entries and takes a picture for each new entry:

def main():
   ...
   try:
       with edgeiq.WebcamVideoStream(cam=0) as video_stream, \
               edgeiq.Streamer() as streamer:
           ...
           while True:
               ...
               frame = edgeiq.markup_image(
                        frame, people, colors=obj_detect.colors)

               new_entries = set(tracked_people) - set(prev_tracked_people)
               for entry in new_entries:
                   save_snapshot(frame, entry)
                   logs.append('Person {} entered'.format(entry))

Our snapshot save is after markup_image() so that our saved images will have bounding boxes, person IDs, and detection confidences.

Next, go ahead and try out your app! You should get a new log each time someone enters and exits, as well as a saved image of each person that enters the frame.

 

Conclusion

This is just a simple example of applying person-detection to a use case like a security camera. However, these techniques are the building blocks for far more complex applications. This simple application could even be used for a different use case, such as analytics, by tracking how long each person remains in the frame.

The alwaysAI platform makes it easy to build, test, and deploy computer vision applications such as this person-detection security camera.

We can’t wait to see what you build with alwaysAI!

Join Our Beta Program

We are providing professional developers with a simple and easy-to-use platform to build and deploy computer vision applications on embedded devices. Our Beta program is open. Create your account now for free.

Sign Up for Free