This architectural guide walks you through one way of creating an app for monitoring when things (animals, cars, etc) come and go.
Today we’ll go through such a system using alwaysAI’s Computer Vision platform. This particular example is intended to run on a Single board computer with an attached video camera, placed near a location entrance to determine how many people are entering or exiting a nearby location. The final code for this example can all be found at: https://github.com/alwaysai/capacity-detector
People don’t always walk in a straight line from A to B. People may linger for a while, take hard turns, or even go back from where they came. This type of behavior we want to exclude from counting as valid entries and exits. To address these challenges we’ll be making some assumptions:
A) The video feed will come from a statically placed camera. In other words, the camera providing the video feed will not be expected to pan or otherwise move.
B) We’ll designate 2 zones onto the video feed frame - one colored green and labeled entry zones and the other colored red and labeled exit zones. A person who appears first in an entry zone and then disappears in an exit zone will be someone considered to have entered the given location, and vice versa.
Finally, as this app is moderately complex, this tutorial will not walk through every line of code required to build a running version of it. Instead we’ll go over the high level concepts and how the different components communicate should you want to modify or expand upon it. You are of course, invited to look at the code in detail at the above repo link.
This sample app is broken up into the following 4 elements:
Options are located in the config.json file and support the following general options in the rest of the app:
- Ability to quickly change models and target label detection
- Switch between running a webcam and a file
- Ability to specify endpoints to send data to
- Ability to quickly designate different parts of the frame as an entry or exit zone
A JSON based configuration was used to pull out all the user based configuration from code and to make remote configuration easier to implement (though this method is not illustrated in code).
2. COMPUTER VISION DETECTION
Functions are located in the alwaysai.py file with the purpose of conveniently starting the detection process based on the above configuration options.
An alwaysai_configs.py file is a helper file that checks to see if the configuration has the minimum needed values and sets up any dummy objects for unneeded features.
3. DIRECTION FINDING
Is done within the DirectionManager class as the alwaysai.py just mainly determines if objects were detected within an entry zone or an exit zone. This information alone doesn’t definitely say if someone or something has entered or exited a given location as objects might first appear in one of the zones, linger and then exit where they came from, in which case we don’t want to count this object as having actually traversed a direction.
The DirectionManager class maintains a record of all detections from the running instance of alwaysAI and attempts to determine the direction of travel of objects based on those records.
Is located within the comm.py, which creates a Server class that just maintains the endpoint addresses defined in the configuration and has 2 convenience functions for triggering calls to a post.py file that’s just a wrapper for POST url calls. The track_entry and track_exit functions here could be easily replaced with alternate delivery mechanisms, such as sockets. See this article for more information in that regard: https://alwaysai.co/blog/how-to-integrate-alwaysai-with-external-applications-using-tcp-sockets
Though various Python mechanisms are available for inter-object communications, I decided to use function callbacks as they require no special considerations and made for few lines of code in the app.py file:
- A configuration object is loaded from the config.json file
- The configuration is then passed into a Server class’s initialization to set the endpoints for where event notifications should go
- A direction manager class is created with pointers to 2 functions within the Server class. When the direction manager has confirmed an object has entered or exited the location, it will call one of these 2 functions within the Server class.
- Start the computer vision detection with alwaysi.py file’s start_detection function(). This function takes three arguments:
4a. the configuration object
4b. a callback for when the detection has finished spinning up
4c. a callback for when an object of interest is in an entry zone
4d. a callback for when an object of interest is in an exit zone
The detection system is able to maintain a single id for a tracked object for the entirety of the time it’s in frame, based on the proximity of pixels between frames. But once that object has left the frame (or has been obstructed from view for more than the number of frames specified by the deregister_frames config option) it would get a new id if it were to enter back into frame.
Because the app is only sending confirmed entry and exit events, multiple instances of the app could be deployed to a large location with many entry points with that aggregated data handled by an independent server.
If you’re interested in experimenting with your own variation of this, sign up for alwaysAI today to try it out.