Touch-less Facial Attendance with Pi

Facial Attendance on Raspberry Pi, with Movidius NCS2 and RPLIDAR A1 for depth sensing to enable Door Access Control simulated with Pimoroni Blinkt

--

Touch-free interaction of all public devices has become an imperative post-covid. No wonder, facial recognition-based entry, and attendance gadgets are in much demand to replace attendance registers and biometric access control. These embedded devices can be used in big companies, institutions, flats, or even to take class attendance.

Touchless Attendance Gadget recognizes registered faces and pushes SMS for unrecognized faces

Face recognition can be used to identify the person while an approximate depth estimate is required to open the door, only to those near the door. Face Recognition based on Deep Learning gives better accuracy than Haar Cascade, LBPH, HOG, or SVM.

Depth can be estimated by LIDAR or DepthAI platforms like LUX-OAK-D, OpenMV Cam, etc., which can combine depth perception, object detection, and tracking in one SoM. Ultrasonic sensor HC-SR04 can also be used to detect persons near the door.

Big Data Jobs

Four major steps involved in this project are,

1. Localize and identify face (RPi + Pi Cam + OpenVINO + Movidius)

2. Publish Identity to the server (MQTT Pub-Sub for IoT communication)

3. Persist identity and register attendance (MySQL or AWS)

4. Alert security if the person is unidentified (SMS), else open the door.

We can choose to enhance the alert mechanism by pushing the footage of unidentified persons notified to security via FFmpeg. The SMS messages are pushed via Twilio integration.

During the initial setup, the system should build an image database of persons to be permitted inside. During the registration process, an affine transformation is applied after detecting the facial landmarks of the person, to get the frontal view. These images are saved and later compared, to identify whether the person is known or not.

Facial Attendance with Biometric Fallback — High-Level Diagram

The face recognition models done in OpenVINO are deployed to RPi, which is integrated with a Pi Cam and LIDAR. If the person is identified and is near to the door, then the ‘door open’ event is triggered.

Assembled Gadget: RPi with LiDAR and NCS2 on battery

If someone is near the door but not recognized then a message should be pushed to the security’s mobile. This is simulated by flashing ‘green’ and ‘red’ lights respectively, on a Pimoroni Blinkt! controlled using MQTT messages. If the person is unidentified then a ‘red’ light will flash with an appropriate voice message. Otherwise, a green light will flash along with a greeting message.

In order to avoid repeated triggers, the message gets published only when the same person is not found in the last ’n’ frames. This is efficiently implemented with a double-ended queue (deque in python).

Trending AI Articles:

1. Why Corporate AI projects fail?

2. How AI Will Power the Next Wave of Healthcare Innovation?

3. Machine Learning by Using Regression Model

4. Top Data Science Platforms in 2021 Other than Kaggle

If the person is identified, then the greeting message is pushed via the eSpeak text-to-speech synthesizer. Just make sure the voice is properly set up in Pi. You can see the system in action in the above video.

Note that the performance is optimal with 20 FPS (no face in frame) and 10–12 FPS (with faces) on a Raspberry Pi 4 with Intel Movidius NCS 2. The system is responsive not just due to the model performance but also due to the non-blocking architecture of the system.

The architecture of the system is shown below.

Architecture Diagram

In order to open the door, we need to first find out how near the person is from the door. This can be estimated by computing the median distance of LIDAR points between two subtending angles, based on LIDAR Picam position (triangulation)

Once the message is received by the MQTT client on the remote server, the identification data can be logged into a database to maintain attendance records as well as entry-exit time. See the auto-generated entry register database at the MQTT reception side below.

Database View of Door Entry register (automatically generated)

We can easily extend the system by pushing the data to the cloud or by building a front end to track or analyze the above data to generate more consumable metrics.

Anti-Spoofing Methods:

Any access control or attendance system needs to be fake-proof. It’s possible to cheat the above system by showing a photo of a registered person. How can we differentiate a real human vs a photo?

We can treat “liveness detection” as a binary classification problem and train a CNN to distinguish real faces from fake ones. But this would be expensive on the edge. Alternatively, we can detect spoofing in,

1. 3D: Using light reflections on the face. Might be overkill on edge devices.

2. 2D: We can do eyewink detection in 2D images. Feasible on edge devices.

To detect eye winks, it’s most efficient to monitor the change in white pixel count around the eye region. But it’s not as reliable as monitoring the EAR (Eye Aspect Ratio). If the Eye Aspect Ratio rises and falls periodically then it’s a real human, otherwise fake. The rise and fall can be detected by fitting a sigmoid or inverse sigmoid curve.

Identifying open vs closed eye based on the count of white pixels

Instead, we can always use Deep Learning or ML techniques to classify an eye image as open or closed. But it’s advisable, in the interest of efficiency, to use a numerical solution when you code for edge devices. See the spread of non-zero pixels in the histogram takes a sudden dip when an eye is closed.

Histogram Spread during Left Eye Wink

We can use a parametric curve fit algorithm to fit a sigmoid or inverse sigmoid function at the tail end of the above curve to detect eye open or close events. The person is a real human if any such event occurred.

Inverse Sigmoid Curve Fitting: Left and Right Eye Wink

This project can be used in multiple scenarios driven by a person, event, or custom objects. To explicate, a customized video doorbell can be made to identify a person, event, or vehicle waiting at the door to invoke an SMS, send video or audio message, or even to give an automated response.

The complete source code of the above experiments can be found here

If you have any queries or suggestions, you can reach me here

References

1. Udacity Computer Vision Nanodegree: https://www.udacity.com/course/computer-vision-nanodegree--nd891

2. LIDAR data scan code stub by Adafruit: https://learn.adafruit.com/remote-iot-environmental-sensor/code

3. OpenVINO Face Recognition Module: https://docs.openvinotoolkit.org/latest/omz_demos_face_recognition_demo_python.html

4. LIDAR Distance Estimation:https://en.wikipedia.org/wiki/Lidar

Don’t forget to give us your 👏 !

--

--

AI / ML R&D Specialist, KLA-Tencor | CSE, IIT B Alumnus | Ex-Vice President — AI & DS, IIA | Ex-TSMC | Reached interview twice for IAS | AI Consultant & Mentor