This week Fiona and I met with Professor Savvides’s staff member, Magesh, to discuss how we would develop the eye-tracking using computer vision. Magesh gave us the following implementation plan.
- Start with openCV video feed
- Send frames to mediapipe python library
- Mediapipe returns landmarks
- From landmarks, select points that correspond to the eye-region
- Determine if you are looking up, down, left, or right
- Draw a point on the video feed to show where the software thinks the user is looking so there is live feedback.
Drawing a point on the video feed will serve to verify that the software is properly tracking the user’s iris and correctly mapping its gaze to the screen.
So far, I have succeeded in having the openCV video feed appear. I am currently bug fixing to get a face-mesh to appear on the video feed, using mediapipe, to verify that the software is tracking the irises properly. I am using Google’s Face Landmark Detection Guide for Python to help me implement this software1. Once I am able to verify this, I will move on to using a face landmarker to interpret the gaze of the user’s irises on the screen, and return coordinates to draw a point where the expected gaze is on the screen.
Resources
Google AI for Developers. (2024, November 4). Face Landmark Detection Guide for Python. Google. https://ai.google.dev/edge/mediapipe/solutions/vision/face_landmarker/python