Team’s Status Report for October 18

Most Significant Risks and Management

The primary risk identified was Fingertip Positional Accuracy, specifically along the keyboard’s depth (Z-axis). Previous geometric methods yielded significant positional errors, which threatened the system’s ability to distinguish between adjacent keys (e.g., confusing Q, A, or Z) and thus made reliable typing impossible. To manage this risk, our contingency plan was the rapid implementation of the Pixel Centroid Method. This technique calculates the statistically stable Center of Mass (Centroid) of the actual finger pixels, providing a highly stable point of contact that successfully mitigates the positional ambiguity risk.

Changes to System Design

A necessary change was introduced to the Fingertip Tracking Module design. We transitioned from geometric projection methods to an Image Processing Refinement Pipeline (the Pixel Centroid Method). This was required because the original methods lacked the vertical accuracy needed for key mapping. The cost was one additional week of time, but this is mitigated by the substantial increase in tracking stability and accuracy, preventing major integration costs down the line.

Updated Schedule

No significant changes have occurred to the overall project schedule.

Part A global factors
Across developing regions, many users rely primarily on smartphones or tablets as their only computing devices, yet struggle with slow or error-prone touchscreen typing due to small screen sizes or limited literacy in digital interfaces. By using the built-in camera and no additional hardware, our system provides a universally deployable typing interface that can work on any flat surface. It’s more practical for students, remote workers, and multilingual users worldwide. For instance, an English learner in rural India could practice typing essays on a table without needing a Bluetooth keyboard, or a freelance translator in South America could work comfortably on a tablet during travel. Because all computation happens locally on-device, the system can function without internet access, which is essential for regions with limited connectivity, while also ensuring user privacy. This design supports equitable access to digital productivity tools and aligns with sustainable technology trends by reducing electronic waste and dependence on specialized hardware.

Part B cultural factors
HoloKeys is designed to fit how people learn and use technology in classrooms, libraries, community centers, and travel settings. Because QWERTY is the most widely used layout, the interface aligns with familiar motor patterns and reduces training time. Instructions and tutorials are written in plain, idiom-free text that can be easily translated into other languages. Visual overlays are adjustable (font size, key size, contrast), allowing users to tune the interface to their needs. Because expectations around camera use vary, HoloKeys defaults to privacy-forward behavior: clear camera active indicators, no recording or image retention by default, and concise explanations of how and why the camera is used.

Part C environmental factors
Unlike traditional hardware keyboards, our solution requires minimal physical manufacturing, shipping, or disposal, thereby reducing material waste and overall carbon footprint. The system relies primarily on existing mobile devices, with only a small stand or holder as an optional accessory. This holder can also serve as a regular phone or tablet stand, further extending its lifespan and utility. By minimizing the need for new electronic components and leveraging devices users already own, our design helps reduce electronic waste and promotes more sustainable technology practices.

Part A was written by Hanning Wu, part B was written by Yilei Huang and part C was written by Joyce Zhu.

Joyce’s Status Report for October 18

 What I did this week:

This week, I successfully resolved critical stability issues in fingertip tracking by implementing a new and highly effective technique: Pixel Centroid analysis. This robust solution moves beyond relying on a single, unstable MediaPipe landmark. It works by isolating the fingertip area in the video frame, applying a grayscale threshold to identify the finger’s precise contour, and then calculating the statistically stable Center of Mass (Centroid) as the final contact point. This system, demonstrated in our multi-method testing environment, includes a crucial fallback mechanism to the previous proportional projection method, completing the core task of establishing reliable, high-precision fingertip tracking.

Scheduling:

I am currently on schedule. The stability provided by the Pixel Centroid method has successfully mitigated the primary technical risk related to keypress accuracy.

What I plan to do next week:

Next week’s focus is on Task 4.1: Tap Detection Logic. I will implement the core logic for detecting a keypress by analyzing the fingertip’s movement along the Z-axis (depth). This task involves setting a movement threshold, integrating necessary debouncing logic to ensure accurate single keypress events, and evaluating the results to determine if complementary tap detection methods are required.

Joyce’s Status Report for October 4

What I did this week:
This week, I worked on implementing the second fingertip tracking method for our virtual keyboard system. While our first method expand on the direct landmark detection ofMediaPipe Hands to detect fingertips, this new approach applies OpenCV.js contour and convex hull analysis to identify fingertip points based on curvature and filtering. This method aims to improve robustness under varied lighting and situations when the color of the surface is similar to skin color. The implementation is mostly complete, but more testing, filter coding and parameter tuning are needed before comparing it fully with the MediaPipe approach.

Scheduling:
I am slightly behind schedule because fingertip detection has taken longer than expected. I decided to explore multiple methods to ensure reliable tracking accuracy, since fingertip detection directly impacts keypress precision. However, I plan to decrease the time spend on some minor tasks originally planed for the next few weeks, and potentially ask for help from teammates to catch up.

What I plan to do next week:
Next week, I will finish the second method, and test and compare both fingertip tracking methods to evaluate accuracy and responsiveness, then refine the better-performing one for integration into the main key detection pipeline.

Joyce’s Status Report for September 27

Accomplishments:
This week I transitioned from using MediaPipe Hands in Python to testing its JavaScript version with my computer webcam for real-time detection. I integrated my part into Hanning’s in-browser pipeline and verified that fingertip landmarks display correctly in live video. During testing, I noticed that when the palm is viewed nearly edge-on (appearing more like a line than a triangle), the detection becomes unstable—positions shake significantly or the hand is not detected at all. To address this, we plan to tilt the phone or tablet so that the camera captures the palm from a more favorable angle.

After completing the initial hand landmark detection, I began work on fingertip detection. Since MediaPipe landmarks fall slightly behind the true fingertip tips, I researched three refinement methods:

  1. Axis-based local search: extend along the finger direction until leaving a hand mask to find the most distal pixel.
  2. Contour/convex hull: analyze the silhouette of the hand to locate fingertip extrema.
  3. CNN heatmap refinement: train a small model on fingertip patches to output sub-pixel tip locations.

I have started prototyping the first method using OpenCV.js and tested it live on my webcam to evaluate alignment between the refined points and the actual fingertips. This involved setting up OpenCV.js, building a convex hull mask from landmarks, and implementing an outward search routine.

Next Week’s Goals:

  1. Complete testing and evaluation of the axis-based local search method.
  2. Implement the contour/convex hull approach for fingertip refinement.
  3. Collect comparison results between the two methods, and decide whether implementing the CNN heatmap method is necessary.

Joyce’s Status Report for September 20

This week I selected and validated a hand-detection model for our hardware-free keyboard prototype. I set up a Python 3.9 environment and integrated MediaPipe Hands, adding a script that processes static images and supports two-hand detection with annotated landmarks/bounding boxes. Using several test photos shot on an iPad under typical indoor lighting, the model consistently detected one or two hands and fingertips; failures occasionally occur and more test on failure reasons are needed. Next week I’ll keep editing the script so that the model consistently detect both hands, and then try to frame the landing points of finger tips.

Team Status Report for September 20

This week (ending Sep 20) we aligned on a web application as the primary UI, with an optional server path only if heavier models truly require it. We’re prioritizing an in-browser pipeline to keep latency low and deployment simple, while keeping a small Python fallback available. We also validated hand detection on iPad photos using MediaPipe Hands / MediaPipe Tasks – Hand Landmarker and found it sufficient for early fingertip landmarking.

On implementation, we added a simple browser camera capture to grab frames and a Python 3.9 script using MediaPipe Hands to run landmark detection on those frames. The model reliably detected one or two hands in our test images and produced annotated outputs.