Team B2: HoloKeys – Carnegie Mellon ECE Capstone, Fall 2025

December 7, 2025December 13, 2025

Yilei’s Status Report for December 6

This week, I wired the pressure sensors on each finger and the resistors to the Arduino. I also figured out a way to fit all the parts on one hand by gluing a tiny breadboard (with wires connecting the sensors to the Arduino) onto an Arduino that can be strapped to one’s wrist with a rubber band. The wires, with the pressure sensors at the end, are wrapped around each finger, with the sensor on the fingertip secured by thin tape and a small sticker, to reduce the impact on the stability of hand detection. I added debounce to the detection algorithm so that holding a finger down slightly too long still counts as a single tap event, so it only prints once when the state changes. We tested the implementation with the newly added pressure sensors. Although we have more false positives now, the false negatives are significantly reduced. The main reason for the new false positives is that the wires and tape on the finger interfere with hand recognition, and the sticker under the fingertip affects fingertip detection. I will replace the stickers with clear tape to see whether fingertip detection is less affected. I also ran tap performance test across different lighting conditions and score thresholds.

We are on schedule. Adding the physical sensors is the additional part we decided on last week (a last minute fix for our not so sensitive tap detection).

Next week, we need to finish testing the new system that uses pressure sensors for tap detection and complete the final demo video and report. We will demo both versions: one with the pressure sensors and one with vision-based motion thresholding.

December 7, 2025December 7, 2025

Joyce’s Status Report for December 1st

What I did this week

Over Thanksgiving week, I wrote a script to log fingertip positions, manually labeled ground-truth fingertip/tap locations by visual inspection, and compared them against the computer-detected positions to understand our current accuracy and failure modes.
This week I focused on integrating the new pressure-sensor hardware into our virtual keyboard system. I designed and finalized a voltage-divider wiring diagram for the fingertip sensors, soldered the connectors and leads, and wrote the Arduino code to read and stream pressure data into our existing pipeline. Together with my teammates, I iterated on different fixed-resistor values to obtain a useful dynamic range from the sensors, then ran bench and on-keyboard tests to verify that taps were reliably detected under realistic typing motions and that the hardware tap signals lined up well with our vision-based tap events.

Scheduling

Our progress is mostly on schedule, and the system is in a state that we are comfortable bringing to demo day. The main hardware integration risk has been addressed now that the pressure sensors are wired, calibrated, and feeding into the software stack.

Plans for next week

Next week, I plan to support the public demo, help finalize and record the demo video, and contribute to writing and revising the final report (especially the sections on tap detection, hardware integration, and testing/validation). If time permits, I also hope to rerun some of the fingertip and tap-detection tests using the new pressure-sensor input, so we can include updated quantitative results that better reflect the final system.

December 7, 2025December 7, 2025

Team’s Status Report for December 6

Most Significant Risks and Mitigation
Our major risk this week continued to be tap detection accuracy. Despite several rounds of tuning thresholds, filtering sudden CV glitches, and improving motion heuristics, the camera-only method still failed to meet our accuracy requirement.

To mitigate this risk, we made a decisive design adjustment: adding external hardware support through pressure-sensitive fingertip sensors. Each sensor is attached to a fingertip and connected to an Arduino mounted on the back of the hand. We use two Arduinos total (one per hand) each supporting four sensors. The Arduino performs simple edge detection (“tapped” vs “idle”) and sends these states to our web app, where we replace our existing tap module to sensor signal→ key → text-editor pipeline. This hardware-assisted approach reduces false negative, which was our biggest issue.

Changes to System Design
Now our system now supports two interchangeable tap-detection modes:

Camera-based tap mode (our original pipeline).
Pressure-sensor mode (hardware-assisted tap events from Arduino).

The rest of the system, including fingertip tracking, keyboard overlay mapping, and text-editor integration, remains unchanged. The new design preserves our AR keyboard’s interaction model while introducing a more robust and controllable input source. We are now testing both methods side by side to measure accuracy, latency, and overall usability, ensuring that we still meet our project requirements even if the pure CV solution remains unreliable.

Unit Tests (fingertip)
We evaluated fingertip accuracy by freezing frames, then manually clicking fingertip positions in a fixed left-to-right order and comparing them against our detected fingertip locations over 14 valid rounds (10 fingers each). The resulting mean error is only ~11 px (|dx| ≈ 7 px, |dy| ≈ 7 px), which corresponds to well under ¼ key-width in X and ½ key-height in Y. Thus, the fingertip localization subsystem meets our spatial accuracy requirement.

We also conducted unit tests for calibration by timing 20 independent calibration runs and confirming the average time met our ≤15 s requirement.

System Tests
We measured tap-event latency by instrumenting four timestamps (A–D) in our pipeline: tap detection (A), event reception in app.js(B), typing-logic execution (C), and character insertion in the text editor (D). The result is 7.31ms, which is within expected timing bounds.
A→B: 7.09 ms
A→C: 7.13 ms
A→D: 7.31 ms
B→D: 0.22ms

For accuracy, we performed tap-accuracy experiments by collecting ground-truth taps and measuring detection and false-positive rates across extended typing sequences under controlled illuminance values (146, 307, and 671 lux).

Tap detection rate = correct / (correct + undetected) = 19.4%
Mistap (false positive) rate = false positives / (correct + undetected) = 12.9%

December 7, 2025December 7, 2025

Hanning’s Status Report for December 6

This week I explored a new hardware-assisted tap detection method for our virtual keyboard. Instead of relying solely on camera-based motion, we connected pressure-sensitive resistors to each fingertip and fed the signals into an Arduino. Initially, we could only see the pressure values in the Arduino Serial Monitor, and I wasn’t sure how to turn that serial output into something JavaScript could use. To solve this, I implemented a new module, pressure.js, that uses the Web Serial API to read the Arduino’s serial stream directly in the browser. The Arduino sends simple newline-delimited messages like “sensorIndex,state” (e.g., 1,tapped), and pressure.js parses each line into a structured event with handIndex, sensorIndex, fingerName, fingerIndex, and state.

Within pressure.js, I added helper logic to normalize different tap labels (tapped, tap, pressed, 1, etc.) into a single “tap” event and then invoke user-provided callbacks (onState and onTap). This lets the rest of our web framework treat Arduino pressure events exactly like our previous tap detection output: we can map fingerIndex to the existing Mediapipe fingertip landmarks and reuse the same downstream pipeline that maps fingertips to keys and updates the text editor. In other words, Arduino signals are now fully integrated as JavaScript events/variables instead of being stuck in the serial monitor.

Schedule change is turning my slack week into implementing this pressure sensor solution.

November 23, 2025November 23, 2025

Hanning’s Status Report for November 22

What I did this week
This week I restructuring the dataflow pipeline for the entire system. Previously, the virtual keyboard and physical keyboard behaved like two separate input paths feeding into the text editor. I rewrote the logic so that the editor only ever receives standard keystroke events, regardless of whether they originate from the physical keyboard or from our CV-driven virtual keyboard. This required reorganizing the flow of signals from the fingertip detection module, converting virtual taps into synthetic KeyboardEvent objects that fully match real hardware keyboard input. With this unified pipeline, the text editor no longer needs to know the difference between physical and virtual typing, simplifying integration and reducing bugs caused by mismatched interfaces.

I also fixed several UI behavior issues. First, I implemented the Shift UI logic we planned last week: Shift presses now update the on-screen keyboard layout dynamically, showing the shifted characters while Shift is active and reverting afterward. Second, I made “Focus on Editor” automatically triggered when the user starts typing. I limited camera selection to front camera only, since using the rear camera results in a mirrored video feed and breaks calibration symmetry.

I’m behind schedule since I haven’t start testing. I will try to catch up next week.

What I plan to do next week
Next week I will begin testing the new unified dataflow and verifying that the text editor behaves correctly under virtual keyboard input.

—

Over the course of these tasks, I had to pick up several new skills on my own. Most of my learning came from studying GitHub repositories, reading browser and JavaScript API documentation, and searching StackOverflow whenever I hit a problem. I learned how to debug JavaScript and HTML more effectively using the console (checking event flow, inspecting DOM elements, and printing intermediate fingertip and keystroke data). I also understand how mobile browsers handle camera usage, including permissions, mirroring behavior, and device-specific constraints. For local development, I learned how to properly run a local host to avoid camera-access issues. On the text editor side, I explored how keystrokes are handled, how autocorrection works, and how to integrate synthetic events cleanly.

November 23, 2025

Joyce’s Status Report for November 22nd

What I did this week

This week, I finished integrating my tap detection pipeline into the fully combined system so that detected taps now drive the actual keyboard mapping in the interface. Once everything was wired end-to-end, I spent a lot of time testing and found that the tap detection itself is still not accurate or robust enough for real typing. In reality, the standalone tap detection version didn’t get enough focused testing or tuning before integration. To make tuning easier, I added additional parameters and tried different methods (for example, thresholds related to tap score, motion, and timing) and spent many hours trying different values under varied lighting and hand-motion conditions. I also began experimenting with simple improvements, such as exploring motion/height-change–like cues from the landmarks, and research on 3D solutions or shadow-based tap detection to better distinguish intentional taps from jittery finger movements.

Scheduling

My progress is slightly behind schedule. Tap detection is integrated and functional, but its reliability is not yet where we want it for final testing. To catch up, I plan to devote extra time to tuning and to improving our logging and visualization so that each iteration is more informative. My goal is to bring tap detection to a reasonably stable state before Thanksgiving break.

What I plan to do next week

Next week, I plan to push tap detection as far as possible toward a stable, usable configuration and then support the team through final testing. Concretely, I want to lock down a set of thresholds and conditions that give us the best balance between missed taps and false positives, and document those choices clearly. In parallel, I will try a few alternative tap detection ideas and go deeper on whichever one shows the most promise. With that in place, I’ll help run our planned tests: logging detected taps and key outputs, measuring latency where possible, and participating in accuracy and WPM comparisons against a baseline keyboard.

New tools and knowledge

Over the course of this project, I had to learn several new tools and concepts at the same time I was building the system. On the implementation side, I picked up practical HTML, CSS, and JavaScript skills so I could work with a browser-based UI, and connect the vision pipeline to the calibration and keyboard interfaces in real time. In parallel, I learned how to research and evaluate fingertip and tap detection methods—reading online docs, forum posts, and example projects about hand landmarks, gradient-based cues, and motion/height-change–style heuristics—then turn those ideas into simple, tunable algorithms. Most of this knowledge came from informal learning strategies: looking up small pieces of information as needed, experimenting directly in our code, adding visual overlays and logging, and iteratively testing and tuning until the behavior matched what we wanted.

November 23, 2025November 25, 2025

Yilei’s Status Report for November 22

This week I shifted focus to tap detection. Building on our existing logic that marks a fingertip as a tap candidate when its vertical motion meets thresholds for downward speed, total travel distance, and a clear stop, I changed how we handle cases where multiple fingers on the same hand satisfy these conditions in the same time window. Because one finger tapping often drags neighboring fingers with similar motion, we now treat all of them as candidates but select only the finger whose fingertip ends up at the lowest vertical position as the actual tap. I also added a visual highlight on the overlaid keyboard so that whenever a tap is mapped to a key, that key briefly lights up. This makes it easier for users to see what was recognized without relying solely on the text editor output.

We are currently behind schedule, as tap detection still needs significant improvement in reducing false positives. I am exploring additional thresholding and feature checks for correct taps, as well as the relationships between fingers during a tap, to further reduce errors. I will need to finish this and start testing early next week.

Next week, I plan to continue improving the tap detection algorithm, complete our testing, and integrate the parts of my subsystem that were developed after the last integration into the main system.

Over the course of the project, most of the new things I had to learn were about calibration and keyboard overlay geometry. For tap detection (which I only started exploring this past week), I briefly read about filtering techniques in a textbook, which was a bit beyond my current level but gave me intuition for possible future improvements. The main parts I worked on over the past two months start after we already have fingertip positions. I had to map our rectangular keyboard layout onto the skewed keyboard shape in the camera view and decide which key each fingertip position should correspond to. When the overlay was not drawing correctly or an error appeared, I relied on online resources, since I had little prior experience with these kinds of coordinate transforms or the canvas and rendering APIs. I then tweaked the mapping math and compared the overlay against the physical keyboard until the calibration and key mapping behaved the way we wanted. Overall, I learned these math and graphics concepts on my own, mainly through documentation, online examples, and trial and error.

November 23, 2025November 23, 2025

Team’s Status Report for November 22

Most Significant Risks and Mitigation
This week, our main challenge came from tap detection instability. If the sensitivity is too high, the system picks up random glitches as taps; if we reduce sensitivity, normal taps get missed while glitches still sometimes pass through. Overall, it’s still difficult for the model to reliably distinguish between a real tap, natural hand movement, and a sudden CV glitch.
To mitigate this, we worked on two short-term fixes:
1. Filtering “teleport” motion — when fingertip coordinates jump too fast, we now label these as glitches and discard the frames.
2. Re-tuning tap sensitivity — we are testing a middle-ground threshold that keeps normal taps detectable without letting small jitters trigger fake keys.

Changes to System Design
While we continue tuning the glitch-filtering pipeline, we also started researching a new design direction: Reconstructing approximate 3D finger movement from the camera stream.
The idea is that true taps correspond to vertical motion toward the desk, whereas random movement is usually horizontal or diagonal. If we can estimate whether a finger is moving “downward” vs “across,” tap detection becomes much more robust.

Schedule Changes
We may need more time for testing and tuning, so we plan to convert our Week 11 slack week into a testing/verification week.

November 16, 2025November 16, 2025

Hanning’s Status Report for November 15

What I did this week:
This week, I worked on additional features for autocorrection system for the text editor. I focused on building the rule-based autocorrection engine, which handles common typos, capitalization fixes, and safe word replacements using a dictionary-style approach. In parallel, I also explored possible machine learning–based autocorrection strategies. Additionally, I encountered an integration issue with the modifier keys (Shift, Control, Option, Command): after Joyce merged fingertip detection into the full pipeline, these modifiers stopped functioning correctly when tapped, so I have been debugging and tracing the event flow to restore proper toggle behavior.

Scheduling:
My progress is slightly behind schedule due to time spent diagnosing the modifier key integration issue.

What I plan to do next week:
Next week, I will add a visual signal for Shift presses and use that signal to dynamically update the on-screen keyboard layout, displaying the shifted key characters while Shift is active and reverting when it’s toggled off. I will also fix and refine several user interface elements, including the functions of “Focus on Editor” and “Select Camera” buttons. (we should always use the front camera of mobile devices, and that camera view should be mirrored to fit in our finger detection module.)

Verification (Web Framework + Text Editor):
For my part, most verification is functional rather than heavy-quantitative, but I still have a few tests planned. For the web framework, I’ll run basic end-to-end checks on different devices (laptop + phone) to make sure the camera loads, switching cameras works, calibration starts correctly, and the UI buttons (focus editor, select camera, black mode, etc.) behave consistently. I’ll also measure simple responsiveness by logging the time between a simulated key event and the character showing up in the editor, just to confirm the typing pipeline isn’t adding noticeable delay. For the text editor, I’ll use small scripted tests that feed in sequences of pressKey/insertText calls (including toggling Shift/Control/Option/Command) and check if the final text matches what we expect. I also prepared a small list of common typos to see whether the rule-based autocorrection fixes them correctly without breaking normal words. This gives me a quick accuracy snapshot and helps make sure nothing behaves unpredictably when we integrate everything together.

November 16, 2025

Joyce’s Status Report for November 15st

What I did this week
This week, I focused on integrating more of the system so that all major pieces can start working together. I updated the integrated version to use our most up-to-date fingertip detection method from the earlier prototype and made sure the fingertip and hand landmark visualization behaves consistently after integration. I also started wiring tap detection into the main pipeline so that taps are computed from the raw landmarks instead of being a standalone demo. A good portion of my time went into debugging integration issues (camera behavior, calibration alignment, display updates) and checking that the detection pipeline runs smoothly.

Scheduling
My progress is slightly behind schedule. While fingertip detection is now integrated and tap detection is partially connected, tap events are not yet fully linked to the keyboard mapping, and the logging system for recording taps and timing is still incomplete. These pieces were originally planned to be further along by the end of this week.

What I plan to do next week
Next week, I plan to complete the connection from tap detection to the keyboard mapping so that taps reliably generate the intended key outputs, and to implement the logging infrastructure needed for our upcoming accuracy and usability tests. After that, I aim to run initial internal dry runs to confirm that the integrated system and logging behave as expected, so the team can move smoothly into the revised testing plan.