Team B2: HoloKeys – Page 2 – Carnegie Mellon ECE Capstone, Fall 2025

November 16, 2025November 16, 2025

Team’s Status Report for November 15

Most Significant Risks and Management
The main risk we identified this week is that our original test plan may not be sufficient to convincingly demonstrate that the system meets its performance requirements. In particular, the earlier accuracy and usability tests did not clearly separate natural human typing errors from errors introduced by our system, and the single-key tap test was too basic to represent realistic typing behavior. To manage this, we reframed our evaluation around within-participant comparisons, where each user types comparable text using both our virtual keyboard and a standard keyboard. This paired design allows us to interpret performance differences as properties of our system, while retaining the single-key tap test only as a preliminary verification step before more comprehensive evaluations.

Design Changes, Rationale, and Cost Mitigation
No major changes were made to the core interaction or system architecture; instead, our design changes focus on verification and validation. We shifted from treating accuracy and usability as absolute metrics for our system alone to treating them as relative metrics benchmarked against a standard keyboard used by the same participants, making the results more interpretable and defensible. We also moved from a single basic accuracy test to a layered approach that combines the original single-key tap check with a more realistic continuous-typing evaluation supported by detailed logging. The primary cost is the additional effort to implement standardized logging and paired-data analysis, which we mitigate by reusing prompts, using a common logging format, and concentrating on a small number of carefully structured experiments.

Updated Schedule
Because these changes affect how we will test rather than what we are building, the overall scope and milestones are unchanged, but our near-term schedule has been adjusted. Our current priority is to complete integration of all subsystems and the logging infrastructure so that the system can generate the detailed event data required for the revised tests. Once logging is in place, we will run internal pilot trials to verify that prompts, logging, and analysis scripts work end to end, followed by full accuracy and usability studies in which participants use both our virtual keyboard and a baseline keyboard. The resulting paired data will then be used to assess whether we meet the performance requirements defined in the design report.

Validation Testing Plan
Accuracy testing: Each participant will type two similar paragraphs: one using our virtual keyboard and one using a standard physical keyboard. In each condition, they will type for one minute and may correct their own mistakes as they go. We will record the typing process and, because we know the reference paragraph, we can infer the intended key at each point in time and compare it to the key recognized by the system. We will then compute accuracy for both keyboards and compare them to separate user error from errors introduced by our keyboard. Our goal is for the virtual keyboard’s accuracy to be within 5 percentage points of each participant’s accuracy on the physical keyboard.
Usability / speed testing: For usability, each participant will again type similar paragraphs on both the physical keyboard and our virtual keyboard. In both conditions, they will type for one minute, correcting mistakes as needed, and are instructed to type as fast as they comfortably can. We will measure words per minute on each keyboard. For users whose typing speed on the physical keyboard is at or below 40 WPM, we require that their speed on the virtual keyboard drop by no more than 10%. For users who naturally type faster than this range, we will still record and analyze their speed drop to understand how performance scales with higher baseline typing speeds.

November 16, 2025November 16, 2025

Yilei’s Status Report for November 15

This week I added a straighten button that users can press after the keyboard overlay has frozen. During calibration, if the two index fingertips are at slightly different depths, the F-J line can be slightly tilted, and the overlay inherits that tilt. The straighten button keeps the same center and horizontal separation between F and J but snaps them to a perfectly horizontal line and recomputes the quadrilateral, so the keyboard becomes level without requiring a full recalibration. I also added a Shift toggle that controls how key labels are rendered on the overlay. By default, the keyboard shows the unshifted legends (number keys display 1-0, punctuation keys show their base symbols (-, =, [, ]), and the letter keys appear as capital letters (matching the physical Mac keyboard)). When the Shift button is enabled, the overlay keeps the letter keys the same but switches the number and symbol keys to their shifted characters (1 becomes ! and = becomes +). This ensures that the on-screen legends always match what would be typed on a real Mac keyboard, while giving us a simple way to emulate Shift (equivalent to Caps Lock in our design) without needing fingertip level modifier detection yet. After the overlay is integrated with tap detection, the text engine, when it receives a Shift (Caps Lock) keystroke, should act as the signal (replacing the temporary button) to change the overlay to the shifted keys. We are taking advantage of the virtual nature of the keyboard (the key labels can change easily) instead of shoving both shifted and unshifted symbols into a single key, which would be difficult to read.

Regarding verification for my part, most of the quantitative testing to verify that my component meets the design/use-case requirements needs to run on an integrated version. I’ve done a physical perspective test to ensure the keyboard overlay matches the physical Mac keyboard. I also ran a basic lighting test for the calibration process and the fingertip-to-key mapping under three different lighting conditions (30.3, 147.9, and 675.5 lux), which includes the 200-600 lux range we are aiming for. The key labels were correct and calibration ran smoothly. Quantitative accuracy and usability tests that checks if my component meet its design goals will be conducted later on the integrated version.

I’m a bit behind schedule on a few remaining details of the keyboard overlay, but we’ve now discussed the testing plan in more detail and can start conducting the accuracy test next week, which should put me back on schedule while I finish those smaller details.

Next week, I hope to finalize how we calculate accuracy for a 1-minute typing test (we will probably record and analyze every keypress to determine accuracy instead of relying only on the final number of characters typed) and start running the accuracy and usability tests. I also plan to change the calibration countdown so it only runs while the index fingers are visible. Right now it starts on first detection and keeps ticking even if the hands leave the frame, so when they come back the overlay freezes almost immediately. Instead, the countdown should reset when the hands disappear and restart when they are detected again. I might also add a feature to center the keyboard on the screen after calibration. I plan to add a way to adjust the key labels (and maybe the overlay color) to better fit different surface colors, lighting conditions (200-600 lux), and skin tones, so these factors don’t affect the visibility of the keyboard and thus the typing experience.

November 9, 2025November 9, 2025

Hanning’s Status Report for November 8

What I did this week:

This week, I convert all the modifier keys—Shift, Control, Option, and Command—from hold-based behavior to tap-based toggles. Because our fingertip detection system can only recognize discrete tap events rather than continuous presses, this redesign ensures that these keys can now be activated or deactivated through single taps, behaving more naturally in our touchless environment. I implemented a new modifier state manager that tracks each key’s toggle state, allowing Shift to function like Caps Lock and ensuring that Control, Option, and Command can maintain their “pressed” states across multiple keystrokes. I am on schedule.

What I plan to do next week:
Next week, I will focus on preparing for the upcoming project showcase. My goal is to refine the demo experience by ensuring smooth integration between the fingertip detection, calibration, and typing modules, and to polish the user interface for a clear and stable demonstration.

November 9, 2025November 9, 2025

Joyce’s Status Report for November 8st

What I did this week

This week, I focused on integrating my component with the team’s modules and preparing for the upcoming demo. Alongside integration, I refined the tap detection algorithm to address common false positive cases and improve typing consistency.

One major update was adding a per-hand cooldown mechanism to prevent multiple taps from being registered too closely together. This addresses cases where neighboring fingers or slight hand motion caused duplicate tap events. Each hand now maintains a short cooldown window after a tap, reducing false double-taps while maintaining responsiveness.

I also continued developing the finger gesture–based state detection to differentiate between “on surface” and “in air” states. This helps ensure that only deliberate surface contacts are treated as taps, improving precision under real typing conditions.

Lastly, I began testing a palm motion detection feature that monitors for large, rapid hand movements. When such motion is detected, tap recognition is temporarily suspended, preventing false triggers when the user adjusts their position or interacts with the screen.

Scheduling

There is no significant schedule change.

What I plan to do next week

Next week, I plan to finalize and fine-tune the new tap detection features, ensuring they perform reliably across different users and lighting conditions. I will complete parameter tuning for the cooldown and palm motion detection, evaluate whether the state detection logic improves accuracy, and conduct end-to-end integration testing. The goal is to deliver a stable, high-accuracy tap detection system ready for the demo.

November 9, 2025November 9, 2025

Yilei’s Status Report for November 8

This week I focused on getting the keyboard overlay to closely match a physical Mac keyboard in perspective and on encoding that geometry into better user defaults. Using the phone stand, I set up a perspective test where I put a Mac keyboard in frame, mounted my phone with our app open at roughly a 60-degree angle, and rested my hands on the physical keyboard to run F/J calibration. I then tuned the Height and Top/Bottom sliders until the rendered overlay lined up with the real keyboard, set those tuned values as the new defaults, and adjusted the slider ranges so that the defaults sit roughly at the midpoint. While doing this, I realized my previous model only squeezed the keyboard horizontally, so I added a per-row vertical scale derived from the same top-bottom ratio: with 5 rows there are 4 steps from bottom to top, so I define a factor such that a^4=topShrink, so a=topShrink^(1/4), and the row heights become [1, a, a^2, a^3, a^4] for the number, QWERTY, ASDF, ZXCV, and modifier rows. This gives us a consistent perspective in both width and height. I tested this repeatedly with the physical keyboard and also refined the offsets between the letter key rows now that I have a physical reference. Below is a picture showing the overlay on the physical keyboard after the adjustments, you can see the overlay (specifically the number and letter keys) very closely matching the physical reference.

I am mostly on schedule for my part of the project in terms of overlay geometry and tap decision, since the perspective test and the new per-row vertical scaling are now working.

Next week, I want to hook my key-under-fingertip mapping up to Joyce’s tap detector so that instead of just labeling fingers, my module will take her tap events and return a concrete key label, which will then be fed into Hanning’s system so their component receives a stream of keypresses from my overlay. I am also considering adding a third slider that lets users adjust the rotation of the keyboard overlay (default 0 degree, horizontal), even though calibration already encodes tilt via the index fingers. In practice, it is easy for the two fingertips to be at slightly different depths, which leaves the keyboard a bit tilted, and right now the only fix is to recalibrate and carefully match fingertip depth. A small rotation slider would let users straighten the keyboard without redoing calibration once they are already happy with the size and position.

November 9, 2025November 9, 2025

Team’s Status Report for November 8

Our most significant current risk is inaccurate tap detection, which can lead to mis-taps. Right now, taps are inferred largely from the vertical displacement of a fingertip. This causes two main failure modes: when one finger taps, a neighboring finger may move slightly and be incorrectly interpreted as a second tap, and when the entire hand shifts forward, the fingertips show a large vertical-displacement-like motion, so a tap is detected even though no single finger has actually tapped. To manage this risk, we added a per-hand cooldown between taps so that each hand maintains a short cooldown window after a detected tap. Further candidate taps from the same hand are suppressed during this period, which reduces false second taps caused by passive finger motion. We plan to introduce a user-adjustable tap sensitivity slider that controls the cooldown duration so users can tune the system to their own typing style and speed. To manage the second failure mode, we plan to monitor the landmarks on the back of the hand in addition to the fingertip. If both fingertip and back-of-hand landmarks move together, we will treat this as whole-hand motion and discard that tap candidate, whereas if the fingertip moves relative to a relatively stable back of the hand, we will accept it as a true tap.

Previously, our Top/Bottom slider only horizontally compressed the top of the keyboard, which meant that perspective was approximated along one dimension only and the top rows could appear misaligned relative to a real keyboard. We now apply a per-row vertical scaling derived from the same top-bottom ratio so that both width and height follow a consistent perspective model.

We don’t have any schedule changes this week.

November 2, 2025November 2, 2025

Yilei’s Status Report for November 1

This week I made the virtual keyboard geometry match the Mac layout horizontally. I changed the layout math so that the 10 main keys in each typing row (number row, QWERTY row, A row, and Z row) all share the same column widths instead of being scaled differently per row. Before this change, rows with large peripheral keys (like tab, caps lock, or shift) would cause the letter keys in that row to shrink, so diagonals like “qaz” and “wsx” didn’t line up exactly. Now the letter/number block is fixed and the peripheral keys flex to fill the leftover space, which keeps the total row width the same. I also pulled the number row into this system so “1qaz,” “2wsx,” “3edc,” and so on are now consistent across all rows. I also updated the camera setup so that on phones and tablets we use the front camera by default instead of the back camera.

I am on schedule and finishing up the keyboard layout and key decision.

Next week I want to clean up the non-letter keys (the space row, including the arrows). I also want to use the phone/tablet stand that arrived to do a perspective test: put a printed Mac keyboard in frame, put the device on the stand, and see what top-bottom ratio and height make our rendered keyboard match the real one. From that, I can pick better defaults for users so they don’t have to drag the sliders every time (they would only adjust them when they want something different, with the default matching the real keyboard). Finally, I want to start integrating with Joyce’s part so that her tap detector uses my key under fingertip mapping now that the horizontal columns are actually correct.

November 2, 2025

Joyce’s Status Report for November 1st

What I did this week

Early this week, the fingertip detection system was fully tested and cleaned up. This week, my primary focus is the successful implementation of a robust version of the Tap Detection Algorithm. The resulting system successfully detects tap events with a relatively high accuracy, addressing some noise and false positive issues.

The successful Tap Detection Algorithm requires three simultaneous logic gates to register a tap: First, the motion must exceed a defined Start Velocity Threshold to initiate the “tap in progress” state. Second, the finger must travel a Minimum Distance from its starting point, ensuring the event is intentional and not incidental tremor. Finally, a Stop Condition must be met, where the motion slows down after a minimum duration, confirming a deliberate strike-and-rest action. To ensure clean input, I also implemented crucial filtering features, including Pinky Suppression—which discards the Pinky’s tap if it occurs simultaneously with the Ring finger—and a Global Debounce enforced after any successful tap event to prevent motion overshoot from registering unwanted consecutive hits.

Scheduling

I am currently on schedule. Although time was spent earlier in the week troubleshooting and implementing the new Gradient-Based detection method due to the previous method’s instability, the successful and robust finalization of the complex tap detection algorithm has put us firmly back on track.

What I plan to do next week

Next week’s focus will be on two key areas: input state refinement and integration for a working demo. I plan to finalize the finger gesture-based state detection (e.g., “on surface” versus “in air” states). This refinement is essential for distinguishing intentional keyboard contact from hovering, which will be used to further optimize the tap detection process by reducing invalid taps and substantially increasing overall accuracy. Following the refinement of the state logic, I will integrate the stable tap detection output with the existing system architecture. This means collaborating with the team to ensure the full pipeline—from gesture processing to final application output—is fully functional. The ultimate deliverable for the week is finalizing a stable, functional demo version of the application, ready for presentation and initial user testing.

November 2, 2025

Team’s Status Report for November 1

Most Significant Risks and Management
This week, we identified a new risk concerning hover versus contact ambiguity (the system’s difficulty in determining whether a user’s fingertip is truly resting on the keyboard plane or merely hovering above it.) This issue directly affects tap accuracy, as vertical finger movements in midair could be misinterpreted as valid keystrokes. To mitigate this, we refined our tap detection mechanism by incorporating gesture-based state validation. Specifically, the algorithm now verifies that every tap motion begins with an “in-air” finger gesture and ends with an “on-surface” gesture, as determined by the relative positions and flexion of the fingertips. Only if this air-to-surface transition coincides with a rapid downward motion is the tap event confirmed.
This approach reduces false positives from hovering fingers and improves robustness across users with different hand postures.

Changes to System Design
The system’s tap detection algorithm has been upgraded from a purely velocity-based method to a state-transition-driven model. The previous implementation relied solely on instantaneous speed, distance, and velocity drop thresholds to identify tap events, which worked well for clear, strong taps but struggled with subtle finger motions or resting gestures. The new design introduces two additional layers:

Finger State Classification: Each fingertip is now labeled as either on-surface or in-air based on its relative position, curl, and height within the calibrated plane.
State Transition Validation: A tap is recognized only when a downward motion sequence transitions from in-air → on-surface within a short temporal window.

By coupling spatial and temporal evidence, the system should be able to differentiate between deliberate keystrokes and incidental finger motion.

Updated Schedule
Hanning’s original plan for this week was to implement the keystroke event handling module. However, since fingertip output data is not yet fully stable, that task is postponed to next week. Instead, Hanning focused on developing the copy-paste function for the text editor and assisted in integrating existing components of the computer vision and calibration pipelines.

November 2, 2025November 2, 2025

Hanning’s Status Report for November 1

What I did this week:
This week, I improved the built-in text editor module by adding new functionality, including copy, paste, and basic autocorrection features (not based on ML, but some based on some rules and dictionary) to make typing and editing more seamless within our web interface. I also recorded several reference videos using our mobile device holder, capturing typing from multiple camera angles to better analyze fingertip visibility and calibration alignment under different lighting and perspectives. In addition, I worked on integrating these updates with our existing calibration and camera framework to ensure that the editor can properly receive text input once fingertip-triggered keystrokes are implemented.

Scheduling:
There was a slight change to the task plan: originally, this week’s task was focused on keystroke event implementation, while the copy-paste functionality was scheduled for next week. However, due to delays in the fingertip detection component, I handled the functions for text editor this week instead, while the keystroke event logic will be postponed.

What I plan to do next week:
I’ll work with Joyce to integrate Joyce’s fingertip detection module into the unified web framework and wire its output to actual keystroke events in the text editor. Hopefully, this will complete the data flow from fingertip detection to visible character input.