Yilei’s Status Report for November 8

This week I focused on getting the keyboard overlay to closely match a physical Mac keyboard in perspective and on encoding that geometry into better user defaults. Using the phone stand, I set up a perspective test where I put a Mac keyboard in frame, mounted my phone with our app open at roughly a 60-degree angle, and rested my hands on the physical keyboard to run F/J calibration. I then tuned the Height and Top/Bottom sliders until the rendered overlay lined up with the real keyboard, set those tuned values as the new defaults, and adjusted the slider ranges so that the defaults sit roughly at the midpoint. While doing this, I realized my previous model only squeezed the keyboard horizontally, so I added a per-row vertical scale derived from the same top-bottom ratio: with 5 rows there are 4 steps from bottom to top, so I define a factor such that a^4=topShrink, so a=topShrink^(1/4), and the row heights become [1, a, a^2, a^3, a^4] for the number, QWERTY, ASDF, ZXCV, and modifier rows. This gives us a consistent perspective in both width and height. I tested this repeatedly with the physical keyboard and also refined the offsets between the letter key rows now that I have a physical reference. Below is a picture showing the overlay on the physical keyboard after the adjustments, you can see the overlay (specifically the number and letter keys) very closely matching the physical reference.

I am mostly on schedule for my part of the project in terms of overlay geometry and tap decision, since the perspective test and the new per-row vertical scaling are now working.

Next week, I want to hook my key-under-fingertip mapping up to Joyce’s tap detector so that instead of just labeling fingers, my module will take her tap events and return a concrete key label, which will then be fed into Hanning’s system so their component receives a stream of keypresses from my overlay. I am also considering adding a third slider that lets users adjust the rotation of the keyboard overlay (default 0 degree, horizontal), even though calibration already encodes tilt via the index fingers. In practice, it is easy for the two fingertips to be at slightly different depths, which leaves the keyboard a bit tilted, and right now the only fix is to recalibrate and carefully match fingertip depth. A small rotation slider would let users straighten the keyboard without redoing calibration once they are already happy with the size and position.

Team’s Status Report for November 8

Our most significant current risk is inaccurate tap detection, which can lead to mis-taps. Right now, taps are inferred largely from the vertical displacement of a fingertip. This causes two main failure modes: when one finger taps, a neighboring finger may move slightly and be incorrectly interpreted as a second tap, and when the entire hand shifts forward, the fingertips show a large vertical-displacement-like motion, so a tap is detected even though no single finger has actually tapped. To manage this risk, we added a per-hand cooldown between taps so that each hand maintains a short cooldown window after a detected tap. Further candidate taps from the same hand are suppressed during this period, which reduces false second taps caused by passive finger motion. We plan to introduce a user-adjustable tap sensitivity slider that controls the cooldown duration so users can tune the system to their own typing style and speed. To manage the second failure mode, we plan to monitor the landmarks on the back of the hand in addition to the fingertip. If both fingertip and back-of-hand landmarks move together, we will treat this as whole-hand motion and discard that tap candidate, whereas if the fingertip moves relative to a relatively stable back of the hand, we will accept it as a true tap.

Previously, our Top/Bottom slider only horizontally compressed the top of the keyboard, which meant that perspective was approximated along one dimension only and the top rows could appear misaligned relative to a real keyboard. We now apply a per-row vertical scaling derived from the same top-bottom ratio so that both width and height follow a consistent perspective model.

We don’t have any schedule changes this week.

Yilei’s Status Report for November 1

This week I made the virtual keyboard geometry match the Mac layout horizontally. I changed the layout math so that the 10 main keys in each typing row (number row, QWERTY row, A row, and Z row) all share the same column widths instead of being scaled differently per row. Before this change, rows with large peripheral keys (like tab, caps lock, or shift) would cause the letter keys in that row to shrink, so diagonals like “qaz” and “wsx” didn’t line up exactly. Now the letter/number block is fixed and the peripheral keys flex to fill the leftover space, which keeps the total row width the same. I also pulled the number row into this system so “1qaz,” “2wsx,” “3edc,” and so on are now consistent across all rows. I also updated the camera setup so that on phones and tablets we use the front camera by default instead of the back camera.

I am on schedule and finishing up the keyboard layout and key decision.

Next week I want to clean up the non-letter keys (the space row, including the arrows). I also want to use the phone/tablet stand that arrived to do a perspective test: put a printed Mac keyboard in frame, put the device on the stand, and see what top-bottom ratio and height make our rendered keyboard match the real one. From that, I can pick better defaults for users so they don’t have to drag the sliders every time (they would only adjust them when they want something different, with the default matching the real keyboard). Finally, I want to start integrating with Joyce’s part so that her tap detector uses my key under fingertip mapping now that the horizontal columns are actually correct.

Joyce’s Status Report for November 1st

What I did this week

Early this week, the fingertip detection system was fully tested and cleaned up. This week, my primary focus is the successful implementation of a robust version of the Tap Detection Algorithm. The resulting system successfully detects tap events with a relatively high accuracy, addressing some noise and false positive issues.

The successful Tap Detection Algorithm requires three simultaneous logic gates to register a tap: First, the motion must exceed a defined Start Velocity Threshold to initiate the “tap in progress” state. Second, the finger must travel a Minimum Distance from its starting point, ensuring the event is intentional and not incidental tremor. Finally, a Stop Condition must be met, where the motion slows down after a minimum duration, confirming a deliberate strike-and-rest action. To ensure clean input, I also implemented crucial filtering features, including Pinky Suppression—which discards the Pinky’s tap if it occurs simultaneously with the Ring finger—and a Global Debounce enforced after any successful tap event to prevent motion overshoot from registering unwanted consecutive hits.

Scheduling

I am currently on schedule. Although time was spent earlier in the week troubleshooting and implementing the new Gradient-Based detection method due to the previous method’s instability, the successful and robust finalization of the complex tap detection algorithm has put us firmly back on track.

What I plan to do next week

Next week’s focus will be on two key areas: input state refinement and integration for a working demo. I plan to finalize the finger gesture-based state detection (e.g., “on surface” versus “in air” states). This refinement is essential for distinguishing intentional keyboard contact from hovering, which will be used to further optimize the tap detection process by reducing invalid taps and substantially increasing overall accuracy. Following the refinement of the state logic, I will integrate the stable tap detection output with the existing system architecture. This means collaborating with the team to ensure the full pipeline—from gesture processing to final application output—is fully functional. The ultimate deliverable for the week is finalizing a stable, functional demo version of the application, ready for presentation and initial user testing.

 

Team’s Status Report for November 1

Most Significant Risks and Management
This week, we identified a new risk concerning hover versus contact ambiguity (the system’s difficulty in determining whether a user’s fingertip is truly resting on the keyboard plane or merely hovering above it.) This issue directly affects tap accuracy, as vertical finger movements in midair could be misinterpreted as valid keystrokes. To mitigate this, we refined our tap detection mechanism by incorporating gesture-based state validation. Specifically, the algorithm now verifies that every tap motion begins with an “in-air” finger gesture and ends with an “on-surface” gesture, as determined by the relative positions and flexion of the fingertips. Only if this air-to-surface transition coincides with a rapid downward motion is the tap event confirmed.
This approach reduces false positives from hovering fingers and improves robustness across users with different hand postures.

Changes to System Design
The system’s tap detection algorithm has been upgraded from a purely velocity-based method to a state-transition-driven model. The previous implementation relied solely on instantaneous speed, distance, and velocity drop thresholds to identify tap events, which worked well for clear, strong taps but struggled with subtle finger motions or resting gestures. The new design introduces two additional layers:

  1. Finger State Classification: Each fingertip is now labeled as either on-surface or in-air based on its relative position, curl, and height within the calibrated plane.

  2. State Transition Validation: A tap is recognized only when a downward motion sequence transitions from in-air → on-surface within a short temporal window.

By coupling spatial and temporal evidence, the system should be able to differentiate between deliberate keystrokes and incidental finger motion.

Updated Schedule
Hanning’s original plan for this week was to implement the keystroke event handling module. However, since fingertip output data is not yet fully stable, that task is postponed to next week. Instead, Hanning focused on developing the copy-paste function for the text editor and assisted in integrating existing components of the computer vision and calibration pipelines.

Hanning’s Status Report for November 1

What I did this week:
This week, I improved the built-in text editor module by adding new functionality, including copy, paste, and basic autocorrection features (not based on ML, but some based on some rules and dictionary) to make typing and editing more seamless within our web interface. I also recorded several reference videos using our mobile device holder, capturing typing from multiple camera angles to better analyze fingertip visibility and calibration alignment under different lighting and perspectives. In addition, I worked on integrating these updates with our existing calibration and camera framework to ensure that the editor can properly receive text input once fingertip-triggered keystrokes are implemented.

Scheduling:
There was a slight change to the task plan: originally, this week’s task was focused on keystroke event implementation, while the copy-paste functionality was scheduled for next week. However, due to delays in the fingertip detection component, I handled the functions for text editor this week instead, while the keystroke event logic will be postponed.

What I plan to do next week:
I’ll work with Joyce to integrate Joyce’s fingertip detection module into the unified web framework and wire its output to actual keystroke events in the text editor. Hopefully, this will complete the data flow from fingertip detection to visible character input.