Hanning’s Status Report for December 6

This week I explored a new hardware-assisted tap detection method for our virtual keyboard. Instead of relying solely on camera-based motion, we connected pressure-sensitive resistors to each fingertip and fed the signals into an Arduino. Initially, we could only see the pressure values in the Arduino Serial Monitor, and I wasn’t sure how to turn that serial output into something JavaScript could use. To solve this, I implemented a new module, pressure.js, that uses the Web Serial API to read the Arduino’s serial stream directly in the browser. The Arduino sends simple newline-delimited messages like “sensorIndex,state” (e.g., 1,tapped), and pressure.js parses each line into a structured event with handIndex, sensorIndex, fingerName, fingerIndex, and state.
Within pressure.js, I added helper logic to normalize different tap labels (tapped, tap, pressed, 1, etc.) into a single “tap” event and then invoke user-provided callbacks (onState and onTap). This lets the rest of our web framework treat Arduino pressure events exactly like our previous tap detection output: we can map fingerIndex to the existing Mediapipe fingertip landmarks and reuse the same downstream pipeline that maps fingertips to keys and updates the text editor. In other words, Arduino signals are now fully integrated as JavaScript events/variables instead of being stuck in the serial monitor.

Schedule change is turning my slack week into implementing this pressure sensor solution.

Hanning’s Status Report for November 22

What I did this week
This week I restructuring the dataflow pipeline for the entire system. Previously, the virtual keyboard and physical keyboard behaved like two separate input paths feeding into the text editor. I rewrote the logic so that the editor only ever receives standard keystroke events, regardless of whether they originate from the physical keyboard or from our CV-driven virtual keyboard. This required reorganizing the flow of signals from the fingertip detection module, converting virtual taps into synthetic KeyboardEvent objects that fully match real hardware keyboard input. With this unified pipeline, the text editor no longer needs to know the difference between physical and virtual typing, simplifying integration and reducing bugs caused by mismatched interfaces.

I also fixed several UI behavior issues. First, I implemented the Shift UI logic we planned last week: Shift presses now update the on-screen keyboard layout dynamically, showing the shifted characters while Shift is active and reverting afterward. Second, I made “Focus on Editor” automatically triggered when the user starts typing. I limited camera selection to front camera only, since using the rear camera results in a mirrored video feed and breaks calibration symmetry.
I’m behind schedule since I haven’t start testing. I will try to catch up next week.
What I plan to do next week
Next week I will begin testing the new unified dataflow and verifying that the text editor behaves correctly under virtual keyboard input.

Over the course of these tasks, I had to pick up several new skills on my own. Most of my learning came from studying GitHub repositories, reading browser and JavaScript API documentation, and searching StackOverflow whenever I hit a problem. I learned how to debug JavaScript and HTML more effectively using the console (checking event flow, inspecting DOM elements, and printing intermediate fingertip and keystroke data). I also understand how mobile browsers handle camera usage, including permissions, mirroring behavior, and device-specific constraints. For local development, I learned how to properly run a local host to avoid camera-access issues. On the text editor side, I explored how keystrokes are handled, how autocorrection works, and how to integrate synthetic events cleanly.

Hanning’s Status Report for November 15

What I did this week:
This week, I worked on additional features for autocorrection system for the text editor. I focused on building the rule-based autocorrection engine, which handles common typos, capitalization fixes, and safe word replacements using a dictionary-style approach. In parallel, I also explored possible machine learning–based autocorrection strategies. Additionally, I encountered an integration issue with the modifier keys (Shift, Control, Option, Command): after Joyce merged fingertip detection into the full pipeline, these modifiers stopped functioning correctly when tapped, so I have been debugging and tracing the event flow to restore proper toggle behavior.

Scheduling:
My progress is slightly behind schedule due to time spent diagnosing the modifier key integration issue.

What I plan to do next week:
Next week, I will add a visual signal for Shift presses and use that signal to dynamically update the on-screen keyboard layout, displaying the shifted key characters while Shift is active and reverting when it’s toggled off. I will also fix and refine several user interface elements, including the functions of “Focus on Editor” and “Select Camera” buttons. (we should always use the front camera of mobile devices, and that camera view should be mirrored to fit in our finger detection module.)

Verification (Web Framework + Text Editor):
For my part, most verification is functional rather than heavy-quantitative, but I still have a few tests planned. For the web framework, I’ll run basic end-to-end checks on different devices (laptop + phone) to make sure the camera loads, switching cameras works, calibration starts correctly, and the UI buttons (focus editor, select camera, black mode, etc.) behave consistently. I’ll also measure simple responsiveness by logging the time between a simulated key event and the character showing up in the editor, just to confirm the typing pipeline isn’t adding noticeable delay. For the text editor, I’ll use small scripted tests that feed in sequences of pressKey/insertText calls (including toggling Shift/Control/Option/Command) and check if the final text matches what we expect. I also prepared a small list of common typos to see whether the rule-based autocorrection fixes them correctly without breaking normal words. This gives me a quick accuracy snapshot and helps make sure nothing behaves unpredictably when we integrate everything together.

Hanning’s Status Report for November 8

What I did this week:

This week, I convert all the modifier keys—Shift, Control, Option, and Command—from hold-based behavior to tap-based toggles. Because our fingertip detection system can only recognize discrete tap events rather than continuous presses, this redesign ensures that these keys can now be activated or deactivated through single taps, behaving more naturally in our touchless environment. I implemented a new modifier state manager that tracks each key’s toggle state, allowing Shift to function like Caps Lock and ensuring that Control, Option, and Command can maintain their “pressed” states across multiple keystrokes. I am on schedule.

What I plan to do next week:
Next week, I will focus on preparing for the upcoming project showcase. My goal is to refine the demo experience by ensuring smooth integration between the fingertip detection, calibration, and typing modules, and to polish the user interface for a clear and stable demonstration.

Hanning’s Status Report for November 1

What I did this week:
This week, I improved the built-in text editor module by adding new functionality, including copy, paste, and basic autocorrection features (not based on ML, but some based on some rules and dictionary) to make typing and editing more seamless within our web interface. I also recorded several reference videos using our mobile device holder, capturing typing from multiple camera angles to better analyze fingertip visibility and calibration alignment under different lighting and perspectives. In addition, I worked on integrating these updates with our existing calibration and camera framework to ensure that the editor can properly receive text input once fingertip-triggered keystrokes are implemented.

Scheduling:
There was a slight change to the task plan: originally, this week’s task was focused on keystroke event implementation, while the copy-paste functionality was scheduled for next week. However, due to delays in the fingertip detection component, I handled the functions for text editor this week instead, while the keystroke event logic will be postponed.

What I plan to do next week:
I’ll work with Joyce to integrate Joyce’s fingertip detection module into the unified web framework and wire its output to actual keystroke events in the text editor. Hopefully, this will complete the data flow from fingertip detection to visible character input.

Hanning’s Status Report for October 25

What I did this week: I merged our previously separate modules into a single, working page so that Joyce’s and Yilei’s parts could run together. Specifically, I unified camera_setup.html, fingertip_detection.html, and the calibration app.js from index0927.html (yilei’s calibration file) into one loop: a shared camera pipeline (device picker, mirrored preview, hidden frame buffer for pixel ops) feeds both fingertip detection and calibration; the F/J calibration computes a keyboard quad (variable-width rows, height/top-bottom shaping), and I render the QWERTY overlay on the same canvas. I added a method switcher for fingertip sourcing (M1 landmark tip, M2 projection, M5 gradient with threshold/extension knobs), normalized coordinates so preview can remain mirrored while detection/overlay run in un-mirrored pixel space, and exposed simple text I/O hooks (insertText/pressKey) so detected points can drive keystrokes. I also cleaned up merge artifacts, centralized the run loop and status controls (live/landmarks/black screen), and kept the 10-second “freeze on stable F/J” behavior for predictable calibration. I’m on schedule this week.

What I plan to do next week: I’ll pair with Joyce to fold her fingertip detector into this pipeline, add basic stabilization/debounce, and wire tip contacts to the keystroke path (tap FSM, modifiers, and key labeling). The goal is to land end-to-end typing from fingertip events and begin measuring latency/accuracy against our targets.

Hanning’s Status Report for October 18

This week I added a calibration instructor and a small finite-state machine (FSM) to the camera webpage. The FSM explicitly manages idle → calibrating → typing: when a handsDetected hook flips true, the UI enters calibrating for 10 s (driven by performance.now() inside requestAnimationFrame) and shows a banner with a live progress bar; on timeout it transitions to typing, where we’ll lock the keyboard pose. The module exposes setHandPresence(bool) for the real detector, is resilient to brief hand-detection dropouts, and keeps preview mirroring separate from processing so saved frames aren’t flipped. I also wired lifecycle guards (visibilitychange/pagehide) so tracks stop cleanly, and left stubs to bind the final homography commit at the typing entry.

I’m on schedule. Next week, I’ll integrate this web framework with Yilei’s calibration process: replace the simulated handsDetected with the real signal, feed Yilei’s pose/plane output into the FSM’s “commit” step to fix the keyboard layout, and run end-to-end tests on mobile over HTTPS (ngrok/Cloudflare Tunnel) to verify the calibration→typing flow works in the field.

current webpage view:

Hanning’s Status Report for October 4

This week I found and fixed the “mobile can’t open HTML” issue by serving the page from a local host on my laptop and connecting to it from a phone on the same Wi-Fi (instead of file://). I verified that modules load correctly and that camera permission works when accessed via a real origin, documenting the steps (bind to 0.0.0.0, visit http://<LAN-IP>:<port> or use HTTPS/tunnel when needed). I completed a basic text editor under the camera preview: independent <textarea> wired with helper APIs (insertText, pressKey, focusEditor) . I also began research on autocorrection methods, found lightweight approaches (rule-based edits, edit-distance/keyboard adjacency, and small n-gram/LM strategies) and noting how we can plug them into the editor’s input path.
I’m on schedule. Next week, I plan to display a calibration instruction panel on the webpage and push the autocorrection prototype further. There is also a slight change of my schedule—originally calibration instructions were slated for this week and the editor for next week, but I swapped them to align with my teammates’ timelines.

Hanning’s Status Report for September 27

This week I focused on two coding tasks and one deliverable with the team. Early in the week I restructured the camera webpage code into three modules (HTML framework + camera.js + snapshot.js) Fixed local serving issues on specific browser (safari) and makes future integrations easier. I then started implementing a built-in text editor below the video preview (textarea + helper APIs like insertText/pressKey) so that we can type something into a real target. In parallel, I worked with my teammates to complete the design presentation slides (webapp part and testing part) Next week I plan to further work on the text editor and begin basic auto-correction implementations.

Hanning’s Status Report for September 20

This week I focused on building the camera input pipeline. Early in the week I set up a webpage that requests camera permission and streams live video using the MediaDevices API, with controls to start/stop, pick a camera (front/rear), and a frame loop that draws each frame to a hidden canvas for processing. Later in the week I added single-frame capture. I can now grab the current video frame and export it as a JPEG (via canvas, with optional ImageCapture when available). Next week I plan to write some API to wire these frames into the CV part and begin basic keystroke event prototyping.