Team’s Status Report for October 18

Most Significant Risks and Management

The primary risk identified was Fingertip Positional Accuracy, specifically along the keyboard’s depth (Z-axis). Previous geometric methods yielded significant positional errors, which threatened the system’s ability to distinguish between adjacent keys (e.g., confusing Q, A, or Z) and thus made reliable typing impossible. To manage this risk, our contingency plan was the rapid implementation of the Pixel Centroid Method. This technique calculates the statistically stable Center of Mass (Centroid) of the actual finger pixels, providing a highly stable point of contact that successfully mitigates the positional ambiguity risk.

Changes to System Design

A necessary change was introduced to the Fingertip Tracking Module design. We transitioned from geometric projection methods to an Image Processing Refinement Pipeline (the Pixel Centroid Method). This was required because the original methods lacked the vertical accuracy needed for key mapping. The cost was one additional week of time, but this is mitigated by the substantial increase in tracking stability and accuracy, preventing major integration costs down the line.

Updated Schedule

No significant changes have occurred to the overall project schedule.

Part A global factors
Across developing regions, many users rely primarily on smartphones or tablets as their only computing devices, yet struggle with slow or error-prone touchscreen typing due to small screen sizes or limited literacy in digital interfaces. By using the built-in camera and no additional hardware, our system provides a universally deployable typing interface that can work on any flat surface. It’s more practical for students, remote workers, and multilingual users worldwide. For instance, an English learner in rural India could practice typing essays on a table without needing a Bluetooth keyboard, or a freelance translator in South America could work comfortably on a tablet during travel. Because all computation happens locally on-device, the system can function without internet access, which is essential for regions with limited connectivity, while also ensuring user privacy. This design supports equitable access to digital productivity tools and aligns with sustainable technology trends by reducing electronic waste and dependence on specialized hardware.

Part B cultural factors
HoloKeys is designed to fit how people learn and use technology in classrooms, libraries, community centers, and travel settings. Because QWERTY is the most widely used layout, the interface aligns with familiar motor patterns and reduces training time. Instructions and tutorials are written in plain, idiom-free text that can be easily translated into other languages. Visual overlays are adjustable (font size, key size, contrast), allowing users to tune the interface to their needs. Because expectations around camera use vary, HoloKeys defaults to privacy-forward behavior: clear camera active indicators, no recording or image retention by default, and concise explanations of how and why the camera is used.

Part C environmental factors
Unlike traditional hardware keyboards, our solution requires minimal physical manufacturing, shipping, or disposal, thereby reducing material waste and overall carbon footprint. The system relies primarily on existing mobile devices, with only a small stand or holder as an optional accessory. This holder can also serve as a regular phone or tablet stand, further extending its lifespan and utility. By minimizing the need for new electronic components and leveraging devices users already own, our design helps reduce electronic waste and promotes more sustainable technology practices.

Part A was written by Hanning Wu, part B was written by Yilei Huang and part C was written by Joyce Zhu.

Team’s Status Report for October 4

We made a design adjustment: The camera needs to sit at a different angle than originally planned. To find a suitable angle for our current gesture recognition model, we’ll first use an adjustable, angle-changeable device holder to test several angles, then commit by purchasing or fabricating a fixed-angle holder once we identify the ideal angle.
Current risk we found: Mobile devices couldn’t open the HTML directly (no real web origin, only Microsoft edge can open, but blocks the camera access)
Mitigation: We now host the page on a local server on the laptop and connect via the phone over the LAN (real origin/permissions work), with HTTPS or a tunnel available if needed.

Schedule updates: Junyu scheduled an additional week for fingertip recognition shifts out because the this task takes longer than expected since we want to test multiple different methods to ensure accuracy, and fingertip detection is curial to the accuracy of key detection. Hanning’s task for week3 and week4 switches to align with Yilei’s calibration process. Yilei reports no schedule change.

current schedule:

Team’s Status Report for September 27

The main risk is the reliability of hand detection using MediaPipe. When the palm is viewed nearly edge-on (appearing more like a line than a triangle), the detected position shakes significantly or the hands may not be detected at all, which threatens accurate fingertip tracking. To manage this, we are tilting the camera to improve hand visibility and applying temporal smoothing to stabilize landmark positions. We have also recently updated the design to incorporate vibration detection into the tap detection pipeline, since relying on vision alone can make it difficult to distinguish between hovering and actual keystrokes.

Part A public health, safety or welfare

Our product supports public welfare by improving accessibility and comfort in digital interaction. Because it operates entirely through a camera interface, it can benefit users who find it difficult to press down physical keys due to mobility, dexterity, or strength limitations. By requiring no additional hardware or forceful contact, the system provides a low-effort and inclusive way to input text. In terms of psychological well-being, all processing is performed locally on the user’s device, and no video frames or images are stored or transmitted. This protects personal privacy and reduces anxiety related to data security or surveillance. By combining accessibility with privacy-preserving design, the system enhances both the welfare and peace of mind of its users.

Part B social factors

Our virtual keyboard system directly addresses the growing social need for inclusive, portable, and accessible computing. In many educational and professional settings—such as shared classrooms, libraries, and public workspaces—users must often type on the go without carrying physical hardware, which may be costly, impractical, or socially disruptive. By enabling natural typing on any flat surface, our design reduces barriers for mobile students, low-income users without access to external peripherals. For example, a commuter can take notes on a tray table during a train ride, or a student with limited finger dexterity can type with adaptive finger placement during lectures. Socially, this technology supports a more equitable digital experience by removing dependency on specialized devices, promoting inclusivity in both educational and workplace contexts. Moreover, it also respects users’ privacy by running entirely on-device and not transmitting camera data to the cloud.

Part C economic factors
As a web app, HoloKeys meets the need for portable, hardware-free typing while minimizing costs for both users and providers. Users don’t buy peripherals or install native software, they simply open a URL. This shifts total cost of ownership away from hardware toward a service with negligible marginal cost. This lowers adoption barriers for students, travelers, and anyone for whom carrying a keyboard is impractical. Additionally, HoloKeys may modestly substitute for portable Bluetooth keyboards but is largely complementary to laptops; its primary use cases are phone-and-tablet-first contexts where a full laptop is unnecessary or inconvenient.

Part A was written by Joyce Zhu, part B was written by Hanning Wu and part C was written by Yilei Huang.

Team Status Report for September 20

This week (ending Sep 20) we aligned on a web application as the primary UI, with an optional server path only if heavier models truly require it. We’re prioritizing an in-browser pipeline to keep latency low and deployment simple, while keeping a small Python fallback available. We also validated hand detection on iPad photos using MediaPipe Hands / MediaPipe Tasks – Hand Landmarker and found it sufficient for early fingertip landmarking.

On implementation, we added a simple browser camera capture to grab frames and a Python 3.9 script using MediaPipe Hands to run landmark detection on those frames. The model reliably detected one or two hands in our test images and produced annotated outputs.