Team’s Status Report for November 15

Most Significant Risks and Management
The main risk we identified this week is that our original test plan may not be sufficient to convincingly demonstrate that the system meets its performance requirements. In particular, the earlier accuracy and usability tests did not clearly separate natural human typing errors from errors introduced by our system, and the single-key tap test was too basic to represent realistic typing behavior. To manage this, we reframed our evaluation around within-participant comparisons, where each user types comparable text using both our virtual keyboard and a standard keyboard. This paired design allows us to interpret performance differences as properties of our system, while retaining the single-key tap test only as a preliminary verification step before more comprehensive evaluations.

Design Changes, Rationale, and Cost Mitigation
No major changes were made to the core interaction or system architecture; instead, our design changes focus on verification and validation. We shifted from treating accuracy and usability as absolute metrics for our system alone to treating them as relative metrics benchmarked against a standard keyboard used by the same participants, making the results more interpretable and defensible. We also moved from a single basic accuracy test to a layered approach that combines the original single-key tap check with a more realistic continuous-typing evaluation supported by detailed logging. The primary cost is the additional effort to implement standardized logging and paired-data analysis, which we mitigate by reusing prompts, using a common logging format, and concentrating on a small number of carefully structured experiments.

Updated Schedule
Because these changes affect how we will test rather than what we are building, the overall scope and milestones are unchanged, but our near-term schedule has been adjusted. Our current priority is to complete integration of all subsystems and the logging infrastructure so that the system can generate the detailed event data required for the revised tests. Once logging is in place, we will run internal pilot trials to verify that prompts, logging, and analysis scripts work end to end, followed by full accuracy and usability studies in which participants use both our virtual keyboard and a baseline keyboard. The resulting paired data will then be used to assess whether we meet the performance requirements defined in the design report.

Validation Testing Plan
Accuracy testing: Each participant will type two similar paragraphs: one using our virtual keyboard and one using a standard physical keyboard. In each condition, they will type for one minute and may correct their own mistakes as they go. We will record the typing process and, because we know the reference paragraph, we can infer the intended key at each point in time and compare it to the key recognized by the system. We will then compute accuracy for both keyboards and compare them to separate user error from errors introduced by our keyboard. Our goal is for the virtual keyboard’s accuracy to be within 5 percentage points of each participant’s accuracy on the physical keyboard.
Usability / speed testing: For usability, each participant will again type similar paragraphs on both the physical keyboard and our virtual keyboard. In both conditions, they will type for one minute, correcting mistakes as needed, and are instructed to type as fast as they comfortably can. We will measure words per minute on each keyboard. For users whose typing speed on the physical keyboard is at or below 40 WPM, we require that their speed on the virtual keyboard drop by no more than 10%. For users who naturally type faster than this range, we will still record and analyze their speed drop to understand how performance scales with higher baseline typing speeds.

Leave a Reply Cancel reply