Team Status Report for 12/06/2025

Accomplishments

  • Testbed coolant flush: Flushed out cooling loop because of yellow fluid and contaminants
  • Data collection: Recollected normal dataset due to ambient temperature drift and coolant loop differences affecting inference
  • ML model development: Retrained regression and autoencoder using updated normal data
    • Updated autoencoder architecture to use delta temperature features only to lessen effect of ambient temperature changes
  • Final deliverables: Started planning Final Report, Video, and Demo
    • Took video of testbed for Final Video
    • Split up Final Report sections

Significant Risks

  • Model sensitivity: Since model is designed to detect small deviations, small drifts in ambient temperature, water levels, and other factors can cause the model to alert an anomaly
    • Switched to delta temperature only features (no absolute temperatures)
    • Tuned thresholds based on power levels

Design Changes

  • No major design changes this week

Schedule Changes

  • Project remains on track
    • ML model tuning in progress

Testing and Results

  • Servo Valve Flow Restriction Test: Ran servo through different PWM duty values and measured flow rate to map duty value to flow rate reduction
    • Result: Mapped duty value to flow rate reduction with fit equation
  • Heater / Power Delivery Test: Applied PWM values to the SSR and confirmed heater power using RMS voltage measured on oscilloscope
    • Design Change: Found that power supply rating too low and had to purchase higher wattage power supply (300W)
    • Result: Commanded power is accurate to actual measured Vrms
  • Regression Model Prediction Accuracy Test: Compared predicted vs. measured CPU power on normal datasets
    • Design Change: Replaced original linear regression model with a Random Forest after benchmarking several models
    • Result: Random Forest achieved RMSE = 2.53 W
  • Autoencoder Model Latent Space Test: Examined latent space separation between normal and fault windows and measured FPR and FNR
    • Result: Model successfully differentiates normal from fault data
      • FPR: 0.38%
      • FNR: 0.64%
  • Classification Model Accuracy Test: Evaluated classifier performance on distinguishing flow faults vs. power faults
    • Result: Model successfully classifies flow fault and power fault
      • FPR: 0.00%
      • FNR: 1.06%
  • Latency Test: Measured timing from inference start to alert generation
    • Result: Model meets 1s latency requirement
      • Regression = 141 ms
      • Autoencoder = 363 ms
      • Classifier = 9 ms
      • End-to-end = 825 ms

Team Status Report for 11/22/2025

Accomplishments

  • Data Collection: Completed power-fault data collection
    • Started reduced flow data collection loop
  • Regression model: Finalized and validated model for predicting CPU power

  • Autoencoder anomaly detection model: Implemented AE using RF residuals
    • Completed initial tuning: window size, latent dimension, dropout, batch size
    • Implementing event based fault detection instead of window based to fix poor separation between normal vs. fault cases 
      PCA After Tuning

      PCA Before Tuning

Significant Risks

  • Normal fault overlap in AE latent space: PCA shows fault and normal windows are not cleanly separable even after tuning
    • Window level anomaly detection may be unreliable due to overlap
    • Risk mitigation: Switching to event-based detection, which only requires detecting a fault at least once per event

Design Changes

  • No major design changes this week

Schedule Changes

  • Project remains on track
    • ML model tuning in progress

Team Status Report for 11/15/2025

Accomplishments

  • Hardware Upgrades: Replaced previous PSU with higher-wattage power supply for full-load heater testing

    300W PSU on Testbed
  • Testing & Data Collection: Began running automated test patterns including ramp up, ramp down, spike, and multi-level hold profiles
    • Completed synthetic data collection patterns covering full power range (20-80W)
  • CPU Power Trace: Collected power trace from a laptop using HWiNFO to generate realistic workload profile for testbed
Updated Testbed Assembly

Significant Risks

  • Interference from Other Classes: Embedded systems class is storing large boxes directly in front of testbed’s intake fan area
    • Obstructions may alter airflow, reduce heat dissipation, and affect temperature sensor readings
    • Risk mitigation: Hope that embedded will move boxes soon
Amazon Boxes Blocking Testbed Fan

Design Changes

  • No major design changes this week

Schedule Changes

  • Project remains on track
    • Synthetic data collection now complete for normal operation
    • Running real traces

Team Status Report for 11/08/2025

Accomplishments

  • System Integration: Completed integration of all major components, including pump, heaters, fan, sensors, and servos
    • Verified basic operation of each subsystem
    • Integrated Pi 5 and Pico code for synchronized control commands for SSR, fan, and pump through UART
  • SSR Calibration: Used oscilloscope testing to characterize relationship between PWM duty cycle and power output

    Using Oscilloscope to Measure Vrms of SSR Output to Resistors
  • Data Collection Pipeline: Completed logging of temperature, RPM, and power data from sensors and Pi 5 commands into InfluxDB

    Grafana Dashboard of Temperature Readings
  • Started preliminary data collection using constant power hold tests (20-40W)
  • Flow Calibration: Mapped duty cycle values to flow reduction rate %
    • Used oscilloscope to debug flowmeter reading problems

      Using Oscilloscope to Measure Flowmeter Output to RPi5 Input GPIO Signal

Significant Risks

  • Limited Data Collection Time: With only a few weeks left in the semester, there is limited time to gather a large enough dataset for ML model training
    • Risk mitigation: Run automated, continuous overnight tests and streamline test sequencing to maximize data volume
  • Power Supply Limitations: Current power supply does not provide enough wattage for full-load operation
    • Risk mitigation: Acquiring a second power supply with higher capacity to support higher power testing

Design Changes

  • No major design changes this week

Schedule Changes

  • Schedule adjusted to allow additional time for data collection
  • Project remains on track, with initial data collection underway

Team Status Report for 11/01/2025

Accomplishments

  • Testbed Assembly: Completed assembly of the basic testbed
  • Pump Mounting: Began mounting the pump but encountered alignment issues
    • Explored alternative mounting solutions to secure it properly
  • Aluminum Board Fabrication: Threaded and sanded the aluminum mounting board for the resistors to prepare for assembly
  • PWM Testing: Performed initial PWM testing on the fan
    • Confirmed PWM control
    • Plan to start pump PWM testing once full water loop is assembled
  • Sensor Integration: Completed sensor wiring and testing
    • Successfully read data from the TMP117 temperature sensors and the ADS1115 ADC used for coolant temperature sensor through the RPi5

Significant Risks

  • Pump Mounting Delay: Assembly progress was slowed by misalignment in the pump mount
  • Risk mitigation: Plan to either zip tie pump to acrylic board or use thread reducer to fix alignment issues

Design Changes

  • No major design changes this week

Schedule Changes

  • Schedule remains the same
  • Project remains on track to begin integration next week

Team Status Report for 10/25/2025

Accomplishments

  • UART Communication: Completed and validated UART connection between the Pi5 and Pico
  • Hardware Assembly: Assembled fan and radiator subassembly, mounted SSRs and Pico to 3d printed mounts
  • Fabrication: Ordered custom cut parts for testbench
  • PWM Code: Started Pico control code for Servos
  • Sensor Code: Successfully connected and read from TMP117 and ADS1115 sensors through Pi5
    • Resolved I2C address conflicts

Significant Risks

  • Delay in Assembly: Waiting on delivery of custom-cut boards
  • Risk mitigation: Complete full assembly next week once boards arrive

Design Changes

  • No major design changes this week

Schedule Changes

  • Schedule remains the same
  • Slight delay due to fabrication lead time, but full assembly and loop testing expected to begin next week

Team Status Report for 10/18/2025

Accomplishments

  • Design Report: Finalized and submitted the Design Report
  • Hardware Setup: Completed Pi5 SSH setup and connection to the on-campus network
  • UART Communication: Began developing and debugging UART communication between Pi5 and Pico
  • PWM & Sensor Code: Started implementing PWM control on the Pico and sensor code on the Pi5
  • All off-the-shelf components have arrived
    • Verified dimensions and adjusted CAD models accordingly

Significant Risks

  • Need to get custom-cut parts fabricated after Fall Break before full assembly can begin
  • UART communication and PWM development are behind schedule due to debugging and setup time for the Pico and Pi5
  • Risk mitigation: Continue code testing with partial assembly and move forward on database and sensor code setup in parallel

Design Changes

  • Flask + SSE alert system changed to a simplified alert system running locally on the Pi 5 terminal, as Flask integration is not necessary

Schedule Changes

  • Schedule remains the same
  • Team generally on track for post-break sprint code development and assembly

Global Factors

Aidan

Part A: Our system considers global factors by addressing a need to improve hardware reliability and longevity as demand for liquid cooled PCs increases worldwide. By focusing our product on providing data-driven solutions, it aims to support a diverse array of technology through a global push towards sustainability. Given the function of our product is geared towards reducing hardware replacements and decreasing energy waste caused by inefficient cooling, the end solution aligns with broader environmental goals by enhancing global initiatives towards sustainable electronics.

Cultural Factors

Jacob

Part B: The product design carefully considers the diverse interactions between the world, PCs, and human-computer interaction. This is executed by utiizing user-friendly interfaces for interacting with our product and providing transparent and functional configuration. Through training the ML model at multiple CPU loads, the end product aims to provide adaptability, respecting the global culture of PC customization and performance tuning, given the endless unique pre-existing setups. Designing a system to perform reliably under varying usage supports user freedom to tailor their machines to their own specific use cases, while maintaining the same level of protection and oversight by our product.

Environmental Factors

Kristina

Part C: One environmental consideration of our system is reducing both energy and water waste in liquid-cooled computing systems. By detecting cooling inefficiencies early, the system prevents pumps and fans from running at unnecessarily high speeds and helps maintain proper coolant flow, lowering overall power consumption.

Early fault detection also prevents thermal stress over time that can cause leaks or excess coolant evaporation in water-cooled loops. This conserves water and reduces the need for maintenance refills. In large data centers, improving cooling efficiency in this way can reduce energy usage and water waste, making the system more sustainable overall.

Team Status Report for 10/04/2025

Accomplishments

  • Parts Procurement: Ordered all major off the shelf components from the BOM
    • Waiting for deliveries before starting fabrication of laser cut plates
  • Raspberry Pi Setup: Began setting up the Raspberry Pi 5 for UART communication with the Pi Pico and sensor data collection
    • Resolved hardware and network setup issues and completed initial Pi OS configuration
  • Design Report and Research: Continued progress on the Design Report, refining use case and design requirements and supporting them with relevant sources

Significant Risks

  • Dependency on off the shelf parts for fabrication: Same as last week, cannot finalize or fabricate custom mounting plates until we have real parts to verify mounting hole dimensions
    • Plan to begin fabrication immediately upon receiving parts to stay on track

Design Changes

  • No major design changes this week. Block diagram and CAD assembly finalized last week

Schedule Changes

  • Schedule remains on track at this point but may need adjustment next week depending on actual part arrival times

Team Status Report for 09/27/2025

Accomplishments

  • CAD modeling & assembly: Completed CAD assembly for testbed
  • Block diagram: Finalized block diagram with added power distribution components and software
  • Analysis & Research: Conducted thermal analysis and research to justify use case requirements
    • ML model research to choose specific model and plan inputs/outputs

Significant Risks

  • Dependency on off the shelf parts for fabrication: Same as last week, cannot finalize or fabricate custom mounting plates until we have real parts to verify mounting hole dimensions
    • Cannot start testing of code until fan/pump, SSRs, etc. arrive
    • Plan to start ordering parts on Monday

Design Changes

  • Block diagram finalized with added power components and sensor / control separation. No costs affiliated with this design change.
  • CAD assembly finished to also include power components. No costs affiliated with this design change.

Schedule Changes

  • The schedule was updated this week to account for delays in parts purchasing. Fabrication and code development all depend on receiving components.
  • Adjusted task split for parallel development
Gantt Chart

Public Health, Safety, or Welfare

Kristina

Part A: One consideration of public health could be the system’s effect on the operator’s mental wellbeing. Frequent false alerts may cause stress or alert fatigue while reliable detection can reduce anxiety and improve confidence in system performance.

In terms of welfare, maintaining uptime prevents data loss and disruption for users who rely on the server for work. This could help the public’s mental health if people do not have to worry about losing unsaved work.

Social Factors

Aidan

Part B: Our design and product considers social factors by allowing for versatile configuration and tuning towards a multitude of real life applications. This allows the product to be used in different social and economic settings, given that each group has a different set of computing needs, and our product can be fine tuned and adapted to fit each of these groups. Additionally, our product considers alert fatigue with regards to the user and ensures that false positives are minimized to ensure the product works as intended across social groups. Lastly, the ML models and fine tuning are exposed and customizable to the specific user depending on their setup to encourage transparency in the product and solidify trust in AI-based anomaly detection systems.

Economic Factors

Jacob

Part C: For economic considerations of the project, the first thing that catches my mind is components. Our project is intended to help extend the lifetime of the system. With a longer life, the components no longer need to purchased and replaced as frequently. This in turn should save the user money.

Another consideration is the real time cost of running the system. For our use case in the datacenter or servers, they use masses of energy to power themselves. Running AnomAIy should use extremely marginal amount of power compared to the heavy load of the data centers.

Team Status Report for 09/20/2025

Significant Risks

  • Dependency on off the shelf parts for fabrication: Cannot finalize or fabricate custom mounting plates (3D printed or laser cut) until we have real parts to verify mounting hole dimensions
    • Planning to finalize component choices at the next team meeting to mitigate finalization and order all required parts to resolve mounting dependency
  • Software development blocked by missing hardware: SW arch design, data collection code, and PWM control code development require sensors and Pi to begin implementation
    • Select and order temperature sensors at the next team meeting to ensure schedule depencies are met towards sensor SW design and development
    • Brainstorm and finanlize high level code skeleton to ensure smooth development and implementation once parts arrive

Design Changes

  • Initial block diagram created with Raspberry Pi 5, Pico, servo, and sensor connections to aid in the development of the system once parts are ordered and design finalized. No costs affiliated with this design change.
  • CAD assembly revised to accommodate sensor placements and control system mounts to aid in the development of the liquid cooling loop and ensure anticipated parts will function well with the current system design. No costs affiliated with this design change.

Schedule Changes

  • No changes to the schedule were made this week