Team Status Report for 4/27

This week we mainly worked on the final presentation and slides and made sure that we had all the information presented well. We also worked on making final adjustments to the overall integration of all three components and we will continue to work on this for the rest of this week as well before the final demo on Friday. The most significant risk at this time is making sure that we are able to successfully demo our final product and we can ensure that everything goes well by doing several practice runs this week. We are all very confident that our demo will go well and are looking forward to completing our final product. We are all on schedule as well and were able to stay on track with the Gantt chart that we made earlier this semester. For next week, we will make sure to submit the final poster, video, and report.

Below are the unit tests and overall system tests carried out for experimentation of our system for the Camera-Based Detection. For the EEG, the test results/ procedure were outlined in Rohan’s individual report last week. 

  • Yawning, microsleep, gaze, phone, other people, user away
    • Methodology
      • Test among 5 different users
      • Engage in each behavior 10 times over 5 minutes, leaving opportunities for the models to make true positive, false positive, and false negative predictions
      • Record number of true positives, false positives, and false negatives
      • Behaviors
        • Yawning: user yawns
        • Sleep: user closes eyes for at least 5 seconds
        • Gaze: user looks to left of screen for at least 3 seconds
        • Phone: user picks up and uses phone for 10 seconds
        • Other people: another person enters frame for 10 seconds
        • User is away: user leaves for 30 seconds, replaced by another user
    • Results

  • Overall system usability and usefulness
    • Methodology
      • Test among 10 different users
      • Have them complete 10 minute work session using the app
      • Ask after if they think the app is easy to use and is useful for improving their focus
    • Results
      • 9 of 10 users agreed
  • Data capture and analysis latency
    • Methodology
      • Video processing
        • Measure time between processing of each frame
        • Get average over a 5 minute recording
    • Results
      • Video processing
        • 100 ms
      • Focus/flow detection
        • 0.004ms

Team Status Report for 4/6

The most significant risk right now is the accuracy of our flow state detector. After determining a good level of accuracy on our testing data from Professor Dueck in a musician setting, we went on to perform validation in the other settings this week. Rohan performed and recorded EEG data for an activity to stimulate flow state (playing a video game) with noise-canceling headphones and music to encourage flow state for 15 minutes. He then recorded a baseline of performing a regular work task with distractions (intermittent conversation with a friend) for another 15 minutes. On the intended flow recording, our model predicted .254% of the time was in flow and in the intended not-in-flow recording, our model predicted .544% of the time was in flow. We have a few ideas as to why the model may be performing poorly in this validation context. First of all we think that in both settings, Rohan did not enter a flow state because neither task was second nature or particularly enjoyable and also 15 minutes is likely too short of a period to enter a flow state. To further validate the flow detection model, we plan to have Rohan’s roommate who is an avid video gamer wear the headset while playing games for an extended period of time to see how the model performs in detecting flow in a gaming environment. Depending on how this goes, we plan to validate the model again in the music setting to see if our model has overtrained to detect flow specifically in a musical setting.

Rohan also implemented a custom focus state detector. We collected recordings of Arnav, Karen, and Rohan in 15 minute focused and then 15 minute distracted settings while collecting EEG data from the headset. The model achieved high test accuracy on data it had not seen before and had strong precision, recall, and F1 scores. We collected data with Karen wearing the headset again, this time for 20 minutes in focus and 20 minutes distracted to use for model validation. When we ran this data through the custom focus detector, we saw a disproportionately high amount of distracted classifications and overall poor performance. We realized that the original training set only had 28 high quality focus data points for Karen compared to 932 high quality distracted data points for Karen. So, we attribute the poor performance to the skewed training data, plan to incorporate this validation data as training data for the model, and collect new validation data to ensure that the model is performing well. As a backup, we inspected the Emotiv Performance Metrics for focus detection and saw a clear distinction in the average focus Performance Metric in the focused recording as compared to the distracted recording. 

Finally, as an attempt to further validate our custom models, Rohan applied Shapley values which are a measure used in explainable AI to understand which input features are contributing most significantly to the flow vs not in flow classification. 

Validation for video processing distraction detection:

  • Yawning, microsleep, gaze
    • Test among 3 different users
    • Have user engage in behavior 10 times
    • Record number of true positives, false positives, and false negatives
  • Phone pick-up
    • Test among 5 different phones (different colors, Android/iPhone)
    • Have user pick-up and use phone 10 times (5-10 second pick-up and use)
    • Record number of true positives, false positives, and false negatives
  • Other people
    • Test among 1 user and 3 other people
    • Have other person enter frame 10 times (5-10 second interaction)
    • Record number of true positives, false positives, and false negatives
  • Face recognition
    • Test among 3 different users
    • Register user’s face in calibration
    • Imposter takes place of user 3 times in a given work session
      • Imposter takes user’s place in 30 second intervals
    • Record number of true positives, false positives, and false negatives for imposter being recognized
  • Performance
    • Calculate average FPS over every 10 frames captured, logic below
    • Get average FPS over a 1 minute recording

        if COUNTER % FPS_AVG_FRAME_COUNT == 0:

            FPS = FPS_AVG_FRAME_COUNT / (time.time() START_TIME)

            START_TIME = time.time()

        COUNTER += 1

Overall, integration is going smoothly as we have all of the distraction types integrated into the frontend and backend of the web app except for face recognition. Besides the hiccup in the accuracy of our flow state detector, our team is overall on schedule.

Besides improving the focus and flow state detector performance, Arnav and I will be focusing this coming week on improving the UI to improve the user experience.

Team Status Report for 3/30

We have made significant progress with the integration of both camera based distraction detection and our EEG focus and flow state classifier into a holistic web application. At this point, we have all of the signal processing of detecting distractions by camera and identifying focus and flow states via EEG headset working well locally. Almost all of these modules have been integrated into the backend and by the time of our interim demo, we expect to have these modules showing up on the frontend as well. At this point, the greatest risks are mainly to do with our presentation of our technology not doing justice to the underlying technology we have built. Given that we have a few more weeks before the final demo, I think that we will be able to comfortably iron out any kinks in the integration process and figure out how to present our project in a user-friendly way.

While focusing on integration, we also considered and had some new ideas regarding the flow of the app as the user navigates through a work session. Here is one of the flows we have for when the user opens the app to start a new work session:

  1. Open website
  2. Click the new session button on the website
  3. Click the start calibration button on the website
    1. This triggers calibrate.py
      1. OpenCV window pops up with a video stream for calibration
      2. Press the space key to start neutral face calibration
        1. Press the r key to restart the neutral face calibration
      3. Press the space key to start yawning calibration
        1. Press the r key to restart the neutral face calibration
      4. Press the space key to start yawning calibration
        1. Press the r key to restart the yawning face calibration
      5. Save calibration metrics to a CSV file
      6. Press the space key to start the session
        1. This automatically closes the window
  4. Click the start session button on the website
    1. This triggers run.py

Team Status Report for 3/23

This week we realized that while focus and flow state are closely related, they are distinct states of mind. While people have a shared understanding of focus, flow state is a bit more of an elusive term which means that people have their internal mental models of what flow state is and looks like. Given that our ground truth data is based on Prof. Dueck’s labeling of flow states in her piano students, we are shifting from developing a model to measure focus, to instead identifying flow states. To stay on track with our initial use case of identifying focus vs. distracted states in work settings, we plan to use Emotiv’s Focus Performance Metric to monitor users’ focus levels and develop our own model to detect flow states. By implementing flow state detection, our project will apply to many fields beyond just traditional work settings including music, sports, and research.

Rohan also discussed our project with his information theory professor, Pulkit Grover, who is extremely knowledgeable about neuroscience, getting feedback on the flow state detection portion of our project. He told us that achieving model test accuracy better than random chance would be a strong result, which we have achieved in our first iteration of the flow detection model. 

We also began integration steps this week. Arnav and Karen collaborated on getting the yawn, gaze, and sleep detections to be sent to the backend, so now these distractions are displayed in the UI in a table format along with snapshots in real-time of when the distraction occurs. Our team also met together to try to get our code running locally on each of our machines. This led us to write a README with information about libraries that need to be installed and the steps to get the program running. This document will help us stay organized and make it easier for other users to use our application.

Regarding any challenges/ risks for the project this week, we were able to clear up some information that was unclear between the focused and flow states and we are still prepared to add in microphone detection if needed. Based on our progress this week, all three stages of the project (Camera, EEG, and Web App) are developing very well and we look forward to continue integrating all the features.

Team Status Report for 3/16

This week, we ran through some initial analysis of the EEG data. Rohan created some data visualizations comparing the data during the focused vs. neutral vs. distracted states labeled by Professor Dueck. We were looking at the average and standard deviations of power values in the theta and alpha frequency bands which typically correspond to focus states to see if we could see any clear threshold to distinguish between focus and distracted states. The average and standard deviation values we saw as well as the data visualizations made it clear that a linear classifier would not work to distinguish between focus and distracted states. 

After examining the data, another consideration we realized was that Professor Dueck labeled the data with very high granularity, as she noted immediately when her students exited a flow state. This could be for a period as short as one 1 second as they turn a page. We realized that while our initial hypothesis was that these flow states would correspond closely to focus states in the work setting was incorrect. In fact, we determined that focus state is a completely distinct concept from flow state. Professor Dueck recognizes a deep flow state which can change with high granularity, whereas focus states are typically measured over longer periods of time.

Based on this newfound understanding, we plan to use the Emotiv performance metrics to determine a threshold value for focus vs distracted states. To maintain complexity, we are working on training a model to determine flow states based on the raw data we have collected and the flow state ground truth we have from Professor Dueck. 

We were able to do some preliminary analysis on the accuracy of Emotiv’s performance metrics, measuring the engagement and focus metrics of a user in a focused vs. distracted setting. Rohan first read an article while wearing noise-canceling headphones and minimal environmental distractions. He then completed the same task without more ambient noise and frequent conversational interruptions. This led to some promising results: the metrics had a lower mean and higher standard deviation in the distracted setting compared to the focused setting. This gives us some confidence that we have a solid contingency plan

There are still some challenges with using the Emotiv performance metrics directly. We will need to determine some thresholding or calibration methods to determine what is considered a “focused state” based on the performance metrics. This will need to work universally across all users despite the actual performance metric values potentially varying between individuals.

In terms of flow state detection, Rohan trained a 4 layer neural network with ReLU activation functions and a cross-entropy loss function and was able to achieve validation loss significantly better than random chance. We plan to experiment with a variety of network configurations, changing the loss function, number of layers, etc. to see if we can further improve our model’s performance. This initial proof of concept is very promising and could allow us to detect elusive flow states using EEG data which would have applications in music, sports, and traditional work settings.

Our progress for the frontend and backend, as well as camera-based detections, is on track.

After working through the ethics assignment this week, we also thought it would be important for our app to have features to promote mindfulness and make an effort for our app to not contribute to an existing culture of overworking and burnout.

Team Status Report for 3/9

Part A was written by Rohan, Part B was written by Karen, and Part C was written by Arnav. 

Global Factors

People in the workforce, across a wide variety of disciplines and geographic regions, spend significant amounts of time working at a desk with a laptop or monitor setup. While the average work day lasts 8 hours, most people are only actually productive for 2-3 hours. Improved focus and longer-lasting productivity have many benefits for individuals including personal fulfillment, pride in one’s performance, and improved standing in the workplace. At a larger scale, improving individuals’ productivity also leads to a more rapidly advancing society where the workforce as a whole can innovate and execute more efficiently. Overall, our product will improve individuals’ quality of life and self-satisfaction while simultaneously improving the rate of global societal advancement.

Cultural Factors

In today’s digital age, there’s a growing trend among people, particularly the younger generation and students, to embrace technology as a tool to improve their daily lives. This demographic is highly interested in leveraging technology to improve productivity, efficiency, and overall well-being. Also within a culture that values innovation and efficiency, there is a strong desire to optimize workflows and streamline tasks to achieve better outcomes in less time. Moreover, there’s an increasing awareness of the importance of mindfulness and focus in achieving work satisfaction and personal fulfillment. As a result, individuals seek tools and solutions that help them cultivate mindfulness, enhance focus, and maintain a healthy work-life balance amidst the distractions of the digital world. Our product aligns with these cultural trends by providing users with a user-friendly platform to monitor their focus levels, identify distractions, and ultimately enhance their productivity and overall satisfaction with their work.

Environmental Factors

The Focus Tracker App takes into account the surrounding environment, like background motion/ light, interruptions, and conversations to help users stay focused. It uses sensors and machine learning to understand and react to these conditions. By optimizing work conditions such as informing the user that the phone is being used too often or the light is too bright, it encourages a reduction in unnecessary energy consumption. Additionally, the app’s emphasis on creating a focused environment helps minimize disruptions that could affect both the user and its surroundings.

Team Progress

The majority of our time this week was spent working on the design report.

This week, we sorted out the issues we were experiencing with putting together the data collection system last week. In the end, we settled on a two-pronged design: we will utilize the EmotivPRO application’s built-in EEG data recording system to record power readings within each of the frequency bands from the AF3 and AF4 sensors (the two sensors corresponding to the prefrontal cortex) while simultaneously running a simple python program which takes in Professor Dueck’s keyboard input, ‘F’ for focused ‘D’ for distracted and ‘N’ for neutral. While this system felt natural to us, we were not sure if this type of stateful labeling system would match Professor Dueck’s mental model when observing her students. Furthermore, given that Professor Dueck would be deeply focused on observing her students, we were hoping that the system would be easy enough for her to use without having to apply much thought to it. On Monday of this week, we met with Professor Dueck after our weekly progress update with Professor Savvides and Jean for our first round of raw data collection and ground truth labeling. To our great relief, everything ran extremely smoothly with the EEG quality coming through with minimal noise and Professor Dueck finding our data labeling system to be extremely intuitive and natural to use. One of the significant risk factors for our project has been EEG-based focus detection. As with all types of signal processing and analysis, the quality of the raw data and ground truth labels are critical to training a highly performant model. This was a significant milestone because while we had tested the data labeling system that Arnav and Rohan designed, it was the first time Professor Dueck was using it. We continued to collect data on Wednesday on a different one from Professor Dueck, and this session went equally as smoothly. Having secured some initial high-fidelity data with high granularity ground truth labels, we feel that the EEG aspect of our project has been significantly de-risked. Going forward, we have to map the logged timestamps from the EEG readings to the timestamps from Professor Dueck’s ground truth labels so we can begin feeding our labeled data into a model for training. This coming week, we hope to have this linking of the raw data with the labels complete as well as an initial CNN trained on the resulting dataset. From there, we can assess the performance of the model, verify that the data has a high signal-to-noise ratio, and begin to fine-tune the model to improve upon our base model’s performance.

A new risk that could jeopardize the progress of our project is the performance of the phone object detection model. The custom YOLOv8 model that has been trained does not currently meet the design requirements of mAP ≥95%. We may need to lower this threshold, improve the model with further training, or use a pre-trained object detection model. We have already found other datasets that we can further train the model on (like this one) and have also found a pre-trained model on Roboflow that has higher performance than the custom model that we trained. This Roboflow model can be something we fall back on if we cannot get our custom model to perform sufficiently well.

The schedule for camera-based detections was updated to be broken down into the implementation of each type of distraction to be detected. Unit testing and then combining each of the detectors into one module will begin on March 18.

To mitigate the risks associated with EEG data reliability in predicting focus states, we have developed 3 different plans:

Plan A involves leveraging EEG data collected from musicians and Professor Jocelyn uses her expertise and visual cues to label states of focus and distraction during music practice sessions. This method relies heavily on her understanding of individual focus patterns within a specific, skill-based activity. 

Plan B broadens the data collection to include ourselves and other participants engaged in completing multiplication worksheets under time constraints. Here, focus states are identified in environments controlled for auditory distractions using noise-canceling headphones, while distracted states are simulated by introducing conversations during tasks. This strategy aims to diversify the conditions under which EEG data is collected. 

Plan C shifts towards using predefined performance metrics from the Emotiv EEG system, such as Attention and Engagement, setting thresholds to classify focus states. Recognizing the potential oversimplification in this method, we plan to correlate specific distractions or behaviors, such as phone pick-ups, with these metrics to draw more detailed insights into their impact on user focus and engagement. By using language model-generated suggestions, we can create personalized advice for improving focus and productivity based on observed patterns, such as recommending strategies for minimizing phone-induced distractions. This approach not only enhances the precision of focus state prediction through EEG data but also integrates behavioral insights to provide users with actionable feedback for optimizing their work environments and habits.

Additionally, we established a formula for the productivity score we will assign to users throughout the work session. The productivity score calculation in the Focus Tracker App quantifies an individual’s work efficiency by evaluating both focus duration and distraction frequency. It establishes a distraction score (D) by comparing the actual number of distractions (A) encountered during a work session against an expected number (E), calculated based on the session’s length with an assumption of one distraction every 5 minutes. The baseline distraction score (D) starts at 0.5. If A <= E: then D = 1 – 0.5 * A / E. If A > E, then 

This ensures the distraction score decreases but never turns negative.  The productivity score (P) is then determined by averaging the focus fraction and the distraction score. This method ensures a comprehensive assessment, with half of the productivity score derived from focus duration and the other half reflecting the impact of distractions.

Overall, our progress is on schedule.

 

Team Status Report for 2/24

This week we finalized our slides for the design presentation last Monday and used the feedback received from the students and Professors in our design report. We split up the work for the design report and plan to have it finalized by Wednesday so that we can get the appropriate feedback before the due date on Friday. We are also working on building a data labeling platform for Professor Dueck and plan to meet with her this week so that we can begin the data-gathering process. No changes have been made to our schedule and we are planning for risk mitigation by doing additional research for Microphone/ LLMs in case the EEG headset does not provide the accurate results we are looking for. Overall, we are all on schedule and have completed our individual tasks as well. We are looking forward to implementing more features of our design this week.

Team Status Report for 2/17

Public Health, Social, and Economic Impacts

Concerning public health, our product will address the growing concern with digital distractions and their impact on mental well-being. By helping users monitor their focus and productivity levels during work sessions and their correlation with various environmental distractions such as digital devices, our product will give users insights into their work and phone usage, and potentially help improve their mental well-being in work environments and relationship with digital devices.

For social factors, our product addresses an issue that affects almost everyone today. Social media bridges people across various social groups but is also a significant distraction designed to efficiently draw and maintain users’ attention. Our product aims to empower users to track their focus and understand what factors play into their ability to enter focus states for extended periods of time.

The development and implementation of the Focus Tracker App can have significant economic implications. Firstly, by helping individuals improve their focus and productivity, our product can contribute to overall efficiency in the workforce. Increased productivity often translates to higher output per unit of labor, which can lead to economic growth. Businesses will benefit from a more focused and productive workforce, resulting in improved profitability and competitiveness in the market. Additionally, our app’s ability to help users identify distractions can lead to a better understanding of time management and resource allocation, which are crucial economic factors in optimizing production. In summary, our product will have a strong impact on economic factors by enhancing workforce efficiency, improving productivity, and aiding businesses in better-managing distractions and resources.

Progress Update

The Emotiv headset outputs metrics for various performance states via their EmotivPRO API including attention, relaxation, frustration, interest, cognitive stress, and more. We plan to compute metrics to understand correlations (perhaps inverse) between various performance metrics. Given further understanding of how some performance metrics interact with one another; for example, the effects of interest in a subject or cognitive stress on attention could prove to be extremely useful to users in evaluating what factors are affecting their ability to maintain focus on the task at hand. We also plan to look at this data in conjunction with Professor Dueck’s focus vs. distracted labeling to understand what threshold of performance metric values denote each state of mind.

On Monday, we met with Professor Dueck and her students to get some more background on how she works with her students and understands their flow states/focus levels. We discussed the best way for us to collaborate and collect data that would be useful for us. We plan to create a simple Python script that will record the start and end of focus and distracted states with timestamps using the laptop keyboard. This will give us a ground truth of focus states to compare with the EEG brainwave data provided by the Emotiv headset.

This week we also developed a concrete risk mitigation plan in case the EEG Headset does not produce accurate results. This plan integrates microphone data, PyAudioAnalysis/MediaPipe for audio analysis, and Meta’s LLaMA LLM for personalized feedback into the Focus Tracker App.

We will use the microphone on the user’s device to capture audio data during work sessions and implement real-time audio processing to analyze background sounds and detect potential distractions. The library PyAudioAnalysis will help us extract features from the audio data, such as speech, music, and background noise levels. MediaPipe will help us with real-time audio visualization, gesture recognition, and emotion detection from speech. PyAudioAnalysis/MediaPipe will help us categorize distractions based on audio cues and provide more insight into the user’s work environment. Next, we will integrate Meta’s LLaMA LLM to analyze the user’s focus patterns and distractions over time. We will train the LLM on a dataset of focus-related features, including audio data, task duration, and other relevant metrics. The LLM will generate personalized feedback and suggestions based on the user’s focus data.

In addition, we will provide actionable insights such as identifying common distractions, suggesting productivity techniques, or recommending changes to the work environment that will further help the user improve their productivity. Lastly, we will display the real-time focus metrics and detect distractions on multiple dashboards similar to the camera and EEG headset metrics we have planned. 

To test the integration of microphone data, we will conduct controlled experiments where users perform focused tasks while the app records audio data. We will analyze the audio recordings to detect distractions such as background noise, speech, and device notifications. Specifically, we will measure the accuracy of distraction detection by comparing it against manually annotated data, aiming for a detection accuracy of at least 90%. Additionally, we will assess the app’s real-time performance by evaluating the latency between detecting a distraction and providing feedback, aiming for a latency of less than 3 seconds. 

Lastly, we prepared for our design review presentation and considered our product’s public health, social, and economic impacts. Overall, we made great progress this week and are on schedule.

Team Status Report for 02/10

This week, as a team, we incorporated all the feedback from the proposal presentation and started coming up with a more concrete/ detailed plan of how we will implement each feature/ data collection for the camera and EEG headset. 

We defined what behaviors and environmental factors we would detect via camera. This includes Drowsiness (combo of eye and mouth/yawn tracking), Off-screen gazing (eye and head tracking), Background motion, Phone pick-ups, Lighting (research shows that bright blue light is better for promoting focus), and Interacting with or being interrupted by other people. We were able to order/pick up the Emotiv Headset from inventory and started to research more on the best way to utilize it. We came up with risk mitigation in the case of EEG focus level detection failure. This will shift the Focus Tracker’s App to more behavior and environmental distraction detection as we will use a microphone as an additional input source. This will help us track overall ambient noise levels, and instances of louder noises, such as construction, dog barking, and human conversation. There will also be a section for customized feedback and recommendations on ways to improve productivity, implemented via an LLM.

Lastly, we met with Dr. Jocelyn Dueck on the possibility of collaborating on our project. We will be using her expertise in her understanding of the flow/focus state of her students. She will help us collect training data for EEG-based focus level detection as she is very experienced in telling when her students are in a focused vs unfocused state while practicing. She proposed the idea of anti-myopia pinhole glasses to artificially induce higher focus levels, which can be used for collecting training data and evaluating performance. 

Overall, we made great progress this week and are on schedule. The main existing design of our project stayed the same, with only minor adjustments made to the content of our proposal/ design following the feedback from our presentation last week. We look forward to continuing our progress into next week.