Rohan’s Status Report for 4/6

This week, I focused on flow state validation, developing/validating a focus detector, and applying Shapley values to provide explainability to our black box flow state neural network. 

For the flow state detection validation, I wore noise canceling headphones while playing Fireboy and Watergirl which is a simple online video game for 15 minutes. Then, for the next 15 minutes, I worked on implementing the model validation script without noise canceling headphones with ambient noises in the ECE work space and frequent distractions from Karen. At the end, we looked at the percentage of time the model classified as in flow for each recording and saw that .254% of the intended flow state recording was marked as flow and .544% of the intended not in flow state recording was marked as flow. These results obviously are not what we expected, but we have a few initial thoughts as to why this may be the case. First of all, Fireboy and Watergirl is a two person game and I was attempting to play by myself which was much more difficult than I expected and definitely not second nature to me (necessary condition to enter flow state). As such, we plan to test our flow state model on my roommate Ethan who plays video games frequently and apparently enters a flow state often while playing. By validating the model on an activity that is more likely to actually induce a flow state, we expect to see better results. If this ends up not working, we plan to return to the music setting and see how the model performs on the pianists we trained it on and then test on pianists we have not trained on before to further understand where the model may be overfitting.

I also developed a focus detection system which was trained on focused and distracted recordings of Arnav, Karen, and myself. We validated this data by collecting another set of focused and distracted data on Karen, but unfortunately this also had poor results. For the focused recording, our model predicted that 12.6% of the recording was in focus and for the distracted recording, it predicted that 12.3% of the recording was in focus. I realized soon after that the training set only had 34 high quality focus data points of Karen compared to 932 high quality distracted data points. This skew in the data is definitely a strong contributor to our model’s poor validation performance. We plan to incorporate this validation data into the training set and retry validating this coming week. As a backup, we inspected the Emotiv Focus Performance Metric values on this data and saw a clear distinction between the focus and distracted datasets which had average values of .4 and .311 respectively on a range from 0 to 1. 

Finally, I applied Shapley values to our flow state model to ensure that our model was picking up on logical features and not some garbage. The SHAP package has a wide variety of functionality, but I specifically explored the KernelExplainer Shapley value approximater and the summary_plot function to visualize the results. Because computing Shapley values over a large dataset, even via approximation methods, can be extremely computationally intensive, I randomly select 500 samples from the dataset to compute the Shapley values on. The basic summary plot  shows the contribution each feature makes on each of the 500 points towards a classification of flow. The bar summary plot shows the mean absolute contribution of each feature to the output classification in either direction (i.e. flow or not in flow). We see that the High Beta and Gamma frequency bands from the AF3 sensor (prefrontal cortex) as well as the Theta frequency band from the Pz sensor (parietal lobe) have high impact on the model’s classification of flow vs. not in flow state. These plots allow us to better understand which parts of the brain and more specifically, which EEG frequency bands have the most correlation with flow states in the music setting. Because flow states are difficult to induce and detect with high granularity, having access to high quality flow state ground truth from Professor Dueck is extremely valuable. Given her extensive experience with detecting flow states, and Shapley values’ ability to explain the contributions of each input to the final output, we can make new progress in understanding what kinds of brain activity corresponds with flow states.

Basic Summary Plot:

Bar Summary Plot:

Karen’s Status Report for 4/6

This week, I completed integration of phone pick-up and other people distraction detection into the backend and frontend of our web application. Now we can see the phone and other people distraction type displayed in the current session page.

I also have finished the facial recognition implementation. I decided on the Fast MT-CNN model for facial detection and the SFace model for facial embeddings. This produced the best results in terms of a balance between accuracy and speed. This is the core of the facial recognition module with the rest of the logic in the run.py and utils.py scripts. The program now recognizes when the user is no longer recognized or not in frame and reports how long the user was missing for. 

User not recognized:  08:54:02
User recognized:  08:54:20
User was away for 23.920616388320923 seconds

I also recognized that adding facial recognition significantly slowed down the programming since facial recognition requires a large amount of processing time. Because of this, I implemented asynchronous distraction detection using threading so that consecutive frames can be processed simultaneously. I am using the concurrent.futures package to achieve this.

executor = ThreadPoolExecutor(max_workers=8)

A next step would be recognizing when the user is simply not in frame vs. when there is an imposter taking place of the user. After that would be integrating the facial recognition data into the frontend and backend of the web app. In the following week, I will focus on facial recognition integration and properly testing to verify my individual components.

I have some initial testing of my distraction detection components. Arnav, Rohan, and I have all used yawning, sleeping, and gaze detection with success. From initial testing, these modules work well across different users and faces. Initial testing of other people detection has shown success and robustness for a variety of users. Phone pick-up detection needs more testing with different users and different colored phones, but initial testing shows success on my phone. I also need to begin verification that face recognition works for different users, but it has worked well for myself for now.

I have already performed some verification of individual components, such as the accuracy of the YOLOv8 phone object detector and the accuracy of MT-CNN and SFace. More thorough validation methods for the components integrated in the project as a whole are listed in our team progress report.

In the coming week I will work on the validation and verification methods. Now that all of the video processing distraction detections are implemented, I will work with Arnav on making the web application cleaner and more user friendly.