Rohan’s Status Report for 4/27

This week I worked on some further experimentation with the flow state model. I tested out different configurations of training, validation, and test sets. I also experimented with adjusting the size of the model to see if we could learn the training data without overfitting such that the model can generalize to new data. Regardless of the training, validation, test splits and the size of the model, I was unable to improve the performance on unseen data which indicates we likely just need more recordings with new individuals as well as with the same people over multiple days. I also realized that the way I was normalizing the data in training, validation, and test set evaluation was different than the process I implemented for normalization during inference. So, I have been working with Arnav and Karen to resolve this issue which also introduces a need for an EEG calibration phase. We have discussed a few possible approaches for implementing this and also have a backup plan to mitigate the risk of the calibration not working out which would make it possible for us to make inferences without any calibration if necessary. My progress is on schedule and mainly will involve last minute testing to ensure we are ready for our demo on Friday next week.

Rohan’s Status Report for 3/30

This week, in preparation for our interim demo, I have been working with Arnav to get the Emotiv Focus Performance Metric and the Flow State Detection from our custom neural network integrated with the backend. Next week I plan to apply Shapley values to further understand which inputs are contributing most significantly in the flow state classification. I will also test out various model parameters, trying to determine the lower and upper bounds on model complexity in terms of the number of layers and neurons per layer. I also need to look into how the Emotiv software computes the FFT for the power values within the frequency bands which are the inputs to our model. Finally, we will also try training our own model for measuring focus to see if we can detect focus using a similar model to our flow state classifier. My progress is on schedule and I was able to test the live flow state classifier on myself while doing an online typing tests and saw some reasonable fluctuations in and out of flow states.

Rohan’s Status Report for 3/23

In order to better understand how to characterize flow states, I had conversations with friends in various fields and synthesized insights from multiple experts in cognitive psychology and neuroscience including Cal Newport and Andrew Huberman. Focus can be seen as a gateway to flow. Flow states can be thought of as a performance state; while training for sports or music can be quite difficult and requires conscious focus, one may enter a flow state once they have achieved mastery of a skill and are performing for an audience. A flow state also typically involves a loss of reflective self-consciousness (non-judgmental thinking). Interestingly, Prof. Dueck described this lack of self-judgment as a key factor in flow states in music, and when speaking with a friend this past week about his experience with cryptography research, he described something strikingly similar. Flow states typically involve a task or activity that is both second nature and enjoyable, striking a balance between not being too easy or tedious while also not being overwhelmingly difficult. When a person experiences a flow state, they may feel a more “energized” focus state, complete absorption in the task at hand, and as a result, they may lose track of time. 

Given our new understanding of the distinction between focus and flow states, I made some structural changes to our previous focus and now flow state detection model. First of all, instead of classifying inputs as Focused, Neutral, or Distracted, I switched the outputs to just Flow or Not in Flow. Secondly, last week, I was only filtering on high quality EEG signal in the parietal lobe (Pz sensor) which is relevant to focus. Here is the confusion matrix for classifying Flow vs Not in Flow using only the Pz sensor:

Research has shown that increased theta activities in the frontal areas of the brain and moderate alpha activities in the frontal and central areas are characteristic of flow states. This week, I continued filtering on the parietal lobe sensor and now also on the two frontal area sensors (AF3 and AF4) all having high quality. Here is the confusion matrix for classifying Flow vs Not in Flow using the Pz, AF3, and AF4 sensors:

This model incorporates the Pz, AF3, and AF4 sensors data and classifies input vectors which include overall power values at each of the sensors and within each of the 5 frequency bands at each of the sensors into either Flow or Not in Flow. It achieves a precision of 0.8644, recall of 0.8571, and an F1 score of 0.8608. The overall accuracy of this model is improved from the previous one, but the total amount of data is lower due to the additional conditions for filtering out low quality data.

I plan on applying Shapley values which are a concept that originated out of game theory, but in recent years has been applied to explainable AI.  This will give us a sense of which of our inputs are most relevant to the final classification. It will be interesting to see if what our model is picking up on ties into the existing neuroscience research on flow states or if it is seeing something new/different.

My Information Theory professor, Pulkit Grover, introduced me to a researcher in his group this week who is  working on a project to improve the equity of EEG headsets to interface with different types of hair, specifically coarse Black hair which often prevents standard EEG electrodes from getting a high quality signal. This is interesting to us because one of the biggest issues and highest risk factors of our project is getting a good EEG signal due to any kind of hair interfering with the electrodes which are meant to make skin contact. I also tested our headset on a bald friend to understand if our issue with signal quality is due to the headset itself or actually because of hair interference. I found that the signal quality was much higher on my bald friend which was very interesting. For our final demo, we are thinking of inviting this friend to wear the headset to make for a more compelling presentation because we only run the model on high quality data, so hair interference with non-bald participants will end up with the model making very few predictions during our demo. 

Rohan’s Status Report for 3/16

This week I tried to implement a very simple thresholding based approach to detect flow state. Upon inspecting the average and standard deviation for the theta and alpha (focus related frequency bands) I saw that there was no clear distinction between the flow states and there was very high variance. I went on to visualize the data to see if there was any visible linear distinction between flow states, which there were not. This told me that we would need to introduce some sort of non-linearity into our model which led me to implement a simple 4-layer neural network with ReLU activation functions and cross-entropy loss. The visualizations are shown below. One of them uses the frontal lobe sensors AF3 and AF4 and the other uses the parietal lobe sensor Pz. The plots show overall power for each sensor and then the power values for the theta and alpha frequency bands at each sensor. On the x-axis is time and the y-axis is power. The green dots represent focused, red is distracted, and blue is neutral.

When I implemented this model, I trained it on only Ishan’s data, only Justin’s data, and then on all of the data. On Ishan’s data I saw the lowest validation loss of .1681, on Justin’s data the validation loss was a bit higher at .8485, and on all the data the validation loss was .8614 all of which are better than random chance which would yield a cross entropy loss of 1.098. I have attached the confusion matrices for each dataset below in order. For next steps I will experiment with different learning rates, using AdamW learning rate scheduling instead of Adam, try using more than 4 layers, different activation functions, only classifying flow vs not instead of neutral and distracted separately, and using a weighted loss function such as focal loss.

Overall my progress is ahead of schedule, as I expected to have to add significantly more complexity to the model to see any promising results. I am happy to see performance much better than random chance with a very simple model and before I have had a chance to play around with any of the hyperparameters. 



Rohan’s Status Report for 3/9

This week I spent a couple hours working with Arnav to finalize our data collection and labeling system to prepare for our meeting with Professor Dueck. Once this system was implemented, I spent time with two different music students to get the headset calibrated and ready to record the raw EEG data. Finally, on Monday and Wednesday I brought it all together with the music students and Professor Dueck to orchestrate the data collection and labeling process. This involved getting the headset set up and calibrated on each student, helping Professor Dueck get the data labeling system running, and observing as the music students practiced and Professor Dueck labeled them as focused, distracted, or neutral. I watched Professor Dueck observe her students and tried to pick up on the kinds of things she was looking for while also making sure that she was using the system correctly/not encountering any issues.

I also spent a significant amount of time working on the design report. This involved doing some simple analysis on our first set of data we collected on Monday and making some key design decisions. Once we collected data for the first time on Monday, I looked through the EEG quality on the readings and found that we were generally hovering between 63 and 100 on overall EEG quality. Initially, I figured we would just live with the variable EEG quality, and go forward with our plan to pass in the power readings from each of the EEG frequency bands from each of the 5 sensors in the headset as input into the model and also add in the overall EEG quality value as input so that the model could take into account EEG quality variability. However, on Wednesday when we went to collect data again, we realized that the EEG quality from the two sensors on the forehead (AF3 and AF4) tended to be at 100 for a significant portion of the readings in our dataset. We also learned that brain activity in the prefrontal cortex(located near the forehead) is highly relevant to focus levels. This led us to decide to only work with readings where the EEG quality for both the AF3 and AF4 sensors were 100 and therefore avoid having to pass in the EEG quality as input into the model and depend on the model learning to account for variable levels of SNR in our training data. This was a key design decision because it means that we can have much higher confidence in the quality of our data going into the model because according to Emotiv, the contact quality and EEG quality is as strong as possible. 

My progress is on schedule, and this week I plan to link the raw EEG data with the ground truth labels from Professor Dueck as well as implement an initial CNN for focus, distracted, or neutral state detection based on EEG power values from the prefrontal cortex. At that point, I will continue to fine tune the model and retrain as we accumulate more training data from our collaboration with Professor Dueck and her students in the School of Music.

Rohan’s Status Report for 2/24

This week I secured the EEG EmotivPRO subscription which has been blocking our progress on EEG-based focus state detection. With the subscription, we can now build out the data labeling platform for Professor Dueck and being implementing some basic detection model which takes in the EmotivPRO performance metrics and outputs a focus state, either focused, distracted, or neutral. I was able to collect some initial readings wearing the headset myself while working at home. I began familiarizing myself with the Emotiv API, connected to the headset via python code, and collected performance metric data from the headset. I am currently encountering an error when trying to download the performance metric data from the headset to a CSV on my laptop, which I suspect is some sort of issue with the way the license is configured or from not properly passing in credentials somewhere in the script. I also spent a significant amount of time working on the design report which is due next week. Finally, I began researching what kinds of detection models would lend themselves to our EEG-based focus level detection and settled in on a 1D (time series tailored) Convolutional Neural Network which I will begin experimenting with as soon as we finalize our data collection platform and we have determined what format we will be reading in the data. Overall, my progress is still on schedule. Looking forward to next week, I plan to implement the data collection platform with Arnav, do some further CNN research/testing, and finalize our design report for submission.

Rohan’s Status Report for 2/17

This week I spent time understanding how to improve the contact quality of the EEG headset. I set the headset up on myself and spent some time making adjustments, finally reaching 100% contact quality. I met with Justin, who is one of the piano players who Professor Dueck trains to teach him how to wear the headset and introduce him to the EmotivPRO software. I have also continued to research methods for detecting focus via EEG including training an SVM or CNN on EEG frequency bands delta, theta, and alpha which correspond closely to attention. We learned that EmotivPRO comes with detection of attention, interest, cognitive stress, and other brain states already in the form of a numerical performance metric. We are thinking of doing some further processing on these numbers to show a user a binary indicator of whether they are focused or not as well as providing the user with insight as to what factors are playing a role in their focus levels. My progress is on schedule, but I am waiting for purchase of the EmotivPRO subscription which will enable me to begin prototyping something with the EEG data from the headset which is currently blocking me. I will follow up with the ECE inventory/purchasing team to ensure that this does not become an issue given our schedule. In the next week, I hope to set up the EEG focus state data labeling system for Professor Dueck and begin researching/computing correlation metrics between various performance metrics.

Rohan’s Status Report for 2/10

This past week, I prepared and presented the proposal presentation. I acquired the Emotiv Insight headset from the ECE inventory and did some initial testing/investigation. I read research papers which studied attention using EEG data to get a better understanding of how to process the raw EEG data and what kinds of models people have had success working with in the past. I set up the headset on myself and got it connected to my laptop via bluetooth. At this point, I encountered some issues trying to get a high fidelity sensor connection to my head. I messed around with adjusting the headset on my head according to the specifications and applying saline solution to the sensor tips. Eventually, I was able to get the sensor contact quality to be steady in the 60-70% range. I also realized that we will need an EmotivPRO subscription to export any of the raw EEG data off the headset, so I filled out the order form and reached out to Quinn about how to go about getting the license. My progress is on schedule. In the next week, I need to chat with Jean to get feedback as to how to improve our sensor contact quality or at least understand what range is acceptable for signal processing. I need to secure the EmotivPRO subscription so we can port the raw EEG data from the headset. At this point, I will work with Arnav to develop a training data labeling platform that Professor Dueck will use to label her students’ raw EEG data as focused, distracted, or neutral. Finally, with the EmotivPRO subscription, I can also start setting up some simple signal processing processes to preprocess and interpret the EEG data from the headset to detect focus levels.