My primary focus this week has been collecting data samples for creating our ML model. The layout of how I collected data can be found here. I managed to sample four different individuals to collect baseline, blink, double blink, triple blink, wink left, and wink right signals. I collected around 600 samples of a variety of the 5 different signals we hope to differentiate and sorted all the samples into our repository. This included spending time writing Python scripts to automatically parse and transform the data into properly labeled samples for use. I created one script that cut the continuous 5-10 minute EEG recordings exported from EmotivPro where we did our data collection into three second labeled recordings using Pandas. From there, I then built a script that automates pulling features from the data and builds a table of feature vectors for ML modeling. Finally, I imported sklearn and sandboxed a random forest classifier and logistic regression model to differentiate blinking against baseline signals. With these two classifiers, I managed to obtain above 98 percent accuracy in differentiating the two kinds of signals when I split the input data into 70 percent training and 30 percent testing. This model was rerun with each new individual’s data and the two models both show promising results for doing our classification. For the following week, I will be working on the design report, collecting more data samples if we can find subjects, and building out more infrastructure to ease model development and data processing.