Status Reports – Team A2: Project LAKE – Logging of Acoustic Keyboard Emanations

in a hole in the ground there lived a hobbit. not a nasty, dirtym, wet hole, filled with the ends of worms and an oozy smell, nor yet a dry, bare, sandy hole with nothing in it to seit down on or to eat, it was a hobbit hole and that means comfort. it had a perfectly round door like a porthole, painted green, with a shiny yellow brass knob in the exact middle. Thdoor opened on to a tube shjaped hall like a tunnel, a very comfortable tunnel without smoke, with paneled walls, and floors tiled and carpeted, provided with polished chairs, and lots and lots of pegs for hats and coats the hobbit was fond of visitors. the tunnel wound on and on. going fairly but not quite straight in the side of the hill. the hill ass all the people for many miles round called it and many little round opened out of it, first on one side and then another. no going upstairsor the hobbitbedrooms bathroonms cellars, pantries lots of these, wardrobes he had whole rooms devoted toclothes. kjitchens dining rooms, all were on the same door, and nindeed on the same passagew. the best rooms were all on the left hand side going in for these were the only ones to have windows deep set round windows looking over his garden and meadows behond slopping to the river.

Status Report 11

Kevin

Accomplishments

- This week I helped james build a soundproof box he designed. The inside of the box is lined with two layers of a sound deadening material. In the images below this is the silver layer. It is a dense material which is made to absorb sounds for vehicles.
- There is an acrylic window on top, so the user can see the keyboard, which is two layers thicks. There is an air gap between the layers to try and further increase sound isolation.
- The box is closed on all sides except where the user inserts their hands. The closed sides are tightly held together to improve sound isolation.
- The sound proofing material creates significant echoes inside the box since the sound cannot leave easily. To fix this we padded the inside with sound absorbing foam. This is very light material and often used in home studios help with reducing reflecting sounds
- The sound proofing box is quite effective. We found it can reduce the noise level by 15dB, using an app on my phone to test. We hope this box will help increase our accuracy by significantly reducing the noise in the loud demo environment.

Upcoming Work

Next week is the final demo and report. We will be spending our time finalizing our demo and writing the report.

James

Accomplishments

- This week, I designed a sound proofing box to help keep the noise levels at moderate, quiet office levels during the demo. The box is 26inches x 18inches x 6inches and can house full sized keyboards. An acrylic window was cut into the top to allow easy viewing of the keyboard.
- I wrote a simplified version of the code for the demo, allowing for quick input of a recording of an audience member’s typing.

Upcoming work

The only upcoming work is the demo and the final report. We will also record a demonstrative video in case the noise levels in the demo room make effective classification impossible.

Ronit

Accomplishments

Built the sound proof box and evaluated its effectiveness by collecting data and trying to infer passwords.
We found that the box reduces outside noise by about 20 dB.
Setup the esp32 for demo, temporarily disabled power saving mode. (the demo room will be too noisy for the esp to go to sleep).
Tested the project against a mechanical keyboard. We found that our algorithm works better for membrane keyboards, but this might be because we tuned our parameters for a membrane keyboard.
Performed user tests(with people other than me kevin and james).
Worked on video for demo, in case te demo does not work in the noisy demo room.
Worked on final report.

Upcoming work

Work on the final report.

Team Status

Accomplishments

- This week, the team worked closely together to construct a soundproofing box to try to maintain a moderate sound level during the demo.
- Slightly simplified versions of the signal processing and ESP32 code were written to allow for a more demonstrable project.

Upcoming work

In the next week, we will be demoing our project and writing the final report.

Status Report 10

Kevin

Accomplishments

- - - - The new PCBs and parts came in this week. We assembled and tested all of them. There appear to be no major issues with any of the assembled boards. We were able to flash all of them successfully. Additionally, we can power them via batteries and charge the batteries in the manor intended. Both microphones are are functioning properly and we are receiving good audio quality.
      - The does appear to be a minor issue with the battery management unit however, on two of the board when powered only through the 5V pin, the output voltage sometimes drops below 3.3V, causing the ESP32 to brown-out. The issue is likely caused by imperfect solder joints, however it not not severe enough to warrant re-work.
      - I tested the current draw of the board during the different power modes. The board is pulling about 120mA when in normal use and 0.7mA when in deep sleep mode.

Upcoming Work

- - Next week we will be giving our final presentation, given by James. We are finishing up the Final Presentation and practicing the talk.
  - We will mostly be focused on getting ready for the demo; practicing how everything will work during the live demo.

James

Accomplishments

- - This week, I worked toward improving the machine learning algorithms toward labeled data, as we were unsuccessful in fully eliminating dropped data.
  - By using labeled data, I was able to achieve a leave-one-out cross validation accuracy of 16%, with a sample size of 1107.
  - Moving forward, using this trained classifier, unlabeled 10-character random passwords were able to be retrieved with fairly decent accuracy. We met our original requirement of guessing 80% of 10-character random passwords in 75 tries or less. We also achieved a successful guess rate of 50% in 5 tries or less, and 40% in 1 try.

Upcoming work

In this next week, we will need to work toward preparing for the demo. I will further tune the parameters the classifier to attempt to improve accuracy. I will construct a sound-proofing box to lower the loud background noise we expect in the gym during the final demo.

Ronit

Accomplishments

The ESP32 is still randomly dropping large chunks of data. As such we have been unable to collect accurate tdoa data.
I have tried switching to the auxiliary peripheral clock, an external, more accurate oscillator on the esp32, however this issue still persists.
For a time we thought that the issue may be because the dma buffers are filling up, we tried increasing the dma buffer and switching to udp. However, this did not seem to fix the issue.
We have so far been unable to ascertain the reason for the dropped data.
I finally tested the esp32’s current draw in deep sleep mode. It come to about 0.7mA. Which is very small compared to its nominal current draw of 0.1-0.3mA during normal data collection and transmission.
At this point we have to rely on frequency features alone. I worked with James to get a fresh new clean set of data. With leave one out cross correlation, we were able to get about a 20-30% error rate.
We generated some confusion matrices and used a breadth first approach to generate the top 75 possible guesses for the password.

Upcoming work

I will continue to investigate the dropped data, but my major focus will be on integration and getting a working demo.

Team Status

Accomplishments

- - This week, the team worked closely together to assemble the final revision of the PCB. Three working boards were assembled.
  - Different options were explored to eliminate dropped packets to aid in TDoA localization and clustering. However, because we were unable to, we have decided to move forward with labeled data.
  - We have achieved good accuracy in classification using labeled data and have met our original requirement for password accuracy.

Upcoming work

In the next week, we will need to focus on preparing for the final demo.

Status Report 9

Kevin

Accomplishments

- This week I reviewed the last PCB revision. I placed the order for the PCB through PCBway, same as the previous boards. I also ordered more parts so we have enough to build out 3 of the new PCBs. This should be our last order. We won’t have enough time to buy new parts so I ordered extras of some components that could easily get damaged or lost.
- I worked on improving the keystroke detection using the delta method from earlier. I was able to tune the parameters to get a slightly improved result. However, we switched to using a thresholding method as it seems to perform better.
- I worked with James on collecting longer samples of data. In order to test our clustering performance, we collected about 30 samples from each key on the keyboard. We then experimented with different clustering techniques and features. Kmeans with euclidean distance gave us the best results.
- While collecting the data, we used two boards so that we can also experiment with TDoA data. The TDoA appeared to be working until about one third into the audio clip. At that point the keystrokes moved to 85ms apart from each other, which does not make sense. There may be an issue of adding or dropping samples while transmitting.
- We believe using Cepstral features and TDoA will give us decent clustering results.

Upcoming Work

- Next week I will be focusing on supporting the effort of training and clustering. This will involve collecting data tuning parameters.
- The new PCB should arrive this week, so once that arrives I will be putting it together and verifying its functionality.

James

Accomplishments

- This week, we worked on collecting long samples of 3-way TDoA data. TDoA between the PCB boards was found have high degrees of separation for non-adjacent keys.
- However, when moving to a much longer audio recording (6 minutes), we found that the audio signal between the two microphones were becoming misaligned, with the same keystroke appearing on one microphone over 80ms before the other. This should not be possible, as that would require a distance difference of 27 meters based on the speed of sound through air. We suspect that one or both of the sensor packages is dropping samples.
- We found a more faster, more noise-resistant method of cracking the substitution cipher problem, using quadgram probability data from http://practicalcryptography.com/. This method was able to decipher a 5500 word cipher within 10 minutes. Noise and unknown word boundaries had minimal effect.

Upcoming work

I will need to fine tune the clustering parameters to improve clustering accuracy.

Ronit

Accomplishments

In order to increase the resolution in the TDOA data, we increased the sampling rate from 40kHz to 60kHz. There is an average separation of about 1cm between each key switch on the keyboard. With a higher sampling rate, the number of samples.
We were able to get good separation of keystrokes using tdoa and cepstral features.
We tried using 3-way tdoa, however we were not able to collect good data. The ESP32 dev board with the external mic was not properly impedance matched.
We found an efficient means of solving the substitution cipher using ngrams. We are now able to crack substitution ciphers in less than 10 mins for passages under 400 words.

Upcoming work

We need to collect 3 way tdoa data to get better separation. So we need to build a 3rd PCB.
We need to start integration.

Team Status

Accomplishments

- This week put out the order for our final PCB.
- We found a way to efficiently decoded the substitution cipher.
- We were able to get good clustering using cepstral and tdoa data.

Upcoming work

In the following weeks, we will need to divert more attention to the machine learning aspects of the project, as well as to fine tune much of the signal processing algorithms in order to be more robust and effective.

Status Report 8

Kevin

Accomplishments

This week I finished the placement and routing of the new PCB. It is 2”x1.5” which should allow for portability as well as easy integration into any package size. While routing, I used many pours this time to connect power between the two power management pins and from external power. This should help to produce cleaner power, leading to better signal integrity.
The top layer is a ground pour, similar to the last revision, to make routing much simpler. This time, however, the bottom layer is a 3.3V pour, which allowed me to much more easily route power the devices that needed it.
I also continued researching the Power Level Difference method of sound reduction. I found several descriptive papers that I can possible recreate the method from if we deem it necessary.

Upcoming Work

Next week I will be working on integrating our signal processing code together to make it much easier to use, and make it a complete system. Ronit, James and I will be spending most of our time working on tuning our design and code to get the results we need.
I will also be placing the order for our last PCB and any parts we need.

James

Accomplishments

This week, I rewrote much of the Matlab code in order to allow the different components (filtering, keystroke detection, clustering, etc.) to fit and interface together. This will allow us to test the full system and see how it performs at the current stage in development
I optimized the keystroke separation and feature extraction algorithms to run much faster, allowing us to process 10-minute recordings within 30 seconds.
Lastly, I modified the current data receiving server and TDoA algorithm to accept timestamps taken from an NTP server, allowing for much more accurate determination of the start time of each audio clip.

Upcoming work

This upcoming week, we will begin to fully integrate the different components of the system. We will need to work toward fine tuning the signal processing and machine learning portions of the project.

Ronit

Accomplishments

We had a bug whereby the esp32 would collect data via DMA even when the device was not connected to the laptop, as a result we were not starting the recordings of the two mics at the same time and thus we were unable to collect accurate TDOA.
The fix was making the mic sleep before collecting data and clear the buffers via a soft reset before reconnecting to the laptop.
To further improve the tdoa data, we now have the the nodes synchronise time with network time protocol . The laptop then tells them to start collecting data some time in the future, this ensures that the tdoa data is being recorded from the same period in time.
Machine learning is still still a challenge, i explored the naive bayes approach more as per professor Mai’s advice.
We are also looking at existing substitution cipher solver and seeing if we can leverage them.

Upcoming work

The next week will be more machine learning. We need to find an efficient method of cracking substitution ciphers.

Team Status

Accomplishments

This week, we have completed most of the work for the final revision of our PCB. We have also began finalizing the software running on our ESP32, allowing for NTP synchronization and removing bugs like failing to clear buffers between clips.
We are in the process of organizing the code in order to allow for integration of the individual components of the system.

Upcoming work

In the following weeks, we will need to divert much more attention to the machine learning aspects of the project, as well as to fine tune much of the signal processing algorithms in order to be more robust and effective.

Changes to schedule description

There are no major changes to the schedule.

Status Report 7

Kevin

Accomplishments

- This week I began the final revision of our PCB, named Snorlax.
- This revision has all of the header pins for I/O removed as well as all the header pins for the Vesper wake up mic removed. We decided to power the Vesper at 3.3V since this is within its capabilities and it allows us to remove a linear regulator and related peripherals. I fixed the issue from our previous board by connecting the feedback pin on the buck/boost converter to its Vout pin. Lastly I removed all the LEDs except the one for main power to save space and energy.
- Right now I am aiming to have to board fit on a 2”x1.5” board, which is smaller than our previous 2”x2.5” board. Reducing the size is important because it increases portability.
- I have also been researching ways to reduce non-stationary noise. I have mainly been looking into a technique called Power Level Difference (PLD). PLD is based on having at least two microphones. The general idea is that the signal we want to record is closer to the two microphones that sources of background noise. This means there will be a perceptible difference in power level from close audio source and background sources will have the same power level.
- The papers I am reading take the PLD concept and created Weiner filters based on the difference of power levels between the two signals.

Upcoming Work

- Next week I will be finishing the final PCB and reviewing the design. This needs to be completed in order have the PCB ready for the final demo
- I will also start looking into how to implement the PLD noise reduction algorithm if I have time after completing the PCB. This task is secondary because we can demo our project with a less noisy background if necessary. However, the better noise reduction, the more widely applicable the final device will be.

Ronit

Accomplishments

- I worked with James to support multiple clients(listening devices) in the network stack.
- I worked on reducing the power consumption of the PCB by having the board go into deep sleep mode to further reduce power consumption. In this state, the wifi antenna is powered down and the oscillator is turned off. unfortunately , this also means that the gpios are powered off.
- The mode pin on the vesper microphone has to be set high in order for it to output the digital wakeup signal.
- This means that that the PCB has to have a pullup resistor on the mode pin so that it is set high even when the processor goes to sleep.
- I tried optimizing the naive bayes approach, for decoding the substitution cipher. It now terminates 2 around 1 hour 30 mins for texts ~500 words, but this time seems to go up linearly with the amount of noise. Will have to test more.

Upcoming work

The focus at this stage is purely on machine learning. I have some small tasks left in the esp32, but that should be manageable.

James

Accomplishments

- I made a few more modifications to the server responsible for receiving sound data from the sensor devices. Now, the connection can be terminated from the server side.
- We collected some recordings using two of our sensor boards placed at about 3ft apart, with the keyboard placed in the middle. Unfortunately, we discovered that data left in the buffer would pollute future transmissions. The image below shows this occurring. The data before the red line from microphone 1 corresponded to the previous recording. Everything after was the new recording. We will need to correct this by clearing all DMA and TCP buffers at the end of a transmission.
- Using the data above, I also began working on extracting TDoA data from individual keystrokes. Each keystroke found in one recording is matched to a keystroke within a 40 ms window in the second recording. The two keystrokes are then cross correlated to determine the TDoA. The time differences are visualized in the plot below for each keystroke.

Upcoming work

- In the next week, I will try to help Ronit ensure that buffers are completely cleared after each transaction.
- I will begin incorporating TDoA data into clustering the keystrokes as an additional feature. Notably, there is a currently a risk that the sampling rates are not exactly the same between two boards. This will cause drift in the TDoA values. We may need to use the APLL clock to produce a more accurate clock signal if we find that this is an issue.

Team Status

Accomplishments

- The major accomplishments this week were to begin work on the final PCB design. Many of the debug features were removed, making for a smaller design. All of the final bugs have been worked out regarding the Vesper microphone and the power supply.
- We have begun work on incorporating TDoA data, including modifying the existing data collection server, and processing data collected on multiple microphones in Matlab.
- Lastly, we have worked on lowering power consumption of the device by incorporating the Vesper mic and sleeping.

Upcoming work

In the upcoming weeks, we will need to focus on fully refining the signal processing and machine learning portions of the project. We currently have many individual parts, but have not integrated each component together.

Changes to schedule description

We are still currently behind schedule. We will be making use of much of the slack time we originally allotted to work toward completing the project on time.

Week 6 update

Kevin:

Accomplishment #1 description & result
- This week I mainly focused on noise reduction. I was able to start using a noise reduction method similar to the algorithm used by Audacity. The technique is called spectral noise gating. Essentially, a clip of pure noise is given to the algorithm to analyze. The FFT is taken in windows and then the algorithm tries to reduce similar patterns in the main audio clip.
- This method of filtering is great for removing stationary noise. Stationary noise are sounds that remain relatively constant, like hums, hisses, or even more complicated sounds that remain throughout the clip. However, it is not able to cope well with non-stationary noises such as voices, or sporadic chirps.
- We collected a small sample of noisy data to test on. There are several voices in the background as well as some generic coffee shop sounds. For the reasons described above, the noise was suppressed, especially background hums and some general chatter, but the voices were mostly intact. As a result we has quite a few false positives for keystroke detection.
- For the test, I first ran our keystroke detection algorithm with high sensitivity on the clip with a simple bandpass filter. I then extracted the noise by taking randomly about ⅓ of the samples between detected keystrokes. I then used the spectral noise gating algorithm and applied this to the bandpass filtered audio clip. I re-ran the keystroke detection with lower sensitivity to reduce false-positives.
- The result was that the noise was suppressed but of course not completely gone, especially the voices. The main goal is to see if this filtering can help us more accurately separate keystrokes. I sent the filtered data to James to see what kind of difference he can see in the features collected.
- We would like to be able to remove more noise, however, the voices and other non-stationary sources of noise are very complicated to remove. Some people have had great success in doing so with deep learning model however those are trained to pick out everything except human voice. Training our own model would not be feasible at the moment.
Upcoming work #1 description & expectation
- This week I will continue to refine and study the effects of noise filtering. I will need to collect more data to help further tune the filtering process.
- I will also be research removing non-stationary noise with the use of multiple microphones. I may be able to have some modest success with the added data of a second microphone to determine what is noise and what is a keystroke.
- Lastly, I will begin planning our final revision of the PCB. We are aiming for a smaller package size without as many pinouts.

James:

Accomplishment #1 description & result
- In order to collect TDoA data, I made changes to the server to allow connections from multiple sensor boards. Upon connecting, each board is assigned a unique ID, and the timestamp of the connection is recorded. This will allow us to align the beginning of the recorded data in order to perform TDoA analysis.
- Along with Ronit, we worked on collecting noisy data in order to help improve our noise reduction scheme.
Upcoming work #1 description & expectation
- Using the three working boards we currently have, I will begin collecting keystroke recordings with TDoA data next week. I will then be able to perform TDoA analysis using the code written earlier in the semester using gunshot data. I will attempt to recluster the data with the TDoA as an additional feature.
- The board is responsible for ending the connection and allowing the server to write the data out to a .wav file. This is currently handled by resetting the board manually. In order to allow the board to automatically power down and sleep in the absence of noise, we are currently planning on utilizing the Vesper wakeup microphone to kick a watchdog timer. Ronit has already begun working on this portion of the system.

Ronit:

Accomplishment #1 description & result
- We narrowed down our trouble with our machine learning to having non-representative data. Our previous data was collected improperly.
- Additionally, we were not performing enough noise reduction.
- I worked with James to collect some noisy data for Kevin to test the new noise reduction technique
- In parallel, I got the ESP32 to work with the wakeup microphone.
- Now when there is no background noise, the wakeup microphone is switched on and the processor goes into deep sleep. The oscillator, execution units and sram are clock gated and consume no energy.
- When there is noise in the environment detected above a preset threshold, an interrupt is generated that brings the processor out of deep sleep to resume its normal actions.
- We will order a power analyser next week to measure more accurately how much power is saved, but preliminary experiments using the oscilloscope show that the power savings is around 20%.
- We now need to modify the network stack to recognize when the processor is in deep sleep mode.

- Upcoming work #1 description & expectation
- There needs to be a last bit of tweaking to the network stack to support tdoa data and the sleep modes on both the processors.
- Next week will involve a small portion of work on the network stack, and then the remaining work to the machine learning

Team

Accomplishments
- After testing we verified the new PCB is working with the modifications discussed last week. We are able to obtain data from the main microphone over WiFi while powered by the battery. We also confirmed the Vesper mic is working. Everything is getting the proper voltage. We believe the battery charging system is working, but further testing may be required.
- More progress was made in the signal processing side. We are tuning the keystroke detector and exploring noise reduction options.
- The server is being updating to handle TDoA data.
Upcoming
- We will be collecting more data to aid in refinement and further development. We need data for TDoA, noise reduction, clustering, and machine learning.
- We have our demo on Wednesday and will be showing the data collection and networking portion of our project.
- We will be refining our feature extraction, clustering, and machine learning this week to prepare for integrating everything together.
Changes to schedule
- We are behind on integration since all of the parts are not fully complete yet. Most of this week will be focused on getting everything ready for integration.

Week 4 and 5 update

Kevin:

Update for week 4

Accomplishment #1 description & result
- This week our first revision of PCBs arrived along with the parts needed to assemble it.
- We assembled the PCB using solder paste placed of the exposed pads. The solder paste was placed using a stencil cut from a precise vinyl cutter. The board then went into an oven where a specific pattern of heating is used to properly solder the components.
- Once out of the oven we fixed a few issues with the soldering, namely some bridges on the power ICs.
- Upon testing the PCB, we discovered the ESP32 was getting very hot, much hotter than it should be in normal operation. The issue was that 5V was being put into the 3.3V pin on the ESP32. I did not thoroughly read the datasheet for the ESP32 and thought 5V would not harm the device, this is incorrect. The boost converter used after the battery manager does does not regulate higher voltages down to 3.3V, instead it only boots up the 3.3V. This resulted in the 5V coming from the USB going into the processor without being stepped down.
- We can’t simply supply 3.3V on the power in line because the battery manager needs at more than this to operate and so the output voltage is only about 2V, this also won’t charge the battery.
- We decided to order another PCB with changes to supply the proper 3.3V needed. We will use a Buck-Boost converter which will step both up and down to regulate the voltage to the processor at 3.3V.
  - I chose to use the TPS63001 chip for this task. It has voltage input range of 2.4V to 5.5V, which matches what we need quite well. The output is fixed to 3.3V with a ripple of about 10mV. The chip has a power saving mode, which we will be using, that lowers the switching frequency when possible.
- We also discovered one of the the ground pins on the ESP32 was not connected because the ground pour was slightly smaller than expected preventing a connection.
- We were able to see some data from the vesper mic vout pin since the breakout section was working properly.

First revision pre-assembly

Upcoming work #1 description & expectation
- Next week I will be working on finishing the schematic for the new board as well as routing. I will also order the new parts needed for this board and the board itself.
- I will also begin to research more on background noise reduction
- If the boards arrive on time we will try to assemble and test those as well.

Update for week 5

Accomplishment #1 description & result
- This week I finished routing the new board, Rev 0.2. I kept the breakouts for the ESP32 and the vesper mic. I fixed the ground issue for the one pin the processor and of course added the new buck-boost converter and its peripherals.
- The board was ordered over spring break along with the new parts.
- While waiting for the board to arrive, I was able to research more removing background noise from our audio data. I have a matlab filter used for hearing aids I want to try as well as audacity’s noise reduction system. I need to collect noisy data to test these methods. Audacity is open source so the code for their noise reduction system is available in C++ which I may adapt to our needs if it works well. In preliminary tests it appears to work quite well. The user selects an area of the signal that is only background noise and the algorithm does its best to remove only similar sounds without destroying the good signal.
- The new boards and parts arrived by Wednesday and Ronit assembled it on Thursday morning. We discovered the output of the buck-boost converter was not regulating the output voltage to 3.3V. After careful inspection we found that the feedback pin had to be connected to the output voltage even on the fixed 3.3V version. I soldered a wire on the board between the to pins and it now works very well.
- The board can be powered from 5V USB or our LiPo battery without issue. At first glance it also appears to charge the battery correctly but more testing needs to be done. Importantly, the board connects to the computer and can be programmed.
- The Vesper mic appears to react to noise levels as well.
- We made a quick fix to the old board as well. We soldered a wire into the 3.3V line so we can power the board direction with 3.3V. We need to test if the battery can power it but we will not be able to charge due to the issues discussed in last week’s report.

Rev 0.2 Board, main changes are centered around part U2, the new buck-boost converter.

Upcoming work #1 description & expectation
- I will be collecting noisy data and looking at the effectiveness of the background noise algorithms I researched this week.
- If the board bring up goes well, I will start designing out Rev 1.0 for the PCB which will be more stripped down and as small as possible.

James:

Update for week 4

Accomplishment #1 description & result
- This week, I worked with Ronit to begin collecting labeled sample data to work on over the break. I began researching different features used and methods to extract them from papers which accomplished similar attacks.
- Past work has shown that cepstrum features do yield better classification accuracy than FFT. However, we want to test this for ourselves and compare FFT, cepstrum, and a concatenation of both.

Upcoming work #1 description & expectation
- In the upcoming weeks, I will work on extracting different features from keystrokes in an automated manner and attempt to cluster different keys based on these features.
- I will also need to further explore the potential classification algorithms to use. During my research of which features to use, I found a comparison of different classification algorithms and their accuracy. Interestingly, a linear classification model yielded better accuracy on test data than both a neural network and a gaussian mixture model. I will try to compare both linear classification as well as the original strategy we planned on using, k-means.

Changes to schedule description
- No major changes have been made to the schedule. However, we have been shifting responsibilities around as needed in order to allow Kevin to fully complete the next revision of our PCB.

Update for week 5

Accomplishment #1 description & result
- This week, I worked more on the feature extraction from the labeled keystroke data. I extracted both FFT and cepstrum features from the entire window of the push peak of the keystroke. I am currently ignoring the release peak as it is less intense in energy and sometimes difficult to pick up in the presence of background noise.
- Using the FFT and cepstrum features, I attempted to classify a few different letters using k-means. When classifying a combination of ‘q’, ‘a’, ‘k’ and the space bar, I found that both the ‘k’ and spacebar were clustered fairly accurately, yielding no more that 2-3 erroneous classifications out of 20. However, the letters ‘a’ and ‘q’ were misclustered into all four clusters
- I thus attempted to increase the feature set. I began by taking FFTs and cepstrum features over sliding windows of 5ms with a 3.75ms overlap. Unfortunately, this did not alleviate the issue.
- I also helped Kevin troubleshoot the newly arrived second revision of the PCB. Unfortunately, there was a missing connection on one of the inputs to the buck-boost converter, and we manually soldered on a wire to bridge the connection. This corrected the issue.
Upcoming work #1 description & expectation
- I will further explore different options and features for classifying the keystrokes effectively. We are currently having some difficulty effectively clustering the data, and this is currently the biggest risk area in the project. I will take a closer look at the previous work done in this area and attempt to replicate their methods as described.
- We also hope that TDoA will serve as a highly distinguishing feature. Now that we have multiple working sensor packages (both breadboarded and PCB), we can begin to incorporate TDoA data into the clustering.
- Should we still find ourselves unable to find the correct features to cluster upon, we may need to explore different keyboards, typists, and typing styles.

Changes to schedule description
- We are falling behind schedule in terms of machine learning and signal processing. Fortunately, the PCB design has been mostly finalized. This will allow us to fully shift our focus to these areas. Because of the ample time for testing and integration we alloted to ourselves, we should still be on track to complete the project.

Ronit:

Update for week 4

Accomplishment description & result
- We collected data for each individual key. We recorded 30 keypresses pressed with the index finger.
- I worked with James to extract features from the sound. We got some features but decided to work on them further after spring break.
- I worked with James on keystroke separation, we read the paper Acoustic Keyboard Emanation Revisted Zhung et al. It described a means of extracting keystrokes from a sound clip by creating 10ms windows and examining the the change in the energies.

Upcoming work description & expectation
- After spring break, we will probably have the new batch of PCBs.
- Once we have a working remote listening device, I will make the processor sleep and only wake up when sound is detected.
- Once we have a working sensor package, we will need to work on adapting our network stack to support data from two devices, so that we can extract time difference of arrival.

Changes to schedule description
- I will be helping James with the machine learning from here onwards, we need to try and reach out MVP as soon as possible

Update for week 5

Accomplishment description & result
- Kevin made changes to the PCB over spring break and ordered new ones.
- I assembled the PCB, there was some issues with the power management circuit, but we managed to solder a wire on and fix it.
- I was able to flash our program onto the PCB and connect to the internet.
- In parallel, I have been working on the machine learning. The problem we are trying to solve is a substitution cipher. Brute force was fast but noise threw it off very easily. We found a naive bayes approach and attempted to use it. It was resistant to noise but was exceedingly slow.

Upcoming work description & expectation
- My work going forward will be focused on machine learning and battery conservation on the ESP32, now that we have a working remote listening device, we need to make sure it meets our laid out target of being able to run for 12 hours on one 200mAh battery .
- We may have to recollect the data, the collecting data by recoding individual key presses may not have been the best idea. Preliminary tests using the clustering algorithm show that letters that are located physically close to each other are being put in the same cluster. This is to be expected, but we were hoping to see more separation.

Changes to schedule description
- We are trying to reach our MVP as quickly as possible, we are seeing routes for optimization for the PCB as well as our processing pipeline, we are making a record of them but right now our main focus is on achieving the correctness. These optimizations, especially for the processing pipeline will come in handy when we try to reduce our compute time.
- We have fallen behind in the machine learning, we will be collectively working on it starting next week.

Team:

Accomplishment #1 description & result

We built the second revision of the PCB, initially it seemed to draw too much voltage and was having the same heating problem as last time. We found out that it was because we left one of the pins on the buck-boost converter hanging that we should have tied to the output. We managed to fix this with a botch wire.
We also managed to program the board and run our program on it. Next week we will be testing the microphone and making sure the peripherals work as intended.
James managed to extract cepstral and fft features. He ran K-means clustering on it, but it yielded poor separation on keys that are physically close to each other on the keyboard.

Upcoming work #1 description & expectation

After spring break, we need to try and get our new pcb working. It should work now that we use a buck boost converter.
We need to extract features so that we can begin the machine learning. We will be using cepstral, fft and TDoA to cluster the features.
We also hope that TDoA will serve as a highly distinguishing feature. Now that we have multiple working sensor packages (both breadboarded and PCB), we can begin to incorporate TDoA data into the clustering.

Changes to schedule description

There will be a major shift from embedded coding and PCB design to machine learning and signal processing. We need to achieve our MVP as soon as possible so that we can begin refinement and testing.

Any other major change to your project on a high level

There are no major changes to the project at this moment.

Team:

Accomplishment #1 description & result

We built our first revision of our PCB, Upon testing the PCB, we discovered the ESP32 was getting very hot, much hotter than it should be in normal operation. The issue was that 5V was being put into the 3.3V pin on the ESP32. This required switching from a boost converter to a Buck-Boost converter to step the voltage both up and down. A new schematic has been created and the order will be put out soon.
James and Ronit collected labeled data for every single key.
James and Ronit implemented the keystroke separation as described by the paper by Zhung et al. After some noise reduction, we were able to get the keys separated rather well.

Upcoming work #1 description & expectation

After spring break, we need to try and get our new pcb working. It should work now that we use a buck boost converter.
We need to extract features so that we can begin the machine learning. We will be using cepstral, fft and TDoA to cluster the features.

Changes to schedule description

We will try to get the machine learning as soon as possible, as we think it will be the most challenging portion of our project.

Any other major change to your project on a high level

There are no major changes to the project at this moment.