Jenny’s Status Report for 4/10/2021

For this past week, I successfully fully integrated my ML model that I made with Python into c++. So now it is a part of the pipeline and returns the classification as a boolean. Additionally, I also print out the classification to stdout for demo purposes.

To integrate, I had to create a different model, since it appears the Mat format in c++ had only 1 channel in grayscale. As a result, I changed around my python model a bit to have the input dimensions match with the c++ image dimensions, and the resulting one has 67% validation accuracy.

Right now my part is demo-ready, and I will further work on improving the accuracy of the model in the upcoming weeks.

Sebastien’s Status Report for 4/3/2021

Earlier this week we got together and finally integrated everything together in C++, which entailed working out some dependency management troubles with my teammates which use different linux distributions. Though probably better to figure out that stuff now than latter. Our meeting with Professor Kim earlier this week was elucidating to the fact that “draw on screen” is far more useful than “draw on camera input”. So I spent some time reading the X11 manual to see how best to get a feed of what’s being drawn onto the screen. At first glance the `getImage` function seemed to provide that functionality but unfortunately it’s quite slow, as it blocks the rendering process while copying the entire frame, including transparency. Instead it turns out it’s faster to just create a secondary consumer of X11 input from the UI and read that directly into a CV matrix. I have something that *mostly* works – the images are still a bit messed up because the field ordering in X11’s RGB image format is different than OpenCV’s. Additionally I abstracted away our input so we should be able to swap between different input sources while the application is running.

Zacchaeus’ Status Report for 4/3/2021 (Week of 3/28) (Status Report 6)

This week The team got together and got the entire pipeline put together. We can now read camera input modify it in software and allow other applications to use that software camera as an input. We have it working on Arch Linux and Linux Mint (Which as a Fork of Ubuntu). Jenny is using Ubuntu 20.04 and just got it up and running this week and will be setting her stuff up this week as well. On my own front I began to find a more sophisticated way of doing background subtraction for the purpose of hand detection. So far in my research I found a good paper which also pointed to other papers which give a survey of many other methods that are used. Along with that I started looking at media pipe which seems to be even more ideal for what we want to do. Unfortunately this week I have not been able to get these new methods tested at yet due to assignments being due.

Paper that gives a survey of background subtraction methods. https://ipsjcva.springeropen.com/articles/10.1186/s41074-017-0036-1

Jenny’s Status Report for 4/3/2021

For this past week, I have been caught up with the housekeeping aspects of the project and integration. Since Zacchaeus and Sebastien both currently do not have Ubuntu as their version of Linux, and since Ubuntu is one of the most popular distributions, we wanted to ensure that our project will run on that. Thus, I installed Ubuntu on my Windows machine, enabled dual boot, and set up Whiteboard Pal on my Linux, which involved needing to build all the libraries again and ensuring Whiteboard Pal could find them, namely opencv and Tensorflow being the big ones.

I also have abandoned the path of using a third-party written library to import a keras model into c++, and instead, looked into importing protobuf files (.pb, which is the file type used by tensorflow when saving a model) into c++ through direct tensorflow library functions. I will be testing the different ways to import in the next three days, and will hopefully have found one that works by Wednesday meeting with Professor Kim.

Sebastien’s Status Report for 3/27/2021

This week I finished the pipeline in Rust, but unfortunately was having a lot of difficulty using existing wrappers around video4linux to output to the virtual camera device, so I ended up just rewriting it all in C++.  Once there, I had to learn about various color pixel formats and figure out how to configure the output device with the right format, as the output from openCV is a matrix, but in the end the virtual camera device is a file descriptor – so I had to pick a format that openCV knew how to convert to but that also did not require re-ordering values in the matrix’s data buffer before writing it to file output. Now it works, so I can zoom into class through a virtual camera called “Whiteboard Pal” 😀

Zacchaeus’ Status Report for 3/20/2021

This week I worked to change the method for removing the background from the camera feed and making the hand detection much cleaner I accomplished that by using a blue glove and doing contour detection of that color. I am going to use the recommendations from Course staff to change the method I am using so that I won’t have to use a blue glove anymore. Along with that this week I added code to respond to Jenny’s model to make integration of the whole thing easier.  In doing this I also added code so that I can now draw on the screen and do writing. I added controls to also allow the user to erase the screen as well. Right now everything is bound to the keyboard but all that needs to change is replacing that with signals from Jenny’s Model. For next week we should be able to fully integrate the pipeline after I port all my code to C++.

Jenny’s Status Report 3/27/2021

(I also made a status report last week because I didn’t realize we didn’t need to write one last week. So to keep more up to date, also read that one from 3/20).

From last week, I trained a model that had a lot better accuracy with 73% test accuracy and 91% training accuracy. This was the model I ended up demo-ing on Wednesday to Professor Kim. I then started researching into how to import this model into c++ so I can integrate it into Sebastien’s pipeline and found several different libraries people have implemented (keras2cpp, frugallydeep). I was able to export the file properly with the scripts that were provided, but I was not yet able to figure out how exactly to format the webcam image input in C++ to a format that is accepted by the libraries. I will strive to have that figured out by Wednesday and have my model fully integrated into the pipeline.

Team Status Report for 3/20/2021

For this week, our team completed the design review paper. In addition, we met with Professor Kim and took into account his advice that we should first ignore any optimizations and get a fully working version of our product together. So, despite some suboptimal performance in portions of the pipeline, we will aim to first put it all together by the end of next week, then spend the rest of the time of this course optimizing it to a much more usable level.

As for the exact details of each portion, they can be found in the status reports of each of our members.

Jenny’s Status Report for 3/20/2021

For this week, I worked with my group to complete our Design Review paper. Additionally, I made progress on gesture detection. As per Uzair’s advice, to ignore the face and background in webcam images, I wore a generic brand blue surgical glove and extracted only that to get only the hands, as pictured below:

Next, I started generating both training and testing datasets using this method, with two labels: gesture vs no_gesture. I currently have about 300 images for each of the labels for training, and about 100 each for testing. The images below are of the gesture label in the datasets folder:

I then implemented the CNN training algorithm and model then began training. I tried to adjust parameters and tested which gave the best results in training and found that the highest testing accuracy (which varied from 65 to 69% validation accuracy, and about 80% training accuracy ) and  I got was for a batch size of 15, number of steps of 20, and 3 epochs, of which the printout of the last epoch is here:

For the next week, I will first work on incorporating this into the overall pipeline and complete a finished version of the product before moving on to improving the model and accuracy. I suspect the accuracy will improve a lot if I do multi-category classification instead of binary, because there are a lot of nuances with the no_gesture label, since it incorporates everything else that is not a gesture.

 

 

Zacchaeus’ Status Report for 3/13/2021

The first part of this week I spent the majority of my time working on my teams design presentation, and presented on Monday. The second part of this week I spent my time getting my portion of the project off the ground. My goal was to get all of the finger tracking code in the method that I found online ported into C++. So far I have not been able to test any of the code written but I have been able to get OpenCV and have been able to do some basic things with it. This portion has been going a bit slower for me as the majority of my background is not in software.