Joon’s Status Report for 5/8

This week, which was the last week of classes, we had a final presentation and presented what we have done during the semester and our final plans before the final demo. We also worked on the final poster and final video for next week’s deliverables.

For the item recognition part, last week I decided to have as many epochs (50 epochs) to train. However, training as many epochs to train was a bad idea because the model started to get overfit. In other words, the model learns the details of the outliers and the noise in the training dataset, therefore resulting in lower classification accuracy. I went back with the range of 15-22 epochs, as that was the range of the epochs where the validation accuracy started to converge. After the test, I was able to achieve the increased accuracy of 88.08% with 18 epochs.

I also have tested the model with the images of Aaron’s student items because for the final demo, Aaron is the one who will be physically demonstrating the overall system, including the item registration. The most noticeable result was that among 21 student item labels, it was able to classify the water bottle with 100.0% accuracy. We can also see that the notebook gets correctly classified.

    

I also have tested the picture of Aaron’s pencil case with his hand taken and without his hand taken, which is a possibility when a user takes a photo when registering an item. The picture with the hand was correctly identified as an image to be a pencil case, but the picture without the hand was incorrectly labeled as a wallet. But, identifying the pencil case as a wallet is fine because both items are “pouch-like objects” which makes sense to classify a pencil case as a wallet and vice versa. To take account of this issue (which also applies to rectangular objects like a notebook, laptop, etc), we have decided previously to show the top 3 classification results, and this result is shown at the top of the image as a Python list. As we see, the result has [‘wallet’, ‘sports equipment’, ‘pencil case’], which also contains the correct result, ‘pencil case’. Thus, even though the image was incorrectly identified (if we only consider the top 1 result), the user can benefit from simply choosing from the top 3 classification results.

 

Although I have increased the accuracy and made sure that my model works with Aaron’s student item images, my progress on integrating this item recognition part to the web application is slightly behind because of the Tensorflow and Keras modules that throw an error to the deployment of the web application. Thus, to have enough time to integrate and test the deployed web application, I am more leaning towards the backup plan, which is to train the model in the local machine (of a team member doing the final demo) and have the model to communicate with the web application locally. To do so, I am working extensively with Aaron, who is responsible for the deployment, and the web application, and asked Janet to enable the image form feature, where the user can input the image for the item recognition.

For next week, I plan to integrate this model into a web application. Then, we will test the deployed web application and see how it is working with the overall system. Finally, our group will prepare for the final demo and work on the final report.

Joon’s Status Report for 5/1

This week, we had the last mandatory lab meeting with Professor Kim and Ryan. We have discussed the updates on our progress since the last meeting and our goals for the final presentation and the report.

For the item recognition part, I was able to increase the accuracy of the model by training for longer epochs. Instead of 15 epochs, where the validation accuracy started to converge, I trained for 50 epochs and it increased the classification accuracy to 84.36%. I believe that increasing the number of epochs is fine because the model is eventually trained onto the AWS server and in the web application, the user can simply input the image to the server and get the classification results from the model in the AWS server. For the reporting purposes of the testing component of the CNN model, I’m planning to not only present the accuracy percentage but also show the confusion matrix, which shows an overview of the classification of given labels to the correct labels.

In order to integrate the image recognition component into Janet’s web application, my main goal for the remainder of the semester is to provide an API endpoint for the model. I have been setting up the model (along with the datasets and the trained feature onto the AWS server using Amazon Sagemaker.

Although the development and the testing for the item recognition part are done, my progress on integrating this item recognition part is slightly behind because I need to provide the API endpoint for the web application and the item recognition module. Since the web app and the ML item recognition are running on two different servers and getting the servers to communicate to each other may be difficult, my backup plan is to train the model in the local machine (of a team member doing the final demo) and have the model to communicate with the web application locally. Then the top 3 recommendations can be transferred and displayed to the web application. To catch up on this, I have to work concurrently with providing an API endpoint between two servers (ML server and web application server) and working on the backup plans.

For next week, I plan to integrate this model into Janet’s web application and provide an API endpoint for the server. Finally, I have to check whether the recommendation is displayed correctly on the web application. Our group will also work on the deliverables for the finals week of this course.

Joon’s Status Report for 4/24

This week, we had an additional meeting with Professor Kim because there weren’t any mandatory lab meetings due to the ethics discussion, and we also wanted to inform him about our progress after the interim demo.

For the item recognition part, I had to increase the item classification accuracy. For the interim demo, the previous model I implemented had 58% item classification accuracy. However, after the discussion during the interim demo, I have realized that I needed to increase the recognition accuracy by a significant amount. While I initially thought that the 60% accuracy was acceptable (and 50% accuracy for the MVP), as the user can manually type the item information whenever the wrong item suggestion is given, I also agreed with the fact that higher accuracy is desirable to decrease the user burden. I had to completely change the model and the better model that I was able to find was the VGG16 CNN model (For more information of VGG16 model: VGG16 Paper and Blog post). This model was provided by the Python Tensorflow and Keras, so I coded and trained the newly implemented VGG16 model. I also had to change the dimension of the images from 256 x 256 to 224 x 224 because the initial convolution layer takes in the 224 x 224 image and the 224 x 224 dimension is widely used in machine learning models for image classification. Another step I also took to increase the accuracy was to have a validation set. With the validation set, I could find the best hyperparameter, which was the number of epochs to train the model. I stopped the number of epochs to train when the validation accuracy and validation loss gets converged. Among 50 epochs, which took way long time, I was able to find the accuracy got converged around 15 epochs, so I stopped training there.

After implementing the model, I was able to increase the accuracy to 80.21%, which was much better than the planned accuracy. For instance, whenever a user inputs an image of a laptop, it correctly identifies that the image is a laptop, with 99.52% accuracy. For visualization and demonstration purposes, I have printed out all 21 labels and the classification likelihood for each label in percentage. I also took a picture of my wallet, and it correctly identifies that the item is a wallet, with 78.09% accuracy. This was good to see because my model works well for the images taken from a user’s smartphone. Also, among 21 labels, there are many rectangular objects such as laptop, notebook, tablet, and textbook, but it can correctly classify that the image is a wallet. Moreover, the item classification returns a top 3 item classification suggestion so that the user can simply choose the item classification from these 3 suggestions. These suggestions are returned in a form of a Python list so that they can be easily transferred to the Django web application.

My progress on the item recognition part is slightly behind because I need to integrate this item recognition module into the web application. To catch up on this, since Janet’s done with her individual work on the web application, I need to work extensively and collaborate with her during the lab session to fully integrate and test the smooth integration of the item recognition module.

For next week, I plan to integrate this model into Janet’s web application. To do so, I have to train my model to the AWS server and look into the methods to integrate the CNN model into the Django framework. In the later part of next week, our group will also work on the final presentation slides.

Joon’s Status Report for 4/10

This week, for the item recognition part, I worked more on finalizing the CNN model and improving the classification accuracy after testing with much more images (more images than 21 images for the last week). However, after updating the algorithm, while the accuracy I’m currently having is fine for the interim demo, I still found that the image recognition accuracy is lower than expected accuracy for our requirements set for the Design Proposal/Presentation. Although my current model just presents the top 1 item recognition given a student item image input, making the model presenting the top 3 item recognition will help increase the accuracy. Moreover, for this weekend before the interim demo, I will devote most of the time working on improving the image recognition model and algorithm.

My progress on the item recognition part is slightly behind because I was trying to get the item recognition algorithm better than our item recognition accuracy requirement. So, I was unable to integrate this item recognition into Janet’s web application. Thus, for the demo, we will demonstrate the item recognition functionality separate from the web application which tracks the tagged items inside/outside of the backpack. Our schedule has been modified to take account of this delay in the progress on the item recognition part.

For next week, I hope to do more extensive testing not only with the scraped images but also with the “real” images taken from a user’s smartphone camera. Therefore, I will take some student item images on my phone to test the model and ask my teammates to take some student time images to test the item recognition module. I will also make the system present the top 3 item recognition, instead of the current top 1 item recognition. Then, I hope to integrate item recognition with Janet’s web application.

Joon’s Status Report for 4/3

This week, we met with Professor Kim and Ryan during the regular lab meetings. As usual, we again discussed our current progress on the project and goals for the interim demo.

For the item recognition part, I completed the implementation of the CNN model using Python and PyTorch. First, the training is done using the 1000 images per item that were obtained from the scraping from the web server and implementing an image processing algorithm to a single image to augment the image dataset. The training is done on my machine to fix any bugs and errors in my CNN model implementation. For the implementation and training, much guidance was received from this blog post.

Additionally, I worked on testing the CNN model using a few sample images from the test dataset. I have found that the CNN model developed does a good job identifying the student items. I have tested it with 21 images, which comes from a single image (not from the training dataset) from each 21 identifying student images, and it showed a 100% accuracy. However, I hope to test extensively with a much larger test dataset.

My progress is on schedule according to the Gantt chart. I also have made changes to the schedule to take account of the training and finalizing the CNN model implementation time and delayed the schedule for testing the CNN model. This testing should be done in correlation to the integration with the web application prior to the interim demo.

Next week, I plan on extensively testing the CNN model. I will also work with Janet to integrate this feature into the web application. As a group, I will be working with my teammates to fully prepared for the interim demo.

Joon’s Status Report for 3/27

This week, we met with Professor Kim and Ryan during the regular lab meetings. We discussed our goals for the interim demo and our current progress on the project. We also briefly talked about the mid-semester grades and the feedback of the mid-semester grades.

For the item recognition part, aside from the collected images (250 images per item) for the training dataset, I then collected 150 images per item, which are different from the collected images. The IDT library was useful in this purpose as well, because it supports separation of training data and test data, which prevents the model from training on test images and potentially skewing the test results. I also augmented the images by applying image transformation, reflection and distortions to generate many images from a single image that are real-world feasible and label preserving. I also wrote Python and PyTorch code to build the CNN model. 

My progress is slightly behind because training the CNN model took longer than I expected, and this was due to the large number of images. However, I am confident that I can integrate this item recognition part onto the web application that Janet is working on so that our group can demonstrate the feature on the interim demo. 

Next week, I plan on fully training the CNN model with the dataset and test on the test dataset I collected this week. I will also look into the integration of the CNN model to the web application built in Django framework and dealing with the machine learning models onto the AWS server.

Joon’s Status Report for 3/13

This week, I worked with my teammates for the Design Report. Aside from the regular lab meetings where we spent all of the time listening to the Section’s Design report presentations, we had our regular meetings outside of the regular lab sessions to discuss our current progress of the project and design report. From the feedback received from the faculty, TAs, and peers in Section C, we look forward to improving our project.

For the item recognition part, I collected the data (a collection of images) for the items that students carry in their backpack. From 21 items, which includes textbooks, notebooks, pencil case, etc, I was able to collect 250 images per item. I was initially planning to write a Python scraper to scrape images from Google. In order to scrape from Google Images, I had to download ChromeDriver, a web driver for Google Chrome. But, I wasn’t able to download it. However, I was able to find a library called IDT, a Image Dataset Tool, which enabled me to scrape 250 images for each student item. I decided to download and use the IDT library for a few main reasons. (For more information about the library, refer to https://pypi.org/project/idt/, https://github.com/deliton/idt )

  • It intends to create image datasets to be used for machine learning, which is suitable for my task to use CNN for image classification
  • The library allows the optimization of the image dataset and downscaling and compressing the images for optimal file size.
  • The library allows for the multiple searches with different keywords. For example, when searching images for a reusable water bottle, a user will search it by typing different keywords “water bottle”, “reusable water bottle”, or “plastic bottle” for the same results. IDT library enables the user to put the multiple keywords so that the user can collect various possibilities to images to better train the machine learning model.

Below is the collection of the “reusable water bottle” collected by the IDT library and saved in my personal directory in my working space.

 

My progress is currently on schedule, and I have made changes on the schedule for the item recognition part, because I was able to refine my schedule after the research and design process.

Next week, I plan on augmenting the collect images and have a finalized dataset to train the CNN model for image classifier. I will also develop the CNN for the image classification/recognition tasks.

Joon’s Status Report for 3/6

This week, I worked with my teammates on the Design Review presentation slides and document. During our lab session, we discussed the details of the user flow diagram and wireframe of the phone application. We also decided what will be our design and implementation plans for the learning system of the application, which enables the application to recommend what item to bring at a certain time based on the user’s past behavior.

For the item recognition part, I watched Summary of Machine Learning Recipes with Josh Gordon from Google Developers. This playlist allowed me to learn about building a suitable model and training this model for the image classifier, which is what I want to accomplish for the item recognition part of the application. Training the model will be accomplished by scraping the images of the items from Google or other search engines and learning the features of the images. I also watched the Udemy course named Image Recognition with Neural Networks from Scratch. This lecture showed the mathematical concepts behind neural networks and how a neural network is accomplished to complete the task of recognizing an item. These two lectures and the blogpost guided me a lot to finalize my design of the item recognition pipeline, which uses Convolutional Neural Network (CNN) model.

My progress is currently on schedule.

Next week, I plan on starting the implementation of the item recognition classifier. From the list of items set up from last week, I will begin collecting the data (a collection of images) for the items that students carry in their backpacks. Since our team has our GitHub repository set up, I hope to share my progress with my teammates!

Joon’s Status Report for 2/27

This week, I presented our team’s project proposal to Section C and Professor Kim. Thanks to my group members’ support, the presentation went really well. From the feedback I have received from the classmates from Section C, I was delighted to hear that the presentation delivered was clear and very easy to understand. I am also looking forward to hearing some feedback on the technical content of the presentation and discussing this feedback with the teammates. 

I did more research on the computer vision and image recognition modules and datasets for the scope of our project. I have read the documentation of the OpenCV to get familiar with the module. Aside from OpenCV that I found last week, I also found that TensorFlow is another module for working on image classification. TensorFlow website also provides a list of the machine learning models (MobileNet, Inception, and NASNet) for image recognition on the smartphone application with accuracy and processing time. (Link: https://www.tensorflow.org/lite/guide/hosted_models). I read some papers related to MobileNet and hopefully next week, I can determine what models to train and build for our project. From Professor Kim’s comments this week, I developed a list of items that the student would carry in the backpack. Having this comprehensive list of items will help me determine the items to identify and classify while the student registers an item with the smartphone camera.

Our team and I are currently on schedule. 

Next week, I plan on finally deciding the modules and software libraries to use for the camera identification so that I have a finalized pipeline for image processing. Since I am in Korea right now and other teammates are completely remote, we need to set up the version control system (Git) so that each other can work together in the online/virtual environment. I also need to work on the Design Review document with my teammates next week.

Joon’s Status Report for 2/20

This week, I worked with my team to solidify and expand the overall idea of the Backpack Buddy developed for our Project Abstract. Based on each team member’s areas of expertise, we divided our tasks and planned our schedule for the entire semester using the Gantt chart. Since I had taken many computer science and machine learning courses, I decided to work on the camera feature of the system, where users can use their phone camera to register the item into the database. I needed to get familiar with the concepts of applications of computer vision because the knowledge of computer vision is instrumental in completing my task. (Links I referred to: Object Recognition with Deep Learning, OpenCV for Detecting Objects in Live Video)

I did some research related to computer vision and the most frequently used modules. I decided that the OpenCV library is suitable for this project. I was also able to learn the difference between object recognition and object detection. Although detecting the object from the camera image will not be too difficult with the OpenCV library, being able to recognize which object will be a much more challenging task. Therefore, my plan for next week is to do more research on using the computer vision techniques on phone applications built in Kotlin and code the object recognition algorithms to be applied to our project.

I also worked with my team members to set up our WordPress website and build dozens of slides for the Proposal Presentation next week. We’re on schedule because we are regularly meeting besides our designated lab sections to catch up on our progress and work together on the slides. As the first presenter of my team, I will be delivering the Proposal Presentation, so I practiced for the presentation before our class next week.