Author: aazhao

Alanis’ Status Report for 12/7

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours.

This week, I finished the systematic testing of our classification models by testing how brightness, x/y translations, and rotations affect the accuracy of our classification models. I wrote a script to do this and also created some graphs.

I also evaluated which architecture out of ResNet50/101/152 would be best for the color and clothing type models by weighing the correct predictions vs close predictions. A close prediction for clothing type would be guessing jeans instead of trousers or a dress instead of a jumpsuit. A close prediction for color would be guessing beige instead of white or black instead of grey. The general description of the criteria I used to determine a close guess for clothing type would be the same category(tops, overwear, bottoms) and a similar shape/usage/weather (jeans and trousers can both be casual or business casual, dress and jumpsuit are both suitable for warm and neutral but not cold weather). The general description of the criteria I used to determine a close guess for color would be if the actual color could be predicted by lightening/darkening the predicted color(lighting would do this, and you could reach white by lightening beige which is already quite light, you could reach black by darkening grey, but you could not reach blue by lightening/darkening red). The exact criteria for close guesses will be outlined in our final report, however it is quite long so I will not include it here.

Also, this wasn’t done for the usage model since only the ResNet50 architecture was able to converge during training.

I also debugged our .tflite color model. We have 3 classification models (clothing type, color, and usage) that we converted to .tflite for inference on the Jetson. They are all ResNet50 architecture and trained in the same way (mostly the same data but with different labels). They all perform at an accuracy of 60-70% before conversion to .tflite. After running inference on the converted .tflite models on the Jetson, the clothing type and usage models perform with the same accuracy however the color model accuracy took a very large hit and mostly predicts blue, brown, and black.

We verified that the camera is not causing the issues since we are testing the models using pretaken images. I tested the problematic .tflite model on my computer and the accuracy is the same as the original. I verified this using 2 different Tensorflow versions, the one I used to train and the one running on the Jetson. I also wrote some code for Riley to try using the original model optimized for the Jetson using TensorRT, however this was very slow. I also sent Riley .tflite versions of the ResNet101/152 which also performed with low accuracy. We also tried training a .tflite model on the same Tensorflow version as the Jetson which still did not address the issue.

We decided to also pursue an alternative way of determining the color by cropping the clothing image around the largest object and determining the most prominent color in the cropped image. I wrote code to do this which is located here. We will continue to try and fix the .tflite file and will see which option works better before the demo.

I also worked on our poster.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the proiect schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

I hope to fix the color model and fine tune the color prediction from pixels code by adding the determination of the main color from the most prominent rgb.

Alanis’ Status Report for 11/30

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours.

I fixed our usage model by compiling a new dataset. I manually labeled about 5000 images of clothing items like dresses, tops, jackets, etc. which could either be casual or formal. I then trained a ResNet50 model on this dataset and was able to achieve an 84% validation and a 65% “sanity check” accuracy. The code for the manual labelling is here, which basically displayed an image and allowed me to determine if it was casual/formal. The code for the usage training is here

I also worked in-person with Riley to debug the Jetson issues with Tensorflow. The issue ended up being an incorrect Jetpack version. I then wrote some code on the Jetson to run inference on the 3 classification models. This involved converting my .keras models to the “saved model” format, which cannot be trained further, and then to the TFLite format, which allows much faster inference on the Jetson.

I also added some frontend/backend features. I wrote a function which takes a city name as a string and returns if the weather is cold, neutral, or hot. This utilizes the OpenWeatherMap API. I also changed the frontend code to be able to display up to 4 pieces of clothing—top/dress/jumpsuit, bottoms, jacket, and overwear(hoodie, sweater, etc). This needed to be done due to a change in our outfit generation algorithm. I also added a frontend page which displays each piece of clothing in a users closet (6 per page with the option to scroll through pages). When a piece of clothing is clicked, it will show the user the current classification labels and allow them to be changed if necessary. I had to make some changes to the backend APIs as well for this feature. These features took a few days and various commits, but the linked commits are the ones with the finished features.

I also worked on our final presentation by creating the slides and writing information for Riley to turn into a script.

I also began the systemic testing for our classification by writing code to help determine how the accuracy changes when the brightness of the image is increased/decreased.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the proiect schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

I hope to fix the color model and finish the systemic testing for brightness (brightness increase by 1.5/2 and decrease by 1.5/2).

As you’ve designed, implemented and debugged your project, what new tools or new knowledge did you find it necessary to learn to be able to accomplish these tasks? What learning strategies did you use to acquire this new knowledge?

We recognize that there are quite a few different methods (i.e. learning strategies) for gaining new knowledge — one doesn’t always need to take a class, or read a textbook to learn something new. Informal methods, such as watching an online video or reading a forum post are quite appropriate learning strategies for the acquisition of new knowledge.

I had to learn how to use the Tensorflow framework to train a model. I used the online documentation for Tensorflow as the library is very well documented. I also had to learn how to use the pandas library to process datasets. I maily relied on this article from W3Schools. I also had to determine the best model architectures to use. I relied on this paper which discusses the performance of different architectures on the Jetson.

I also had to learn how to write backend APIs in the FastAPI format. I learned this from Gabriella. I also had to learn how to write Flutter code. I did this by googling “how to create _ in flutter” for each feature I wanted to make and following the forum posts or articles which came up. 

Whenever I hit roadblocks, I usually googled the error I was encountering and followed the advice of forum posts. For questions about the backend/frontend, I usually asked Gabriella first to see if she had encountered them.

Team Status Report for 11/30

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risk at the moment is the classification accuracy of images from the camera. We are getting a much lower color classification accuracy from images taken on the camera than our validation and “sanity check” datasets. We haven’t pinpointed the exact reason for this since the clothing type and usage accuracies from the camera are similar to our validation and “sanity check” datasets. We are planning to retrain our color model on a larger dataset, which may require manually labeling more images. If this doesn’t address the accuracy, we may try a different approach which does not require a classification model by trying to determine the most common pixel values in the center of the image to determine the base color of the clothing.

Another issue may be our classification model since the accuracy on the camera is a bit lower than the validation and “sanity check” dataset accuracies for “long” clothing items, like dresses, lounge pants, and trousers. We thought this may have been due to a confusing background that had a doorway and a bright pillow which could have created issues. This confusing background was only used because Riley wasn’t able to find a large enough solid color background at his home over break. However, once we return back to Pittsburgh, we will have one to rerun the tests for “long” clothing items on.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

No changes were made.

Provide an updated schedule if changes have occurred.

Our schedule has not changed.

This is also the place to put some photos of your progress or to brag about a component you got working.

Photos of our progress are located in our individual status reports.

Alanis’ Status Report for 11/16

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours.

I ended up being unable to use the EfficientNet architecture because of a bug which I discussed in our weekly meeting this week. However, I retrained our colour and clothing type models using a balanced dataset and tried our the ResNet50,101, and 152 architecture. The ResNet152 performed best for clothing type classification and the ResNet101 performed the best for color classification, each reaching a validation accuracy of 65-75%. I also manually compiled a small dataset of 80 images(including at least 5 examples from each type and color class) which I used as a sanity check to ensure that our models performed well on the types of images that Style Sprout users would take. This spreadsheet includes the accuracy for the ResNet50 and ResNet152 versions of the models, and I will update it to include the accuracy for the ResNet101 after training is finished. 

I also did some troubleshooting of our usage model since it was only outputted the “sports” usage(which has now been switched to casual). My troubleshooting, which included running the 3 tests discussed in our weekly meeting, and I have determined that my next course of action is to try and determine the usage for less categories. This means determining only if clothing types like tops and skirts are formal or casual and ignoring clothing types like blazers and jeans, which are inherently either casual or formal. 

I also worked with Gabriella in person on Monday to test the uses until dirty feature. We had to make a couple of changes to our backend and frontend, which are located on our Github, but were able to successfully complete this feature. I also added the ability to display a variable # amount of pictures when an outfit is generated. Before, it only displayed a top and bottom, however now there is the ability to show outfits with 3 pieces of clothing(jacket, top, bottom), 2 pieces of clothing(jacket and dress, top and bottom), or 1 piece of clothing(dress). This is located on this branch on our Github.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the proiect schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

In the next week, I hope to finish the usage model and continue helping Gabriella with the backend/frontend. 

Verification Tests:

I tested the accuracy of our models in two ways. Our dataset was split into two parts, training and validation(training is 80% of the original, validation is the remaining 20%). When training each model, the validation dataset was used to measure the performance of the model. For the clothing type and color classification models, the validation accuracy ranged from 60-70% depending on the architecture that was used. Since each model is a 1-of-k classifier, where each image belongs to one of k classes, the validation accuracy is categorical accuracy, meaning # of correctly classified images/total correctly classified images. A correctly classified image is an image where the predicted class is the same as the actual class. 

I also tested the accuracy of our models using a small dataset of 80 images containing 5 examples of each of our 16 possible clothing types and about 6-10 images for each of our 12 colors. I manually compiled this dataset. I evaluated the predictions made by our models using categorical accuracy and a spreadsheet linked earlier in my status report. I haven’t finished checking the accuracy for the ResNet101 clothing type and color models since they are still training.

However, it is highly likely that none of our models will meet the 80% accuracy use case requirement we had for classification accuracy. We are supplementing this with the additional feature of having the user confirm/correct the labels produced for each piece of our user’s clothing, which should correct any incorrect labels to achieve a 100% accuracy for clothing classification in our database.

Team Status Report for 11/9

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risk that could jeopardize the success of the project is the classification accuracy, same as last week. We are managing this risk by retraining our models, which have achieved higher accuracy than we were getting before. We are also managing this risk by working with a “default” model on the Jetson which has a low accuracy. By using this, we can continue to develop our code on the Jetson and integrate all parts of our system without being slowed down. Later, we will be able to switch out the default model with our actual models. 

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

If we have time at the end of our project, we will add functionality to our system that allows the user to change incorrect labels. This will be an additional page which will show all scanned and classified items which have not yet been verified. If the user wants to, they can go through this page and update any incorrect classifications and confirm correct classifications. After pieces are confirmed, they will be removed from the page. Also, even if users do not confirm or change the classification labels of pieces, they will still be in our database of the user’s closet so this feature is optional to use.

We also added a push button to the design, to limit the amount of pictures that will be taken and ensure that the user has the clothing ready and prepared when the picture is taken. The costs are minimal (15$) and will not dramatically change the architecture.

Provide an updated schedule if changes have occurred.

Our schedule has not changed. 

This is also the place to put some photos of your progress or to brag about a component you got working.

Photos of our progress are located in our individual status reports.

Alanis’ Status Report for 11/9

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours.

This week, I worked on retraining our classification models, focusing on the clothing type classification model. I tried out the EfficientNet architecture, which from my research, outperforms ResNet while maintaining a low classification time on edge devices. I reprocessed our DeepFashion training data because it had been cropped incorrectly with this code. I then applied image augmentation to boost the training accuracy, the code is located here. I experimented with various combinations of image augmentation and architecture, and improved the performance from last week by 20% to 70% by using the EfficientNetB4. 

While training models, I tried to find an alternative to training on my local device. I looked into CMU provided machines, however they reset every night which would interfere with the training. I also looked into using Google Colab, however the free version would not let me upload a zip file of my training data. Also, the free version runtimes are limited to around 2 hours, and even sped-up training on a GPU would take longer than that. I also signed up for a free trial for Google Cloud to utilize Colab Enterprise, however it still would not allow me to upload my training data and runtimes were limited to 17 hours, which is shorter than the training time. I will ask more about alternatives to local training during my meeting next week.

I also worked on the frontend. I added functionality to the Settings button which allows the user to input how many times a piece of clothing can be worn until it becomes dirty. This included adding a popup when the Settings button is pressed, adding an integer input field to the popup, and adding an API call when the input is entered. I also added a basic API outline to the backend code which will be updated next week with functionality of changing the setting on the backend. The frontend code is located on this branch and the backend code is located here.

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the proiect schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

Next week, I hope to start retraining the usage model. I also hope to finish the functionality for the settings button. 

Alanis’ Status Report for 11/2

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours.

This week, I finished drawing the background image of our app for the frontend. It looks like this:

I also began testing our clothing type, color, and usage models with various pictures of pieces from my own closet as well as online secondhand stores like Depop. I chose to source some pictures from Depop as we want to test our model with pieces from all possible classes. However, we don’t own enough clothing from certain categories(orange, blazers, rompers). However, I realized that we would be able to find pieces of clothing from these categories on online secondhand shops where people would take well-lit pictures of their own clothes to sell which would help supplement our testing data. It was during this testing that I realized that our usage and clothing type models struggled to achieve high accuracies in practice. For example, our model classified the below image as beige, casual, and a dress which is correct.

However, clothing type and usage classification were frequently incorrect. Our models classified a tank top as a blazer and pants as a top, while always outputting a casual usage type even for formal clothing. I realized that this was due to an incorrect training data set. I had initially misunderstood what having a “balanced” dataset means and thought it meant that you need to have enough training data for each class past a threshold.

However, I realized that the amount of data present for each class should also be equal. The accuracy of my clothing type and usage models was inflated since some classes had much more data, like tshirts and casual clothing, while the color model was trained on a dataset where each class had a similar amount of data. Since the imbalance of my dataset was severe for the clothing type, with categories like Tshirts having thousands of examples and categories like blazers having just over 100, I had to source more.

I returned to using the DeepFashion dataset. This dataset was initially insufficient for us because it didn’t label images with all the attributes appropriate(most images were missing color labels). However, it does have many labeled clothing images which works for our application since we are only classifying by color. The updated training and processing code is on our Github. I trained the ResNet50, which reached 50% accuracy. I am also training the ResNet101, which has more layers/parameters, to try and achieve a higher accuracy. If the accuracy of ResNet101 is still insufficient, I will pivot to a different architecture like EfficientNet. 

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the proiect schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

In the next week, I hope to retrain the clothing type and usage accuracies to be more accurate. I also hope to continue helping Gabriella with the frontend and helping Riley integrate the trained models onto the Jetson.

Team Status Report for 11/2

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

The most significant risks that could jeopardize the success of the project is the accuracy of our models. We are retraining them for reasons discussed in detail in Alanis’ individual status report. These risks are being managed by retraining, but we have also established the contingency plan of having the user confirm our classifications. This would mitigate insufficiently accurate classifications by allowing the user to switch incorrect classifications, ensuring the accuracy of our outfit generation.

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

No changes were made to the existing design of the system. Changes will be made to our backend/frontend to allow users to confirm the classification if we cannot achieve sufficient classification accuracy.

Provide an updated schedule if changes have occurred.

Our schedule has not changed. 

This is also the place to put some photos of your progress or to brag about a component you got working.

Photos of our progress are located in our individual status reports.

Team Status Report for 10/26

What are the most significant risks that could jeopardize the success of the project? How are these risks being managed? What contingency plans are ready?

A risk that we have is the transmission of data from the Jetson to the server. Currently, when set up in the computer labs, the Jetson is connected to CMU Device and thus has no issues communicating. However, during presentations this will no longer be the case and the Jetson will be isolated. It could then become an issue to send classification data and images to the server to update on the frontend and database. We are considering a couple different solutions, usb/ethernet cables to connect to the server, hotspotting or simply connecting to CMU Secure if that is possible. We plan to test out some of these solutions next week to see if our embedded device can communicate without being plugged in and connected with CMU Device. 

Were any changes made to the existing design of the system (requirements, block diagram, system spec, etc? Why was this change necessary, what costs does the change incur, and how will these costs be mitigated going forward?

Yes, we decided against using AWS S3 for storing our images. This change was necessary because the pricing on S3 is variable to the amount of storage you use, so it became complicated for us to use our Capstone fund on it. We will be using one of our computers as the server for our images instead. This is slightly inconvenient as we will need the storage for the images, but should not cause any issues. Moving forward, if the amount of images and storage needed is more than we have available on our computers, we will look into different places to store them and revisit the pricing for storing images on S3.

Provide an updated schedule if changes have occurred.

Our schedule has not changed. 

This is also the place to put some photos of your progress or to brag about a component you got working.

Photos of our progress are located in our individual status reports.

Alanis’ Status Report for 10/26

What did you personally accomplish this week on the project? Give files or photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week (12+ hours.

This week, I made progress on the frontend by integrating my local android emulator with our frontend repo that Gabriella has been working on. I then changed our main page to look more like what was proposed design report. This code is located on a separate branch on our frontend repo because Gabriella is currently using the main page to display debugging content from the backend. I am using a placeholder image for now as I am still working on drawing the actual image that we will be using. 

I also worked on our ethics assignment and discussed the group questions with Gabriella and Riley during our weekly meeting. During this weekly meeting, I also helped Riley debug why the import of OpenCV wasn’t working with Python. We initially thought the installation of OpenCV was incorrect, but the reason ended up being that the Python path was incorrect so Python could not find the installed libraries. I then wrote some code which would take an image provided by OpenCV from the camera and input it into our 3 models to output a one-hot encoded array which denotes which class it belongs to. We may have to change this array later depending on exactly how we will communicate the classes to our backend, but this code holds the general structure of the edge-device classification and allows for later output processing. The code is commented out for now to avoid conflicting with Riley’s work. I also wrote the code for testing our models by using pictures from our own closets to test our models on new data. 

predicted_classes is a one-hot encoding of the possible classes. I will use this code with pictures of our clothes to calculate the accuracy of our models on pictures that we take. 

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the proiect schedule?

My progress is on schedule.

What deliverables do you hope to complete in the next week?

I hope to complete the integration of our 3 classification models and the Jetson Xavier so that it is able to take an image with it’s attached camera, feed it into our model, and output a class for the clothing type, color, and usage which can then be transmitted to our backend. I also hope to continue the testing our classification models with pictures by utilizing clothes from our own closets.