Isaiah Weekes’s Weekly Status Report for 9/20 – Team A1: Mirror Mirror on the Wall

What did you personally accomplish this week on the project? Give files or
photos that demonstrate your progress. Prove to the reader that you put sufficient effort into the project over the course of the week

This week mainly consisted of 4 tasks:

Setting up the face detection, and implementing it as a python script
Refining the search for a model architecture for the ML component
Creating a data-pipeline, training script and wrapper class to be ready for training next week
Refining the search for datasets and settling on an initial set

For setting up face detection, I used the Google Mediapipe framework and created a script to detect and segment a video feed, and return just the image of the bounding box with the highest model confidence. I’ve included a video of the script running on a stock video, but it is a 1-line change to run it on live-footage (my laptop doesn’t have a built-in camera).

Face detection on stock video

For refining the search for a model architecture, I identified this week that MobileNetV2, while effective and an industry standard architecture, was slightly out-of-date, and newer models were available. The main contenders were ShuffleNetV2, Inception-Net, and MobileNetV3-small. MobileNetV3-small and ShuffleNetV2 are both successors to the methods used in MobileNetV2. Between the two, ShuffleNetV2 trades off slightly better speed for slightly lower performance in comparison to MobileNetV3. However, as both models are lightweight and optimized for edge devices (< 100 MFLOPS), the tradeoff for the performance gain is worth it. Likewise, Inception-Net also provides an increased performance, but at a significant decrease in speed and increase in size. MobileNetV3-small was chosen as such.

To be prepared for pretraining next week, I created a data-pipeline and simple torch training script. I want to be able to download the datasets and get right into training after creating the dataset classes. To do so, I wrote a wrapper class for the MobileNet architecture, taking the features embedding and including my own classification head and reconstruction head. From there, reconstruction (and/or masked reconstruction) can be used to pre-train the model on any extra data, and the classification heads can be used to train the models.

I have identified and browsed the datasets corresponding to each factor the model needs to track that we will use.

Oiliness/Dryness:

Wrinkles:

https://github.com/labhai/ffhq-wrinkle-dataset?tab=readme-ov-file

Acne:

https://www.kaggle.com/datasets/osmankagankurnaz/acne-dataset-in-yolov8-format

For sunburns, we have a burns dataset, but as the data involves more serious burns across various body parts, scraped images from google images will be used to complement it.

https://www.kaggle.com/datasets/trainingdatapro/skin-defects-acne-redness-and-bags-under-the-eyes
- For skin reddness
https://www.kaggle.com/datasets/shubhambaid/skin-burn-dataset
- For burns

Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?
What deliverables do you hope to complete in the next week?

Progress is on schedule. For deliverables for next week, I would like to have graphs from pretraining, and graphs from training the model on the different datasets. From there, if the results are promising, I would like to attach the face-detection to the model and analyze those results