Joon’s Status Report for 3/13

This week, I worked with my teammates for the Design Report. Aside from the regular lab meetings where we spent all of the time listening to the Section’s Design report presentations, we had our regular meetings outside of the regular lab sessions to discuss our current progress of the project and design report. From the feedback received from the faculty, TAs, and peers in Section C, we look forward to improving our project.

For the item recognition part, I collected the data (a collection of images) for the items that students carry in their backpack. From 21 items, which includes textbooks, notebooks, pencil case, etc, I was able to collect 250 images per item. I was initially planning to write a Python scraper to scrape images from Google. In order to scrape from Google Images, I had to download ChromeDriver, a web driver for Google Chrome. But, I wasn’t able to download it. However, I was able to find a library called IDT, a Image Dataset Tool, which enabled me to scrape 250 images for each student item. I decided to download and use the IDT library for a few main reasons. (For more information about the library, refer to https://pypi.org/project/idt/, https://github.com/deliton/idt )

  • It intends to create image datasets to be used for machine learning, which is suitable for my task to use CNN for image classification
  • The library allows the optimization of the image dataset and downscaling and compressing the images for optimal file size.
  • The library allows for the multiple searches with different keywords. For example, when searching images for a reusable water bottle, a user will search it by typing different keywords “water bottle”, “reusable water bottle”, or “plastic bottle” for the same results. IDT library enables the user to put the multiple keywords so that the user can collect various possibilities to images to better train the machine learning model.

Below is the collection of the “reusable water bottle” collected by the IDT library and saved in my personal directory in my working space.

 

My progress is currently on schedule, and I have made changes on the schedule for the item recognition part, because I was able to refine my schedule after the research and design process.

Next week, I plan on augmenting the collect images and have a finalized dataset to train the CNN model for image classifier. I will also develop the CNN for the image classification/recognition tasks.

Leave a Reply

Your email address will not be published. Required fields are marked *