Mingrui Yang Status 11/12

Ok , so this week I encountered some difficulties in hardware implementation, whether that be the belt not able to turn, and shaft unable to move, and shaft not fitting into the holes.

Yet, I surmounted all the obstacles. What I learned is that if I have problems, first try figuring out myself, then try some high level ideas from the internet. If that doesn’t help, ask the people who specialize in hardware prototyping( TA and the professor), if that doesn’t help, ask the people who specialize in hardware manufacturing(the tech spark people).

Then my car battery died and I am unable to shop some hardware parts in the pittsburgh area, so I am fixing my car battery and replacing a new one(which, also is an obstacle for capstone). I categorize this as substituory obstacle.

I am on the schedule. For next week, I decide to test the leather belt to see if that would work. If that works, I plan to sew the leather belt together. I can’t believe there are sewing machines on campus https://resources.ideate.cmu.edu/equipment/sewing-machines/. I want to see if I can get access to them some how.

Rachana’s Weekly Status Report (5th November)

This week, I spent time looking at our review from the design report. There are some things we need to figure out including what aspects of the UI we want to display. Currently, we want to just auto add items to the list, give the user the ability to trash the item if need be. We also want to display this list of items with their individual costs, and cumulative cost. I found a grocery POS app that can help me with this. I need to customize this django app to our needs. Furthermore, I have been expanding the CV system to different classes. I still want to be able to better the accuracy rate on the validation set. I think I am on schedule better than I was last week for my parts of the project.

Rachana’s Weekly Status Report (12th November)

This week, I spent time on exporting the model to a file, and I had some errors with the model because I was setting pre-trained to false. From the demo, I realized I was training my model for the coke class with both cans and bottles. I was only testing it with a bottle though, causing some train and test errors. I am able to display the number of records, and my predictions are almost right except for one. I am going to work a little more on fine tuning parameters of the model. I synced up the workflow to make predictions and output the items that it recognizes on the django framework. I need to work on understanding the Raspberry Pi framework. I think I am a little behind schedule compared to last week for my part of the project.

Team Status Report 11/12

This week, Jiyeon mainly worked on improving her text detection. She added a temporary dictionary that contains names of products and its descriptions so that it can be matched even if the brand name is not extracted from the image.  For instance, the main difference between Advil and Tylenol is the substances. Advil contains ibuprofen and Tylenol contains acetaminophen. In case Advil or Tylenol has not recognized from the image, it may have recognized ibuprofen or acetaminophen so utilizing her method, it can match the product. She also worked on correcting the text orientation to consider the case where the product is not ideally positioned (upside down/rotated); however, she is getting some errors with it so she’s hoping to fix it by next week. Additionally, she trained a model using AlexNet for logo detection, but she hasn’t tested with real-world image yet. For next week, she is hoping to fix the text orientation error described above and improve the efficiency of her code because the efficiency isn’t good because she included many testings within her code.

 

Jiyeon’s Status Report 11/12

For this week, I mainly worked on improving my text detection. After Monday’s interim demo with the professors, I realized that lots of important features are missing in my part for demo so I spent next few days to improve the flaws. Though it is not complete, I added a temporary dictionary that contains names of products and its descriptions. For instance, the main difference between Advil and Tylenol is the substances. Advil contains ibuprofen and Tylenol contains acetaminophen. Because I may not always extract the name of brand, I added some distinct words that is specific to the product so that it can recognize the product even if the brand name is not recognized. I also added a list of words that extracted from the image after text preprocessing has been done.

I also worked on correcting the text orientation. For testing purpose, I am testing with ideally positioned images now, but I know the user won’t always place the object in an ideal position. The text may be upside down or rotated a bit depending how the user places it so I used Tesseract’s image_to_osd function to check whether the text is correctly positioned. However, I am getting an error that says the resolution of image is 0 dpi and I could not figure this out yet so I think I am going to write a code on my own to rotate the images if text has not detected in the original image.

I worked on logo detection a little as well. As the professor said, this isn’t my priority yet, but I tested around some of the code. I trained a model using AlexNet and saved the best model as .pt file, but I haven’t tested with my real-world image yet.

For next week, I am hoping to improve the efficiency of my code. Because I am testing around within my code, the efficiency is not so great. The current code extracts text from the original image, grayscale image, blur image, and both grayscale and blur image, and then match the product with all of those texts. It takes up to ~18 seconds depending on the size of the image and amount of text in the image. For example, when I run my code with an image of front of Advil, it take 2.5 sec to run everything. However, with an image of back of Advil, it takes 17.4 sec to run everything and get the result. The main difference is the inconsistency of image size I am using for the testing and amount of text in the image. I am planning to stop my code once a product is matched.

Frankly, I think I am still a bit behind on the schedule, but once I get the text orientation and efficiency figured out, I think I am pretty good on my own schedule.

Team Status Report 11/5

This week, Raymond finally got his parts so he worked on building the hardware. He went to TechSpark to drill holes in the wood, cut the metal shafts, and learned how to use hand press. Also, he researched about how to export the images from Raspberry Pi. He is planning to get the motor turning and integrate the belt to the shaft.

Jiyeon worked on image preprocessing part of text detection and logo detection. She has tried several image processing methods including grayscale, blur, and dilation. While working on the text detection, she thought it would be better to integrate the logo detection as well so she is working on the logo detection. She will be continue working on them throughout the weekend and hoping to start collect some more relevant logo images and improve the accuracy next week.

Jiyeon’s Status Report 11/5

For this week, I worked on image preprocessing part of text detection and logo detection. To improve accuracy of the text detection, I have tried several different image preprocessing methods including grayscale, blur, and dilation. However, the accuracy was about the same, and it had the worst accuracy when just blurred. The below figure represents blurred image (left) and grayscale+blur combined image (right).

Also, I realized that text detection wouldn’t work for some product with minimal text and logo with non-standard font like Coca-Cola below.

So I decided to add logo detection as well, and use the text detection as a supplement. I am planning on training YOLO with some logo dataset. I am using FlickrLogo-47 dataset for now, but I will have to collect some other logo images of the products we will be detecting in the future. I think I am a bit behind as I added a new feature (logo detection) in my part, but I am hoping to work through over the weekends, and present it during interim demo next week. If I get the logo detection done over the weekends, I am hoping to start collecting the logo images I need but not in the FlickrLogo-47 dataset, and improve the accuracy of text detection as well.

Mingrui Yang 11/5/2022 status report

So for this week, I went to Techspark and drilled holes in the wood, cutted the metals, and chatted with tech spark people on how to use the hand press. I opened all the hardware materials, lookedup on some website on how to connect the code to raspberry pi, and how to import the pictures from the camera to the code base.

We are on schedule. For next week, I plan to get the motor turning, and then integrate the belt to the shaft for it to turn.

10/29 Team Status Report

This week, Rachana spent time looking at our review from the design report. There are some things we need to figure out including what aspects of the UI we want to display. Currently, we want to just auto add items to the list, give the user the ability to trash the item if need be. We also want to display this list of items with their individual costs, and cumulative cost. I found a grocery POS app that can help me with this.

For last week, Raymond was primarily working on adding new functionality to the design, which is adding a control unit to the controller. Again, I used the skills I learned from 18220 to find the correct motor driver, and I proofread the specification of the size and the material properties of various mechanical components, and read the voltage and amps of different electrical components.

For next week, Raymons is planning to

  1. Drill 8 holes in the wood.
  2. Cut the metal of the shaft.
  3. Cut the belt to make it fit into our design.
  4.  Glue the two wood planks onto the wood base.
  5. order a coupler to connect the motor and the shaft.

 

For this week, Jiyeon worked on testing some images using Tesseract for text detection and planned web app. We are planning to modify the web app framework that was found on web with the features we want to use. For text detection, Jiyeon was able to extract text from the images, but to be used with the retail product, Jiyeon would need some image preprocessing done before using the text detection so that Jiyeon can distinguish the text from the background. So I think Jiyeon is going to work on that for next week.

Jiyeon’s Status Report 10/29

For this week, I worked on testing some images using Tesseract for text detection and reviewed the feedbacks from our design report. Also, Rachana and I met on Friday to discuss about our web app plan, and we are planning to modify the web app framework that was found on web with the features we want to use. For text detection, I was able to extract text from the images, but to be used with the retail product images, I would need some image preprocessing done before using the text detection so that I can distinguish the text from the background with multiple colors. So I think I am going to work on that for next week. Since the interim demo is due in about a week, I am planning to more work on the project next week