Jiyeon’s Status Report 11/19

This week, I finished text orientation and improved efficiency and readability of my code.  I initially used Tesseract’s image_to_osd function to correct the text orientation, but it gave me an error that the resolution of image is 0 dpi even though I made sure to test with an image with resolution of 300 dpi. I couldn’t fix the error so instead of using image_to_osd function, I wrote a function to check the skewness and correct the image. Additionally, my other main concern was efficiency. Depending on the image, my code was running in between 2.5 sec to 17.4 sec. This was mainly because of my code efficiency and the inconsistency of resolution of the images I used to test. This week, I worked on improving efficiency and readability of my code by breaking down my code into multiple helper function and files and making sure the code stops if a matching text is found from our product info dictionary. I also wrote a helper function to save our product info to json file which would be useful for web app later. I was able to improve the best case image from 2.5 sec to 1.2 sec, and the worst case image from 17.4 sec to 5.9 sec.

   

This definitely still needs some improvement since the user would not want to wait ~6 sec each product to be scanned. For next week, I am hoping to improve more efficiency with consistent test images, and with the camera I got from Raymond, I will take some pictures of the items and have them test in my code. Also, I am hoping to improve on my image preprocessing and start integration with Raspberry Pi and web app for next week. I think I am better on track than last week since I have improved some code.

Team Status Report 11/19

Jiyeon:

This week, Jiyeon finished text orientation and improved efficiency and readability of my code. She initially got error from Tesseract’s image_to_osd function, which corrects the text orientation, but she couldn’t fix the error. So instead of using image_to_osd function, she wrote a function to check the skewness and correct the image. Additionally, her other main concern was efficiency. This week, she worked on improving efficiency and readability of her code by breaking down my code into multiple helper function and files and making sure the code stops if a matching text is found from our product info dictionary. She also wrote a helper function to save our product info to json file which would be useful for web app later. She made a good amount progress on the efficiency but this definitely still needs some improvement since the user would not want to wait ~6 sec each product to be scanned. For next week, she is hoping to improve more efficiency with consistent test images.

Raymond:

So for last week right, Raymond was primarily working on a  couple of design challenge. What he have observed is that there are similarities in hardware and software engineering. The similarity is that there are challenges along the way to finish a project, and sometimes when we see a challenge he has to find ways to pass the challenge. In software, it might be switching to different APIs, different library vendors etc. Sometimes one function call provided by one party works, and the others don’t, and often we don’t know why(probably we didn’t understand the input well enough, the output well enough, and the overall function well enough. A lot the times this happens in math as well, when we don’t know the exact size and dimension and type of a variable, and we don’t know the meaning of certain symbol, the abstract and geometric meaning, and we don’t understand it well enough it often leads us to misuse it, forgetting to use it, not understanding the derivations, etc.

So what he learned last week is that in hardware, we also need to have some basic understanding in the laws of physics and the properties of the materials. He thinks last week most of the knowledge domain he utitlized was from Physics I, Newtonian physics, in terms of friction, surface area, torque. Some basics application of physics(not numerial, but some high level concepts).

Anyways, he get the belt working at half of the length. Now the problem is our design is not able to carry a lot of the weight because we have hollow areas on the belt, but since this is a proof of concept we will just put light weight stuffs on. For next week, he plans to focus on raspberry pi. We are on schedule.

 

Overall, Raymond and Jiyeon made great progress this week. We will tackle problems together next week using our knowledge base.

Mingrui Yang Status Report 11/19

So for last week right, I was primarily working on a  couple of design challenge. What I have observed is that there are similarities in hardware and software engineering. The similarity is that there are challenges along the way to finish a project, and sometimes when we see a challenge I have to find ways to pass the challenge. In software, it might be switching to different APIs, different library vendors etc. Sometimes one function call provided by one party works, and the others don’t, and often we don’t know why(probably we didn’t understand the input well enough, the output well enough, and the overall function well enough. A lot the times this happens in math as well, when we don’t know the exact size and dimension and type of a variable, and we don’t know the meaning of certain symbol, the abstract and geometric meaning, and we don’t understand it well enough it often leads us to misuse it, forgetting to use it, not understanding the derivations, etc.

So what I learned last week is that in hardware, we also need to have some basic understanding in the laws of physics and the properties of the materials. I think last week most of the knowledge domain I utitlized was from Physics I, Newtonian physics, in terms of friction, surface area, torque. Some basics application of physics(not numerial, but some high level concepts).

Anyways, I get the belt working at half of the length. Now the problem is our design is not able to carry a lot of the weight because we have hollow areas on the belt, but since this is a proof of concept we will just put light weight stuffs on. For next week I plan to focus on raspberry pi. We are on schedule.

Mingrui Yang Status 11/12

Ok , so this week I encountered some difficulties in hardware implementation, whether that be the belt not able to turn, and shaft unable to move, and shaft not fitting into the holes.

Yet, I surmounted all the obstacles. What I learned is that if I have problems, first try figuring out myself, then try some high level ideas from the internet. If that doesn’t help, ask the people who specialize in hardware prototyping( TA and the professor), if that doesn’t help, ask the people who specialize in hardware manufacturing(the tech spark people).

Then my car battery died and I am unable to shop some hardware parts in the pittsburgh area, so I am fixing my car battery and replacing a new one(which, also is an obstacle for capstone). I categorize this as substituory obstacle.

I am on the schedule. For next week, I decide to test the leather belt to see if that would work. If that works, I plan to sew the leather belt together. I can’t believe there are sewing machines on campus https://resources.ideate.cmu.edu/equipment/sewing-machines/. I want to see if I can get access to them some how.

Rachana’s Weekly Status Report (5th November)

This week, I spent time looking at our review from the design report. There are some things we need to figure out including what aspects of the UI we want to display. Currently, we want to just auto add items to the list, give the user the ability to trash the item if need be. We also want to display this list of items with their individual costs, and cumulative cost. I found a grocery POS app that can help me with this. I need to customize this django app to our needs. Furthermore, I have been expanding the CV system to different classes. I still want to be able to better the accuracy rate on the validation set. I think I am on schedule better than I was last week for my parts of the project.

Rachana’s Weekly Status Report (12th November)

This week, I spent time on exporting the model to a file, and I had some errors with the model because I was setting pre-trained to false. From the demo, I realized I was training my model for the coke class with both cans and bottles. I was only testing it with a bottle though, causing some train and test errors. I am able to display the number of records, and my predictions are almost right except for one. I am going to work a little more on fine tuning parameters of the model. I synced up the workflow to make predictions and output the items that it recognizes on the django framework. I need to work on understanding the Raspberry Pi framework. I think I am a little behind schedule compared to last week for my part of the project.

Team Status Report 11/12

This week, Jiyeon mainly worked on improving her text detection. She added a temporary dictionary that contains names of products and its descriptions so that it can be matched even if the brand name is not extracted from the image.  For instance, the main difference between Advil and Tylenol is the substances. Advil contains ibuprofen and Tylenol contains acetaminophen. In case Advil or Tylenol has not recognized from the image, it may have recognized ibuprofen or acetaminophen so utilizing her method, it can match the product. She also worked on correcting the text orientation to consider the case where the product is not ideally positioned (upside down/rotated); however, she is getting some errors with it so she’s hoping to fix it by next week. Additionally, she trained a model using AlexNet for logo detection, but she hasn’t tested with real-world image yet. For next week, she is hoping to fix the text orientation error described above and improve the efficiency of her code because the efficiency isn’t good because she included many testings within her code.

 

Jiyeon’s Status Report 11/12

For this week, I mainly worked on improving my text detection. After Monday’s interim demo with the professors, I realized that lots of important features are missing in my part for demo so I spent next few days to improve the flaws. Though it is not complete, I added a temporary dictionary that contains names of products and its descriptions. For instance, the main difference between Advil and Tylenol is the substances. Advil contains ibuprofen and Tylenol contains acetaminophen. Because I may not always extract the name of brand, I added some distinct words that is specific to the product so that it can recognize the product even if the brand name is not recognized. I also added a list of words that extracted from the image after text preprocessing has been done.

I also worked on correcting the text orientation. For testing purpose, I am testing with ideally positioned images now, but I know the user won’t always place the object in an ideal position. The text may be upside down or rotated a bit depending how the user places it so I used Tesseract’s image_to_osd function to check whether the text is correctly positioned. However, I am getting an error that says the resolution of image is 0 dpi and I could not figure this out yet so I think I am going to write a code on my own to rotate the images if text has not detected in the original image.

I worked on logo detection a little as well. As the professor said, this isn’t my priority yet, but I tested around some of the code. I trained a model using AlexNet and saved the best model as .pt file, but I haven’t tested with my real-world image yet.

For next week, I am hoping to improve the efficiency of my code. Because I am testing around within my code, the efficiency is not so great. The current code extracts text from the original image, grayscale image, blur image, and both grayscale and blur image, and then match the product with all of those texts. It takes up to ~18 seconds depending on the size of the image and amount of text in the image. For example, when I run my code with an image of front of Advil, it take 2.5 sec to run everything. However, with an image of back of Advil, it takes 17.4 sec to run everything and get the result. The main difference is the inconsistency of image size I am using for the testing and amount of text in the image. I am planning to stop my code once a product is matched.

Frankly, I think I am still a bit behind on the schedule, but once I get the text orientation and efficiency figured out, I think I am pretty good on my own schedule.

Team Status Report 11/5

This week, Raymond finally got his parts so he worked on building the hardware. He went to TechSpark to drill holes in the wood, cut the metal shafts, and learned how to use hand press. Also, he researched about how to export the images from Raspberry Pi. He is planning to get the motor turning and integrate the belt to the shaft.

Jiyeon worked on image preprocessing part of text detection and logo detection. She has tried several image processing methods including grayscale, blur, and dilation. While working on the text detection, she thought it would be better to integrate the logo detection as well so she is working on the logo detection. She will be continue working on them throughout the weekend and hoping to start collect some more relevant logo images and improve the accuracy next week.