Team Status Report 04/06/24

To start the week off, we were able to make good progress in showcasing our project’s individual components (Rpi integration, Arducam, scale, physical box design, website ML functionality, SQL database access, and retrieval) for the interim demo. These were all separately shown during the demo.

In the context of validation, this is an evaluation that is more system-centric and requires us to analyze how well the final product matches our problem statement and proposed approach. Some of the system-wide approaches we plan to consider include the following:

    1. Image forwarding: How well does the RPI forward information to and from the mySQL database? With a 30-second overall latency requirement, image capturing and forwarding should be done in a timely manner (which also means the first-level classification performed on the RPI should be rather light and efficient given that an exact match is not required for that stage of the design)
    2. Concurrency: Principles of ACID and two-phase locking implemented in the views of the web application are important to ensure safe database access and retrieval. From our coursework, we have learned that the database is the most vulnerable component regarding system failure. Other than data sanitation and validation (which are separate concerns), we have to regulate access to elements in the database that are constantly being deleted and re-inserted. Particular edge cases include handling duplicates (eating the same thing multiple times in the day) and ensuring the DB generates new entries for each object.
    3. Tesseract OCR: There are still accuracy issues persisting with text recognition in relation to the image quality, so we have to perform more testing under different conditions for real-time processing scenarios. More diverse set of images can be used and calibrated to see how the text is best extracted in the most accurate way possible such as grayscale conversion, image cropping, font sizes, and background. By systematically testing Tesseract performance on these various images, we are aiming to find optimal settings to achieve the most accurate text extraction for reading food labels. In doing so, we aim to enhance the precision of our text recognition for this real-time application.
    4. Web App User Interaction: How can we ensure all of the user features will work as expected? Performance issues will be tested to access the web application’s responsiveness and scalability, where bottlenecks like slower queries and inadequate indexing are even more important to identify with the inclusion of data forwarding from the camera and scale readings. Upon these inclusions, usability regarding navigation and the transition between scanning and using the application is important to note for user’s ease of use and navigation between different functionality tabs.
    5. ML and Camera Integration: How accurate and quick will ML classification using the Rpi with RasNet-18 model be? We will test this using a testing dataset as well as the time function in the Rpi. Likewise, we will compare this with the results of classification from the website backend to determine which method is better and more efficient. Furthermore, we need to test the image quality using either a USB camera or an Arducam to determine the best fps to achieve maximum accuracy in terms of classification.
    6. Proper Database Retrieval and Storage: How will data (usernames, caloric intake, inventory items) be stored properly in the database once scanned and how can we properly display it in the front-end? We have pages that will graph caloric intake as well as display caloric intake at a daily granularity in the form of a list. We need to make sure that the items and information is properly stored based off the logged-in user to prevent security issues. Likewise, we need to make sure that the users can only modify the database by scanning and/or removing objects directly from the inventory list. We will do extensive testing to prevent SQL injections, javascript attacks, and X-site scripting attacks. Likewise, we will use the browser console to detect irregularities and potential security concerns using error and DB retrieval statements.

Summary of individual work
Surya continued his work on RPI and camera integration. Despite challenges with operating the RPI in a headless setup, including slow SSH connections and lag in camera performance, he successfully implemented a rudimentary algorithm to capture images with 80 percent confidence for classification. The next steps for him involve improving camera performance with a headed monitor setup and integrating image capturing into the database, along with exploring options for classification performance. He thanks the 18500 teaching staff for their feedback on the hardware behind this project during the interim demo with specific regards to camera performance and options.

Grace included additional functional pages on the web application in accordance to feedback and improving the general user experience. These pages include but are not limited to a chat function between users to discuss fitness related goals, a calorie consumption progress graph demonstrating user’s daily improvement, adding caloric entries by date and displaying by day (eventually connected to the hardware components of classification), and a page allowing users to set a daily calorie limit to measure necessary improvements.  She plans to continue working on the frontend and ensuring they match how the user can use the application in an efficient and beneficial way. Also testing for each page that includes user modifications or uploads that could be vulnerable to security attacks.

Steven continued working on his ML features and working with Grace to integrate them into the web application. Tesseract OCR components were added into views.py to extract food label texts from the user’s uploaded images. Additional work will need to be done to improve accuracy, but images were turned into gray-scale and enhanced through enlargement of the original images themselves. During the individual testing phase, these improvements were able to enhance text extracting results. The next step would involve incorporating Surya’s camera subcomponent into capturing these OCR readings and fully integrating it onto a web application page. Steven also included a new model to store entries in a database where users are inputting food items and their respective caloric values. More improvements will be done to include individual food inventories for the user that is currently logged into the application.

Leave a Reply

Your email address will not be published. Required fields are marked *