Ran’s Status Report for 4/27/24

  • What did you personally accomplish this week on the project? 

I helped to prepare our final presentation slide earlier this week. Then I finished the web application development and accomplished CV and ML functionalities with Sejal. I was also responsible for setting up AWS cloud deployment. However, I encountered some difficulties because one of the key video input functions is forbidden under HTTP protocol. I am still searching for solutions. 

  • Is your progress on schedule or behind?

Yes, my progress is on schedule. Given that we are entering the final week of the project, I will make sure I keep up with our team’s pace and help each other out if difficulties arise.

  • What deliverables do you hope to complete in the next week?

Mobile phone compatible deployment

User experience survey distribution and collection

Final integration

Final poster designing

Final report writing

Ran’s Status Report for 4/20/24

  • What did you personally accomplish this week on the project? 

In the first week, I discovered a mistake in my older way of integrating CoreML model with iOS app. I inserted the model in the incorrect function, so the model was never called, producing null results. After I moved it to the target function, multiple errors were raised and the integration was largely unsuccessful. Moreover, since MediaPipe only provides hand landmarker packages for iOS development but not pose landmarks, my search for alternative packages, including QuickPose and Xcode Vision library, failed to indicate a strong feasibility overall. So, after meeting with professors and TAs and our internal group discussion, we officially decided to change our integration plan to a web application in the Django framework. 

Accordingly, I was responsible for the overall codebase setup, the javascript hand and pose real-time recognition functions, and the data transmission between frontend and backend. Over 5 days starting from this Monday, I accomplished setting up Django framework, converting the original MediaPipe CV module written in Python to the same functionality written in javascript, and enabling frontend and backend request and response.  

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Yes, my progress is on schedule. Given that we are entering the final weeks of the project, I still need to speed up my process as much as possible to leave time for further integration.

  • What deliverables do you hope to complete in the next week?

User experience survey distribution and collection

Improving accuracy

Improving UI

  • Additional question for new module learning

To accomplish our gesture and pose detection feature, we utilized the MediaPipe module a lot. While MediaPipe is a powerful library for generating landmarks on input image, video, or livestream video, it took us a couple weeks to study its official google developer website, read its documentation, and follow its tutorials to build the dependencies and experiment with the example code. In addition, we also watched YouTube videos to learn the overall pipeline to implement the CV module for gesture and pose recognition.

Ran’s Status Report for 4/06/24

  • What did you personally accomplish this week on the project? 

I have mostly employed the ML model in the mobile application with CoreML. While the current model shows prediction on the phone screen, it lacks accuracy compared to what we have obtained on a laptop. The reason is mostly likely that I have not extracted pose landmarks in the mobile app, because mediapipe does not offer a pose package for iOS. The alternative packages include QuickPose, which I am now experimenting with and plan to integrate next week. 

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Yes, my progress is on schedule. Given that we are entering the final weeks of the project, I still need to speed up my process as much as possible to leave time for validation.

  • What deliverables do you hope to complete in the next week?

CoreML full functionality

Translation accuracy and latency testing

Mobile app to hardware connection

Ran’s Status Report for 3/30/24

  • What did you personally accomplish this week on the project? 

Following the plan to migrate CV and ML modules to the local iOS app, I was mainly responsible for getting the Swift version of code to work this week. I successfully made the mobile app running, showing 21 landmarks per hand as expected. However, I am still debugging the CoreML interface that should integrate the ML model we trained and the real-time mediapipe landmarks.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

I devoted quite an amount of time working on my task this week, but I am still a bit behind schedule. Since the set milestone and project deadlines are approaching, I will make sure to seek help if really stuck.

  • What deliverables do you hope to complete in the next week?

CoreML integration

Testing with ML latency and accuracy

Ran’s Status Report for 3/23/24

  • What did you personally accomplish this week on the project? 

This week we decided to make a change on our implementation plan. Instead of incorporating cloud deployment, we switched to local processing on the phone. Since MediaPipe offers an interface with Swift, I started to set up the appropriate environment and migrate our original code for the CV module. Although initially I encountered integration issues, I managed to get the application compiled and run on my phone. Currently it captures the video and shows landmarks in real-time on the screen. The next task would be to transform the ML module with CoreML, but all our team members happen to be busy with other courses and interviews, so we will invest in-peron, collaborative working time next week to keep us on the schedule.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

Since some of the tasks have changed, I am currently behind schedule. Next week, I will allocate at least 6 hours besides regular class time to work on the remaining work offline, partially individually and partially with my teammates. 

  • What deliverables do you hope to complete in the next week?

iOS-based CV module refinement

CoreML integration

Ran’s Status Report for 3/16/24

  • What did you personally accomplish this week on the project? 

After our meeting with Professor Savvides and Neha on Monday, I explored the mobile app stream transmission resources they shared and experimented on some methods, including using ffmpeg, Apple’s HTTP Live Streaming (HLS), and some SDKs/open source libraries. After all, I found SwiftVideo might be a suitable package to assist mobile (local) to cloud server video transmission. Meanwhile, my teammates suggested moving CV and ML modules completely to the iPhone processor, by implementing MediaPipe features in Objective-C and using CoreML to realize ML prediction. At the current stage, I cannot tell which method will carry out a better outcome, so I will try to run two tasks in parallel and decide to adopt a certain one by the middle of next week. Moreover, I helped to test and debug the ML module for dynamic signing. 

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress is mostly on schedule. I plan to devote more time in writing code for 1) cloud transmission API and 2) transforming CV module from python into Objective-C.

  • What deliverables do you hope to complete in the next week?

Mobile app development

Ran’s Status Report for 3/9/24

  • What did you personally accomplish this week on the project? 

I was mainly responsible for the iOS application programming and cloud environment setup this week. I improved the UI of the mobile app and added focus and saving features to the application. After pressing the stop recording button, the video is automatically saved in .mov format (after obtaining user’s permission to access the local photo album). This feature could be easily integrated with cloud deployment, where the movie file is instead stored in the S3 database. However, I have not implemented the real-time transmission to the cloud database. This could be a potential difficulty for future work. I also finished the design requirements, test & validation, and other subsections for the design review report.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress is mostly on schedule. I plan to devote more time in CV-ML integration as there seems to be accuracy issues in dynamic signing prediction. Moreover, I will spend more time working on researching cloud transmission technologies and phone application coding.

  • What deliverables do you hope to complete in the next week?

Cloud deployment with real-time video transmission

CV and ML integration

Ran’s Status Report for 2/24/24

  • What did you personally accomplish this week on the project? 

I was the speaker of the design review presentation, so I spent some time reviewing and practicing for it in the first couple days of last week. Then I started on setting up the backend codebase in Swift. Not familiar with Xcode or Swift language before, I watched a few online tutorials to help me get started. I encountered various issues when implementing the real app code, including basic syntax errors, program termination from the simulator’s failure to detect a camera, and permission inconsistency when I tried to export the app to my phone for testing. Those issues took me a while to fix, and thanks to the help from my teammates, I was able to create a camera app that features both photo and video taking. Although the current app is technically the basic version of the iPhone’s default camera app, it is integrable with our own app.

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress is mostly on schedule. However, I have some work that needs to be integrated with my teammates on Monday (mainly CV and ML phases). If the integration outcome fails to meet our expectations, we might devote some time over the spring break into it. Moreover, although I completed my assigned task of setting up the mobile app backend, I anticipated the upcoming workload for the mobile app might turn out to be heavier than we thought. Subsequently, I will spend more time in the spring break working on it.

  • What deliverables do you hope to complete in the next week?

Continue with mobile app backend programming

CV and ML integration

Recognition accuracy testing

Design review report writing

 

Ran’s Status Report for 2/17/24

  • What did you personally accomplish this week on the project?

After obtaining MediaPipe hands recognition on loaded video, I continued to work on enabling live video input and detecting dynamic pose in addition to hands. Notably, with instructions from Professor Savvides and Neha, I was able to answer the frame processing rate question from last week. The reason was that a loaded mp4 video might have played at a high frame rate incompatible with OpenCV. However, this issue turned out to be irrelevant to our implementation plan, since our use-case only involves live feed-in from webcam or phone camera. I added a live frame rate monitoring function to quantify our recognition frame rate to be 10s-15s under real-time dynamic capture conditions. Thanks to Sejal’s primitive working ML model, at this point, I completed my assigned task of static alphabetical gesture recognition. Our team then decided to include detection of the user’s upper body’s movement, basically including arm & shoulder poses, as well as the intricate gesture of both hands. Accordingly, I experimented with the MediaPipe holistic model to successfully implement this recognition feature (as shown in the screenshot below, code in GitHub). Lastly, as the presenter for the design review presentation, I prepared for the presentation slides and rehearsed for the speech.

 

 

 

 

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My progress in on schedule.

  • What deliverables do you hope to complete in the next week?

Boost CV processing

Detect webcam feed-in of static alphabetical gestures

Design presentation slides and speech

Ran’s Status Report for 2/10/24

  • What did you personally accomplish this week on the project? 

After helping finish up the proposal slides, I worked on getting OpenCV and MediaPipe set up on my laptop. I first experimented with sampling a mp4 video into frames that are stored as images in a local folder, and it worked. Later on, after solving some capability issues when initializing MediaPipe, I successfully utilized the MediaPipe hand detection feature to generate landmarks dynamically on the video input. These landmarks, as we expected, exclusively cover the person’s hands and mark the fingers and knuckles. This accomplishment marked the completion of my assigned task of video processing. However, I noticed that the incorporation of MediaPipe’s hand detection model slowed down the frame processing rate, so I will continue researching for solutions to speed up the process.

Hand detection with video feed

 

 

 

 

  • Is your progress on schedule or behind? If you are behind, what actions will be taken to catch up to the project schedule?

My process is on schedule.

  • What deliverables do you hope to complete in the next week?

Boost CV processing

Detect webcam feed-in of static alphabetical gestures

Design presentation slides and speech