Weekly Status Reports – Page 2

April 6, 2024April 7, 2024

Team Status Report for 4/6

Risks

One large risk that our group is currently facing is the fact that Erin is currently dealing with a number of issues regarding the Jetson. This is a large blocker for the entire end-to-end system, as we are unable to demonstrate whether the dirt tracking on the AR application is working properly if the entire Jetson subsystem is still offline. Without the dirt detection functioning, as well as the BLE connection, the AR system does not have the necessary data to determine whether the flooring is clean or dirty, and we will have no way of validating whether the transformation of 3D data points that we have on the AR side is accurate. Moreover, Erin is currently investigating whether it is even possible to speed up the Bluetooth data transmission. Currently, it seems that an async sleep call for around ten seconds is necessary in order to preserve functionality. This, along with the BLE data transmission limit, may force us to readjust our use-case requirements.

Nathalie and Harshul have been working on tracking the vacuum head in the AR space, and they have also been working on trying to get the coordinate transformation correct. While the coordinate transformations have a dependency on the Jetson subsystem (as mentioned above), the vacuum head tracking does not, and we have made significant progress on that front.

Nathalie has also been working on mounting the phone to the physical vacuum. We purchased a phone stand rather than designing our own mounting system, which saved us time. However, the angle which the stand is able to accommodate for may not be enough for the iPhone to get a satisfactory read, and this is something we plan to test more extensively, so we can figure out what the best orientation and mounting process for the iPhone would be.

System Validation:

From our design report and incorporating Professor Kim’s feedback outlined an end-to-end validation test that he would like to see from our System below is the test plan for formalizing and carrying out this test.

The goal is to have every subsystem operational and test connectivity and integration between each subsystem.

Subsystems:

Bluetooth (Jetson)
Dirt Detection (Jetson)
Plane Projection + Image Detection (ARKit)
Plane Detection + Ui
Bluetooth (ARkit)
Time:position queue (ARKit)

With the room mapped with an initial mapping and the plane frozen, place the phone into the mount and start the drawing to track the vacuum position. Verify that the image is detected and the drawn line is behind the vacuum and that the queue is being populated with time:position points (Tests: 3,4,5,6)

Place a large, visible object behind the vacuum head in view of the active illumination devices and the Jetson camera. Verify that the dirt detection script categorizes the image as “dirty”, and proceed to validate that this message is sent from the Jetson to the working iPhone device. Additionally, validate that the iPhone has received the intended message from the Jetson, and then proceed to verify that the AR application highlights the proper portion of flooring (containing the large object). (Tests: 1, 2)

Schedule

Our schedule has been unexpectedly delayed with the Jetson malfunctions this week, which has hindered the progress on the dirt detection front. Erin has been especially involved with this, and we are hoping to have it reliably resolved soon so that she can instead focus her energy on reducing the latency with regards to Bluetooth communication. Nathalie and Harshul have been making steady progress on the AR front, but it is absolutely crucial for each of our subsystems to have polished functionality so that we see more integration progress, especially with the hardware. We are mounting the Jetson this week(end) to measure the constant translational difference so we can add it to our code and do accompanying tests to ensure maximal precision. A challenge has been our differing free times in the day, but since we are working on integration testing between subsystem it is important that we all meet together and with the hardware components. To mitigate this, we set aside chunks of time on our calendars allotted to specific integration tests.

April 6, 2024April 7, 2024

Nathalie’s Status Report for 4/6

I spent this week combining multiple features of the augmented reality component, specifically freezing the plane and tracking coverage metrics. In addition, I did initial scoping, researching and solidifying approaches for object detection, where picking a reference object that has a fixed distance with the back Jetson camera would allow us to locate the world coordinates of the Jetson camera at any given point. I performed initial testing of our floor mapping technology so that I could plan how to verify and validate our augmented reality subsystem going forward, making sure that we are able to track coverage and integrate our individual functionalities without compromising on latency.

Floor mapping with tracking based on camera middle point and present objects in a non-rectangular space

Integrating with physical components

As I have been working on the augmented reality app, I am familiar with how it works and got the opportunity to experiment with the placement of our components because we finally received the phone mount that we ordered. I spent the early parts of this week playing with object detection in order to orient a specific object within world coordinates, and needed some physical metrics in order to inform the approach that I am going to take with reference to object detection in the AR front. Essentially, mine and Harshul’s goal (and what we are currently working on) is to detect a specific object in the space which serves as a reference point on the front of the vacuum, from which we can map the constant translational difference from the front of the vacuum to the back where the Jetson camera exists. Initially, I had thought (and we expected) to mount the phone mount on the actual rod of the vacuum. When I actually assembled the components together and put the mount on the handle with the phone + AR app mapping the floor, I realized it was too close to the ground so wouldn’t provide a user-friendly aerial view that we had initially envisioned. From initial validation tests, the floor mapping technology works best when it has the most perspective, as it is able to interpret the space around it with context and meaning. Therefore, any rod positioning was not ideal due to the vacuum compartment – initially Erin was holding the vacuum for dirt detection functionality so I didn’t realize how just how big the compartment actually was. Another problem was that mounting the phone so low would essentially put it out of the field of view of the user, which essentially renders the whole subsystem useless.

When experimented with positioning, I then added the phone map to the handle. This was initially a hesitation of mine because I didn’t want the hand hole to be too small for someone to hold, but testing it out showed me that it wasn’t an issue. Actually, it was the perfect visual aid, within the user’s field of vision but not being too obstructive.

Ideal positioning of the phone mount on the vacuum. Compatible with multiple devices for maximal accessibility.

Object Detection

Handling the physical components led me to better understand what sorts of software challenges that we might be dealing with. Due to the change in the height of the phone and mount, an unexpected challenge that I realized was we would need to object detect at a further distance than initially thought. We are looking at what sort of object/shape is distinct at ~1.2m high, looking for a compromise between a smaller object for user friendliness and least obstruction possible, while also maintaining accuracy in our object detection models. In the below video, I detected a rectangle from 2m distance (further than we need) serving as a proof of concept that we can detect shapes/objects at distances further than we even need for this project. When watching it, you can observe the changing colors of the rectangle on the floor, which is the AR modifying/adding art to the rectangle that it detects. Since the object is far away, you might need to look closely, but you can see how the colors change indicating that the application recognizes the rectangle.

[VIDEO] Demo example of object detection at a distance

Verification + Validation

In terms of subsystem testing, Harshul and I worked on designing tests for the AR side of our subsystem. We designed our tests together, but decided to act as project manager of different subsystems within the AR component. Specifically, I’m responsible for testing the floor mapping classification validation, checking that the frozen map area with reference to the real world.

In a purely rectangular space, I am going to test the map’s ability to cover the real area including the corners. Specifically I am going to test accuracy of the mapping when changing degrees of freedom that the camera accesses. The use case requirement I am testing is:

Accuracy in initial mapping boundary detection ±10 cm (per our design report + presentations)

Measurements & Calculations

What I’m measuring is the mapped area (the mesh area is a property of itself and can be determined programmatically through ARKit) versus the area of the real floor, which I’ll be measuring manually by performing a Width x Height calculation. I’ll use the percentage difference between the two as a measure of accuracy, and perform calculations to make sure that the mapped area falls within +- 10 cm of the outer borders. Specifically, I’m going to do 3 tests for mapping the floor area and performing calculations in each of these scenarios. I will be mapping the floor by pointing the camera for 5 seconds each in the following ways:

Parallel to the floor
Perpendicular to the floor with 90º horizontal rotation allowed
Perpendicular to the floor with 180º horizontal rotation

As such, I will have represented many of the user-mapped scenarios and performing accompanying calculations for each, making sure that it fits within our use case requirement.

Schedule & Next Steps

At the beginning of this week on the day of demos, our Jetson broke. Erin has been dealing with this issue extensively, and it has been an unexpected challenge that we have been trying really hard to work through, by reflashing the OS, starting everything over again, reinstalling absolutely everything – we have been having issues installing initial packages due to all the dependencies. The broken Jetson put us behind schedule because I was not able to test the latency of the Bluetooth communication from the Jetson so that part got pushed to this week which was unexpected but our #1 priority going forward. While continuing to improve the object detection technology with Harshul, I also will be working on two main aspects of our tracking system (1) making sure that we cannot trace outside of the plane, and enlarging the area of the drawn lines, whether it be by increasing the radius or making it rectangular by adding planes instead. We don’t have a lot of slack time left so need to be constantly making progress to allow for proper integration tests and validation as we look forward to our final demo.

March 31, 2024

Erin’s Status Report for 3/30

This week, I worked primarily on trying to get our Jetson hardware components configured so that they could run the software that we have been developing. Per our design, the Jetson is meant to run a Python script with a continuous video feed coming from the camera which is attached to the device. The Python script’s input is the aforementioned video feed from the Jetson camera, and the output is a timestamp along with a single boolean, detailing whether dirt has been detected on the image. The defined inputs and outputs of this file have changed since last week; originally I had planned to output a list of coordinates where dirt has been detected, but upon additional thought, mapping the coordinates from the Jetson camera may be too difficult a task, let alone the specific pixels highlighted by the dirt detection script. Rather, we have opted to simply send a single boolean, and narrow the window of the image where the dirt detection script is concerned. This decision was made for two reasons: 1) mapping specific pixels from the Jetson to the AR application in Swift may not be feasible with the limited resources we have, and 2) there is a range for which the active illumination works best, and where the camera is able to collect the cleanest images. I have decided to focus on that region of the camera’s input when processing the images through the dirt detection script.

Getting the Jetson camera fully set up was not as seamless as it had originally seemed. I did not have any prior trouble with the camera until recently, when I started seeing errors in the Jetson’s terminal claiming that there was no camera connected. In addition, the device failed to show up when I listed the camera sources, and if I tried to run any command (or scripts) which queried from the camera’s input, the output would declare the image’s width and height dimensions were both zero. This confused me, as this issue had not previously presented itself in any earlier testing I had done. After some digging, I found out that this problem typically arises when the Jetson is booted up without the camera inserted. In other words, in order for the CSI camera to be properly read as a camera input, it must be connected before the Jetson has been turned on. If the Jetson is unable to detect the camera, a restart may be necessary.

Another roadblock I had run into this week was trying to develop on the Jetson without an external monitor. I have not been developing directly on the Jetson device, since it has not been strictly necessary for the dirt detection algorithms to work. However, I am currently trying to get the data serialization to work, which requires extensive interaction with the Jetson. Since I have been moving around, carrying a monitor around with me is simply impractical. As such, I have been working with the Jetson connected to my laptop, rather than an external monitor with a USB mouse and keyboard. I used the `sudo screen` command in order to see the terminal of the Jetson, which is technically enough to get our project working, but I encountered many setbacks. For once, I was unable to view image outputs via the command line. When I was on campus, the process to getting the WiFi system set up on the Jetson was also incredibly long and annoying, since I only had access to command line arguments. I ended up using the command line tools from `nmcli` to connect to CMU-SECURE, and only then was I able to fetch the necessary files from Github. Since getting back home, I have been able to get a lot more done, since I have access to the GUI.

I am currently working on trying to get the Jetson to connect to an iPhone via a Bluetooth connection. To start, getting Bluetooth set up on the Jetson was honestly a bit of a pain. We are using a Kinivo BT-400 dongle, which is compatible with Linux. However, the device was not locatable by the Jetson when I first tried plugging it in, and there were continued issues with Bluetooth setup. Eventually, I found out that there were issues with the drivers, and I had to completely wipe and restore the hardware configuration files on the Jetson. The Bluetooth dongle seems to have started working after the driver update and a restart. I have also found a command line argument (bluetooth-sendto –device=[MAC_ADDRESS] file_path) which can be used to send files from the Jetson to another device. I have already written a bash script which can run this command, but sadly, this may not be usable. Unfortunately, Apple seems to have placed certain security measures on their devices, and I am not sure that I will find a way to circumvent those within the remaining time we have in the semester (if at all). An alternative option which Apple does allow is BLE, which stands for Bluetooth Low Energy. This is a technology which is used by CoreBluetooth, a framework which can be used in Swift. The next steps for me are to create a dummy app which uses the CoreBluetooth framework, and show that I am able to receive data from the Jetson. Note that this communication does not have to be two-way; the Jetson does not need to be able to receive any data from the iPhone; the iPhone simply needs to be able to read the serialized data from the Python script. If I am unable to get the Bluetooth connection established, at worst case, I am planning to have the Jetson continuously push data to either some website, or even a Github repository, which can then be read by the AR application. Obviously, doing this would incur higher delays and is not scalable, but this is a last resort. Regardless, the Bluetooth connection process is something I am still working on, and this is my priority for the foreseeable future.

March 31, 2024

Harshul’s Status Report for 3/30

This week I mainly worked on the core feature of being able to draw on the floor plane using the position of our phone in the environment as a reference point. SceneKit allows projecting elements from the 2D coordinate frame of the ‘view’, in this case, the screen, into a 3D world plane, using the ‘unprojectOntoPlane’ method My first approach was to take the camera coordinates, convert them into 2d screen coordinates, and then perform the projection. However, there seemed to be a disconnect between the view port and the camera’s coordinates so I pivoted to using the center of the screen as the base point. In order to project onto the plane and also ensure specifically that I’m only projecting onto the desired floor plane, I updated my tap gesture to set variables of projectedPlane and a flag of selectedPlane to only enable the projection logic once a plane was selected and allowed the projection renderer to access the plane object. I then performed a hittest which sends out a ray from that 2d point into the 3d world and returns AR objects that it passes through. In order to make planes uniquely identifiable and comparable I added a uuid field to the class which I then cased on to ensure I’m only drawing on the correct plane. The result of a HitTest returns the world coordinates at which it intersected the plane which I then passed into a function to draw.

image highlighting the drawn points all being coplanar with the floorplane

Video of Drawing on plane with circles

Video of Drawing on plane with connected cylindrical segments

Functionally this worked, but there was a notable performance drop. I attempted to introduce a time interval to reduce the frequency of the hittests and draw spheres, which are more shader optimized than cylinders, but there is absolutely more work to be done.

I then profiled the operation of the app and while for some reason debug symbols did not export. This outlined that there was a delay in wait for drawable.

image of xcode instruments profiling the AR app

Researching this led me to find out that drawing repeated unique shapes and adding them is not as performant as stitching them together. Additionally, performing physics/mathematics calculations in the renderer (non GPU ones, at least) can cause this performance impact. In discussing these findings with Nathalie, we identified 2 candidate approaches forward to cover our bases on the performance bottleneck that we could work on in parallel. The first approach consists of 2 key changes. Firstly, I’m going to try to move the hittest outside of the renderer. Secondly, Natalie outlined that the child nodes in our initial plane detection did not have this impact on performance so I plan on performing a coordinate transform and drawing the shapes as child nodes to see if that makes load on the renderer better and experiment with flattened clone to coalesce these shapes.

When looking up native parametric line implementations in Xcode I could only find bezier curves in the UI library. We managed to find a package SceneLine that implements parametric lines in sceneKit which is what Nathalie is going to install and experiment with. to see if we unlock a performance speedup by only having one node that we update instead of a new node for every shape.

Next steps involve addressing this performance gap in time for our demo and syncing with Nathalie on our optimization findings. Additionally integrating Nathalie’s freeze frame feature as well as Bluetooth messages is our next highest priority post-demo as Professor Kim outlined that integration would take more time than we anticipate. Eventually we will need to have the line be invisible, but place planes along its coordinates and orient them the correct way which should be unlocked once we have the ability to draw in a performant way.

March 31, 2024March 31, 2024

Team Status Report for 3/30

Risks

Our challenges currently lie with integrating the subsystems that we have all been working on in parallel. From our discussions this week, we have decided on data models for the output of Erin’s dirt detection algorithms which are the inputs to Nathalie and Harshul’s AR mapping algorithms. Current risks lie in establishing Bluetooth communication between the Jetson camera and the iPhone: we have set up the connection as receiving/sending and see the available device, but Apple’s black-box security measures prevent us from currently sending files. There have been developers that were able to circumvent this in the past, and so we are actively researching what methods they used. At the same time, we are actively exploring workarounds and have contingency plans in place. Options include employing web communication via HTTP requests or utilizing file read/write operations.

Other risks include potentially slow drawing functions when tracking the camera. Right now, there seems to be a lot of latency that impacts the usability of our system, so we are researching different methods in ARKit that can be used in a faster way. To address this, Nathalie is exploring alternatives such as utilizing the SCNLine module to potentially enhance performance. Similarly, Harshul is working on creating child nodes in a plane to see which is faster. We can always use GPU/CUDA if needed for additional speed up.

In addition, we have our main software components making progress but need to focus on how to design and mount hardware. This is a challenge because none of us have extensive experience in CAD or 3D printing, and we are in the process of deciding how to mount the hardware components (Jetson, camera, active illumination) such that it fits our ideal criteria (i.e. the camera needs to be mounted at the identified height and angle). Doing so earlier (immediately after the demos) will allow us to iterate through different hardware methods and try different mounts that we design to figure out what holds the most stability while not compromising image quality.

Schedule

In the coming week, we plan to flush out a plan for how to mount our hardware on our vacuum. We have already set up the Jetson such that it will be easy to fasten to the existing system, but the camera and its positioning are more challenging to engineer mounts for. In addition, the AR iPhone application is nearing the end of its development cycle, as we are moreso working on optimizations rather than core features. We are considering options for how to mount the iPhone as well. Nathalie has been working on how to pinpoint the location of the rear camera view based on the timestamps received from the Jetson. This may still need to be tweaked after we get the Bluetooth connection to be fully functional, as this is one of the main action items we have for the coming week.

March 30, 2024March 30, 2024

Nathalie’s Status Report for 3/30

This week I worked on iterating and adding on the mapping implementation that I worked on last week. After our meeting, we discussed working on the communication channels and integrating each component together that we had been working on in parallel. We discussed the types of information that we want to send from the Jetson to the iPhone, and determined that the only information we need is a UNIX timestamp and a binary classification as to whether the spot at that time is considered dirty. A data model sent from the Jetson via Bluetooth would be {“time”: UNIX timestamp, “dirty”: 1} for example. On the receiving end, the iPhone needed to map a UNIX timestamp with the camera coordinates at a given time. The Jetson has no sense of orientation while our AR code in XCode is able to perform that mapping, so connecting the data through a timestamp made the most sense to our group. Specifically, I worked on the Swift side (receiving end) where I grabbed the UNIX timestamps and mapped them to coordinates in the plane where the camera was positioned at that time. I used a struct called SCNVector3 that holds the XYZ coordinates of our camera position. Then I created a queue with limited capacity (it currently holds 5 entries but that can be easily changed in the future). This queue holds dictionaries that map {timestamp: SCNVector3 coordinates}. When the queue reaches capacity, it dequeues in a FIFO manner to enqueue more timestamps. This is important to account for possible time latency that it takes for the Jetson camera to transmit information via Bluetooth.

Below I’ve included a screenshot of some code that includes how I initialized the camera position SCNVector and the output of when I print the timeQueue over several UNIX time stamps. You can see that the queue dequeues the first element so the order of the UNIX timestamps in the queue are always monotonically ascending.

Initialization of the Camera Position vector and queue progression over several timestamps

In addition, I did work to develop the initial floor mapping. Our mapping strategy happens in two steps: first the floor mapping, then annotating the floor based on covered area. For the initial floor mapping, I created a feature called “Freeze Map” button, which makes the dynamic floor mapping freeze to become a static floor mapping. I wrote accompanying functions for freezing the map and added this logic to the renderer functions, figuring out how to connect the frontend with the backend in Swift. To do this, I worked for the first time with the Main screen, and learned how to assign properties, names, and functionalities to buttons on a screen. This is the first time I’ve ever worked with the user interface of an iOS app, so it was definitely a learning curve in terms of figuring out how to make that happen in XCode. I had anticipated that this task would only take a day, but it ended up taking an additional day.

In the below photo, you can see the traced line that is drawn on the plane. This is testing SCNLine and tracking of camera position – currently a work in progress. Also, the “Freeze Map” button is present (although some of the word characters are cut out for some reason). The freeze button works to freeze the yellow plane that is pictured.

The white line indicates an SCNLine on the plane, and the Freeze Map button freezes the yellow plane as you move your phone through the air. It will stop the plane mapping functionality.

Overall, we are making good progress towards end-to-end integration and communication between our subsystems. I’ve established the data model needed that I need to receive from Erin’s work with the Bluetooth connection and the Jetson camera, and created the mapping needed to translate that into coordinates. We are slightly behind when it comes to establishing the Bluetooth connection because we are blocked by Apple’s security (i.e. given the current state, not being able to establish the actual Bluetooth connection from a iPhone to the Jetson). We are working on circumventing this, but have alternative approaches in place. For example, we could do the communication via web (HTTP request), or read/write to files.

I spent a good amount of time with Harshul also working on how to annotate and draw lines within planes to mark areas that have been covered by our vacuum. The current problem is that the drawings happen too slowly, bottlenecked by the latency of the ARkit functions that we are dealing with. My personal next steps are to try drawing lines with a SCNLine module, seeing if that shows a performance speedup, and working in parallel with Harshul to see which approach is the best. This is one of the major technical components left in our system, and what I will be working on for the coming week. We are preparing the connections between subsystems for our demo, and definitely want that to be the focus of the next few days!

March 24, 2024March 24, 2024

Erin’s Status Report for 3/23

This week, I continued to try and improve the existing dirt detection model that we have. I also tried to start designing the mounting units for the Jetson camera and computer, but I realized I was only able to draw basic sketches of what I imagined the components to look like, as I have no experience with CAD and 3D modeling. I have voiced my concerns about this with my group, and we have planned to sync on this topic. Regarding the dirt detection, I had a long discussion with a friend who specializes in machine learning. We discussed the tradeoffs of the current approach that I am using. The old script, which produced the results shown in last week’s status report, relies heavily on Canny edge detection when classifying pixels to be either clean or dirty. The alternative approach my friend suggested was to use color to my advantage. Something I hadn’t realized earlier was that our use-case constraints give me the ability to use color to my advantage. Since our use-case confines the flooring to be white and patternless, I am able to assume that anything “white” is clean, assuming that the camera is able to capture dirt which is visible to the human eye from a five foot distance. Moreover, we are using active illumination. In theory, since our LED is green, I can simply try to threshold all values between certain shades of green to be “clean”, and any darker colors, or colors with grayer tones, to be dirt. This is because (in theory), the dirt that encounter will have height component. As such, when the light from the LED is shined onto the particle, the particle will cast a shadow behind it, which will be picked up by the camera. With this approach, I would only need to look for shadows, rather than look for the actual dirt in the image frame. Unfortunately, my first few attempts at working with this approach did not produce satisfactory results, but this would be a lot less computationally expensive than the current script that I am running for dirt detection, as it relies heavily on the CPU intensive package, OpenCV.

I also intend to get the AR development set up fully on my end soon. Since Nathalie and Harshul have been busy, they have not had the time to sync with me and get my development environment fully set up, although I have been caught up to speed regarding the capabilities and restrictions of the current working model.

My next steps as of this moment are to figure out the serialization of the data from the Jetson to the iPhone. While the dirt detection script is still imperfect, this is a nonblocking issue. I currently am in possession of most of our hardware, and I intend on getting the data serialization via Bluetooth done within the next week. This also will allow me to start benchmarking the delay that it will take for data to get from the Jetson to the iPhone, which is one of the components of delay we are worried about with regard to our use case requirements. We have shuffled around the tasks slightly, and so I will not be integrating the dirt right now; I will simply be working on data serialization. The underlying idea here is that even if I am serializing garbage data, that is fine; we simply need to be able to gauge how well the Bluetooth data transmission is. If we need to figure out a more efficient method, I can look into removing image metadata, which would reduce the size of the packets during the data transfer.

March 24, 2024

Team Status Report for 3/23

Risks

With the augmented reality floor mapping base implementation done, we are about to move into marking/erasing the overlay. Nathalie and Harshul have discussed multiple implementation strategies for marking coverage, and are not entirely sure which approach will be most successful – this is something we will determine when working on it this week. Our initial thought is to combine the work that we have each done separately (Nathalie having mapped the floor and Harshul creating logic to change plane color on tap). Specifically, we want to add more nodes in a specific shape to the floor plane in a different color, like red, with the diameter of the shape equivalent to the width of the floor vacuum. Still, we need to figure out first how to do that, and once it works what shape would best capture the vacuum coverage dimensions. This is important because the visual representation of coverage is essential to our project actually working. As a fallback, we have experimented with the World Tracking Configuration logic which is able to capture our location in space and are willing to explore how our alternative approaches might work to solve the problem of creating visual indicators on a frozen floor map.

The core challenge is that upon freezing map updates we run the risk of odometry and drift of objects as we move around the room and tracking information changes, but doesn’t propagate to the actual planes drawn in the scene. However keeping the map dynamic mitigates this but then prevents consistency in the actual dimensions of our plane which make it difficult to measure and benchmark our coverage requirements. One mitigation method would be to have custom update renderers to avoid redefining plane boundaries but possibly allow their anchor position to change.

Another challenge that our group is currently facing is the accuracy of the mapping. While we addressed this issue before, the problem still stands. At this time, we have not been able to get the ARKit mappings to reflect the error rates that we desire, as specified by our use case requirements. This is due to the constraints of Apple’s hardware and software, and tuning these models may not be a viable option, giving the remaining time we have for the rest of the semester. Our group has discussed readjusting our error bounds in our use case requirements, and this is something we plan to flush out within the week.

We also need to get started on designing and productionizing all the hardware components we need in order to assemble our product end to end. The mounts for the Jetson hardware as well as the active illumination LEDs need to be custom made, which means that we may need to go through multiple iterations of the product before we are able to find a configuration that works well with our existing hardware. Since the turnaround is tight considering our interim demo is quickly approaching, we may not be able to demonstrate our project as an end-to-end product; rather, we may have to show it in terms of the components that we have already tested.

Scheduling

We are now one week away from the interim demo. The last AR core feature we need to do is plane erasure. We’ve successfully tracked the phone’s coordinates and drawing that in the scene. The next step is to project that data into the floor plane. This would leave the AR subsystem ready to demo. Since our camera positioning has been finalized, we are beginning to move forward with designing and 3D printing the mounting hardware. Next milestones will entail a user friendly integration of our app features as well as working on communication between the Jetson and the iPhone.

March 24, 2024

Harshul’s Status Report for 3/23

This week I worked on more of the AR app’s core functionality. Me and Nathalie worked together to identify functionality and relavant API features and then parallelized some of our prototyping by pursuing candidate approaches individually and then reconvening. I worked on being able to recolor a tapped plane. This was achieved by modifying the plane class and creating a tap gesture callback that modified the appropriate plane. This works because the way in which the nodes are placed in the world function in a similiar way to the DOM on a webpage with parent and children nodes with ARKit providing a relevant API for accessing and traversing these SceneNodes. Since we can insert custom objects we have the ability to update them.

Handletap callback that performs a hit test to extract the plane from the tapped SCN node and change its color

This image outlines this feature in action providing a way for the user to interact with the worldusing an intuitive action of tapping the screen.

The next feature that I mocked up was the ability for us to track the position of our phone in the world. The below image outlines a proof of concept of the white trail representing the camera’s position in the worldmap and it updates as we move the camera. More work needs to be done to optimise the amount of nodes added with some attempt at shortcutting to combine segments as well as Projecting these 3d coordinates into the perspective of the floor plane. and widening the footprint of the drawn trail into approximating the size of the vacum as well as computing a transform to translate the origin of the drawn trail towards the position of the actual vacuum mount. Fiducial Markers might be a way of providing a ground truth for the vacuum mount detecting fiducials/reference 2d images that are a candidate option to pursue next week.

In terms of scheduling things are generally on track. I expect that we will be able to have the core features of plane detection tracking and texture mapping all implemented for our interim demo with work pending to calibrate things to comply with our V&V plans and improve overall accuracy and system integration..

March 23, 2024March 23, 2024

Nathalie’s Status Report for 3/23

This week I spent time with Swift and Xcode to create the functionality of mapping the floor (and only the floor) and also uploading and testing it with our phones. I worked with Harshul on the augmented reality component, with our base implementation being inspired by the AR Perception and AR Reality Kit code. This functionality creates plane overlays on the floor, updating the plane by adding children nodes that together draw the floor texture. Not only does it dynamically add nodes to the plane surface object, but it also is able to classify the surface as a floor. The classification logic allows for surfaces to be labeled as tables, walls, ceilings, etc. to ensure that the AR overlay accurately represents the floor surface, minimizing confusion that can be caused by other elements in the picture, like walls and/or furniture.

The technology has memory, meaning that it locates each frame within a World Tracking Configuration, verifying if it has already seen the surface in view and re-rendering the previously created map. This memory is important because it is what will allow us to create marks on the plane for already-covered surface area, and remember what has been covered when the camera spans in and out of view. This means that once a surface has been mapped, that the code pulls what it already mapped rather than re-mapping the surface. In the next steps I plan on finding a way to “freeze” the map (as in, stop it from continuously adding child nodes), and then perform markings on the overlay to indicate what has been passed over, and be able to remember that going in and out of the frame.

Mapping room corners with an object (plant) in the plane.

Mapping of a non-rectangular plane, demonstrating bounding box and overlay difference

There is a distinction between the bounding boxes (which are purely rectangular) and the actual overlay that is able to dynamically mold to the actual dimensions and outline of the floor surface. This overlay is what we are relying on for coverage, rather than the rectangular boxes because it is able to capture more granularity that is needed for the vacuum-able surface. I tested the initial mapping on different areas, and noticed that there are some potential issues with the error bounding of the surface. I’m in the process of testing the conditions in which the mapping is more accurate, and seeing how I can tweak the code to make things more precise if possible. I found that the mapping has more of a tendency to over-map rather than under-map, which is good because that fits our design requirements of wanting to clean the maximum surface possible. Our group had a discussion that we would rather map the floor surface area larger than it actually is rather than smaller, because the objective of our whole project is to show spots that have been covered. We decided that it would be acceptable for a user to notice that a certain surface area (like an object) cannot be vacuumed and still be satisfied with the coverage depicted. Hence, the mapping does not remove objects that lay in the way, but is able to mold to non rectangular planes.

Demonstration of the overlay performance with objects. Mapping for our project will be measured with respect to the colored portions, not the bounding box.

With reference to the schedule, I continue to be on track with my tasks. I plan on working with Harshul for improving the augmented reality component, specifically with erasing/marking areas covered. I want to find ways to save an experience (i.e. freeze the frame mapping), and then be able to add/erase components to the frozen overlay. To do this, I will need to get dimensions of the vacuum and see what area of coverage that the vacuum actually gets, and how best to capture those markings in the overlay with augmented reality technologies.