Alice’s Status Report – 04/30/2022

Turns out the numbers that I had gotten for Chamfer distance were off and they were even closer than I expected to the reference :). I got support for multiple spheres working and I was able to come up with a new visualization to confirm the fluid simulation traces matched as well.

My goal for this upcoming week is to re-generate all of the reference fluid simulations, record those and get log files, and then get log files from the FPGA fluid simulations and get Chamfer distance numbers. I also want to use the new visualization method to compare the fluid simulations as point clouds for easier visual verification. This will also provide as a nice graphic for all of our presentation materials.

We’ve more or less locked down our final code version,  just need to verify everything and document all of our hardware improvements with timing numbers. Feeling great that our project finally has something visual for results 🙂

Alice’s Status Report – 04/23/2022

The FPGA finally produces visually appealing fluid simulation output! I has to fix some nearest neighbors code and then I spent the majority of this week getting collisions of fluid particles to work.

We constrained the voxel space to [-4,4), [-6,2), [-4,4). The original algorithm uses voxels of the same size as the particle, but this would require us to make an array of size 65536, which requires approximately 1500 BRAMs when the max limit is 438. Therefore we decided to make the voxels 8 times bigger. I initially thought this would make the fluid explode due to an increased number of particles influencing each other, but we tested this and it was fine.

Currently Jeremy and Ziyi are working on the optimizations, so my primary goal for next week is to get numbers for quantitative accuracy.

Team Status Report for 04/16/2022

We were unfortunately set back this week since we were not able to meet in person and had to rely on remote work/communication, and Jeremy was busy recovering from Covid. Though we were set back this week, it’s not a huge loss for us since we’ve allocated a good amount of slack and we were able to piggy back on the work we got done from last week. We’re confident we’re still on track to complete our project.

We were able to make some good progress this week even though we didn’t accomplish everything we wanted to. Ziyi was able to get the hardware/software interface working and got the existing build to run on the FPGA, and did some good work on build configurations as well. Jeremy and Alice made good progress on fixing bugs in the algorithm as well.

No changes were made to the schedule.

Alice’s Status Report for 03/26/2022

In my last update, the main goal I wanted to tackle was getting Scotty3D to read from a text file of particle positions and be able to step through the simulation frames. I was able to fully accomplish this goal for this week, and additionally gained a good understanding of scene loading code in the codebase. I anticipate this will be handy for nice-to-have features if we have the time.

The next major goal for me is to get the FPGA to send the particle position data over UART (serial USB) to Scotty3D and step through the simulation. I realized once I got the simulation working from the text file that a real-time demo will be significantly more compelling, since a pre-computed simulation does not give an accurate visual perception of how much the FPGA can accelerate the fluid simulation. To clarify, real-time demo is not in our MVP, but since Ziyi and Jeremy are still working through build issues I believe this is the most important task that I can be working on right now.

However, since the Vitis HLS project is not working yet (which is necessary for me to do the above), my goal for next week will be to get the Chamfer Distance evaluation script working with the Scotty3D fluid simulation outputs.

I’ve had a really great two weeks so far with progress, and. I am a little concerned that build issues are still taking us this long to work through, but since we’ve stripped down Scotty3D to just the core simulation files on the FPGA I’m feeling OK about us getting the Vitis HLS build completely working in the near future.

Alice’s Status Report for 03/19/2022

In my last update, the main goal I wanted to tackle was figuring out what to put on the FPGA, since we were discovering some crashing issues when loading the CPU too much and also installation issues regarding OpenGL. So we decided to make an architecture change, such that the FPGA platform will only perform the fluid simulation algorithm. In other words, it will take an array of 3D positions and output an array of 3D positions for each frame, and save that to a file such that a host computer that is able to run Scotty3D can load in the simulation file and play the fluid simulation.

The major task I accomplished for this week was stripping the fluid simulation code of extraneous dependencies such that the number of files loaded into the Vitis HLS would be minimal. I also worked with Jeremy and Ziyi to rewrite portions of the code that were using modern C++ features, and am currently working on loading a text file of 3D positions into keyframes for the animatino. I am making good progress already and am confident that I can finish this for next week.

Next week will be critical since we have a demo coming up, so in addition to getting the text file loaded into Scotty3D I will be working on figuring out the FPGA–> host computer communication. Since we have finally gotten the Vitis HLS project to build, I am feeling great about our progress going forward.

Jeremy’s Status Report for 03/19/2022

This week I spent a lot of time trying to wrestle with C++ compilation errors, which were particularly cryptic in the Vitis HLS build environment. The Scotty3D code that we are basing our project on takes advantage of a lot of modern C++ features, and although these are good practice for general programming, they make stripping down the general codebase to what we need fairly tricky. But, after resolving several build errors and restructuring the code, we are able to build and run in the Vitis HLS environment. I also worked on beginning to implement some loop unrolling and allowing for parallelism in hardware.

I think that we are working at a steady pace, and are on track to complete our goals. Especially now that we have a steady build flow working, I think that we will be able to make smooth progress now. Next week I hope to make more progress on working on the acceleration, and also figure out a preliminary method for displaying our output, and work towards figuring out a final version of displaying output.

Jeremy’s Status Report for 02/26/2022

This week started by finishing up our design presentation, and working on the slides for that over the weekend. Since both classes this week involved the design presentations, my productivity was lower than usual, as those are good time for getting work done. But the work that I did do this week was some troubleshooting and some design work. As a group, we finalized how we are going to store points and find neighbors. We figured that by partitioning our simulation space, we could use the cube the point is in’s neighbors to find any points relative neighbors. This means that each neighbor lookup would be O(1) time, which is of particular importance as that is something each point needs to do every iteration. Our approach can also be done in parallel. I also was working on getting some of the build flow to work, which is getting to a stable place, but interfacing with hardware and getting programs to build for different architectures is never straight forward.

Next week I will be working on both the design report, as that deliverable is due next week. Additionally, I will be working on starting taking benchmarks of the CPU implementation, and finding how to benchmark our HLS implementation when we get it working.

Team Status Report for 02/26/2022

Currently the significant risk that we are facing is figuring out where to do the rendering now. It risks an architecture change. We have multiple contingency plans and are currently evaluating the risk/reward/effort of all of these (we are meeting again tomorrow to finalize our decision).

The most important task for the project at this stage is to address the above and get some MVP form of Scotty3D running so we can complete other tasks.

No changes were made to the schedule, as we believe we can catch up during Spring Break and with some hopefully provided slack days. Changes to the system design have not been made for now.

Ziyi’s Status Report for 02/26/2022

This week was a bit confusing in terms of workflow for me. As of now, we are still trying to fully compile the project on Vitis. We had to deal with some troubleshooting in terms of connectivity to the FPGA, but I would say that the build system is mostly stable (Nothing that can’t be solved by the solutions we had from last week).

              Other than this, most of my effort this week was spent on trying to figure out the best way to implement the nearest neighbor lookup for the fluid simulation algorithm. We realized that this issue was of primary concern, as how we would choose to store these values would determine the distribution of BRAM usage between the pipeline stages, which would in turn determine the overall organization of pipelining and unrolling for our kernel. First of all, at the beginning of the week, we thought we would be using a three-level hashmap to implement the lookup from (X,Y,Z) address to the neighbor list.

              The problem with this design is that while it would theoretically be great for a sparse tree (which is what our hashmap would likely represent), the implementation is actually not realistically realizable in hardware. Since you can’t “allocate” a BRAM in realtime, every pointer in each of the BRAMs must have hardware backing, so every pointer must point to a dedicated BRAM. A back of the envelope calculation tells us that if we used integers to index, we would need 2^64 BRAMs to implement this design. Note: This would require 8.5 *(10**16) Ultra96s to implement..

              Instead, we realized that we would need to implement some restrictions in order to feasibly implement the kernel. Our first idea was to figure out the realistic range of both the (X,Y,Z) points and the number of neighbors within a hash bucket. For the first part, we figured that this was important in that we could choose to restrict the simulated fluids into a “simulation window,” where every point that falls out of this window is removed from the simulator task. This also has the effect of reducing the address space we need to support for (X,Y,Z). If this value is small enough, then we could create a unique hash that can fit within 32 bits just by bit-shifting the (X, Y, Z). For the second part, we want to figure out a resonable upper limit for neighbors so that we can determine how wide the BRAM needs to be in order to store a list of all the neighbor pointers. As a side note, we note that with a simulation size of 512 particles, we can uniquely address each of the particles using a 9 bit address. We will need to investigate how we can efficiently pack this as a struct in the neighbor array in a way that works with the C++ for Vitis HLS.

              As for next week, we will continue rewriting the C++ to allow for Vitis HLS compilation of the fluid simulation kernel.  

Alice’s Status Report for 02/26/2022

Finally got Scotty3D to build on the FPGA!!

This week has been hard to get any progress done since I was reviewing for the Design Review presentation and working on the paper. I feel we are behind schedule due to so much time being dedicated towards presentations.

Currently we are trying to get Scotty3D to run on the fabric, so we can compare that to the FPGA version of Scotty3D.

Over the past week we’ve had issues with the FPGA crashing, presumably due to high CPU usage, and I realized that the probability of being able to render graphics on the CPU is low. We might pivot to having a non-graphical version of Scotty3D run on the FPGA, and then output into some sort of text file the 3D position of each of the 512 particles per frame, and let some host computer (such as my PC or my laptop) to the rendering instead. This is after all how big data companies structure their platforms. I am still trying to investigate this with Jeremy and Ziyi for another day or 2 further before making any decisions. This is of course linked to my task for running the simulation headless. Another option is to have the FPGA communicate with the laptop through USB.

Next week my goal is to address this issue. It seems a bit pressing since it might require an architecture change.