Alice’s Status Report for 03/26/2022

In my last update, the main goal I wanted to tackle was getting Scotty3D to read from a text file of particle positions and be able to step through the simulation frames. I was able to fully accomplish this goal for this week, and additionally gained a good understanding of scene loading code in the codebase. I anticipate this will be handy for nice-to-have features if we have the time.

The next major goal for me is to get the FPGA to send the particle position data over UART (serial USB) to Scotty3D and step through the simulation. I realized once I got the simulation working from the text file that a real-time demo will be significantly more compelling, since a pre-computed simulation does not give an accurate visual perception of how much the FPGA can accelerate the fluid simulation. To clarify, real-time demo is not in our MVP, but since Ziyi and Jeremy are still working through build issues I believe this is the most important task that I can be working on right now.

However, since the Vitis HLS project is not working yet (which is necessary for me to do the above), my goal for next week will be to get the Chamfer Distance evaluation script working with the Scotty3D fluid simulation outputs.

I’ve had a really great two weeks so far with progress, and. I am a little concerned that build issues are still taking us this long to work through, but since we’ve stripped down Scotty3D to just the core simulation files on the FPGA I’m feeling OK about us getting the Vitis HLS build completely working in the near future.

Ziyi’s Status Report for 03/26/2022

At the start of this week, Jeremy and I managed to get the C simulation build to pass in Vitis HLS. While we were quite happy at this initial success, we quickly realized that we still needed a lot more work to get the C synthesis working. One thing that majorly complicated this task was that the C synthesis build script would fail pre-synthesis without actually printing any errors. However, after poking around on the Vitis forums and ticking some specific compilation boxes, we could finally read some of the error logs. As it turns out, most of our problems were just version and build conflicts with different C++ features. For instance, while we were able to get away with the auto function type for our lambda functions in simulation, the synthesis compiler got a bit angry at us. Right now, we are still trying to resolve these issues and hopefully synthesize some version of the kernel soon.

In the process of fixing these compilation errors, we also got started on our optimization work. One major task that we did this week was that we got rid of all instances of par_for, seq_for, and for_n, which – respectively – iterated a function over the particles parallelly, iterated over the particles sequentially, and iterated over the neighbors of each particle. This is a major step, as it allows us to manually control the unroll parameters of the kernel on a per-loop basis, rather than applying a single pragma en masse.

Other than slowly grinding through the myriad waves of compilation errors, there is nothing too much to schedule and to report. Still, once we get a working build of the binary, we’ll throw it onto the FPGA and get some timing values reported. We’ll also need to get a survey of the resource utilization of the base implementation. We envision that this will be done by next week.

Ziyi’s Status Report for 03/19/2022

Most of my effort this week was spent on getting a lightened version of the fluid simulation kernel to build on Vitis HLS. This comes as a transition in our priorities to better make an MVP target for our project. Specifically, we were interested in implementing a “headless” version of the simulation kernel, where the only thing that we would specifically need to run is the step2 function in the fluid simulation kernel. We decided that all of the rest of the rendering tasks are not as pressing and should thus be handled at a later point.

Much of the work of this week was setting up the Vitis HLS project on the ECE number machines. We were running into some significant build issues running on Alice’s computer, and the documentation online was frustratingly sparse. As such, in order to make some breakthroughs in our project, we decided to transition to the existing, working, setup that 643 uses. Setting up the project took a bit of work, but we were eventually able to resolve a lot of the bulid issues and get the C simulation task to build. A large part of this task was refactoring parts of the code to remove C++17 artefacts and using different versions of the different libraries that the Scotty3D library depends on.

Right now, we are still trying to figure out how to get the C synthesis to fully build. It is a bit difficult at the moment, as the synthesis log is not actually highlighting the error. However, I think it might have something to do with the tool inferring multiple TOP functions.

Obviously, the goal for next week is to get a working version of C synthesis running.

Alice’s Status Report for 03/19/2022

In my last update, the main goal I wanted to tackle was figuring out what to put on the FPGA, since we were discovering some crashing issues when loading the CPU too much and also installation issues regarding OpenGL. So we decided to make an architecture change, such that the FPGA platform will only perform the fluid simulation algorithm. In other words, it will take an array of 3D positions and output an array of 3D positions for each frame, and save that to a file such that a host computer that is able to run Scotty3D can load in the simulation file and play the fluid simulation.

The major task I accomplished for this week was stripping the fluid simulation code of extraneous dependencies such that the number of files loaded into the Vitis HLS would be minimal. I also worked with Jeremy and Ziyi to rewrite portions of the code that were using modern C++ features, and am currently working on loading a text file of 3D positions into keyframes for the animatino. I am making good progress already and am confident that I can finish this for next week.

Next week will be critical since we have a demo coming up, so in addition to getting the text file loaded into Scotty3D I will be working on figuring out the FPGA–> host computer communication. Since we have finally gotten the Vitis HLS project to build, I am feeling great about our progress going forward.

Team Status Report for 03/19/2022

At the moment, the most jeopardizing issue we have is figuring out how to integrate everything into one package on the board including a display output. As our efforts have been focused on building and accelerating the fluid simulator, we are still unsure about display. We have a variety of backup plans for this issue issue, such as producing a file of the rendered output, and producing the display from somewhere else. This still aligns with our project goals as we are not aiming for a real-time system, and are just trying to accelerate the fluid simulation process.

We decided to make a system change, such that the FPGA platform will only perform the fluid simulation algorithm. In other words, it will take an array of 3D positions and output an array of 3D positions for each frame, and save that to a file such that a host computer that is able to run Scotty3D can load in the simulation file and play the fluid simulation. This system change plays into the category of “request scheduler”, which is shown below.

Jeremy and Alice also swapped responsibilities since Jeremy is more experienced at C++ and digital hardware, and Alice’s knowledge of the Scotty3D codebase is helpful for adding additional features necessary to load the simulation into Scotty3D and getting the FPGA to send over the necessary data to the host computer. Below is the updated Gantt Chart:

Jeremy’s Status Report for 03/19/2022

This week I spent a lot of time trying to wrestle with C++ compilation errors, which were particularly cryptic in the Vitis HLS build environment. The Scotty3D code that we are basing our project on takes advantage of a lot of modern C++ features, and although these are good practice for general programming, they make stripping down the general codebase to what we need fairly tricky. But, after resolving several build errors and restructuring the code, we are able to build and run in the Vitis HLS environment. I also worked on beginning to implement some loop unrolling and allowing for parallelism in hardware.

I think that we are working at a steady pace, and are on track to complete our goals. Especially now that we have a steady build flow working, I think that we will be able to make smooth progress now. Next week I hope to make more progress on working on the acceleration, and also figure out a preliminary method for displaying our output, and work towards figuring out a final version of displaying output.

Jeremy’s Status Report for 02/26/2022

This week started by finishing up our design presentation, and working on the slides for that over the weekend. Since both classes this week involved the design presentations, my productivity was lower than usual, as those are good time for getting work done. But the work that I did do this week was some troubleshooting and some design work. As a group, we finalized how we are going to store points and find neighbors. We figured that by partitioning our simulation space, we could use the cube the point is in’s neighbors to find any points relative neighbors. This means that each neighbor lookup would be O(1) time, which is of particular importance as that is something each point needs to do every iteration. Our approach can also be done in parallel. I also was working on getting some of the build flow to work, which is getting to a stable place, but interfacing with hardware and getting programs to build for different architectures is never straight forward.

Next week I will be working on both the design report, as that deliverable is due next week. Additionally, I will be working on starting taking benchmarks of the CPU implementation, and finding how to benchmark our HLS implementation when we get it working.

Team Status Report for 02/26/2022

Currently the significant risk that we are facing is figuring out where to do the rendering now. It risks an architecture change. We have multiple contingency plans and are currently evaluating the risk/reward/effort of all of these (we are meeting again tomorrow to finalize our decision).

The most important task for the project at this stage is to address the above and get some MVP form of Scotty3D running so we can complete other tasks.

No changes were made to the schedule, as we believe we can catch up during Spring Break and with some hopefully provided slack days. Changes to the system design have not been made for now.

Ziyi’s Status Report for 02/26/2022

This week was a bit confusing in terms of workflow for me. As of now, we are still trying to fully compile the project on Vitis. We had to deal with some troubleshooting in terms of connectivity to the FPGA, but I would say that the build system is mostly stable (Nothing that can’t be solved by the solutions we had from last week).

              Other than this, most of my effort this week was spent on trying to figure out the best way to implement the nearest neighbor lookup for the fluid simulation algorithm. We realized that this issue was of primary concern, as how we would choose to store these values would determine the distribution of BRAM usage between the pipeline stages, which would in turn determine the overall organization of pipelining and unrolling for our kernel. First of all, at the beginning of the week, we thought we would be using a three-level hashmap to implement the lookup from (X,Y,Z) address to the neighbor list.

              The problem with this design is that while it would theoretically be great for a sparse tree (which is what our hashmap would likely represent), the implementation is actually not realistically realizable in hardware. Since you can’t “allocate” a BRAM in realtime, every pointer in each of the BRAMs must have hardware backing, so every pointer must point to a dedicated BRAM. A back of the envelope calculation tells us that if we used integers to index, we would need 2^64 BRAMs to implement this design. Note: This would require 8.5 *(10**16) Ultra96s to implement..

              Instead, we realized that we would need to implement some restrictions in order to feasibly implement the kernel. Our first idea was to figure out the realistic range of both the (X,Y,Z) points and the number of neighbors within a hash bucket. For the first part, we figured that this was important in that we could choose to restrict the simulated fluids into a “simulation window,” where every point that falls out of this window is removed from the simulator task. This also has the effect of reducing the address space we need to support for (X,Y,Z). If this value is small enough, then we could create a unique hash that can fit within 32 bits just by bit-shifting the (X, Y, Z). For the second part, we want to figure out a resonable upper limit for neighbors so that we can determine how wide the BRAM needs to be in order to store a list of all the neighbor pointers. As a side note, we note that with a simulation size of 512 particles, we can uniquely address each of the particles using a 9 bit address. We will need to investigate how we can efficiently pack this as a struct in the neighbor array in a way that works with the C++ for Vitis HLS.

              As for next week, we will continue rewriting the C++ to allow for Vitis HLS compilation of the fluid simulation kernel.  

Alice’s Status Report for 02/26/2022

Finally got Scotty3D to build on the FPGA!!

This week has been hard to get any progress done since I was reviewing for the Design Review presentation and working on the paper. I feel we are behind schedule due to so much time being dedicated towards presentations.

Currently we are trying to get Scotty3D to run on the fabric, so we can compare that to the FPGA version of Scotty3D.

Over the past week we’ve had issues with the FPGA crashing, presumably due to high CPU usage, and I realized that the probability of being able to render graphics on the CPU is low. We might pivot to having a non-graphical version of Scotty3D run on the FPGA, and then output into some sort of text file the 3D position of each of the 512 particles per frame, and let some host computer (such as my PC or my laptop) to the rendering instead. This is after all how big data companies structure their platforms. I am still trying to investigate this with Jeremy and Ziyi for another day or 2 further before making any decisions. This is of course linked to my task for running the simulation headless. Another option is to have the FPGA communicate with the laptop through USB.

Next week my goal is to address this issue. It seems a bit pressing since it might require an architecture change.