Jeremy’s Status Report for 02/26/2022

This week started by finishing up our design presentation, and working on the slides for that over the weekend. Since both classes this week involved the design presentations, my productivity was lower than usual, as those are good time for getting work done. But the work that I did do this week was some troubleshooting and some design work. As a group, we finalized how we are going to store points and find neighbors. We figured that by partitioning our simulation space, we could use the cube the point is in’s neighbors to find any points relative neighbors. This means that each neighbor lookup would be O(1) time, which is of particular importance as that is something each point needs to do every iteration. Our approach can also be done in parallel. I also was working on getting some of the build flow to work, which is getting to a stable place, but interfacing with hardware and getting programs to build for different architectures is never straight forward.

Next week I will be working on both the design report, as that deliverable is due next week. Additionally, I will be working on starting taking benchmarks of the CPU implementation, and finding how to benchmark our HLS implementation when we get it working.

Team Status Report for 02/26/2022

Currently the significant risk that we are facing is figuring out where to do the rendering now. It risks an architecture change. We have multiple contingency plans and are currently evaluating the risk/reward/effort of all of these (we are meeting again tomorrow to finalize our decision).

The most important task for the project at this stage is to address the above and get some MVP form of Scotty3D running so we can complete other tasks.

No changes were made to the schedule, as we believe we can catch up during Spring Break and with some hopefully provided slack days. Changes to the system design have not been made for now.

Ziyi’s Status Report for 02/26/2022

This week was a bit confusing in terms of workflow for me. As of now, we are still trying to fully compile the project on Vitis. We had to deal with some troubleshooting in terms of connectivity to the FPGA, but I would say that the build system is mostly stable (Nothing that can’t be solved by the solutions we had from last week).

              Other than this, most of my effort this week was spent on trying to figure out the best way to implement the nearest neighbor lookup for the fluid simulation algorithm. We realized that this issue was of primary concern, as how we would choose to store these values would determine the distribution of BRAM usage between the pipeline stages, which would in turn determine the overall organization of pipelining and unrolling for our kernel. First of all, at the beginning of the week, we thought we would be using a three-level hashmap to implement the lookup from (X,Y,Z) address to the neighbor list.

              The problem with this design is that while it would theoretically be great for a sparse tree (which is what our hashmap would likely represent), the implementation is actually not realistically realizable in hardware. Since you can’t “allocate” a BRAM in realtime, every pointer in each of the BRAMs must have hardware backing, so every pointer must point to a dedicated BRAM. A back of the envelope calculation tells us that if we used integers to index, we would need 2^64 BRAMs to implement this design. Note: This would require 8.5 *(10**16) Ultra96s to implement..

              Instead, we realized that we would need to implement some restrictions in order to feasibly implement the kernel. Our first idea was to figure out the realistic range of both the (X,Y,Z) points and the number of neighbors within a hash bucket. For the first part, we figured that this was important in that we could choose to restrict the simulated fluids into a “simulation window,” where every point that falls out of this window is removed from the simulator task. This also has the effect of reducing the address space we need to support for (X,Y,Z). If this value is small enough, then we could create a unique hash that can fit within 32 bits just by bit-shifting the (X, Y, Z). For the second part, we want to figure out a resonable upper limit for neighbors so that we can determine how wide the BRAM needs to be in order to store a list of all the neighbor pointers. As a side note, we note that with a simulation size of 512 particles, we can uniquely address each of the particles using a 9 bit address. We will need to investigate how we can efficiently pack this as a struct in the neighbor array in a way that works with the C++ for Vitis HLS.

              As for next week, we will continue rewriting the C++ to allow for Vitis HLS compilation of the fluid simulation kernel.  

Alice’s Status Report for 02/26/2022

Finally got Scotty3D to build on the FPGA!!

This week has been hard to get any progress done since I was reviewing for the Design Review presentation and working on the paper. I feel we are behind schedule due to so much time being dedicated towards presentations.

Currently we are trying to get Scotty3D to run on the fabric, so we can compare that to the FPGA version of Scotty3D.

Over the past week we’ve had issues with the FPGA crashing, presumably due to high CPU usage, and I realized that the probability of being able to render graphics on the CPU is low. We might pivot to having a non-graphical version of Scotty3D run on the FPGA, and then output into some sort of text file the 3D position of each of the 512 particles per frame, and let some host computer (such as my PC or my laptop) to the rendering instead. This is after all how big data companies structure their platforms. I am still trying to investigate this with Jeremy and Ziyi for another day or 2 further before making any decisions. This is of course linked to my task for running the simulation headless. Another option is to have the FPGA communicate with the laptop through USB.

Next week my goal is to address this issue. It seems a bit pressing since it might require an architecture change.

Team’s Status Report 02/19/2022

If you’ve already looked at our individual reports, you’d know that much of our week was spent on realizing the Y2k22 issue with Vitis, so we won’t go on too much about it here. Instead, we’ll focus on the positives. 😊

To start, we now have a local instance of Vitis running on Alice’s laptop! The point of this is that we will no longer be restricted to the build tools present on the ECE machines. As the current Scotty3D code runs on a newer version of CMake, the ECE machine was unable to compile the program. Things should be more straightforward on a local machine, where we can easily install whatever packages we need. In any case, it seems like our build platform woes are hopefully coming to an end…

 

In terms of algorithmic improvements, we met a couple times throughout the week to analyze the code in the fluid.cpp file itself, rather than just identifying the dependencies in the overall algorithm. Of primary importance for these meetings was determining how to represent each of the datatypes and determining what data we would be able to store on chip in the BRAMs, as well as how many copies of the data we would actually be able to store. One particular structure of importance was trying to figure out how we would actually implement the neighbor map on the FPGA. In the regular implementation, the neighbor map is simply an unordered hash map that uses a nearest quantized point to index into a list of neighboring particles. While this implementation is fine in software, if we actually want a performant hardware implementation, we’ll likely need to manually implement the hashmap as a BRAM array of pointers to another BRAM array of particles.  Aside from that, we also took a look at the looping structures in the code and assessed its ability to benefit from pipelining. In fact, we found that a lot of the steps in the code could actually be pipelined. Steps 2 and 3 seem to have no interdependencies, so we can probably unroll and pipeline those two steps. Other than that, we also took note of the instances of multithreading code that we would need to strip from the fluids.cpp file due to depreciation.

For the next week, our primary goal is to get a build of the fluids.cpp kernel working in Vitis HLS, as this will give us the baseline results we need.

Ziyi’s Status Report for 02/19/2022

Basically, much of this week was spent realizing that Vitis has been broken on the ECE machines since the start of the new year. Basically, the very same Y2K22 bug that plagued Microsoft Exchange was affecting the build process for compiling to hardware on Vitis. Unfortunately, this took a bit of dredging to find on the Vitis forums, and so we lost a couple days of progress, as we thought that it was an issue with our personal configuration rather than an issue with the system itself. Thankfully, after pointing out the issue to Professor James Hoe, he was able to quickly implement the patch for fixing the build tools. Finally, we were able to compile a project and generate a PetaLinux image to flash onto the FPGA. But then we needed to actually interface with the FPGA. Unfortunately, due to some weird configurations, the FPGA’s internal WiFi was not automatically set up, so we needed to interface with it through mouse and keyboard (we were also missing the mini Displayport cable, so we had to overnight that).

After finally gaining access to the FPGA interface, we were able to connect the Ultra96 to our local Router. Now, we are able to remote into the FPGA whenever we are connected to the local Router. We’ll still need to do some poking around in order to gain access to the FPGA when on campus, as our apartment network does not play nicely with port forwarding, but I’m sure we can figure something out. Either way, this is a good start for having a more streamlined development platform. We might decide to set up the board  in 1307, so we can just VPN in, but it’s flexible.

Other than this, we did some speicifc code analysis on the fluids.cpp file, but we’ll talk more about this in the team report.

Next week, my main goal is to compile a baseline kernel of the step2 function (which consists of the main body of the fuild simulation compuational kernel) using Vitis HLS. This will involve significant tinkering of the code and perhaps refactoring into more friendly datatypes. The best case scenario is that everything just compiles, but that’s likely quite the pipe dream.

 

 

Jeremy Dropkin Status 02/19/2022

This week unfortunately we had to spend a while to determine that for a while there was a bug with the version of Vitis installed on the cmu ece number machines. This means that for a long time we were determining why we were facing build errors on the ece machines and were unable to compile any project in Vitis, even basic example ones. After a while we determined that we were facing a bug due to the way Vitis stored integers in a format that broke on Jan 1 2022.

In addition to dealing with compiler errors, we also figured out what portions of the fluid rendering algorithm we are able to pipeline and apply loop unrolling to. I personally was able to gain a much deeper understanding of how the fluid rendering algorithm internally works and structures data. This helped me understand how we will structure the hardware, and be able to massively accelerate the algorithm. I also began thinking about how to schedule requests, and worked on the slides for the design presentation.

I think that we are slightly behind on schedule due to the issue with Vitis, but due to the way we created our schedule, I think that we are in a very recoverable position.

Alice’s Status Report for 2/19/2022

This week I was able to get Vitis set up on my laptop. We are now using my laptop as the testing platform since getting Scotty3D to compile was near impossible on the Andrew Linux machines, so we decided to pivot to using a machine that already had Scotty3D working. This process unfortunately took about 12 hours this week due to trying out various machines and hard drives and having to clear up space for Vitis, Vitis HLS, and Vivado (the install required 200 GB).

I am currently working on getting Scotty3D to build in Vitis on my laptop. In particular, I am working on getting Vitis to respect the CMake build system of Scotty3D. Based on some readings it seems promising that Vitis will be able to, there is a section on the Xilinx documentation website for Vitis and Makefiles. In addition to this, I’m also working with Ziyi & Jeremy in re-writing the Scotty3D code to be more hardware friendly, for instance, getting rid of recursion in the collision detection function and getting rid of unnecessary member variables in the Particle object (among other things).

We realized that all the rendering should be done on the fabric (the board’s CPU), so we don’t need OpenGL to work on the FPGA. OpenGL should behave like any other C++ library.

Next week I definitely want to get Scotty3D built through Vitis. Among software rewrites, I will also start looking at making a lightweight version of Scotty3D so we can separate out only what we are aiming to accelerate from the rest of Scotty3D. I also want to follow through on my command-line interface for an easier workflow. I am *slightly* behind on my schedule, but not worried at all as I estimate it to be only 1 or 2 days of work behind.

Team Report – 2/12/2022

Currently the significant risk that we are facing is the C++ code that isn’t compiling. A lot of the C++17 features that are being used are features we are not familiar with. We are making good progress on learning about this features, however, and are in communication with the main developer of Scotty3D on our various compilation issues.

The most important task for the project at the moment is getting OpenGL to work on the FPGA. Alice and Ziyi worked on it briefly this week, but more work is necessary.

No changes were made to the system design or schedule.

Alice’s Status Report for 02/12/2022

This week I worked on getting rid of dead code and re-working some of the imports so that fluid.cpp would be reliant on as few other libraries as possible, and started to make the flowchart for how the different data structures work together. I generated some dependency graphs in order to visualize which parts of Scotty3D are critical to the fluid simulation and UI. 

 

I also worked heavily with Ziyi this week. I helped Ziyi debug some compilation issues that arose when trying to compile Scotty3D in Vitis. There are various issues with the CMake and C++ versions, since a large portion of the codebase is implemented with C++17 features. We also took on getting OpenGL to work on the Ultra96 together.

The initial steps for this project are slow and a bit confusing, but we are still making progress at a rate that we expect. Next week I hope to complete the flowchart diagram and work with Ziyi to get Scotty3D compiled in Vitis, and make significant progress on getting OpenGL + simple graphics demo to run on the FPGA.