2/12/2018
Jen:
This week I fell asleep in a bus on my way back to my capstone group before I submitted a weekly update. I’m dead serious that capstone is the reason I spent money on a bus to get back 16 hours earlier than I otherwise would. This week I: – had some nice seg faults in the benchmark code as I added an option_id to the struct currently in the code (resolved) – Had some nice seg faults in the benchmark code as I added in timing data to it to track when each individual pricing was finished (resolved) – Added in some stuff in python to create a hashed option id out of the option data – Wrote a function in python to create the packets of data – Wrote a UART to package parser module in SV – Wrote a TB for the SV module – Made them compile and then realized it wasn’t working for some reason (not resolved) – Gave up and started an SV module for sending results out via UART
Xinna:
I finished up with parsing sample result files in the GUI and managed to create a live representation of which stock is currently being operated and the number of transactions finished. I am still trying to get the graph to generate live properly. I have also been stuck in lab trying to get uart to work. Turns out I accidentally had a ser.read() in python sending code. Oops..
Dan: We have been working on everything. Wish i could be more specific than that, but time is not bountiful at the moment. We have been working to resolve UART issues and some weird inaccuracy bugs
Group: Crunch time.
24/11/2018 (EU Again)
Individual:
Jen:
Amazingly enough, there was yet another bug because I had a typo in my code. I honestly forget what else happened on Monday because as far as I know, we were all braindead and tired, but I do remember Tze Meng finding a bug because I misspelled “parallel.” The C benchmark seg faults because there’s still an issue with the threads.
I’ve written the python parsing such that it can output a file that the C benchmark can read once the C benchmark stops seg faulting. I also found the interest rate we’ll be using for the benchmarks.
I need to fix the seg faults in the C benchmark, do some unfortunate bit-processing for the python code for the SV benchmark, and then the memory architecture.
Xinna:
I worked on debugging the UART on Monday and looked at the waveforms. I found a problem with when I was enabling counter and fixed that, however now it is thinking that it is receiving a new package of data too often. I would also need to edit how I am resetting the clock sampling counter. I could not get to fixing that over this break because my internet at home is super bad and I would have to wait until I return from break to fix the issue. In the meantime however, I have begun working on GUI in Visual Studio. I am currently still just setting things up in terms of ui and getting warmed up again with c#. This next week, I will work on finishing up with debugging UART and work on getting the GUI to graph out results from a sample data set.
Dan: Happy Thanksgiving! I have been working on the individual memory controller and data feeder for the blackscholes modules. Since we need to be able to feed data in faster than the maximum throughput of our blackscholes modules. The way this can be accomplished is by running multiple clock domains to allow for the memory module and logic to be running at a faster rate than the other logic.
Group:
As a group we have been working on dividing the final tasks more evenly. As of right now, Dan is the only one who knows how to make Vivado actually cooperate, so trying to split the RTL design tasks is a little more complicated than initially planned, but doing some of the algorithm and module design is still possible.
11/17/2018
Individual:
Jen:
This week I realized that the benchmarks weren’t working properly due to a mistake in the parsing in the C code. After spending three days on that bug and moving back to the Python processing part, Dan found my bug and that’s fixed now. I did work on the Python processing, and am working on parsing the data out from English (Ex. from two dates like Oct. 13, 2018 to just a number in terms of years) and then packaging that into bits for the FPGA. Unfortunately, this has put me a bit behind schedule, but I aim to finish all the processing by tomorrow night. That way I can work on the memory architecture on Monday and be able to still ask Dan questions before the break on the specifics / read up on it.
Xinna:
This week I found a bug in the log function and worked on fixing that. It turns out I was accidentally calculating ln(x) instead of log(x). Initial testing showed that it was very accurate, to the precision of typing the values on the calculator on google. I would have to check the speedups gained on Monday. I am also currently working on fixing UART receiving module on the FPGA. I have ran it on the waveform viewer and noticed some issues with the state being stuck in the data state. It could be an issue with the test module I wrote but I am still currently investigating.
Dan:
This week I raged at Vivado for an excessive amount of time again. My blood pressure may be in bad shape by the end of this semester, but my doctor has proactively prescribed high blood pressure medication, so I should make it to graduation. Through hard work and determination, I figured out how to get memory modules for block RAM working on Vivado, which should greatly help with the speed that we can process data.
Group:
As a group, we’ve been adjusting the schedule a lot. We’re still on track to have a deliverable project, but we definitely need to be able to stick to the schedule really hard this week and over winter break.
11/10/2018 (Now we’re NA again)
Individual:
Jen:
This week I fixed our C benchmarks for SIMD and OpenMP for threading. I also reformatted our code so that we could more easily test the accuracy of different optimizations. Next week I’ll be working more on data packet formatting for the FPGA
Xinna:
This week I finished up with optimizing exponentials and logarithms. I will continue on to find optimizations that can be used to improve division. I will also begin debugging why the UART is not working.
Dan:
This week I fought Xilinx some more and ended up figuring out that I cannot trust the documentation that Xilinx provides. Turns out that ready signals do not always mean ready, and a ready signal requires that the start signal be asserted in order for the ready signal to be asserted which means that the user cannot tell if the module is ready without starting it, in which case it will start and no longer be ready. Also the module accepts inputs at times during which it will stall the module from completing operation, so manual delays are required to be put in to prevent that.
I also finished the demo sample and have begun working on bringing up the memory interface for storing our round data.
Group:
This week we had the demo so we finished up benchmarking on the CPU and the FPGA. Next week we will begin connecting all the pieces and optimize a bit more.
03/11/2018 (That’s right. We’ve switched to European dates as well):
Individual:
Jen:
This week I worked on getting a working C benchmark for comparison. I thought that it was working for OpenMP but I actually think I messed up and it’s sequential. The numbers I got were .18 ms per Black-Scholes calculation. I’m currently working on getting the OpenMP + SIMD instruction benchmark, and I want to get a graph of how the numbers compare up for our demo on Wednesday.
Xinna:
This week I continued working on figuring out how to improve the accuracy of calculating exponentials fast. Using an existing implementation we managed to reduce the area size required to calculate significantly, however it was inaccurate (off by .01) . I read up on this paper: https://www.researchgate.net/publication/272178514_Fast_Exponential_Computation_on_SIMD_Architectures. Which involved breaking e^x into 2^(xi + xf), where xi is an integer and xf is a fractional part (between -1 and 1). I found a more accurate equation where it uses e^x = 2^i * e ^f where f is in a narrower interval between [-1/2log2, 1/2log2]. I am still currently working on figuring out the math for the coefficients needed to get the implementation more accurate.
Dan:
This week I have been preparing for the mid-point demo and writing overall controllers for our implementation. The goal for this week has been to simply create a controller for feeding in information and distributing it to the individual Black Scholes implementations. I have been designing this to be expanded quickly to other data interfaces. We have been migrating some initial optimization over to the FPGA as well, which seem to have increased our area effectiveness.
Group:
This week we are on schedule to get a working implementation once we get optimization, streaming, and a GUI. We need to work on both our benchmarks and get a midterm demonstration setup. We will be getting our schedule together this weekend.
10/27/2018:
Individual:
Jen:
This week I committed to not doing any more onsite interviews, so I’ll be in mandatory lab from now on. This means that I will average 3 hours of visible lab time more per week.
This week I bashed together an easier to use Python testing script so that we can double check the accuracy of our optimized C and Verilog code. Thanks to this, I can confirm that we caught at least one wrong Verilog implementation.
I also wrote a small Makefile for Xinna’s ease of use.
Xinna:
This week I worked on optimizing the exponentials in our C code by implementing a LUT for the taylor series. I also worked on testing the C and Verilog code. I also learned more about how to optimize HLS and initializing caches.
Dan:
I put Black-Scholes onto the FPGA and did a test run. Unfortunately, it takes about 15 minutes to synthesize, so we need to be careful about making sure our code is correct before we put it onto the FPGA. The first test run failed due to some incorrect code being synthesized, but I fixed it and now the most recent synthesis works produces an accurate result.
Group:
10/20/2018:
Individual:
Jen:
This week I worked with Quinn to get the order process started for the data. The data order is unfortunately considered a software purchase, so it’s been delayed while the University Contracts approves it.
Xinna and I also talked about the process of making the simulator and GUI. We’ve decided to use C# for the GUI and I downloaded Visual Studio. We may switch all of our processing to C# instead of Python because working with bits and bytes in Python is somewhat frustrating. I’ll have to work on getting familiar with C#, but now that I’ve deleted Overwatch for Visual Studio, I should have plenty of time for that.
However, this week has been overall extremely busy for me due to on-site interviews combined with exams, and although I’m also out next Wednesday, I hope to be more productive next week.
Xinna:
This week I was out of town for the majority of the time. On Monday I was working on getting the UART to stream onto the FPGA with Dan. We weren’t sure why no data appeared onto the FPGAs much less the correct data. It was difficult for me to get much work done this week. I had an on-site interview on Wednesday and I am currently on vacation starting Friday and will be back Sunday night. Due to exams as well this week, I was not able to make much progress, I hope I will be able to double my work next week.
Dan:
This week I mastered the art of vivado synthesis and clocking. We began working on getting the uart bridge working. We didn’t have much success as it appeared that no communications were going through on the uart pins. I then began planning out optimization paths for our black scholes implementation and ways in which we can reduce space and clock cycles. Since this is such a large path, we will be planning the optimization goals in chunks, possibly doing function condensing, then look up tables and finally some internal pipelining.
Group:
10/13/2018:
Individual:
Jen:
This past week a lot of my efforts have been on the design presentation and design paper. For the design paper, I did some reading on other work currently out there (paper on using HLS in HFT, Velvetech blog on why HFT companies use HFT, paper on optimizing for latency in HFT) for the “Related Work and Competition” section of the paper. From these papers, I concluded that FPGAs are commonly used for processing network data and sending orders out, managing order books (lists of buy and sell orders), but that they’ve also been used for other things including more advanced trading algorithm processing. Their advantage also comes from the fact that the CPU latency can be unpredictable depending on interrupts, while an FPGA is programmed to do exactly what the RTL says, and doesn’t incorporate that randomness, so its latency is extremely predictable. The first paper on using HLS in HFT actually talks about implementing an entire end to end trading system on an FPGA, so I’m hopeful that we can replicate their experience.
I also submitted a purchase request form for data from IVolatility. I’m hoping that the purchase request was detailed enough!
Next week I aim to work more on the simulator and hopefully start optimizing the HLS produced model.
Xinna:
This week I worked on the presentation and the design documentation. I added in new diagrams and worked on finalizing the components that we would have to create. I plan to write the module that would take in UART data and then output all the inputs to the Black-Scholes module after completing the paper.
Dan:
This week my endless fight against Xilinx continued. I have been successful in getting bit streams to generate without a clock. I have been attempting to figure out how Xilinx does its clock linking and generation. From the research I have done, everything seems to point to the idea that Vivado makes it easy, but that does not seem to be the case. Most forum posts state that “Vivado will take care of the clock stuff for you”, but none of them actually tell you how to let Vivado do that. I have only recently been successful at generating a bit stream while a clock was in use, however it has not been run on the board as of yet. I also did some pre-analysis on the Black-Scholes code and ran that through a bare pass of HLS to get raw numbers on expected size and latency. I began identifying a number of optimizations we can make to improve our throughput. In the next week, I plan to be able to consistently have Vivado generate bit streams, which are run-able on hardware, because as of right now, it sometimes feels like it is just chance as to whether it succeeds or fails.
Group:
As a group, we’re a little bit blocked by the lack of success in getting Vivado to work. We understand now why that one group from Fall 2017 took 4 weeks getting the software to work. In the meantime, Xinna and Jen have been working on the design presentation and paper for the most of this week, and will be working on the simulator / other parts of the system until we can figure out how to get Vivado to cooperate.
10/06/2018:
Individual:
Jen:
This week I wrote a code to parse out data from the CSV files we’ll receive when we order data and came up with guidelines for exactly which stocks we want, and chose stocks according to those guidelines. I started the budget sheet, and wrote up an explanation of what stocks we chose and why, as well as the ones we discarded.
I also did some research on the theoretical maximums for the FPGA and the ideal size for the data packets we’ll send, given the information we need to send per option.
Xinna:
I finished up with the UART receiving code and also added in the UART transmitting code. Hopefully we can figure out how to upload it onto the Virtex 7 and have the data begin transmitting to and from the FPGA. For now I’ll use a baud rate of 9600 but this may changed to a higher rate if it is too slow for the sake of the demo.
Dan:
This week I managed to complete the impossible: I made Vivado segfault not once, not twice, but 3 times. In the middle of those segfaults, I figured out how Vivado generates its IP blocks, found the pin mapping assignments for gpio pins, learned what causes bitstream generation failures, and learned that the Virtex-7 has a differential clock that is not allowed to be differential during bitstream generation. I also spent an ungodly number of hours attempting to get Vivado to recognize that we do, in fact, have a board that is indeed connected to the computer. Xilinx’s “help” topics and debugging steps for the programmer literally led me to blank pages, pages for purchasing new products, and mentioned software which has been out of date for 6 years. I always figured it would be a driver issue, since these programs weren’t technically installed on our computer, but attempting to locate any type of driver information through Xilinx was a sisyphean task, which is incredibly frustrating when you learn that the solution to all of this was located within a series of nested folders under names which no one would ever guess included scripts to actually setup and install these drivers.
But it worked.
Now we begin the UART integration and the conversion of black-scholes into the RTL design.
Group: As a group we’re making pretty decent progress on different fronts. Dan has found true belief in God after a prayer made Vivado work for him. If we continue like this next week, hopefully we’ll be able to get UART working on the board and start with the data processing within a week or two. We will need to submit the order form, but since it could take some back and forth, we might need to talk to someone about how to do it.
9/29/2018:
Individual:
Jen: Spoke with a professor about pricing European and American options and learned that provided: 1. interest rates are positive and 2. There are no dividends between the current date and expiration date, option prices will be the same. Xinna and I sent a preliminary email with potential source of data to get a quote on the price. I talked to a few different high frequency trading firms at the TOC, and Optiver suggested I contact their recruiting if I want to get in touch with a hardware engineer to learn more about how they do things in industry.
Goals for next week: Will be writing a Python API for processing data by Monday and getting a final set of stocks to look at.
Dan: Began using vivado. Downloaded and walked through first couple basic tutorials for Vivado. I was given a separate course’s first lab, which has a vivado basic tutorial attached to it. This came with basic matrix multiply style code. I took a look at the HLS breakdown to see how it converts code to Verilog to try and get a sense of how the directives would affect the actual code generation. I did not notice a difference in the code that was generated, but when I looked at the timing, the pipe-lined version of this code was running slower than the non pipe-lined version. I am bring this up on the Virtex-7, just to ensure that programming will work as expected. It will not have any output or monitoring attached to it. In the next week I will plan on writing my own basic code and work on learning how to integrate the HLS with regular .sv modules, or if that is even possible.
Xinna: I continued working on sending UART packages to the FPGA. I plan to finish this by Monday and have a working deliverable. I’ve also received replies from a few companies and their data looks promising and within our budget. If things go well on Sunday, I hope to be able to have the FPGA to Python/C# dataflow completed as well.
Group:
As a group, at the very least Xinna and Jen are behind on our schedule because of the TOC and interviews, but we’re going to be working on it on Sunday. Once we have the list of companies, we’ll do our order form.
9/22/2018:
Individual:
Jen: This week I helped look into different data to figure out the viability of different currently available datasets. We have sent out a few requests for more information to different companies and have looked at many potential available datasets.
I also worked on setting up the WordPress’s format and wrote up the Introduction.
Dan: Attempted to find historical data which provides all the information needed to properly implement our algorithms. This involved trying to work with terrible data parsers to figure out how to grab FAST data from an EMDI scheme because the EOBI parser refuses to work on any Linux machine even though all the parsers were designed specifically for Linux. This is even more frustrating when you compound it with the lack of any documentation for these parsers. The sample data files I found to see if these data streams were appropriate must be parsed in order to be used. Attempting to use the EMDI parser yielded some results, but the fields did not contain any information on the price of the option being moved. This took a number of hours to figure out, because the files when parsed were over 1 million lines long of meta-data, and many of fields included “price” in the name even though none of them actually contained the price of the option/security… We are planning to call the company on Monday to see if we can get more information on what information is provided in the actual data if we were to purchase it.
Xinna: Setup dummy functions for parsing in python. I added in a way to send serial data on python but I have no way of testing it because I can’t find Vivado.
Group:
As a group, our main risks are finding, purchasing, and parsing the appropriate data. We may have found appropriate data, but ordering and finding parsers or building our own may be a challenge. Additionally, we are having issues finding the Vivado files, so we’re behind schedule in terms of learning how to use Vivado and testing our Virtex 7, but we are on schedule for getting the data to create our simulation model. We also have set up a Github in anticipation of our future code.
As a group we are mostly on schedule, and we do not currently have individual deliverables. Next week we aim to submit a purchase order for data and to get some hands on experience with Vivado.
We have not needed to change the current design of the system.