This week, I focussed on two primary aspects of the project:
- Ethical considerations and how this will adjust the benchmark. For this system, I have made some minor improvements to the model so that it simply refuses to autocomplete certain types of text — eg: Medical, urgent action etc.
- Analysing the Microsoft Bitnet Paper in order to suggest performance improvements that we could target.
Overall, the aspects that I was able to achieve are:
- Reduced hallucination rate by over 6%, but this was naturally at the expense of the model simply refusing to provide an output.
- Identified the Look-up-table implementation and the indexing system as the major speedups which would provide 40% more throughput in the system.
My goals for next week are:
- I want to be able to connect with the FPGA wirelessly and transmit the query onto the board(this I can do simply after booting Linux on the core that the board has) so I’d probably do this before we start working on the synthesis flow.
- Prepare more on Vitis to see how I would synthesize a basic block that detects the query and pasts the exact same text as the query into the output(this can be seen as a prelim step, we would simply replace the short circuit with our model in order to complete the system)
I wanted to keep pretty conservative goals for this week given that we are finally going to start interfacing with hardware, and this will always come with a number of challenges relating to the setup and the use of the system. At the same time, I still think that the above goals that I have listed are reasonable.
We’re currently well ahead of schedule.(Approx 2 weeks)