This week I explored a number of trained BitNets that are supported by microsoft’s bitnet inference framework. The goal of this was to find a model that would be small enough to fit on an FPGA, but worked well enough to be repurposed into a viable product.
Initially, we wanted to work with a model that had only 70M parameters, in the hopes that we could fit that model on any FPGA we wanted. However, after trying to chat with it, I found that the low number of parameters contributed to very poor performance as seen in my conversation with it below:
I tested a few more models with larger parameters (up to 10B) from this family of models. While they perform significantly better, these models are too large to fit on any FPGA we can afford (the 10B parameter model is around 3GB after quantization). I ultimately settled on this model, that has around 700 million parameters and is around 300 MB after quantization. This model is for text completion, as you can see below, so that is likely the direction we will take for our final project.
![](http://course.ece.cmu.edu/~ece500/projects/s25-teame1/wp-content/uploads/sites/368/2025/01/Screenshot-2025-01-31-at-3.38.36 PM-300x32.png)