Team E1: Texcelerate – A hardware accelerator for private intelligence

We aim to address two current challenges in ML applications:

Models are too heavyweight to run on local machine.
Consume excessive energy, making them environmentally unsustainable.

To address this problem, we plan to develop an FPGA-based accelerator as a precursor to an ASIC capable of running smaller, lightweight “bitnets” locally.

Bitnets are highly quantized versions of their base bulky models, and recent research by Microsoft, Tsinghua University and the Chinese Academy of Sciences has shown that such models can be trained with a minimal loss in model output quality.

Our proof of concept will demonstrate architectural improvements, achieving faster text generation compared to FPGA- based CPU/GPU systems of similar size and power class. We will validate our approach using a heavier text completion model.

Currently, we are working on identifying the ideal bitnet model that we aim to accelerate, using the following considerations to evaluate the models:

The models should be small enough to run on the FPGA’s limited hardware resources.
The models should be producing good enough outputs that they could be used for applications like text or code completion. With a future goal of predictive text completion.