Anirudhp_29thJan2025

I am currently working on recreating a Flux 1.58 bit model as announced by Bytedance Research.

However, at this time, the model that they have trained shows a 7.7x times size improvement over the existing 23.5GB Flux model that was released by Black Forest Labs. This model will be in excess of 3Gb, and cannot be accomodated on the FPGAs that we have access to(max size 2Gb).

As a result, I have currently replicated the quantization process for the Flux model, however even though the model was open sourced by Black Forest Labs, the training code and training data are not referenced. As a result, I am currently trying to adapt the quantization system for a fully open-source text to image system such as:

Dall-E Mini or the first Flux.1 Dev model that was released.

However, the FLux model when quantized to 1.58 bits does produce excellent outputs that are almost on par with the original model.

Eg: “A man using a soldering iron to repair a broken electronic device” Produces:

My goal for the end of the next week is to either identify a way of using an FPGA that can accommodate the larger models(Using either a DIMM slot or in an extreme case, networking two FPGAs).

And if this is not possible, either distilling the FLUX model or recreating the quantization code for DALL-E Mini

Leave a Reply

Your email address will not be published. Required fields are marked *