Sign in

FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation

By Liqun Ma and others at
LogoZayed University
and
LogoCarnegie Mellon University
This work presents a Fully BInarized Large Language Model (FBI-LLM), demonstrating for the first time how to train a large-scale binary language model from scratch (not the partial binary or ternary LLM like BitNet b1.58) to match the performance of its full-precision counterparts (e.g., FP16 or BF16) in transformer-based LLMs.... Show more
July 9, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation
Click on play to start listening