Sign in

AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies

By Bo-Wen Zhang and others
In recent years, with the rapid application of large language models across various fields, the scale of these models has gradually increased, and the resources required for their pre-training have grown exponentially. Training an LLM from scratch will cost a lot of computation resources while scaling up from a smaller... Show more
August 13, 2024
=
0
Loading PDF…
Loading full text...
Similar articles
Loading recommendations...
=
0
x1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies
Click on play to start listening