Frameworks
Link copied!
DeepSpeed ZeRO++ A framework for accelerating model pre-training, finetuning, RLHF updating.
By minimizing communication overhead. A likely essential concept to be very familiar with.
Levanter (not just LLMS) Codebase for training FMs with JAX.
Release Using Haliax for naming tensors field names instead of indexes. (for example Batch, Feature....). Full sharding and distributable/parallelizable.