

(iii) DeepSpeed-RLHF System: A robust and sophisticated RLHF system that combines the training and inference prowess of DeepSpeed into single unified Hybrid Engine (DeepSpeed-HE) for RLHF. Additionally, it offers data abstraction and blending capabilities to enable training with multiple data sources.

In addition, it provides an inference API for testing conversation-style interactions after the model is trained.ĭeepSpeed-RLHF Pipeline: DeepSpeed-RLHF pipeline primarily replicates the training pipeline from the InstructGPT paper with careful attention to ensure completeness and one-to-one correspondence with the three-steps that includes a) Supervised Fine-tuning (SFT), b) Reward Model Fine-tuning and c) Reinforcement Learning with Human Feedback (RLHF). The initial release of DeepSpeed-Chat includes the following three capabilities:Įasy-to-use Training and Inference Experience for ChatGPT Like Models: A single script capable of taking a pre-trained Huggingface model, running it through all three steps of InstructGPT training using DeepSpeed-RLHF system and producing your very own ChatGPT like model. DeepSpeed-Chat makes complex RLHF training fast, affordable, and easily accessible to the AI community. It can support parameters ranging in size from a few to hundreds of billions.ĭeepSpeed-Chat RLHF training experience is made possible using DeepSpeed-Inference and DeepSpeed-Training to offer 15x faster throughput than SoTA, while also supporting model sizes that are up to 7.5x larger on the same hardware.

You can train a 13B ChatGPT like model in 1.25 hours and a massive OPT-175B model in a day on 64-GPUs.ĭeepSpeed doesn’t have any limits on no.of parameters. Microsoft claims that you can train up to a 13B model on a single GPU, or at low-cost of $300 on Azure Cloud using DeepSpeed-Chat. Yesterday, Microsoft announced the release of DeepSpeed-Chat, a low-cost, open-source solution for RLHF training that will allow anyone to create high-quality ChatGPT-style models even with a single GPU.
