Return to site

Hello neighbor release trainer

broken image
broken image

We developed efficient, model-parallel ( tensor, sequence, and pipeline), and multi-node pre-training of transformer based models such as GPT, BERT, and T5 using mixed precision.īelow are some of the projects where we have directly used Megatron:

broken image

This repository is for ongoing research on training large transformer language models at scale.

broken image

Megatron ( 1, 2, and 3) is a large, powerful transformer developed by the Applied Deep Learning Research team at NVIDIA.

broken image