Fine tuning a pretrained model from Hugging Face Transformers with flax
Published:
Pre-trained models are great. They’re trained on a lot of data us normies probably won’t be able to compile by ourselves and they also require a lot of compute to train from scratch. Ever since BERT was released, the NLP community has been using pre-trained models to fine-tune on their own datasets. This is a great way to leverage the power of these models without having to train them from scratch. Read more