Abstract
We introduce BitFit, a sparse-finetuning method where only the bias-terms of the model (or a subset of them) are being modified. We show that with small-to-medium training data, applying BitFit on pre-trained BERT models is competitive with (and sometimes better than) fine-tuning the entire model. For larger data, the method is competitive with other sparse fine-tuning methods. Besides their practical utility, these findings are relevant for the question of understanding the commonly-used process of finetuning: they support the hypothesis that finetuning is mainly about exposing knowledge induced by language-modeling training, rather than learning new task-specific linguistic knowledge.
| Original language | English |
|---|---|
| Title of host publication | ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Short Papers) |
| Editors | Smaranda Muresan, Preslav Nakov, Aline Villavicencio |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 1-9 |
| Number of pages | 9 |
| ISBN (Electronic) | 9781955917223 |
| DOIs | |
| State | Published - 1 Jan 2022 |
| Event | 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Ireland Duration: 22 May 2022 → 27 May 2022 https://aclanthology.org/2022.acl-long.0/ |
Publication series
| Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
|---|---|
| Volume | 2 |
Conference
| Conference | 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 |
|---|---|
| Country/Territory | Ireland |
| City | Dublin |
| Period | 22/05/22 → 27/05/22 |
| Internet address |
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Linguistics and Language
- Language and Linguistics
Fingerprint
Dive into the research topics of 'BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver