Abstract
We train neural networks to optimize a Minimum Description Length score, that is, to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as an bn, an bn cn, an b2n, an bm cn+m, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.
| Original language | English |
|---|---|
| Pages (from-to) | 785-799 |
| Number of pages | 15 |
| Journal | Transactions of the Association for Computational Linguistics |
| Volume | 10 |
| DOIs | |
| State | Published - 27 Jul 2022 |
All Science Journal Classification (ASJC) codes
- Communication
- Human-Computer Interaction
- Linguistics and Language
- Computer Science Applications
- Artificial Intelligence