Figure 1: The parameter sizes of language models increase exponentially over time [11]
Figure 3: T0 is trained on explicit task formulations for a wide range of linguistic tasks
Figure 4: Basic schema of an encoder-decoder architecture (example of English-German translation)
Figure 5: Association strengths between language models and downstream tasks [12]
Table 1: Summary of the features of the most popular Large Language Models