Date: September 2, 2025

Topic: Language Modeling

Recall

Language models are capable of a wide variety of tasks

Fluency is important for language models

Notes

Why Language Models?

What Makes a Good Language Model?


In our vocabulary, we might just take the most frequently used word, and use symbols like OOV to indicate the word seen is unknown.

Using random variables, we can achieve fluency by maximizing probability. Thus we want to find a sequence of text $W_1,W_2,...,W_n$ that can do this.

Modeling Fluency

Vocabulary

Modeling Fluent Language


To model fluency, we can use the preceding words to predict the next word by choosing one that has the highest probability.

Example of Modeling Fluency

History and Context

$$ P(W_1 = w_1, \ldots, W_n = w_n)= \prod_{t=1}^{n} P\!\left( W_t = w_t \,\middle|\, \underbrace{W_1 = w_1,\, W_2 = w_2,\, \ldots,\, W_{t-1} = w_{t-1}}_{\text{$w_1$, $w_2$, ..., $w_t-1$ is also called the history'' (context'') of the $t$-th word}}\right). $$




<aside> 📌 SUMMARY: The main goal of language generation is in modeling fluency, such that we can achieve text that looks like accurate language. Such fluency can be approximated using probability. Using unigrams, bigrams and n-grams can reasonably approximate fluency, large numbers of $n$ is hard to manage.

</aside>


Date: September 7, 2025

Topic: Neural Language Models