Date: August 25, 2025
Topic: Introduction to NLP
Recall
NLP seeks to make natural language accessible to computers
Natural language is a system of communication naturally evolved in humans for conveying information
Natural language is lossy but efficient
Notes
Introduction to NLP
What is NLP
- Set of methods for making human language accessible to computers
- Analysis/understanding of what a text means
- Generation of fluent, meaningful, context-appropriate text
- Acquisition of the above from knowledge and data
What is Natural Language
- Structured system of communication evolved naturally in humans without conscious planning or premeditation
- Humans discovered sharing information could be advantageous
- Language is an agreed-upon protocol for moving info from one person’s mind to another’s
- Different groups had different systems (ie different languages)
- Each has its own structures and “rules” that humans learn
- Syntax: rules for composing language
- Semantics: meaning of the composition
Communicating with Each Other
- Natural language is lossy but efficient
- Syntax can be wrong but we can still be understood
- Details can be left out but complex communication can still be conveyed
Non-natural languages (like programming languages) are literal
Knowing a language may mean you can produce and receive in that language, effecting others or acting on the language
Computers may not understand languages, but powerful algorithms help to imitate human language production and understanding
Computers and Natural Languages
Non-natural Languages
- Languages that are deliberately planned, with well-defined rules of composition
- E.g., computer languages (C++, Python, etc)
- Syntax is structured to eliminate ambiguity — everything is literal
Understanding Language
- If I know English
- I can produce English and affect change on others
- I can receive English and act reasonably in response
- Does a 3YO know English?
- Typically understands with limited vocabulary
- Does a dictionary know English?
- No. It is just a knowledge source
Computers need to understand Natural Language
- What to tell a computer so it understands English like a 3YO or an adult?
- How to tell a computer about language?
- How much do they need to “know” or “understand”?
Computers do not truly understand Language
- Humans also don’t have a proper definition of what it means to “understand” language
- However, powerful algorithms can imitate human language production and understanding
<aside>
📌 SUMMARY: NLP is a way for computers to understand natural language since computers traditionally only understand precise non-natural language (like programming languages). This allows for communication with computers, and leads to a wide variety of downstream tasks from the understanding of natural language texts, such as sentiment analysis and text generation.
</aside>