The Synonyms of Intelligence: Compression, Abstraction, and Learning
Imagine you’re given two tasks.
Task 1: Memorize the multiplication tables up to 12x12. This means memorizing 144 results, each a combination of two numbers and their product.
Task 2: Learn the rules of multiplication.
At first glance, Task 1 seems daunting because of the sheer volume of information. Task 2, however, seems ambiguous. What does it mean to “learn the rules of multiplication”?
This article delves into the idea that intelligence is essentially a form of data compression. By understanding the rules (or patterns) that underlie information, we can compress vast amounts of knowledge into simpler representations.
The Mathematics of Compression
In the context of the multiplication tables, rote memorization means storing 144 individual data points. However, learning the rules of multiplication (like understanding that any number multiplied by zero is zero, or any number multiplied by one is the number itself) drastically reduces the number of data points you need to remember.
For instance, if you understand the property of commutativity (a x b = b x a), then you’ve already halved the number of unique multiplication combinations you need to remember. 6x7 and 7x6 yield the same result!
This process of understanding underlying patterns to reduce the volume of information is analogous to compression in computer science. Just as a compressed file takes up less storage space while retaining the essence of its content, an abstracted or compressed understanding of a concept retains the essence of the knowledge while reducing cognitive load.
The Beauty of Abstraction
Abstraction is the process of filtering out – or abstracting away – the specifics and details to expose only the necessary parts. In the realm of mathematics, this is evident when we transition from arithmetic to algebra. Instead of working with specific numbers, we start working with symbols and variables, which represent a broad range of values. This is a higher level of abstraction.
When we abstract, we’re creating a mental model that can be applied in various scenarios. It’s like having a skeleton key that can open numerous doors, instead of a massive keychain with a different key for each door.
Intelligence: The Ultimate Compression Algorithm
Taking this idea further, we can postulate that intelligence – whether it’s human intelligence or the apparent intelligence of machine learning models – is fundamentally a process of compression. It’s the ability to discern patterns, abstract rules, and apply these rules across different domains.
A child doesn’t need to touch every single stove to know it might be hot. They recognize the pattern from past experiences and abstract the rule: “Stoves can be hot; I should be careful.” This generalization, an abstracted piece of wisdom, is a compressed form of countless individual experiences.
In the realm of machine learning, especially with models like large language models, the underlying neural networks are designed to recognize patterns in data. When exposed to vast amounts of data, these models form abstract representations, effectively compressing the information. This is why they can generate human-like text or recognize objects in images. They’ve distilled the essence of their training data into a more compact, abstracted form.
Reasoning through Compression: How Large Language Models “Think”
Building on our understanding of intelligence as a form of compression, we can delve deeper into how large language models, like GPT or BERT, appear to “reason”. Just as humans compress vast amounts of information into patterns and rules to make sense of the world, these models do something remarkably similar, albeit in a more mathematical, data-driven manner.
Pattern Recognition: The Heart of Machine Intelligence
Large language models are, at their core, sophisticated pattern recognizers. When they’re trained on extensive datasets, these models encounter countless sentences, ideas, arguments, facts, and opinions. By adjusting their internal parameters (or “weights”), they learn to predict the next word in a sequence, effectively capturing the structure and essence of language.
Every sentence, phrase, or word they process contributes a piece to an immense jigsaw puzzle. The “picture” they form isn’t of a physical scene but a dense web of linguistic patterns, rules, and structures.
Abstracting Language: Beyond Memorization
While these models have a vast number of parameters allowing them to remember a lot, they can’t possibly remember every sentence from their training data verbatim. Instead, they distill the essence or the “gist” of the information.
For example, they might not remember a specific book’s summary they were trained on. Still, they might retain the general themes, character tropes, or narrative structures common to that genre. This distilled knowledge is a compressed representation, an abstracted form of countless books, articles, and texts.
From Compression to Reasoning
So, how does this compression translate to reasoning?
When we pose a question to a language model, it doesn’t search its training data for an exact match. Instead, it relies on its abstracted, compressed knowledge to generate a relevant response.
Consider a hypothetical scenario: We ask the model, “What happens if you pour water on a plant?” Even if the model has never encountered this exact question, it has seen enough related content to know that plants need water to grow, photosynthesize, and survive. Using this abstracted knowledge, it can “reason” and provide a relevant answer.
Similarly, when faced with more complex questions or tasks, like summarizing a text or providing an argument on a topic, the model taps into its web of compressed patterns to generate coherent, contextually relevant responses.
Wrapping Up
Understanding intelligence as a form of compression offers a profound insight into both human cognition and machine learning. As we build and interact with ever-evolving language models, recognizing the underlying principles of pattern recognition, abstraction, and compression helps us appreciate their capabilities and their limitations.