Driven by Compression Progress

February 2021

Reading time: 5 mins

Article Table of Contents

Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes
Conclusion
Footnotes

Note from author: This is part of an experimental series, more-or-less based on “white papers” and academic literature, as applied to somewhat practical-ish domains.

These pages serve as a brief overview of a paper, and I’ll be able to link to this paper down the road when I what to be able to do so, without having to repeat all of this information

Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes #

Paper on arxiv.org, PDF available for free therein, or here

Here’s the paper abstract, lightly reformatted for readability:

I argue that data becomes temporarily interesting by itself to some self-improving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful.

Curiosity is the desire to create or discover more non-random, non-arbitrary, regular data that is novel and surprising not in the traditional sense of Boltzmann and Shannon but in the sense that it allows for compression progress because its regularity was not yet known.

This curiosity drive maximizes interestingness, the first derivative of subjective beauty or compressibility, that is, the steepness of the learning curve.

It motivates exploratory infants, mathematicians, composers, artists, dancers, comedians, yourself, and (since 1990) artificial intelligence.

I read this mid-2019:

read in 2019

And three times since then. Whenever I re-read a paper, I add the date to it. Here’s it’s current state:

read 4x

The abstract was hard for me to read the first time. I had to re-read it a few different times. Here it is, with my thoughts between sentences:

I argue that data becomes temporarily interesting by itself to some self-improving, but computationally limited, subjective observer once he learns to predict or compress the data in a better way, thus making it subjectively simpler and more beautiful.

A self-improving, computationally limited, subjective observer is a fancy way of saying “a normal person”. I’m a self-improving, computationally limited, subjective observer.

I feel a little dose of dopamine in my brain when I encounter novel information, and I have learned to predicted it or anticipate it better than I think I would have. A sign of “better prediction” or “compression” is when I can name a principle elucidated by an author or a known phenomena that is a likely precipitating cause for the unfolding of the story/event.

Curiosity is the desire to create or discover more non-random, non-arbitrary, regular data that is novel and surprising not in the traditional sense of Boltzmann and Shannon but in the sense that it allows for compression progress because its regularity was not yet known.

Do you know that reference? I have no idea who Boltzmann is¹, but I happen to know who Shannon is, and because I know what is being implied this entire paper takes on a new light.

Shannon is an incredible figure. I’d encourage you to read A Mind At Play: How Claude Shannon Invented the Information Age. It’s fascinating. His life, and his approach to life and his research, is interesting, in the same recursive way that this entire paper I’m discussing is interesting.²

This drive maximizes interestingness, the first derivative of subjective beauty or compressibility, that is, the steepness of the learning curve.

The “steepness of the learning curve” is in the same sense that the slope is more important than the y-intercept. (Click that link, read the talk. It’s important to understand the implications for having “the steepest learning curve possible”)

This “steepness” piece, combined with the idea of “compressibility”, posits that they’re inextricably linked. The person who finds the best means to compress information learns the fastest, and therefore outperforms all others in medium-to-long time horizons.

It motivates exploring infants, pure mathematicians, composers, artists, dancers, comedians, yourself, and (since 1990) artificial systems.

The conclusion of the abstract ties in this (admittedly dense) topic of “compression progress” to the domain of all of us.

Anyone engaged in pure research should read this. Anyone who creates anything (words, jokes, code) should attend to the idea of “compression progress”. To attend to this compression is to engage in learning and exploration in the most effective way possible.

Incidentally, for examples of persons driven by compression progress, we can all turn to infants, and observe how they explore the world.³

Conclusion #

I’ll probably pull out quotes from the paper at some point. Download the paper and skim the table of contents.

Footnotes #

After reading this article, my friend Stephen Pollard emailed me with:

I just read that you don’t know who Boltzmann is. He’s a baller. He’s associated with Entropy and information theory.

↩
An understanding of how Claude Shannon understood “information compression” sheds light on other forms of compression, as they compare/contrast, like: Jane Austen’s concept of information (Not Claude Shannon’s) (2013) ↩
One of the best books I’ve read recently expands on this “compression progress” from a novel ethnographic perspective. The book title is: The Secret of Our Success: How Culture Is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter.

Don’t read the book, though, read this book review by Scott Alexander: Book Review: The Secret Of Our Success ↩

Josh Thompson about blog