>At that time many young programmers regarded compression […] as very important....

DeathArrow · on March 26, 2023

While compression is congruent with some cases of statistical inference, no one is trying to do statistical inference using compression. Even the example you gave of Fabrice Bellard (which I highly admire) does the converse and uses neural networks to compress.

Even for realizing what you've said one needs a bit of theory, basic computer science and some math. And most people today have a "do" philosophy and are result oriented. They are not learn oriented, they hate CS and math and they hate low level programming.

I don't think compression is highly relevant today, but at the time I started learning it was a good tool to sharpen one's mind. Learn algorithms, learn about memory management, learn about hardware, learn low level constructs, learn a bit of math.

Whenever a young person wanting to undertake a programmer career is asking me for advice I'll ask what their motivation is. If they just want money I advice them to learn the tools in fashion today, maybe Javascript and React, maybe Python, maybe C#. But if they respond they want to do it for both enjoyment and money, I put them on the hard path. I ask they enroll an BS CS program, or if that is not possible, give them long lists of books, tutorials, courses which cover CS basics, math used in CS basics, hardware, low level programming and the programming hot topics du jour. In that last case I also design their learning paths to accommodate their goals and I am always taking time to answer them questions in the future if I can, if not I will at least indicate where should they look or which is the right person to ask.

vintermann · on March 26, 2023

> no one is trying to do statistical inference using compression

Well, maybe no one sensible? I still think it's quite cool that you can take any general purpose compression algorithm, and abuse it to do e.g. image classification. (Just append the image you want to classify to each of the classes' sets in turn, and see which compresses best!)

And actually I do remember a paper that tried to use ideas from PAQ to do online learning. Gated Linear Networks, out of DeepMind, in 2019:

https://arxiv.org/abs/1910.01526

All compression is related to AI, but especially sample efficient online learning is basically what data compression is all about.

DeathArrow · on March 26, 2023

Well stochastic processes are a much wider category than mere algebraic operations applied on group theory that losseles data compression use.

I'd think that calculus or functional theory or category theory can find more bijections towards statistics or even congruences than mere arithmetics or algebra ever will. Ok, you can explain or derive any mathematics construct using only algebra, and there were efforts to do so, but does it makes sense?