Introducing the Invariance Theorem

Last Updated on

This is from a series of lectures by by Hector Zenil and Narsis A. Kianib entitled From Networks to Cells. It is sponsored by the Sante Fe Institute.

Text Transcription below:

So we saw in the last lectures how we can quantify algorithmic randomness by using computer programs running on some reference universal Turing machine. But one could think that it is always possible to find some language in which a particular object had a short encoding, no matter how random. For example, Alice and Bob could agree to compress all the content of the Encyclopaedia Britannica in a few symbols, so when Alice presents Bob with those symbols, Bob would know Alice meant the Encyclopaedia Britannica.

Algorithmic Information Theory (AIT) would mean little if the complexity of something could be determined so arbitrarily by renaming things with any number of symbols. AIT requires the program to reconstruct the original object from scratch so there is no coding cheating. Bob would need a computer program to reconstruct the Encyclopaedia Britannica without any external help from Alice or anyone else. It would also look like one would need to specify the specific programming language and the particular universal Turing machine on which those computer programs would run for algorithmic complexity to work or make sense. Otherwise, one would be able “rename” things and make anything look random or simple by changing the underlying language and Turing machine. So let’s consider the following sequence that has 3 symbols, 0, 1, and 2: We can show that this sequence can be generated by a small computer program like this one: whose length is no more than 56 bytes no matter how long the sequence to produce the same pattern. However that is in the Wolfram Language running on Mathematica. What if instead we had used Lisp, Java or Visual basic? It seems that the result would then depend on the computer language chosen.

And even in a single computer language there may be different computer programs of different size can produce the same object. For example, the following two computer programs can generate any number of digits of the Thue-Morse sequence and both are different in length even though both are small and generate the same sequence. So how to deal with different computer programs in possibly different computer languages with different lengths? Fortunately, the Invariance Theorem as proven by Kolmogorov, Solomonoff and Chaitin, tells us that the difference between any two computer programs producing the same object is at most a constant from each other. More formally, the so-called invariance theorem establishes that the difference of the lengths of the minimal programs on any two computer languages L1 and L2 is always bounded by a constant that depends on L1 and L2 but not of s: The way to see this is by thinking in a translator program between two languages L1 and L2.

Because what the invariance theorem says is that one can always write a 3rd computer program of fixed length capable of translating between any two languages L1 and L2. This is actually very clear with the computer language Java because one of the innovations of Java when it was introduced was that there was a Java Virtual Machine or JVM that made Java multiplatform in the sense that you could run your Java program on any platform without changing any language code. That is because the many JVMs were acting as a translator, or officially called a compiler, between the Java programming language and each machine code for different platforms such as Unix and Windows. So if the translator is shown in blue, one has to only add the translator to every computer program written in some language to convert it into another program in another language, both languages producing the same sequence of interest.

The length of the compiler or the length of the blue part is the same for all input strings and so it is constant. So if L1 is the Wolfram Language and L2 is Java, then the invariance theorem tells that finding the shortest computer programs to measure the algorithmic complexity of a string is about the same length up to a constant. In another more practical example, what the “c” in the invariance theorem characterises in a biological context, for example, is that the DNA and a set of reading and writing chemical machines called ribosomes form a couple of language and compiler able to transcribe and translate DNA into proteins effectively from one language into another. So ribosomes can be thought as some sort of compilers between the DNA, RNA and the protein space with the ribosomes always of same complexity independent of what they are reading or writing. can be reproduced with a code no longer than 56 bytes no matter how long the sequence with this pattern is required.

Well, that is not entirely true. Actually the code has to grow by a little bit if we want to produce longer sequences with the same pattern. It does grow a little bit because one has to determine the end of the algorithm, that is when do we wish the sequence to come to an end. And that number does make the code to grow a bit. In fact, classical information theory tells us that we can encode an integer n in about logarithm of n bits, because remember we can always guess a number in about logarithm of n yes and no answers. You can see how running this code with very large numbers, the byte count does actually change for very large numbers: So despite how trivial this extremely little piece of information may look and how kind of little difference it makes, it is not only mathematically precise but it will turn out to be the most important little piece of information in the field of Algorithmic Information Dynamics, and you will see soon why.

Not to keep you waiting but not getting into details, the main idea is that, unlike a deterministic system that has a well-defined generating mechanism such as this sequence and its generating formula, other systems that may be subject to some source of randomness or are interacting with other systems, this logarithmic change will not be longer be respected and the more removed from the logarithmic term the more it will tell you some unvaluable about the system’s behaviour and its underlying causes. To understand this better, in the next unit we will cover the theory of dynamical systems for you to understand what a dynamical system is before we finally get into the subject of Algorithmic Information Dynamics. .

Additional Resources:
Some great Maths Lecture on British TV – to access them from outside the UK then you’ll need a VPN. This post shows how it’s done – ITV Hub abroad.

To test functions of random generation in tandem with online programs and devices then you’ll probably need to switch IP addresses randomly while conduction the tests by perhaps using rotating proxies –