ec2-18-234-236-14.compute-1.amazonaws.com | ToothyWiki | RecentChanges | Login | Webcomic Informal summary: "If all you know about two propositions is how complex they are and that they are both consistent with all currently known facts, then on average, as new facts emerge, the simpler of the two theories is more likely to remain consistent."
Consider the following thought experiment:
A universe U(x,y,z) contains X individuals each of which falls into Y dimensions of category (eg Species, Gender, Colour) each of which have Z possible values (eg Pink, Blue, Red).
A proposition P consists of a series of statements that may only be either true of false and that get applied only to individuals, connected by various brackets and logical conjunctions from the set of allowed operators O.
So for example "There are more cats than dogs" would not be a valid proposition, where as "((Cat OR Dog) AND (Black OR Brown)) OR (Green AND Pig AND Male)" would be valid.
A proposition is considered true for a particular U iff it holds true for X individuals in U.
C(P) is a measure of the complexity of a proposition (for instance the number of operators in it).
S is a set of propositions.
F(C,U) is the set of all propositions of a particular complexity.
T(S,U) is the fraction of S that are true for a particular U.
(the proportion of statements of a particular complexity that were true for a universe that continue to remain true as the universe expands by 1)
Reay's Lemma states: For two different levels of complexity C1 and C2, if C1 is larger than C2 then E(C1,U(x,y,z)) will tend to be smaller than E(C2,U(x,y,z)), and that this tendency will increase as x, y, z and (C1-C2/C2) increase. --DouglasReay
Isn't that just saying "as you add more things to the list of criteria you use to pick out objects from the universe, the number of objects you can find that fit all the criteria decreases"? - MoonShadow
I hope not. It is intended to say that if all you know about two propositions is how complex they are and that they are both consistent with all currently known facts, then on average, as new facts emerge, the simpler of the two theories is more likely to remain consistent. --DouglasReay
I was attempting to state it in a way that is exhaustively testable for small x,y,z and O. It is a prediction about what results you will get if you write a computer program to generate such. --DR
It's true that the simpler proposition is generally more probable, but it also has less explanatory power. If we observe particular, specific evidence, then in general the probability of that evidence given a complex hypothesis is higher than the probability of that evidence given a simpler hypothesis. Since Bayes' Theorem tells us that the posterior probability of a hypothesis is proportional to the product of the prior probability of a hypothesis (given, say, by Occam's razor) and the likelihood of the hypothesis (i.e., the probability of the data given the hypothesis), it's not obvious whether our posterior conclusion will give higher weight to the simpler or more complex hypothesis. But yes, as far as prior probabilities go, simplicity generally captures a larger portion of the sample space. --Utilitarian?
Another thing to consider is what would happen if, instead of randomly generating new individuals, you instead look at all rules under which the present set of individuals could have been generated, pick one of those rules at random, then generate the next individual according to that rule. A baysian approach "Given the possible universes I might have been in, how do I modify my best estimate of the probability of having been in each one, given this new data?" --DR
In 2005, Marcus Hutter mathematically proved that shorter computable theories have more weight when calculating the expected value of an action across all computable theories which perfectly describe previous observations (universal artificial intelligence).
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability ISBN 3-540-22139-5
The connection between the simplicity of scientific theories and the credence attributed to their predictions seems to permeate the practice of scientific discovery. When a scientist succeeds in explaining a set of nobservations using a model M of complexity c then it is generally believed that the likelihood of finding another explanatory model with similar complexity but leading to opposite predictions decreases with increasing n and decreasing c. This paper derives formal relationships between n, c and the probability of ambiguous predictions by examining three modeling languages under binary classification tasks: perceptrons, Boolean formulae, and Boolean networks. Bounds are also derived for the probability of error associated with the policy of accepting only models of complexity not exceeding c. Human tendency to regard the simpler as the more trustworthy is given a qualified justification. CategoryMathsCategoryPhilosophy