A New Probability Model

Basic Model

If there were two pathways, along which genes gain frequency, via cellular gain in one population, and molecular gain across life, we should be able to combine both pathways in one frequency model, to compute the total gain of frequency for any gene, in any population, in the history of life (say, when sex evolved). Except, to do this would confute another paradigm of evolution, concerning probability theory.

To explain, frequency is a probability, and one early triumph of the frequency method was proving that evolution uses a simple probability, like in a game of cards. However, if genes can gain frequency two ways, then one probable event will have two probable outcomes, which is not the same as a game of cards. Especially, there is no theory how this could work, so in Fig 0.1 below, and Fig 0.2 next, let me clarify.

See Fig 0.1

Usually, any probable outcome is a single number, between 0 and 1, or 0 and 100%. For evolution, it is the same, except events unfold over time. Fig 0.1 shows a probable outcome, x_i, of a gene that begins from 0% frequency at 0% time, but then increases gain under natural selection. Growth increases naturally to 50%, but beyond 50%, the population fills with the new variety, so growth slows. Finally, growth ceases at 100% frequency after 100% time, when the model must stop. This is a standard "growth" (or logistic) curve, but gain need not follow this curve exactly. For the evolution of sex, say, we might expect a curve to settle at 50% gain in 100% time, and so on.

Notice, the curve in Fig 0.1 looks two-dimensional, but this is only an illusion of time. Each event has a single outcome, between 0 x 100% frequency per instant, but each instant is 'unfolded' over 0 x 100% time. In the paradigm, any probable event can be solved inside a 100% x 100% frequency-time 'window'. Except, for how genes distribute, a 100 x 100% window might not be correct. If say, a gene gains 100% frequency in one population, it can still migrate into other populations beyond 100%. If a population extinguishes 100% of its time on Earth too, its genes can live on, in new populations to evolve from the parent one. No one denies that genes can distribute outside this 100% x 100% window, but experts insist that such gene distribution cannot be applied as a frequency, because no probability model exists, to emulate such effects.

However, although genes distribute across millions of years in time, they also distribute in phylogenetic 'space', and this allows us to graph, not a 100 x 100%, 'window' but a 100 x 100 x 100% 'landscape'. Fig 0.2 next, shows (on a spreadsheet) how the result will look.

See Fig 0.2

Notice, whereas Fig 0.1 was a single growth curve on a flat plane, Fig 0.2, uses two growth curves to form not a flat plane, but a landscape. In Fig 0.2, the heavy black front curve shows the standard, cellular gain, x_i, as in Fig 0.1. As in Fig 0.1, gain increases from zero, to peak at 100% frequency in 100% time, which is a standard 'window'. In reality, genes at 100% can migrate into new populations, and evolution does not stop at each 100% time interval. This is shown in Fig 0.2 by another growth curve, sloping away at 90^O to the cellular gain.

The second curve shows the molecular gain. In the new theory, this has the symbol X_i ("big x" as against "little x", as in x_i). Molecular gain, X_i, starts from the same zero point, and increases via standard growth. Just that whereas cellular gain ends on a peak, in one population, a gene as a molecule can descend into many populations, and some genes end up 100% in every population. In reality, very few genes distribute across life 100%, but this is still shown as a horizontal grey line on the far slope of Fig 0.2. Fig 0.2 does not show X_i directly, but a formula converts X_i into an x_i base, and this conversion projects the far grey line to infinity. As a result, even if a gene gains 100% on the first peak, it can gain frequency beyond that until it reaches the far slope. Yet if the far slope is at infinity, a gene can only reach it in infinite time, so the model avoids the paradox of time ever needing to stop.

Objections to this Model

Now, let me remark on the objection by experts that my solutions do not conform to standard probability rules.

My thesis is that for life to evolve, two pathways of selection, for the gene and the cell must combine. The objection is that selection can only be emulated by a frequency, but a frequency is a probability, and it must obey strict rules. Probabilities, say, cannot exceed 100%, but in life, genes at 100% in one population can spread wider. Probabilities are also 'directionless'. Cellular migration might be 'vertical', whereas molecular migration can be 'horizontal', but probability cannot have direction. Life is also unified, because genes in a unique cellular population can belong to a larger molecular population, but that is not how probabilities work. If a cellular frequency was for one species, and a molecular frequency was for all life, these cannot be combined. It is odd, though, that cellular and molecular pathways combine for life to evolve, yet mathematics cannot allow them to combine, without breaking rules.

The solution to this paradox is as follows.

The rule says that probabilities are a one-dimensional (1D) number, between 0% and 100%, and a probability cannot exist outside this limit. For example, the curve in Fig 0.1 looks like a 2D surface, so it seems as if two dimensions were used to draw it. Yet by convention, Fig 0.1 is a single 1D frequency, x_i, unfolded over 0% to 100% time. By convention too, this curve is within the rules of probability.

On the other hand, if you look at Fig 0.2, it no longer looks 2D, but three-dimensional (3D), in a landscape, so it seems it uses more than one dimension to draw it. Instead, Fig 0.2 is a single frequency (call it z_i) but now projected along two eras of time. The front face of this curve shows standard time, maybe twenty generations "here and now", which repeats Fig 0.1. The difference from Fig 0.1 is that the equation to draw Fig 0.2 includes a formula to convert molecular distribution across all life, into another axis of time, but still strictly within the rules for such problems. Experts can object, of course, that the formula of gene distribution across life is wrong, or it is not true that genes can distribute in different eras of time, but that is not the mathematical dispute. If there were two eras of time, for whatever reason, it should not violate the rules for a single frequency to spread across both of them.

Besides, if this interpretation were wrong, it would be very easy to falsify any part of this theory.

For instance, just in this introduction, I have made a prediction of thermodynamic direction, no one has dared predict before, and anyone could prove it wrong. I have offered a solution to the paradox of sex, and have predicted its results as molecular distribution, which it would be easy to prove wrong. I have predicted a previously overlooked "valley", between the first peak, of 100% frequency, and how a gene distributes at 100% across all life; anyone can prove that the valley is not there. Then, contrary to the 'mutation-centric' paradigm of evolution, I predict that the most conserved (least mutated) genes spread widest across life, and that every transition that is a paradox, resulted in gain of conservation, not mutation, for core genes. Anyone could falsify this prediction, merely by measuring wide-scale molecular distribution, quite independently of any mathematical theory of how it worked.

A New Probability Model

Friday, October 14, 2011

A New Two-Axis Probability Model