Bayes' Theorem: This has
many formulations, but essentially comes down to the observation that
one can learn from experience using probability theory while avoiding
the fallacy of affirming the consequent. This all is based in the end
on the following theorem of elementary probability theory:
p(TP)=p(PT)*p(T):p(P).
In this equation, the factors have standard names:
p(TP) is the posterior probability (of
the theory T given the data P)
p(PT) is the likelihood (of the data P given the theory T)
p(T) is the prior probability (of the theory T)
p(P) is the probability of the data
1. Usefulness of Bayes' Theorem: The
strength, interest and usefulness of Bayes' Theorem can be explained by
noting that if T is a theory and P a prediction of the theory, one can
recalculate the probability of T given that P is true (or false) if one
has the probability of P given T (which is one's theory T from which
one has derived that P has a certain probability if T is true) together
with the probability of one's theory T and the probability of one's
prediction P.
A classical example involves finding the orbit of
a comet from a few observations.
Suppose one has a theory T about that orbit that
may have a low initial probability p(T) (since there are many possible
orbits), and a prediction p(P) that the comit will be at a certain
place at a certain time of which the probability also will be low apart
from T (since the comet may in fact be at many possible places),
although p(PT) i.e. that the comit will be at a certain place at a
certain time if the theory is true as a rule will be high (since else
one wouldn't propose the theory).
Now it follows by Bayes' Theorem i.e. the
above elementary formula of probability theory that p(TP) i.e. the
probability of the theory T if the prediction is true will be much
higher than p(T) was before the prediction was verified, and indeed in
the ratio p(PT):p(P). Thus, if p(PT) was 90/100 and p(P) was 1/100,
then p(TP)=90*p(T), which may make the new probability p(TP)
appreciable even if p(T) may have been quite low to start with (say
also 1/100, e.g. because its prediction P is that low). Thus, the new
p(TP)=90/100, whereas the old pr(T), before finding that P is true,
was 1/100.
2. Problem of Bayes' Theorem: The main
problem involved in Bayes' Theorem is that it often is not clear how
one can establish the three probabilities one requires to use it,
namely p(PT), p(P)
and p(T).
This is especially so with p(T), in that one
often can make plausible cases for p(PT) (it must be high if the explanation is to be useful)
and p(P) (there often can be given evidence that if T is not true, then
P is not probable at all), but since theories cannot be counted like
blueberries or particular instances of kinds of fact, there often seems
to be no plausible way to fix the probability of a theory.
There are several ways to circumvent the problem,
but the usual ones (such as socalled likelihoodratios: p(PT):p(P~T)
or p(T):p(~T)) all seem to involve a considerable element of
subjectivity: In the end, it all comes down to one's subjective degree
of belief in T.
For those who believe that probability is
subjective, this is no objection, and indeed believers in subjective
probability feel quite free in using Bayes' Theorem, while also some
have converted to a subjective interpretation of probability theory
precisely because it permits one to apply Bayes' Theorem.
The problem with this, apart from other
objections to subjective intepretations probability theory, is that in
practice it won't help much, for example with fanatics.
Take Darwin's theory of evolution. This accounts
quite well for may otherwise problematic facts, and has quite a few
succesful predictions to its credit that do not follow from other
theories, such as divine providence. Thus it can be seen as being quite
well confirmed by the evidence and by Bayesian reasoning  unless one
is both a believer in the subjective interpretation of probability and
in divine providence, and therefore fixes the probability of the
Darwinian theory as so small (say, in the order of 10^{1000})
that no practical amount of evidence can much change this (except with
verified predictions of the same order of improbability).
3. A possible solution: One possible
solution is to make a special assumption about the probability of a
theory. This follows, after a definition of a term that occurs in the
assumption to be made: The proper consequences of a theory T
are those statements that follow from T but do not follow from ~T.
Now one may assume the following Theoretical Probability Postulate or
TPP
 TPP:
The probability x of a theory T at any time t is the probability of the
least probable proper consequence that is known to follow from T at
time t.
The justification for this
assumption is that we certainly know that pr(T) cannot be higher than
the probability of its least probable proper consequence, for that
follows from probability theory, whereas the stated conventional
assumption answers the problem how to attribute a probability to a
theory, and indeed does uniquely so, and with empirical justification,
namely that least probable proper consequence of the theory.
Thus our assumption for
the probability of a theory T at time t is that it is the maximum of
what it may be at t, given the probabilities of the known proper
consequences of T. This is an assumption; it is consistent with
probability theory; it is based on the known facts about what T
entails; and it is a convention.
