|
Problem of
Induction:
There is a notion - first argued strongly by Francis Bacon -that one can learn
from experience by finding evidence that permits generalizations from the
evidence and allows thus to find laws of nature. The problem is that these
generalizations are not deductively valid. A classical example is
that one may have seen quite a few white swans, and conclude therefrom that
'All swans are white', yet in Australia there are black swans, and thus the
generalization is false.
There are infinitely many other possible examples of the same form, that
all share the same feature: If one has only noted some of the cases that are A
and one has found that all the cases in one's experience of A also are B, it does
not follow with deductive necessity that all A's are B's i.e. that all
non-experienced A's also are B's.
Even so, it would seem that empirical science is founded on inductive
generalizations of the form: There are A's. All experienced A's are B's.
Therefore all A's are B's. At least, this is what Bacon argued, and many
scientists since believed, with more or less statistical sophistication, for
there have been developed many techniques in statistics to infer the
characteristics of a population based on a sample from it, and these
techniques often work in practice, indeed to the extent that insurance
companies can make profits based on them, by letting their customers pay more
in premiums than they have to pay out in restituted damages.
1. Hume's Problem
2. Newton's attempted solution
3. Goodman's problem
4. Goodman's attempted solution
5. A partial solution and
an old problem
6. The inductive condition
7. Application to the
problem of induction
1. Hume's Problem: The fact that a kind of inference is often used
and often (seems to) work in practice may be a good practical reason to rely
on it, but this does not constitute a good theoretical reason for it.
Indeed, using some formal logic, we can write the inductive schema of
inference as follows:
(1)
(x)[(Ax & Ex) => Cx] |- (x)[Ax => Cx]
Ad & ~Ed & ~Cd
Ad & ~Cd
which may be read as: If all Experienced A's are C's then all A's are C's,
and consider its validity. As the line under it shows, it is easy to generate
counter-examples using any d that has not been experienced with a true
antecedent and a false consequent: (1) is not deductively valid.
Several persons - Nicholas of Autrecourt, Newton, Leibniz - noted that inductive
generalizations are not deductively valid, but David Hume first put the
general problem with
great clarity, and related it also to ordinary reasoning in terms of causes
and effects:
If reason determin'd us,
it wou'd proceed upon that principle, that instances, of which we
have had no experience, must resemble those, of which we have had
experience, and that the course of nature continues always uniformly
the same. (..) (p. 137)
Our foregoing method of
reasoning will easily convince us, that there can be no demonstrative
arguments to prove, that those instances, of which we had no
experience, resemble those, of which we had experience. (p.137)
The idea of cause and
effect is deriv'd from experience, which informs us, that such
particular objects, in all past instances, have been constantly
conjoin'd with each other: And as an object similar to one of those is
suppos'd to be immediately present in its impression, we thence
presume on the existence of one similar to its usual attendant.
According to this account of things, which is, I think, in every point
unquestionable, probability is founded upon the presumption of a
resemblance betwixt those objects, of which we have had experience,
and those of which we had none; and therefore it is impossible this
presumption can arise from probability. (...) (p. 138)
Shou'd
anyone think to elude this argument; and without determining whether
our reasoning on this subject be deriv'd from demonstration or
probability, pretend that all conclusions from causes and effects are
built on solid reasoning: I can only desire, that this reasoning may
be produc'd (...) (p. 138) (A Treatise on Human Nature)
Note the real problems here if indeed, as many have claimed, the
schema given by (1) is the schema humans use to learn from experience:
First, it is not deductively valid. Second, it would seem all
evidence for it, such as a supposed uniformity of nature, involve the
same invalid schema of reasoning to be established. Third, to claim that
the schema is valid if qualified probabilistically, say in the form 'All
experienced A's are C's. Therefore probably all A's are C's' also does
not help, unless one knows there is a uniformity of nature, which we saw
to be problematic, or unless one knows that what was not experienced is
like what was experienced, which is begging the question at issue.
That is, in general terms, Hume's Problem of Induction is that schema
(1) is not deductively valid, and can be justified only by itself, or by
a presumption of the uniformity of nature, which again requires schema
(1) to be established, and anyway is known to be not true in all cases.
2. Newton's attempted solution: Newton certainly was aware of the
problem of induction, and indeed added a special section to the second edition
of the Principia, in which he proposed Rules of Reasoning specifically
addressed to the problem, and very probably Hume's starting point for his
doubts about induction.
Newton's
Rules of Reasoning minus his comments (to some of which I will
return) are the following, where it should be realised that in the
following quotation Newton meant by "experimental philosophy"
what we call "natural science" and that a shorter
version of "which admit neither intensifcation nor remission
of degrees" is "which are invariant".
Rule I :
We are to admit no more causes of natural things than such as are
both true and sufficient to explain their appearances.
Rule II : Therefore to the same natural effects we must, as
far as possible, assign the same reasons.
Rule III : The qualities of bodies, which admit neither
intensification nor remission of degrees, and which are found to
belong to all bodies within the reach of our experiments, are to be
esteemed the universal qualities of all bodies whatsoever.
Rule IV : In experimental philosophy we are to look upon
propositions inferred by general induction from phenomena as
accurately or very nearly true, notwithstanding any contrary
hypotheses that may be imagined, till such time as other phenomena
occur, by which they may be either made more accurate, or liable to
exception.
The problem is that, without arguments, Hume's objections hold against
these rules, since Rule III, which is the most important, is merely an
instance of schema (1).
3. Goodman's problem: Thus far, we have considered the problem as it was in
the early 18th Century. It was never satisfactorily answered, except by noting
that in practice inductions often worked.
In the 20th Century Nelson Goodman sharpened it as follows:
"Suppose that all
emeralds examined before a certain time t are green. At time t, then,
our observations support our hypothesis that all emeralds are green;
and this is in accord with our definition of confirmation. Our
evidence statements assert that emerald a is green, that emerald b is
green, and so on; and each confirms the general hypothesis that all
emeralds are green. So far, so good. Now let me introduce another
predicate less familiar than "green". It is the predicate
"grue" and it applies to all things examined before t just
in case they are green but to other things just in case they are blue.
Then at time t we have, for each evidence statement that a certain
emerald is green, a parallel evidence statement that the emerald is
grue. And the statements that emerald a is grue, that emerald b is
grue, and so on, will each confirm the general hypothesis that all
emeralds are grue. Thus according to our definition, the prediction
that all emeralds subsequently examined will be green and the
prediction that all emeralds will be grue are alike confirmed by the
evidence statements describing the same observations. But if an
emerald subsequently examined is grue, it is blue and hence not green.
Thus although we are well aware which of the two incompatible
predictions is genuinely confirmed, they are equally well confirmed
according to our present definition. Moreover, it is clear that if we
simply choose an appropriate predicate, then on the basis of these
same observations we shall have equal confirmation, by our definition,
for any prediction whatever about other emeralds - or indeed anything
else." (Problems and Projects, p. 381-2)
This gave rise to an enormous amount of papers, most of
which take issue with irrelevancies, like the reference to time, the oddity of
predicates like "grue", or considerations about when a generalisation is a law
or law-like, and miss the principal point, which is the following.
What Goodman in fact considered was a case that may be written
in formal logic as follows:
(2) (x)[(Ax & Ex) => Cx] & (x)[(Ax & ~Ex) => ~Cx]
Namely: What if the real matter of the case is that all experienced A's are
C's (all experienced emeralds are green) but all non-experienced A's are not
C's (all non-experienced emeralds are blue)? (This is the most extreme
possibility, and in this way Goodman sharpened the problem. Otherwise, it is
the same as before.)
Again, one does not know what is the case, and all the evidence is at best
as the first conjunt of (2) renders it, while the conclusion one wishes to draw by induction is in
fact as follows:
(3) (x)[(Ax & Ex) => Cx]
|- (x)[(Ax & ~Ex) => Cx].
This conforms to (1), and is merely a sharper form of it. But the
conclusion of it contradicts what may be the case, namely the second conjunct
of (2) i.e.
(x)[(Ax & ~Ex) => ~Cx].
4. Goodman's attempted solution: Goodman also came with an attempted
solution, which is that some predicates (attributes, properties) are what he
called 'projectible'. However, nobody ever gave a clear definition of
what makes a predicate projectible or not, other than that the projectible
ones conform to (3) and the non-projectible do not, and may conform, in
extreme cases, when all the evidence one has is counter to all cases one does
not have evidence of, to the second conjunct of (2).
5. A partial solution
and an old problem: The problem of induction is especially
important if one assumes that science is based on an inductive schema like (1)
or (3).
So part of the solution is to reject that assumption: Science is not based
on such a schema, but rather on abduction -
inference to the best explanation - and inductive confirmation and Bayesian
reasoning.
That is: One tries to account for the facts one
wants to explain by
guessing a
theoretical explanation for it, from which one can
deduce those facts, and also can deduce
further predictions, which may be used to test the theory. If these further
predictions turn out to be true, the theory is
confirmed, and more probable than before; if these further predictions turn
out to be false, the theory is refuted, and must
be somehow changed or given up.
This is a partial solution of the problem of induction, in that it is more
realistic about what science is and scientists
do, in that it gives up the notion that science is based on inductive
generalizations, and replaces it by the notion that it proceeds by a
combination of guessing, deducing, and testing. (See:
Theory).
But this solution is partial only, in that in fact the old problem of
induction that Hume and Goodman noted reappears for confirmations and
refutations, also in a probabilistic form.
The reason it does is also similar, namely in general terms: How does one
know that such evidence as one has permits an inference about the cases about
which one does not have evidence, and can be clarified as follows.
In general terms, what one has is a theory T, from which one can deduce the
facts F that one wants to explain, and further predictions P that can be used
to test T. That is, one has:
(4) F & (T |= F) i.e. p(F|T)=1 -
for theory T and facts F
(5) (T |= (p(P)=q) i.e. p(P|T)=q - for theory T and predictions P
Now one wants to find p(T|P), especially if one finds that p(P)=1, that is,
in case one finds confirming evidence for T. Formally, one wants to argue then
by Bayes' schema: p(T|P) = p(P|T).p(T):p(P) = q*p(T):p(P), and thus
recalculate the probability of T on the new evidence that P.
Incidentally, it may be argued (not quite correctly, but that may be left
out here, and anyway is also indicated by the point that follows) that in case
p(P)=0 then T simply is refuted. (Popper did so, but was mistaken for
the reason that follows.)
The fundamental problem is that one never has just p(T|P) - one always has
(6) p(T|P&X)
where X may be all manner of other facts that occur together with P,
that may or may not be relevant, and that may indeed also cover the
case considered above, namely that it concerns experienced P's - for in
the probabilistic form just given, and with 'E' for 'is experienced', the
problem is how to validate the move from p(T|P&E) to p(T|P), just like above
the problem was how to validate the move from (y)(Ty & Ey => Py) to (y)(Ty =>
Py).
But the problem is also considerably more general: X may also include
references to the methodology of the experimental set-up; to the stars; to the
mole on the subjects face; to the number of days since the prophet Mohammed
died; or to anything else that happens to be also true when P is true - and
that may be relevant to P or to T or to both, or not.
6. The inductive condition:
What one needs, it would seem, to deal with this problem, is a postulate of
the following form, that must be added to any empirical theory one seriously
proposes, and that I shall call IC (for
Inductive Condition).
It concerns a theory and its predictions, and is added to it as an
assumption. There are several possible equivalent statements of the
IP, one of which is:
(IC):
For theories T, predictions P and circumstances Q:
TrelP =>
PrelQ IFF (PrelQ|T)
That is: If the theory T is relevant to P then the prediction P of
the theory is relevant to Q if and only if P is also relevant to Q if T is true.
Intuitively, a theory and a statement are irrelevant to anything they do
not imply anything about, while the irrelevancies in the hypothesis of
(IC) are defined as is usual in probability
theory: AirrB =def p(A&B)=p(A)*p(B) which in turn is equivalent with
p(B|A)=p(B) in any case p(A)>0.
This consequent of
(IC) -
that claims conditional irrelevance - is defined probabilistically as follows:
(7) P irr Q | T =def Q|P&T=Q|P
Having this, one can proceed as follows, noting that PrelQ
IFF (PrelQ|T) IFF PirrQ
IFF (PirrQ|T) and supposing that theory T makes prediction P which also is irrelevant to
Q, while T
satisfies
(IC):
(8) T|P&Q=T&P&Q:P&Q
by def
=T&P&Q:P.Q by PirrQ
=Q|T&P.T&P:P.Q by def
=Q|P.T&P:P.Q by P irr Q if T
=Q.T&P:P.Q by
PirrQ
=T|P
by def
And thus one has arrived where one wanted, only using
(IC) plus
probability theory. And thus one can learn from experience, and
confirm one's theories, and those theories one does not need to infer from
experience, but can merely propose to explain
one's experience, and then use further experience to confirm or infirm one's
theoretical guesses.
Now, what are the reasons to adopt the inductive condition? It seems to me
there are four or five.
First, it works. It solves the problem we found.
This is a good basic reason, for in the end one does adopt hypotheses and
theories because they entail what one desires or knows to be true.
Second, it makes independent sense methodologically.
Namely: One must somehow, both when proposing and when testing an empirical
theory, abstract from many circumstances and one always does so in
fact, and it is much better to do this by an explicit assumption than
implicitly and ad hoc, and - as it were, or really - without knowing of
acknowledging this.
Third, it makes sense theoretically.
A good way to read and to defend the addition of the
(IC) to any empirical theory is that it amounts
to the following claim about the theory: All things that are relevant to
the theory and its predictions are in fact accounted for by the theory, and
follow deductively from it. And thus, what is not accounted for by
the theory and does not follow from it, is in fact irrelevant to it.
This claim may be false, and indeed is false if one has
missed a factor that is relevant to what the theory claims (as may easily
happen, and indeed often happens), but that is precisely the reason why the
assumption should be added explicitly as an assumption.
And this claim can not be proved, since it is a
generalization about the whole universe and the theory, namely that the theory
contains all the relevant distinctions of relevance and irrelevance, and does
so logically: What is relevant should follow deductively from it, if it is a
good and true and testable theory.
However, this is not metaphysics, or else it is a minimal metaphysics
necessary for science, since the inductive
condition, if true, explains why science and empirical testing
works, and how human beings can learn from experience: By imaginative
guessing, and by theories that imply all that is relevant for their truth, and
thus can be tested.
And otherwise, the proof of the pudding is in the eating, and the last 400
years of empirical science have shown incontrovertibly that the schema just
proposed works, and indeed also works if the inductive condition is not
explicitly assumed, for in actual practice all that needs to be true is that
human beings are capable of formulating empirical theories that can be tested
succesfully and independently from most circumstantial facts that the theories
do not imply anything about.
Fourth, we can prove that the IC is necessary, if theories are
true, which they must be assumed to be at least when testing them:
What the IC amounts to logically speaking is this:
(9) ~(EQ)(T rel P & [~(P irr Q) & (P irr Q|T) V (P
irr Q) & ~(P irr Q|T)])
And what the assumption that T is a theory amounts to, at least when
testing it, is
(10) (T)( T is a theory --> T is true )
Now since (P irr Q|T) IFF
T |- P irr Q and ~(P irr
Q|T) IFF (P rel Q|T) (both by (7)) it
follows from (10) that (9) must be true, for else T makes predictions about
(ir)relevance that are false in fact, and that (10) forbids.
Thus the argument is in brief that:
To test theories, these must assumed to be true. If true, they are also
must be true about (ir)relevancies for their predictions. This can and often
must be assured to the best of one's knowledge methodologically, by the design
of experiments. And if indeed T is true about (ir)relevancies for its
predictions, one can abstract from irrelevancies when testing.
The problem is only that one can verify or assure that (9) is in fact true
by good methodology and experimental design only to a finite extent and
as far as one's knowledge goes, and not for everything in the universe.
So in general we must assume theories to be true when testing them; apart
from that we have usually a probability for them; and in fact we assume
tacitly or explicitly and with methodological care that the theory as stated
is true about anything whatsoever that is (ir)relevant to its predictions, and
we often take care in our experiments to control relevant factors in some way.
Fifthly, it may be remarked that the inductive condition is similar to
Ockham's Razor and to Newton's
Rules of Reasoning, but
distinct from both in that it is explicitly probabilistic, and is concerned with
irrelevance and with learning from experience
and the testing of theories.
It is simultaneously epistemological and ontological, that is, it concerns
both how we may come to know things and what reality is like, in that it
presumes that reality contains at least some features and facts that satisfy
the inductive condition in that those features and facts remain the same while
many other things may not, and thus one can abstract from those other things
when considering and indeed using these features and facts. (See also:
Invariance)
Indeed, on a rather deep level the IC is related to compactness and
the notion that the universe is such that it can be at least partially
explained in terms of finite sets of hypotheses about possibly infinite
domains or possibilities.
7. Application to
the problem of induction: Now to return to the classical
problem of induction as formulated by Hume and sharpened by Goodman.
What the above considerations show is that for the classical problems the
claim is that the actual experiences one has are irrelevant to the truth of
the theory: The evidence one has actually gathered is not special in the sense
that later evidence or other evidence would have a different import on what
the theory proposes in fact - or if it does, as in the case of supposedly grue
rather than green emeralds, this should explicitly follow from the theory.
Therefore also, the problem with Goodman's problem is that there is, in
fact, not one theory about the color of emeralds, but there are two: One in
which the times at which the evidence gathered is not relevant to the color of
the emerald, and one in which it is. And incidentally: That 'grue' is a
curious predicate is not at all relevant - one has, for example, a similar
case in many practical situations with real things that have a limited
life-time. (The number of remaining days to live is rather strongly dependent
on age, and that one has lived one more day does not make it more probable
that one will live one more day, but less probable, ceteris paribus.)
In any case, as far as the problem whether emeralds are grue or blue is
concerned, the answer is that the rational assumption on such evidence as one
has is that they are blue and not grue, for so far there have been no grue
emeralds at all (and many that were green to start with when found and that
continued to be green ever since).
And the above solution also gives a general clue about Goodman's notion of
projectible predicates, and theories, for theories can also be construed
as predicates:
A theory is projectible if it is supposed to satisfy the
inductive condition, for then it is testable for the reason given above
in (8). And this amounts to an assumption that the theory already entails
all the facts and factors relevant to its truth and its testing - and
thereby also gives a clue to what is wrong if the theory gets refuted: The
theory failed to imply at least one relevant fact or factor to what it
proposes, whatever it may be. And this may be a simple factual falsehood it
implies is true, or it may be a contributing factor to one of the things it
implies and that it does not imply and thus may have been missed in the design
to experimentally test the theory.
Normally, the inductive condition is not explicitly added to an empirical
theory, even though it is in fact always tacitly assumed when testing the
theory, as has been explained, for one must and does abstract from very many
attending circumstances and facts when testing it.
It is better that one is at least consciously aware one assumes the
inductive condition, since this may both help to understand what may have gone
wrong, while it also helps to prevent the ad hoc fallacy - for by the
inductive condition later ad hoc assumptions made in order to save a refuted
theory are forbidden, or count as new theories, and not as savings of an old
one. For one way to read the
(IC) is as the claim that the theory itself
entails and should entail everything that is relevant to its truth or falsity,
and that a theory which does not do so is thereby no good as a theory, even if
it may be good as a stepping stone towards it.
|