Maarten Maartensz:    Philosophical Dictionary | Filosofisch Woordenboek                      

 I - Induction - Problem of


Problem of Induction: There is a notion - first argued strongly by Francis Bacon -that one can learn from experience by finding evidence that permits generalizations from the evidence and allows thus to find laws of nature. The problem is that these generalizations are not deductively valid.

A classical example is that one may have seen quite a few white swans, and conclude therefrom that 'All swans are white', yet in Australia there are black swans, and thus the generalization is false.

There are infinitely many other possible examples of the same form, that all share the same feature: If one has only noted some of the cases that are A and one has found that all the cases in one's experience of A also are B, it does not follow with deductive necessity that all A's are B's i.e. that all non-experienced A's also are B's.

Even so, it would seem that empirical science is founded on inductive generalizations of the form: There are A's. All experienced A's are B's. Therefore all A's are B's. At least, this is what Bacon argued, and many scientists since believed, with more or less statistical sophistication, for there have been developed many techniques in statistics to infer the characteristics of a population based on a sample from it, and these techniques often work in practice, indeed to the extent that insurance companies can make profits based on them, by letting their customers pay more in premiums than they have to pay out in restituted damages.

1. Hume's Problem
2. Newton's attempted solution
3. Goodman's problem
4. Goodman's attempted solution
5. A partial solution and an old problem
6. The inductive condition
7. Application to the problem of induction

1. Hume's Problem: The fact that a kind of inference is often used and often (seems to) work in practice may be a good practical reason to rely on it, but this does not constitute a good theoretical reason for it.

Indeed, using some formal logic, we can write the inductive schema of inference as follows:

(1) (x)[(Ax & Ex) => Cx] |- (x)[Ax => Cx]
           Ad & ~Ed & ~Cd         Ad & ~Cd

which may be read as: If all Experienced A's are C's then all A's are C's, and consider its validity. As the line under it shows, it is easy to generate counter-examples using any d that has not been experienced with a true antecedent and a false consequent: (1) is not deductively valid.

Several persons - Nicholas of Autrecourt, Newton, Leibniz - noted that inductive generalizations are not deductively valid, but David Hume first put the general problem with great clarity, and related it also to ordinary reasoning in terms of causes and effects:

If reason determin'd us, it wou'd proceed upon that principle, that instances, of which we have had no experience, must resemble those, of which we have had experience, and that the course of nature continues always uniformly the same. (..) (p. 137)

Our foregoing method of reasoning will easily convince us, that there can be no demonstrative arguments to prove, that those instances, of which we had no experience, resemble those, of which we had experience. (p.137)

The idea of cause and effect is deriv'd from experience, which informs us, that such particular objects, in all past instances, have been constantly conjoin'd with each other: And as an object similar to one of those is suppos'd to be immediately present in its impression, we thence presume on the existence of one similar to its usual attendant. According to this account of things, which is, I think, in every point unquestionable, probability is founded upon the presumption of a resemblance betwixt those objects, of which we have had experience, and those of which we had none; and therefore it is impossible this presumption can arise from probability. (...) (p. 138)

Shou'd anyone think to elude this argument; and without determining whether our reasoning on this subject be deriv'd from demonstration or probability, pretend that all conclusions from causes and effects are built on solid reasoning: I can only desire, that this reasoning may be produc'd (...) (p. 138) (A Treatise on Human Nature)

Note the real problems here if indeed, as many have claimed, the schema given by (1) is the schema humans use to learn from experience:

First, it is not deductively valid. Second, it would seem all evidence for it, such as a supposed uniformity of nature, involve the same invalid schema of reasoning to be established. Third, to claim that the schema is valid if qualified probabilistically, say in the form 'All experienced A's are C's. Therefore probably all A's are C's' also does not help, unless one knows there is a uniformity of nature, which we saw to be problematic, or unless one knows that what was not experienced is like what was experienced, which is begging the question at issue.

That is, in general terms, Hume's Problem of Induction is that schema (1) is not deductively valid, and can be justified only by itself, or by a presumption of the uniformity of nature, which again requires schema (1) to be established, and anyway is known to be not true in all cases.

2. Newton's attempted solution: Newton certainly was aware of the problem of induction, and indeed added a special section to the second edition of the Principia, in which he proposed Rules of Reasoning specifically addressed to the problem, and very probably Hume's starting point for his doubts about induction.

Newton's Rules of Reasoning minus his comments (to some of which I will return) are the following, where it should be realised that in the following quotation Newton meant by "experimental philosophy" what we call "natural science" and that a shorter version of "which admit neither intensifcation nor remission of degrees" is "which are invariant".

Rule I : We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances.
Rule II : Therefore to the same natural effects we must, as far as possible, assign the same reasons.
Rule III : The qualities of bodies, which admit neither intensification nor remission of degrees, and which are found to belong to all bodies within the reach of our experiments, are to be esteemed the universal qualities of all bodies whatsoever.
Rule IV : In experimental philosophy we are to look upon propositions inferred by general induction from phenomena as accurately or very nearly true, notwithstanding any contrary hypotheses that may be imagined, till such time as other phenomena occur, by which they may be either made more accurate, or liable to exception.

The problem is that, without arguments, Hume's objections hold against these rules, since Rule III, which is the most important, is merely an instance of schema (1).

3. Goodman's problem: Thus far, we have considered the problem as it was in the early 18th Century. It was never satisfactorily answered, except by noting that in practice inductions often worked.

In the 20th Century Nelson Goodman sharpened it as follows:

"Suppose that all emeralds examined before a certain time t are green. At time t, then, our observations support our hypothesis that all emeralds are green; and this is in accord with our definition of confirmation. Our evidence statements assert that emerald a is green, that emerald b is green, and so on; and each confirms the general hypothesis that all emeralds are green. So far, so good. Now let me introduce another predicate less familiar than "green". It is the predicate "grue" and it applies to all things examined before t just in case they are green but to other things just in case they are blue. Then at time t we have, for each evidence statement that a certain emerald is green, a parallel evidence statement that the emerald is grue. And the statements that emerald a is grue, that emerald b is grue, and so on, will each confirm the general hypothesis that all emeralds are grue. Thus according to our definition, the prediction that all emeralds subsequently examined will be green and the prediction that all emeralds will be grue are alike confirmed by the evidence statements describing the same observations. But if an emerald subsequently examined is grue, it is blue and hence not green. Thus although we are well aware which of the two incompatible predictions is genuinely confirmed, they are equally well confirmed according to our present definition. Moreover, it is clear that if we simply choose an appropriate predicate, then on the basis of these same observations we shall have equal confirmation, by our definition, for any prediction whatever about other emeralds - or indeed anything else." (Problems and Projects, p. 381-2)

This gave rise to an enormous amount of papers, most of which take issue with irrelevancies, like the reference to time, the oddity of predicates like "grue", or considerations about when a generalisation is a law or law-like, and miss the principal point, which is the following.

What Goodman in fact considered was a case that may be written in formal logic as follows:

(2) (x)[(Ax & Ex) => Cx] & (x)[(Ax & ~Ex) => ~Cx]

Namely: What if the real matter of the case is that all experienced A's are C's (all experienced emeralds are green) but all non-experienced A's are not C's (all non-experienced emeralds are blue)? (This is the most extreme possibility, and in this way Goodman sharpened the problem. Otherwise, it is the same as before.)

Again, one does not know what is the case, and all the evidence is at best as the first conjunt of (2) renders it, while the conclusion one wishes to draw by induction is in fact as follows:

(3) (x)[(Ax & Ex) => Cx]  |- (x)[(Ax & ~Ex) => Cx].

This conforms to (1), and is merely a sharper form of it. But the conclusion of it contradicts what may be the case, namely the second conjunct of (2) i.e. (x)[(Ax & ~Ex) => ~Cx].

4. Goodman's attempted solution: Goodman also came with an attempted solution, which is that some predicates (attributes, properties) are what he called 'projectible'. However, nobody ever gave a clear definition of what makes a predicate projectible or not, other than that the projectible ones conform to (3) and the non-projectible do not, and may conform, in extreme cases, when all the evidence one has is counter to all cases one does not have evidence of, to the second conjunct of (2). 

5. A partial solution and an old problem: The problem of induction is especially important if one assumes that science is based on an inductive schema like (1) or (3).

So part of the solution is to reject that assumption: Science is not based on such a schema, but rather on abduction - inference to the best explanation - and inductive confirmation and Bayesian reasoning.

That is: One tries to account for the facts one wants to explain by guessing a theoretical explanation for it, from which one can deduce those facts, and also can deduce further predictions, which may be used to test the theory. If these further predictions turn out to be true, the theory is confirmed, and more probable than before; if these further predictions turn out to be false, the theory is refuted, and must be somehow changed or given up.

This is a partial solution of the problem of induction, in that it is more realistic about what science is and scientists do, in that it gives up the notion that science is based on inductive generalizations, and replaces it by the notion that it proceeds by a combination of guessing, deducing, and testing. (See: Theory).

But this solution is partial only, in that in fact the old problem of induction that Hume and Goodman noted reappears for confirmations and refutations, also in a probabilistic form.

The reason it does is also similar, namely in general terms: How does one know that such evidence as one has permits an inference about the cases about which one does not have evidence, and can be clarified as follows.

In general terms, what one has is a theory T, from which one can deduce the facts F that one wants to explain, and further predictions P that can be used to test T. That is, one has:

(4) F & (T |= F)    i.e. p(F|T)=1 - for theory T and facts F
(5) (T |= (p(P)=q) i.e. p(P|T)=q - for theory T and predictions P

Now one wants to find p(T|P), especially if one finds that p(P)=1, that is, in case one finds confirming evidence for T. Formally, one wants to argue then by Bayes' schema: p(T|P) = p(P|T).p(T):p(P) = q*p(T):p(P), and thus recalculate the probability of T on the new evidence that P.

Incidentally, it may be argued (not quite correctly, but that may be left out here, and anyway is also indicated by the point that follows) that in case p(P)=0 then T simply is refuted. (Popper did so, but was mistaken for the reason that follows.)

The fundamental problem is that one never has just p(T|P) - one always has

(6) p(T|P&X)

where X may be all manner of other facts that occur together with P, that may or may not be relevant, and that may indeed also cover the case considered above, namely that it concerns experienced P's - for in the probabilistic form just given, and with 'E' for 'is experienced', the problem is how to validate the move from p(T|P&E) to p(T|P), just like above the problem was how to validate the move from (y)(Ty & Ey => Py) to (y)(Ty => Py).

But the problem is also considerably more general: X may also include references to the methodology of the experimental set-up; to the stars; to the mole on the subjects face; to the number of days since the prophet Mohammed died; or to anything else that happens to be also true when P is true - and that may be relevant to P or to T or to both, or not.

6. The inductive condition: What one needs, it would seem, to deal with this problem, is a postulate of the following form, that must be added to any empirical theory one seriously proposes, and that I shall call IC (for Inductive Condition). It concerns a theory and its predictions, and is added to it as an assumption. There are several possible equivalent statements of the IP, one of which is:

(IC): For theories T, predictions P and circumstances Q:
      TrelP => PrelQ IFF (PrelQ|T)

That is: If the theory T is relevant to P then the prediction P of the theory is relevant to Q if and only if P is also relevant to Q if T is true.

Intuitively, a theory and a statement are irrelevant to anything they do not imply anything about, while the irrelevancies in the hypothesis of (IC) are defined as is usual in probability theory: AirrB =def p(A&B)=p(A)*p(B) which in turn is equivalent with p(B|A)=p(B) in any case p(A)>0.

This consequent of (IC) - that claims conditional irrelevance - is defined probabilistically as follows:

(7) P irr Q | T =def Q|P&T=Q|P

Having this, one can proceed as follows, noting that PrelQ IFF (PrelQ|T) IFF PirrQ IFF (PirrQ|T) and supposing that theory T makes prediction P which also is irrelevant to Q, while T satisfies (IC):

(8) T|P&Q=T&P&Q:P&Q       by def
             =T&P&Q:P.Q         by PirrQ
             =Q|T&P.T&P:P.Q   by def
             =Q|P.T&P:P.Q       by P irr Q if T
             =Q.T&P:P.Q          by PirrQ
             =T|P                   by def

And thus one has arrived where one wanted, only using (IC) plus probability theory. And thus one can learn from experience, and confirm one's theories, and those theories one does not need to infer from experience, but can merely propose to explain one's experience, and then use further experience to confirm or infirm one's theoretical guesses.

Now, what are the reasons to adopt the inductive condition? It seems to me there are four or five.

First, it works. It solves the problem we found.

This is a good basic reason, for in the end one does adopt hypotheses and theories because they entail what one desires or knows to be true.

Second, it makes independent sense methodologically.

Namely: One must somehow, both when proposing and when testing an empirical theory, abstract from many circumstances and one always does so in fact, and it is much better to do this by an explicit assumption than implicitly and ad hoc, and - as it were, or really - without knowing of acknowledging this.

Third, it makes sense theoretically.

A good way to read and to defend the addition of the (IC) to any empirical theory is that it amounts to the following claim about the theory: All things that are relevant to the theory and its predictions are in fact accounted for by the theory, and follow deductively from it. And thus, what is not accounted for by the theory and does not follow from it, is in fact irrelevant to it. 

This claim may be false, and indeed is false if one has missed a factor that is relevant to what the theory claims (as may easily happen, and indeed often happens), but that is precisely the reason why the assumption should be added explicitly as an assumption.

And this claim can not be proved, since it is a generalization about the whole universe and the theory, namely that the theory contains all the relevant distinctions of relevance and irrelevance, and does so logically: What is relevant should follow deductively from it, if it is a good and true and testable theory.

However, this is not metaphysics, or else it is a minimal metaphysics necessary for science, since the inductive condition, if true, explains why science and empirical testing works, and how human beings can learn from experience: By imaginative guessing, and by theories that imply all that is relevant for their truth, and thus can be tested.

And otherwise, the proof of the pudding is in the eating, and the last 400 years of empirical science have shown incontrovertibly that the schema just proposed works, and indeed also works if the inductive condition is not explicitly assumed, for in actual practice all that needs to be true is that human beings are capable of formulating empirical theories that can be tested succesfully and independently from most circumstantial facts that the theories do not imply anything about.

Fourth, we can prove that the IC is necessary, if theories are true, which they must be assumed to be at least when testing them:

What the IC amounts to logically speaking is this:

(9) ~(EQ)(T rel P & [~(P irr Q) & (P irr Q|T) V (P irr Q) & ~(P irr Q|T)])

And what the assumption that T is a theory amounts to, at least when testing it, is

(10) (T)( T is a theory --> T is true )

Now since (P irr Q|T)  IFF T |- P irr Q and ~(P irr Q|T) IFF (P rel Q|T) (both by (7)) it follows from (10) that (9) must be true, for else T makes predictions about (ir)relevance that are false in fact, and that (10) forbids.

Thus the argument is in brief that:

To test theories, these must assumed to be true. If true, they are also must be true about (ir)relevancies for their predictions. This can and often must be assured to the best of one's knowledge methodologically, by the design of experiments. And if indeed T is true about (ir)relevancies for its predictions, one can abstract from irrelevancies when testing.

The problem is only that one can verify or assure that (9) is in fact true by good  methodology and experimental design only to a finite extent and as far as one's knowledge goes, and not for everything in the universe.

So in general we must assume theories to be true when testing them; apart from that we have usually a probability for them; and in fact we assume tacitly or explicitly and with methodological care that the theory as stated is true about anything whatsoever that is (ir)relevant to its predictions, and we often take care in our experiments to control relevant factors in some way.

Fifthly, it may be remarked that the inductive condition is similar to Ockham's Razor and to Newton's Rules of Reasoning, but distinct from both in that it is explicitly probabilistic, and is concerned with irrelevance and with learning from experience and the testing of theories.

It is simultaneously epistemological and ontological, that is, it concerns both how we may come to know things and what reality is like, in that it presumes that reality contains at least some features and facts that satisfy the inductive condition in that those features and facts remain the same while many other things may not, and thus one can abstract from those other things when considering and indeed using these features and facts. (See also: Invariance)

Indeed, on a rather deep level the IC is related to compactness and the notion that the universe is such that it can be at least partially explained in terms of finite sets of hypotheses about possibly infinite domains or possibilities.

7. Application to the problem of induction: Now to return to the classical problem of induction as formulated by Hume and sharpened by Goodman.

What the above considerations show is that for the classical problems the claim is that the actual experiences one has are irrelevant to the truth of the theory: The evidence one has actually gathered is not special in the sense that later evidence or other evidence would have a different import on what the theory proposes in fact - or if it does, as in the case of supposedly grue rather than green emeralds, this should explicitly follow from the theory.

Therefore also, the problem with Goodman's problem is that there is, in fact, not one theory about the color of emeralds, but there are two: One in which the times at which the evidence gathered is not relevant to the color of the emerald, and one in which it is. And incidentally: That 'grue' is a curious predicate is not at all relevant - one has, for example, a similar case in many practical situations with real things that have a limited life-time. (The number of remaining days to live is rather strongly dependent on age, and that one has lived one more day does not make it more probable that one will live one more day, but less probable, ceteris paribus.)

In any case, as far as the problem whether emeralds are grue or blue is concerned, the answer is that the rational assumption on such evidence as one has is that they are blue and not grue, for so far there have been no grue emeralds at all (and many that were green to start with when found and that continued to be green ever since).

And the above solution also gives a general clue about Goodman's notion of projectible predicates, and theories, for theories can also be construed as predicates:

A theory is projectible if it is supposed to satisfy the inductive condition, for then it is testable for the reason given above in (8). And this amounts to an assumption that the theory already entails all the facts and factors relevant to its truth and its testing - and thereby also gives a clue to what is wrong if the theory gets refuted: The theory failed to imply at least one relevant fact or factor to what it proposes, whatever it may be. And this may be a simple factual falsehood it implies is true, or it may be a contributing factor to one of the things it implies and that it does not imply and thus may have been missed in the design to experimentally test the theory.

Normally, the inductive condition is not explicitly added to an empirical theory, even though it is in fact always tacitly assumed when testing the theory, as has been explained, for one must and does abstract from very many attending circumstances and facts when testing it.

It is better that one is at least consciously aware one assumes the inductive condition, since this may both help to understand what may have gone wrong, while it also helps to prevent the ad hoc fallacy - for by the inductive condition later ad hoc assumptions made in order to save a refuted theory are forbidden, or count as new theories, and not as savings of an old one. For one way to read the (IC) is as the claim that the theory itself entails and should entail everything that is relevant to its truth or falsity, and that a theory which does not do so is thereby no good as a theory, even if it may be good as a stepping stone towards it.


See also: Abduction, Induction, Deduction, Inference, Invariance, Logic, Probability, Theory


Ayer, Hume, Goodman, Howe & Urbach, Maartensz, Rescher, Russell, Stegmüller,

 Original: Mar 22, 2005                                                Last edited: 12 December 2011.   Top