Abstract: This paper clarifies the fundamental principle of valid reasoning by
dividing the principles of valid reasoning into deductions, abductions and
inductions, and providing formalizations for each kind in terms of
elementary probability theory and elementary set theory, both of which the
paper presumes. It introduces two new probabilistic assumptions, one for abductions and
one for inductions. If accepted, these assumptions make abductions and
inductions into valid deductions depending on these assumptions in
probability theory.
Sections:
(0) Introduction
(1) Deduction
(2) Abduction
(3) Induction
(4) Discussion of the abductive and inductive postulates
See also:
Fundamental principles of invalid reasoning
Internet note: The fonts used are Verdana and
StarMath.
"Probability is the guide to
life".
Bishop Butler
(0) Introduction: In this paper I shall understand by
reasoning any verbal inferencing using assumptions and conclusions, and by
valid reasoning any instance of reasoning with a conclusion that is true if the assumptions given
for it are true.
It is true - I shall assume - that there is more to human
reasoning than verbal inferencing using assumptions and conclusions,
and it is also true that reasoning may be very useful when it is not valid.
The reason to restrict myself to valid
reasoning as defined is that most human reasoning can be cast in the form of
verbal inferencing using assumptions and conclusions, and that valid reasoning
has the very useful property of always leading from (presumptive) truths to
(presumptive) truths.
This property guarantees that as
long as one does use valid reasoning, one's conclusions are true whenever
one's assumptions are true, and therefore also that one's assumptions are
false if one's validly inferred conclusions are in fact false.
Thus valid reasoning gives us a
way to find further truths from true premisses, and to find that assumed
premisses are false if they validly imply conclusions that are false.
There are three basic kinds of reasoning, where
reasoning involves argumentation of any kind using assumptions and inferences
of conclusions:
1. Deductions: To find conclusions that
follow from given assumptions
2. Abductions: To find assumptions from which given given conclusions follows
3. Inductions: To confirm or infirm assumptions by showing their conclusions
do (not) conform to the observable facts.
Normally in reasoning all three kinds are involved: We
explain supposed facts by abductions; we check the abduced assumptions by
deducing the facts they were to explain; and we test the assumptions arrived
at inductively by deducing consequences and then revising the probabilities
of the assumptions by probabilistic reasoning when these consequences are verified or falsified.
Here are some quotations of
C.S. Peirce on the subject of abduction:
"Abduction.
(..) "Hypothesis [or abduction] may be defined as an argument which
proceeds upon the assumption that a character which is known necessarily
to involve a certain number of others, may be probably predicated of any
object which has all the characteristics which this character is known
to involve." (5.276) "An abduction is [thus] a method of
forming a general prediction." (2.269) But this prediction is
always in reference to an observed fact; indeed, an abductive conclusion
"is only justified by its explaining an observed fact." (1.89)
If we enter a room containing a number of bags of beans and a table upon
which there is a handful of white beans, and if, after some searching,
we open a bag which contains white beans only, we may infer as a
probability, or fair guess, that the handful was taken from this bag.
This sort of inference is called making an hypothesis or abduction. (J.
Feibleman, "An Introduction to the Philosophy of Charles S. Peirce",
p. 121-2. The numbers referred to are to paragraphs in Peirce's
"Collected Papers".)
Now it is convenient to formalize the three
types of inference I have stated in words above in
the form of three general principles of inference. This can be done by
using the theory of probability and some basic standard set theory, with
both of which I assume familiarity in this paper. [1]
Since I assume familiarity with elementary probability theory, it should be
remarked at the start that in this paper I
have repressed all conventional "pr(.)"-notation by relying on the fact that
all probabilities can be written with a conditional-probability sign when
absolute probabilities are taken as probabilities conditional on the whole
universe K. Thus, the probabilistic notation I use conforms to the following:
pr(s|K&T) = s|K&T and
pr(s) = pr(s|K) = s|K = s.
The reason to prefer the notation of the RHS (righthand side) of the
equalities over that on their LHS is that using the format of the RHS avoids
many redundant occurences of the functional operator "pr(.)". This device
makes it possible to write elementary probability-theory almost exactly like
propositional logic, where the formulas of probability-theory are
distinguished by containing the mark of conditional probability and by
being statements of (in)equalities.
It is important to see that the resulting
probabilified version of propositional logic is considerably more subtle and
has far more possibilities for different types of logical analysis than does
standard propositional logic without probability. [2]
This last claim is supported by the following
formalizations of the three fundamental principles of reasoning, that are
given with the help of probability-theory. In each case I give a definition of
the general conditions that makes the kind of reasoning valid in elementary
probability theory and use this definition to give a rule of inference that
conforms to the definition.
I start with deduction, because this is best
known and simplest, and because I shall make abduction and induction into
deductions that use special assumptions. Also I presume in the rest of this
note the following notational conventions:
K = knowledge assumed, T = specific theory
added to K, and s = specific statement.
K and T are supposed to be sets of statements, that can be rendered as a
conjunction, and s a statement.
The main reasons to include K are that in
fact in all reasoning (presumed) background knowledge is used, and that indeed
often the (in)validity of one's reasoning is due to one's presumed background
knowledge, and that this is very easily accomodated by the formalism
of probability theory, namely as part of one's conditional probabilities.
(1) Deduction: To find conclusions
that follow from given assumptions:
s|K=1 is a valid deduction from
T given K
=def
s|K&T=1 & T|K=1
Thus a valid deduction (using
probability theory) is a probabilistically
certain conclusion from a (presumptively) certain theory, assuming also
background knowledge K. Of
course, T|K=1 may be
withdrawn: All one often needs and can get is "if the theory is true,
then given the knowledge we presume, it must also be true that ...". The
general pattern of deductions is the familiar "If so-and-so is true, then
such-and-such is true, and so-and-so is true, therefore such-and-such is a valid
deduction."
And a deduction is the inference (where
"A |= B" = "B is a valid inference from premisses A") that conforms to:
s|K&T=1 & T|K=1 |= s|K=1
and is used in any case where one infers a
conclusion from premisses (indeed also where in fact s|K&T=1 is false, and
therefore the deduction invalid, which we shall not consider in this paper).
In a valid deduction we infer a fact s as
conclusion from a fact T, presuming K, and the fact s follows from K&T. The
deductive reason for s|K=1 is s|K&T=1 & T|K=1 i.e. s is true because it follows
from K&T and K&T is true, presuming K.
There is nothing new here (if you know some
probability theory and logic) apart from the facts that everything is stated in probabilistic terms and
that the statement is explicit about the presence of some
assumed background knowledge K. One advantage of including this reference to
assumed background knowledge in a probabilistic setting is that if one has
a valid deduction as defined, it is immediately obvious that if T|K=x then
s|K&T=x, and thus one has a way of dealing explicitly with uncertainty of one's premisses.
It also makes sense to point out that the assumption [(s|K&T)=1] also allows for
the possibility that s is a probabilistic statement, as in [((This die
has a probability of 1/6 to fall with 3 facing)|K&(This die is fair))=1].
[3]
It also is important to see that one has in
fact very great liberties in deducing:
- one can choose whatever assumptions one
pleases and
- one can assume whatever principle of
inference one pleases
- provided only that one retains the
criterion that a deduction of a conclusion from assumptions is valid
precisely if in any case that the assumptions are true, so is the
conclusion, and chooses one's principles of inference such that they satisfy
this criterion.
An example of a deduction is: Let K be
sufficient plane geometry to include Pythagoras' Theorem, and T the statements
that X is a straight-angled triangle in which the sides joined by the straigh
angle have the lengths 3 and 4, and s be the statement that the side in X that
joins the non-straight angles has length 5.
And though one has very great liberties in
deducing, part of the reason that this is so is that all deduction moves in the
hypothetical realm of if-then, with no deductve possibility of testing one's
assumptions other than by validly deriving a known falsehood from them. It is
this limitation that one seeks to escape by abduction and induction: Using only
deductions and bi-valent logic, one can only deduce conclusions, and make and
refute assumptions, but one cannot infer assumptions nor support or undermine
them.
(2) Abduction:
The definition of abduction that I give relies on the definition of "Tps|K" =def "T is positively
relevant to s given K" that follows, together with two related definitions that
will be useful below:
"Tps|K"
=def "T is positively relevant to s given K" =def "s|K&T > s|K&~T".
"Trs|K" =def "T is relevant
to s given K"
=def "s|K&T <> s|K&~T"
"Tis|K" =def "T is irrelevant
to s given K"
=def "s|K&T = s|K&~T".
Now abduction is used to
find assumptions from which
a given conclusion follows, and so defined as follows using the probabilistic concept of
(conditional) positive relevance:
T|K&s=x is a valid abduction from
s given K =def
Tps|K & s|K&T>=1/2 & s|K=1 & T|K=x
And so an abduction is an inference that conforms to:
s|K&T>s|K&~T & s|K&T≥~s|K&T & s|K=1 &
T|K=x |= T|K&s=x
and is generally resorted to when one seeks to
explain some puzzling fact s which one can not (one believes) derive by deducing it from one's
presumed background knowledge. The present definition and rule of abduction
claim that this is valid if in fact the theory T is relevant to and does support
the truth of the puzzling fact s, and if in fact one has a probability for T|K
that may be low. Indeed, often abductions start as theories that would explain
certain facts with a high probability if true, while the probability of the
theory is not high.
In a valid abduction we infer a theory T as
a possible explanation for a newly given fact s, presuming K, and we also infer
that T has a certain probability, namely equal to the minimum probability of its proper consequences presuming K. The
abductive reason for T|K&s=x is that trs|K & s|K&TS1/2 & s|K=1 &
T|K=x i.e. that it is true that T is a possible explanation for s precisely if it is
true that T is positively
relevant to s and that T makes s more probable than not and that s is true, all given K,
while the least probable proper consequence of T given K has probability x. The
general pattern of abductions is: "If theory T is positively relevant to s and
the probability of s is at least 1/2 if T, and s is true and the probability of
T is x, then T is a valid abduction for s, and T has probability x given s."
Given the definition of valid abduction and the
premisses listed, it is an easy matter to verify that then it is deductively
true, in probability theory, that T is a possible explanation for a newly given
fact s presuming K. But to establish that T|K=x we need some principle to settle the
probability of a theory, since probability theory has no axioms sufficient to
assign probabilities to theories, and we used the following:
The abductive postulate:
T|K=x IFF
x=qi|K&T & qiЄ{q: K&T|=q & ~(K&~T|=q) & (s)(K&T|=s
& ~(K&~T|=s) -->
s|K&T≥q|K&T)}
This principle says that the probability of
a theory T on presumed knowledge K equals x precisely if x is the minimum of
T's proper consequences given K, where q is a proper consequence of K&T if it
follows deductively from K&T but not from K&~T. (One needs proper consequences
to avoid problems with just any statement with a low probability that happens to
be true, for such a statement is a logical consequence of any theory.) It is a
postulate because while probability theory implies that in the stated conditions
T|KRx
it does not imply that T|K=x. [4]
What is new here apart from stating everything
probabilistically is that in fact two statements are inferred, of which one is
deduced and one assumed. First it is deduced that it is true that T is a
possible explanation of s, for which it is sufficient by the given definition
that s|K&T>s|K&~T & s|K&T≥~s|K&T & s|K=1. Next an assumption is made to settle
the probability of a theory, namely that T|K equals the minimum
probability of its proper consequences, all presuming K. This is motivated by
the facts that this minimum probability is the upper boundary of T|K on probability theory; that probability theory by itself
provides no consequences other than this to settle the probability of a theory;
and that we need to assign a probability to theories to reason probabilistically
with them. [5]
It should be noted that T|K&s may be low, and
that it still it may be a valid abduction, for this depends not on how low it
is, but on T being positively relevant to s given K and making s given K at
least 1/2. The possibility of 1/2 is included here to take care of whatever has
that probability 1/2. Thus, that this coin fell heads half of the times it was thrown is explained by
the valid abduction that this coin is fair (i.e. unloaded and with a heads and a
tails side). But often abductive inference is used to find a theory to account
for a surprising fact.
Also, if T|K&s=x is small this suggests to do one
(or more) of three things:
(1) Find another valid abduction for s with a
higher probability or
(2) deduce a consequence of T and test T, in the hope to inductively confirm T and
increase the probability of T or
(3) revise K so that the minimal proper consequence of T gets higher.
An example of an abduction is: Let K be
elementary physics; let s the statement that a star S1 is not
precisely where it should be if elementary physics is true; and let T be the
theory that there is a hitherto unknown interstaller object S2 that
effects where S1 precisely is.
The basic weaknesses of abductions are that (1) the
only reason to infer their theoretical conclusions is that they are relevant to
some puzzling facts and that (2) the theoretical conclusions inferred by abduction
often have a low probability. This last weakness is addressed by induction.
(3) Induction: To confirm
(or infirm) assumptions by showing their conclusions
do (not) conform to the observable facts.
T|K&s=T|K*[s|K&T:s|K] is a valid induction
from s given K =def
s|K=1 & q|K=1 & s|K&q=s|K & (q)(s)(Trs|K --> s|K&q=s|K IFF s|K&T&q=s|K&T)
And the induction is the inference
s|K=1 & q|K=1 &
(q)(s)(Trs|K --> s|K&q=s|K IFF s|K&T&q=s|K&T)
|=
T|K&s=T|K*[s|K&T:s|K]
although it may be clearer to write it like so,
explicitly listing the probabilities that must be presumed:
s|K=1 & s|K&T=s1 & s|K=s2 &
T|K=t1 & q|K=1 & s|K&q=s|K &
(q)(s)(Trq|K --> s|K&q=s|K IFF s|K&T&q=s|K&T) |=
T|K&s=t1*s1:s2
In a valid induction we infer that the
probability of T given K&s equals T|K*[s|K&T:s|K] from a true fact s that
is relevant to T, while T satisfies the inductive postulate that any fact that
is in fact irrelevant to anything T is relevant to remains irrelevant if T is
true, all given K. [6]
What is new here is
The inductive postulate: (q)(s)(Trs|K
--> (q|K&s=q|K IFF q|K&T&s=q|K&s))
This principle enables one to abstract from any
fact that is not relevant when testing a theory. The way this works can be
easily shown when we abstract for a moment from K. Then the principle turns
into: [Trs --> (q|s=q IFF q|T&s=q|s)] and allows us to abstract from any q
not relevant to any s that T is relevant to and that happens to be the case when
T is tested, for now one can reason as follows: T|s&q = q|T&s*T&s : s&q = q|s*T&s : q|s*s =
T&s:s =
T|s.
Each step is deductively valid in probability theory, but the second step involves
the inductive postulate to abstract from q. [7]
So in induction too there is an assumption
involved, namely that the theory one uses has the property that any fact that is
irrevelant in fact to any of the predictions of the theory is irrelevant also if
the theory is true and conversely, all given K. This can be seen equivalently as
the claim that the theory T is relevant in theory to everything it is relevant
to in fact and nothing else. Indeed, the equivalence in the inductive
postulate (q|K&s=q|K IFF q|K&s&T=q|K&s) can be conveniently read from left to
right as: what is irrelevant in fact, also is irrelevant according to theory T,
and from right to left as: what is irrelevant according to theory T is
irrelevant in fact, all presuming K (i.e. sofar as we know).
It should be noted that T|K&s may be any
probability, including what T was before s came to be known, namely T|K, if T is irrelevant to s given
K. What matters for an inductive inference to be valid is the assumption that the
inductive assumption is satisfied by T, for it is this that allows one to make an
inductive inference about T given a new fact s.
An example of an induction is: Let K be standard medical
science and s be the statement that test S is positive and T be the theory that
you have cancer, and suppose s|K&T=s1=0.9 and s|K&~T=s2=0.3
and T|K=t1=0.01 then T|K&s=t1*s1:s2=
0.03. If the test is positive, your chance of having cancer is trebled, but if
it was 1/100 to start with, then a positive result makes this 3/100. The
inductive postulate enters because to credit the reasoning one must in fact
assume that the test was properly done and that all manner of circumstances that
occurred when it was done either are irrelevant to the theory or are accounted
for by it. ("No madam, the lab assistant was not drunk, and no, it does not
matter the test was done on a Tuesday in a leap year and no, your meditation
exercises have nothing to do with outcome of the test." Etc.)
(4) Discussion of the
abductive and inductive postulates:
The explanations I have given of abductive and
inductive reasoning makes these valid deductions inside elementary probability
theory, given certain postulates and definitions, that I also have given. In this section I want to
consider briefly the postulates I made, and make a few remarks about the
interesting epistemological status they share, namely of being practically
necessary and corrigible.
A. The abductive postulate: The basic
reasons to assume the abductive postulate are that one needs to arrive somehow
at probabilities for theories, and that one has by probability-theory that T|KRx,
which the postulate strengthens to an identity provided
x=qi|K&T & qiЄ{q: K&T|=q & ~(K&~T|=q) & (s)(K&T|=s
& ~(K&~T|=s) -->
s|K&T≥q|K&T)}. For since whatever theories may denote (presumably:
the infinite set of logical consequences of the assumptions of the theory),
these cannot be counted in the same sense as e.g. occurences of "heads up" in sequences of
throwns with a coin can be counted, and so there is no evident way to
assign a probability to a theory.
The abductive postulate amounts to the assumptions
that, first there is such a thing as the probability of a theory [8], and second that it may be initially and conveniently
settled by supposing this probability equals the maximum of what it may be
given in probability theory and such knowledge as one presumes, while one also
fully expects
that this initial probability will be adjusted by further reasoning and to be
discovered new facts.
So the abductive postulate seems not so much a
truth about nature as a truth about the ways and procedures human beings may use
to discover the truth about nature. And indeed the abductive postulate seems
safe and warranted in the sense that any probability it introduces can be - and
usually will be - rationally corrected and adjusted, and that indeed it may be
increased or decreased by inductions, and also by various
other means indicated above.
An abductive postulate is needed because
we need to have some factually based probability for theories that we decide are
good explanations, if only to have a start to test them inductively.
B. The inductive postulate: The basic
reason to assume the inductive postulate is that one needs some assumption to
deal with the very many facts that are true besides the theory one is interested
in testing, since each of these very many facts may be relevant to the truth of
the fact one is interested in, and so that it seems a good demand to make of a
theory to be true that it should truly and fully entail all it is relevant to.
Also, in fact this postulate seems to be
necessarily true if human beings can come to know nature by testing such
theories as they have, for all such tests must include knowledge of what is relevant to
what is tested and in what degree it is relevant and also of what is irrelevant to
it, for relevancies and irrelevancies are facts that are as real as the facts
they concern. In brief: one just cannot rely on any experimental evidence
if one cannot rely on one's abstraction from much of the surrounding factual
details as irrelevant, which is necessary in any experiment. [9]
On the other hand, one cannot normally prove in
complete or even considerable detail that any given theory that is to be tested
in fact does correctly entail all that is relevant to it and does not
entail as relevant anything that is in fact irrelevant. (Indeed, normally only a
few known relevant factors are listed in any report of a scientific experiment
together with an indication how these have been dealt with in the experimental
set-up. Yet any design of experiments must involve assumptions about factors
that are relevant and that are irrelevant to what is to be tested.)
But since true theories must properly entail
the true degrees of relevancies of their predictions, all one can do is to
assume that one's theories do so, and to take care of all relevancies one does
know.
So the inductive postulate seems not so much a
truth about nature as a truth about the ways and procedures human beings use to
discover the truth about nature, and one which is true to the extent human
beings have true theories about nature, for true theories must satisfy the
inductive postulate, even if no human being is able to survey all of the
universe and establish all its presumed relevancies and irrelevancies are
factually correct. And indeed the inductive postulate seems safe and warranted
in the sense that any probabilities it introduces can be - and usually will be -
rationally corrected and adjusted by later evidence. Also it suggests a reason for experiments
that fail or turn out unexpected results: One may have disregarded as irrelevant
some factor that is relevant i.e. one may have falsely assumed that one's theory
T satisfied the inductive postulate. Finally, the inductive postulate is needed
because in any experimental test of a theory we need to abstract from very many
accompanying circumstances.
Summing up: It seems to me that both the
abductive and inductive postulate are used tacitly in very much of human
reasoning; it seems to me that both - or at least: assumptions much like them -
are required by human beings if they want to arrive by reasoning at truths about
nature; and it seems to me that both postulates have the interesting property
that one must assume (something like) them in order to learn anything at all
about the natural facts, while one is able to correct such inaccuracies as they
may introduce, though this correction will take time and trouble, as does any
scientific advance.
And
the reason one must assume these principles is that in order to establish that one's
theories are about nature (and are not merely fantasy), one must test one's theories, and to test them one
needs both probabilities for theories and make as sure as one can that all that
is relevant to what one tests is known, so that what one abstracts from as
irrelevant to what one tests in theory indeed is irrelevant in
fact.
Therefore it seems sensible to list these
assumptions of scientific procedure explicitly, and to use them consciously
wherever applicable. And finally it seems an interesting fact that there are, then,
corrigible presumptive truths of procedure, that one needs to make in order to validly
infer intermediate conclusions that are required on the way towards new natural
knowledge.
Maarten Maartensz
Literature:
Ernest Adams: A Primer of Probability Theory
Arthur Burks: Chance, Cause and Reason
Paul Halmos: Naive Set Theory, and Measure Theory.
Charles Peirce: Collected Papers
G. Polya: Principles of Plausible Reasoning (2 volumes)
W.G. Wood & D.H. Martin: Experimental Method
Notes:
[1]: In fact I assume no more
than standard elementary probability theory (summarizable as: What follows from
Kolmogorov's axioms without any infinitary assumptions, or alternatively but
equivalently: measure theory without infinitary assumptions) and standard elementary
set theory. Good introductions to the former are by Adams and Burks, and a good
introduction to set theory and to measure theory are by Halmos. Wood & Martin is a useful summary of
principles involved in scientific experimental designs.
[2]: This is well explained
by Adams and by Polya.
[3]: So while there is no
need to formalize deductions inside probability theory it may be helpful to do
so, especially when one wants to deal with uncertainties. Also, it is well to
stress that the notation I use (and explained in the beginning) is useful, and more
elegant and easier to read and use that the ordinary format using "pr(.)".
[4]: For in probability
theory we have for any theory T and any statement s logically implied by T that
the probability of T - assuming, as we do, that it exists and is consistent with
probability theory - cannot be larger than the probability of s. In more or less
standard notation: T |= s --> pr(T)Rpr(s).
[5]: These considerations do
not logically imply the abductive postulate, and more can be said about the
reasons to assume it, some of which is said in section (4).
[6]: That
T|K&s=T|K*[s|K&T:s|K] in fact does follow by standard elementary probability
theory and is well explained by Burks, with much more detail than is dealt with
in the present paper. In any case, the basis of it all is the elementary theorem
of probability theory to the effect that (abstracting from K for the moment) T|s
> T IFF s|T > s IFF T is positively relevant to s, while it also is a theorem
that T|s = s|T.T:s. So probability theory itself easily and elegantly enables
inductive confirmation of a theory by its verified predictions, and
falsification of a theory by its falsified predictation. But - though very
interesting in principle - this is all presumed as known in the present paper.
(See e.g. Burks and Polya.)
The inductive postulate is needed in any case s
here is an empirical fact, that could be used as confirmation or infirmation,
since such a fact s always wil be simultaneously true with very many
other empirical facts, some of which may be relevant to it, and others which
may not be relevant to it.
[7]: There is a lot more that
can and should be said about the inductive postulate and its relations to the
problems of induction as raised by Hume and Goodman, but the present paper is
not the place for it, for it is not dedicated to the problems of induction, but
to a clear and formal statement of the fundamental principles of valid
reasoning.
[8]: It may be well to remark
here that I do not presume the subjective or personal approach to probabilities,
which makes probabilities depend on human betting quotients. These are
arbitrary, and what I want, rather, is a probability based on the facts,
in so far as these are (presumptively) known. These
probabilities are not available for theories directly, since one cannot count the cases in
which a theory is true and count the cases in which a theory is not true, and therefore I
assume a theory has the probability of its least probable proper consequence,
where this probability can be established experimentally by counting the cases
pro and con.
[9]: In fact this is one side
of the problem of induction that has been missed by many. One of the many was Karl
Popper, whose whole philosophy of science founders on the fact that he did not
see that to use experimental evidence requires an assumption about what is and
is not relevant to it. Any experiment involves the assumption to the
effect that whatever is happening at the time that is not considered by the
theory that is tested is in fact irrelevant to whatever the theory implies.
Without this assumption anything whatsoever that is happening may be quoted as
"reason" that the experiment has the outcome it does, whatever the outcome is.
And the assumption must be explicitly made because it may be mistaken and is
involved in all testing of theories by their empirically checkable implications:
Some of the factors that are not assumed to be relevant in fact may be relevant.
|