Maarten Maartensz:    Philosophical Dictionary | Filosofisch Woordenboek                      

 L - Learning from experience


Learning from experience: Explanation of how one can learn from theories and their logical consequences if these consequences are verified or falsified.

The explanation to be given is closely related to Bayesian Reasoning, but cast in such a way that certain problems with that are avoided, and all relevant assumptions are clearly stated.

The start is standard propositional logic (PL) based on & and ~, that are sufficient to define all standard connectives. This is extended to temporal propositional logic and temporal probability theory as follows, noting that "p(α,Q)" is  "person α's probability for Q" ("α" = "alpha").

A1.     (P&Q).t                IFF (P).t & (Q).t
A2.     (p(α,P)=p(α,Q)).t   IFF p(α,P).t = p(α,Q).t

That is:
Propositions are temporally relativized by an operator ".t" attached to them, read as "at t" and this distributes as one expects over & and =, which are together with ~ the basic logical operators. The times t are supposed to be discrete and ordered by <=. This will be needed to keep track of the various truth-values and probabilities of propositions at various times.

A3.      (EP)(Et)(Ex)  ( 0 <= p(α,P).t=x <= 1)      
A4.      (EP)(EQ)(Et) ( (Ex)(p(α,P).t=x) & (Ey)(p(α,Q).t=y) & ~(Ez)(p(α,P&Q).t=z) )

For every person α there are propositions with some personal probability at some t - and so it may be that there are for α at t propositions which do not have a personal probability. And for every person α there are at some t pairs of propositions such that both propositions have some probability but their conjunction doesn't.

A3 and A4 entail that there may well be propositions for α that do not have α's personal probability for some reason, and there may well be conjunctions for which α does not have a probability even if α has probabilities for the conjuncts. This differs from non-personal probability, where every logical possibility has some probability always, even if it is unknown.

A5.     p(α,Q|T).t=x       IFF p(α,Q&T).t : p(α,T).t=x
A6.     p(α,~Q).t=1-y     IFF p(α,Q).t=y
A7.     p(α,~Q|T).t=1-x  IFF p(α,Q|T).t=x

This defines conditional probability in A5 and defines ~Q for unconditional and conditional probabilities in A6 and A7. Note this is a personal probability of α: It is up to α to decide what it is, and often though not necessarily it is what α believes that the real probability is. The reading of "p(α,Q|T).t" is "the probability for α of Q given T at t".

With two conditional probabilities plus p(α,T).t we can calculate all entries for a fundamental table, that lists the probabilities for all logical alternatives. Indeed here is the fundamental table in three different convenient forms, where the .t are left out as understood:

          I                II                          III
t T ~T T ~T T ~T
Q  a   b p(α,Q&T) p(α,Q&~T) p(α,Q|T)*p(α,T) p(α,Q|~T)*p(α,~T) p(α,QT)
~Q  c   d p(α,~Q&T) p(α,~Q&~T) p(α,~Q|T)*p(α,T) p(α,~Q|~T) p(α,~QT)

p(α,T) p(α,~T) p(α,T) p(α,~T) p(α,T) p(α,~T)     1

From A1-A7 we get something much like standard probability theory, for those propositions that a does have probabilities for, for we can easily prove:

T1.      p(α,T).t=p(α,Q&T).t+p(α,~Q&T).t

For p(α,Q&T).t+p(α,~Q&T).t  = p(α,Q|T).t*p(α,T).t+p(α,~Q|T).t*p(α,T).t  by A5
                                       = (p(α,Q|T).t+p(α,~Q|T).t)*p(α,T).t                by algebra
                                       = p(α,T).t                                                   by A7

T2.      p(α,QT).t = p(α,Q|T).t*p(α,T).t + p(α,Q|~T).t*p(α,~T).t

since p(α,Q|T).t*p(α,T).t+p(α,Q|~T).t*p(α,~T).t = p(α,Q&T).t+p(α,Q&~T).t by A5. This is written as p(α,QT).t to indicate p(α,Q).t is calculated with respect to p(α,T).t and two of a's conditional probabilities involving Q and T. This will become of some importance below.

Since also by A6

T3.      p(α,QT).t+p(α,~QT).t = p(α,T).t+p(α,~T).t = 1

the fundamental table has been justified at this point. (The α,b,c,d entries in it are for conventient abbreviation of the four possible logical alternatives.)

A8.     (T |-α Q).t      IFF p(α,Q|T).t=1 V p(α,T).t=0
A9.     (|-α Q).t         IFF p(α,Q).t=1

This defines
logical implication and verified formula for a in terms of personal probability of a. Note that in fact only 1 and 0 are used here, and that thus we have the basis for standard bi-valent propositonal logic. 

T4.     (T |-α Q).t    --> p(α,T).t <= p(α,Q).t

For suppose (T |-α Q).t. i.e. by A8 p(α,Q|T).t=1 V p(α,T).t=0. In case p(α,T).t=0, we have p(α,T).t <= p(α,Q).t. So suppose p(α,T).t>0. Then p(α,Q|T).t=1 and so p(α,~Q&T).t=0 whence again p(α,T).t <= p(α,Q).t. Thus T4.

Therefore also, defining (T -||-α Q).t =def (T |-α Q).t & (Q |-α T).t

T5.     (T -||-α Q).t  --> p(α,T).t = p(α,Q).t

which is to say that logical equivalents have equal probabilities. Again, this is like standard probablity theory, but relativized to a's judgements.

Now we are going to say how one may learn from experience.

           For given p(α,Q|T).t=h, p(α,Q|~T).t=i and p(α,T).t=j, with h≠i:

A10.     p(α,Q|T).
t+1=h              IFF   p(α,Q|T).t=h
A11.     p(α,Q|~T).t+1=i             IFF   p(α,Q|~T).t=i

A12.     p(α,T).
t+1 = p(α,T).t       IFF    0 < p(α,Q).t < 1
A13.     p(α,T).t+1 = p(α,T|Q).t    IFF   p(α,Q).t=1
A14.     p(α,T).t+1 = p(α,T|~Q).t  IFF   p(α,~Q).t=1

Any given set p(α,Q|T).t, p(α,Q|~T).t, p(α,T).t where T is a theory is called a basic theory  for α if p(α,Q|T).t≠p(α,Q|~T).t, and A10 and A11 insist that the conditional probabilities in a basic theory remain constant in time. A12 till A14 state how p(α,T).t in a basic theory changes or not depending on what α verifies about the Q in the set: It remains constant if α neither verifies Q nor ~Q and changes with Q or ~Q if either of these are verified for α.

To show how this works put p(α,Q|T).t=h, p(α,Q|~T).t=i and p(α,T).t=j. Then suppose p(α,Q).t=1. We have, noting also that ~j=1-j

(*)      p(α,T).t+1 = p(α,T|Q).t                                 by A13
                        = (p(α,Q|T).t : p(α,QT).t) * p(α,T).t  by A5 and T5, for p(α,Q&T).t=p(α,T&Q).t
                        = (h : hj+i~j) * j                           by adopted conventions

Thus p(α,T|Q).t+1= p(α,Q|T).t:p(α,QT).t * p(α,T).t and so the new theory p(α,T).t+1 differs by
a multiplicative factor p(α,Q|T).t:p(α,QT).t from the old p(α,T).t . Now clearly

     p(α,Q|T).t:p(α,QT).t  >= 1   IFF h : hj+i~j >= 1
                                           IFF h >= hj+i~j
                                           IFF h~j >= i~j                         using ~j=1-j
                                           IFF h >= i                               supposing ~j>0
                                           IFF p(α,Q|T).t  >= p(α,Q|~T).t

Therefore the direction of the degree of confirmation depends only on the conditional probabilities: The multiplicative factor p(α,Q|T).t:p(α,QT).t equals or exceeds 1 iff p(α,Q|T).t equals or exceeds p(α,Q|~T).t . And as the conditional probabilities remain constant by A10 and A11 this remains constant.

Next, the problem for Bayesian Reasoning that if p(α,Q).t =1 i.e. |-α Q.t then also, by probability theory, p(α,T|Q).t=p(α,T).is avoided by the following theorem, formulated for the same basic theory as before, and using the definition (T α-rel Q) =def p(α,Q&T)≠p(α,Q)*p(α,T) and (T α-irr Q) =def ~(T α-rel Q).

T6.      (T α-rel Q) --> p(α,QT) < 1

Proof: p(α,QT) = p(α,Q|T)*p(α,T)+ p(α,Q|~T)*p(α,~T) by T2
                    = hj+i~j                                            by adopted conventions

Now  hj+i~j = 1        IFF
        hj+i-ij = 1        IFF          using ~j=1-j    
        hj-ij = (1-i)      IFF      
       (h-i)*j = (1-i)    IFF
       ((h-i):(1-i))*j = 1            supposing i<1

Lemma: ((h-i):(1-i))*j = 1 IFF h=1 & j=1 & h>i, assuming h, i and j are probabilities.

Proof: Make the assumption about h, i and j. Suppose the RHS of the equivalence. Then ((h-i):(1-i))*j turns to ((1-i):(1-i)). Since h=1 and h>i it follows ((1-i):(1-i))=1. Next suppose the LHS. Assume h<=i. Then h-i<=0 and so ((h-i):(1-i))*j 1. So h>i follows from the LHS. Assume h<1. Then ((h-i):(1-i)) < 1 and ((h-i):(1-i))*j 1. So h=1 follows from the LHS. Since we have proved h=1, ((h-i):(1-i))*j = 1, and so j=1. Thus the lemma has been proved.

Now if j=1 then ~j=0 and then (T α-irr Q) by T4.  So if (T α-rel Q) then j<1 and so hj+i~j < 1 by the lemma. Therefore indeed (T α-rel Q) --> p(α,QT)< 1. Qed.

Hence it is quite possible that p(α,QK).t =1 & p(α,QT).t < 1. The probabilities involved in a basic theory need not be the same as those of propositions that are not involved in it, but may be used to update the probabilities in a basic theory by (*). And indeed one may write also p(α,QTK).t to explicate K as well.


Also see: Bayesian Reasoning

Literature: Adams

 Original: Jul 20, 2006                                                Last edited: 12 December 2011.   Top