Journal of Applied Logic 7 (2009) 155–176 www.elsevier.com/locate/jal

Probabilistic argumentation Rolf Haenni Bern University of Applied Sciences, Department of Engineering and Information Technology, CH-2501 Biel, Switzerland Received 29 June 2007; received in revised form 8 November 2007; accepted 20 November 2007 Available online 9 January 2008

Abstract Argumentation in the sense of a process of logical reasoning is a very intuitive and general methodology of establishing conclusions from defeasible premises. The core of any argumentative process is the systematical elaboration, exhibition, and weighting of possible arguments and counter-arguments. This paper presents the formal theory of probabilistic argumentation, which is conceived to deal with uncertain premises for which respective probabilities are known. With respect to possible arguments and counter-arguments of a hypothesis, this leads to probabilistic weights in the first place, and finally to an overall probabilistic judgment of the uncertain proposition in question. The resulting probabilistic measure is called degree of support and possesses the desired properties of non-monotonicity and non-additivity. Reasoning according to the proposed formalism is an simple and natural generalization of the two classical forms of probabilistic and logical reasoning, in which the two traditional questions of the probability and the logical deducibility of a hypothesis are replaced by the more general question of the probability of a hypothesis being logically deducible from the available knowledge base. From this perspective, probabilistic argumentation also contributes to the emerging area of probabilistic logics. © 2007 Elsevier B.V. All rights reserved. Keywords: Argumentation; Reasoning; Logic; Probability theory; Uncertainty; Degrees of belief; Probabilistic logic; Dempster–Shafer theory

1. Introduction Traditionally, argumentation has been a topic of interdisciplinary interest in areas like philosophy, psychology, communication science, linguistics, law, economy, sociology, or other non-technical sciences. With the emergence of autonomous intelligent agents and multi-agent systems, where the aim is to make a computer decide and act in place of a human being, argumentation has also become an important research topic in computer science, particularly within the AI community [75,81,82,91]. The meaning of the English word argumentation is twofold. Many AI oriented argumentation theories are focused on argumentation as a methodical process of logical reasoning, in which conclusions are drawn from credible reasons, see e.g. [2,3,7,13,21,72,90,107]. The dialectical meaning of argumentation, namely as a debate, negotiation, or discussion, in which reasons are advanced for and against some controversial proposition or proposal, is another primary concern of the AI view of argumentation [5,67,89,108], as most approaches are driven by the general idea of putting E-mail address: [email protected] 1570-8683/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jal.2007.11.006

156

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

forward the pros and cons of the proposition in question. This is also the case in the theory proposed in this paper, but apart from that, it is primarily an argumentative theory of logical and probabilistic reasoning. Ideally, the construction of logical reasons for a conclusion is based on facts or generally accepted premises or principles. But in the real world, in which we need to deal with unreliable, incomplete, or even contradictory information, this seems not to be a very typical situation. An adequate logical theory of argumentation should therefore be conceived to deal with assumptions, uncertain premises, or defeasible statements, from which more or less credible logical arguments are constructed. In the proposed theory of probabilistic argumentation, the credibility (or the weight) of logical arguments is measured by probabilities. For this, we assume the uncertainty of possible premises for defeasible arguments to be adequately representable by a joint probability function. For maximal simplicity, we often suppose the premises to be stochastically independent, in which case it is sufficient to know individual marginal probabilities, but this is not a general restriction. One can think of uncertain premises as atomic sources of uncertainty, which are not further decomposable, and it depends on the concrete problem or model whether they are independent or not. Such premises will be represented by means of so-called probabilistic variables. Restricting the possible values of a probabilistic variable forms an assumption, and a combination of such assumptions, called scenario, forms a possible logical argument. This type of argumentative reasoning (or argumentation-oriented reasoning), i.e. the process of constructing arguments from uncertain assumptions, is sometimes called assumption-based or hypothetical reasoning [10,17,54, 62,64,71]. 1.1. Motivation and general ideas The ultimate goal of our approach is to quantitatively judge the truth or falsity of an uncertain proposition by an argumentative inference process. The qualitative part of the available information or evidence, on which this judgment is based, is called knowledge base, and the proposition in question is called hypothesis. Both the knowledge base and the hypothesis are supposed to be expressible by logical sentences in an appropriate logical language. The examples given in this paper are purely propositional, but the theory is applicable to many more general logics. Most formal concepts and definitions in this paper are not restricted to a specific logic. For a given hypothesis, the first step is to construct logical arguments and counter-arguments. Each argument (counter-argument) delivers some sort of logical proof for the truth (falsity) of the hypothesis in question, in the sense that the hypothesis (complementary hypothesis) becomes a logical consequence of the knowledge base once the argument is added to it. Note that a combination of an argument with a counter-argument should obviously lead to a logical contradiction. The theory of probabilistic argumentation proposes a precise solution of how to deal with such conflicts. This is an important component of the theory, which is responsible for its non-monotonic character. Determining all possible arguments, counter-arguments, and conflicts is the logical or qualitative part of the proposed evaluation process. It is evident that inference methods to accomplish this step depend strongly on the chosen logical language. Once the qualitative part of the evaluation is completed, the question is how to derive respective probabilistic weights. As a general solution, the theory of probabilistic argumentation proposes a non-additive measure called degree of support and its dual counter-part called degree of possibility. They measure the credibility of the hypothesis by the total probabilities that it is supported by arguments and that it is not defeated by counter-arguments, respectively. Degrees of support are sometimes called probabilities of provability (or probabilities of deducibility) [26,83], which do not necessarily sum up to 1 for a pair of complementary hypotheses. In [84], Pearl worries about the usefulness of such probabilities of provability: Why should we concern ourselves with the probability that the evidence implies h, rather than the probability that h is true, given the evidence? Of course, we would prefer having the latter, which is some sort of an ideal situation, but the imperfection of the available information does not always provide such an ideal situation. Moreover, by considering probabilities of events of the form “the hypothesis is deducible” rather than “the hypothesis is true”, we are not leaving the classical field of additive probabilities in the sense of Kolmogorov’s axioms. Computing such degrees of support and possibility is the probabilistic or quantitative part of the evaluation process. The entire evaluation process with the connections

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

157

Fig. 1. The conceptual connections between assumptions and arguments, arguments and hypotheses (solid arcs), counter-arguments and hypotheses (dotted arcs), hypotheses and degrees of support (solid arrows), and hypotheses and degrees of possibility (dotted arrows).

between the above-mentioned concepts of assumptions, arguments and counter-arguments, hypotheses, and degrees of support and possibility is depicted in Fig. 1. As probabilistic argumentation is primarily conceived to deal with the uncertainty of possible arguments, one can also look at it as a theory of non-monotonic reasoning under uncertainty. In fact, probabilistic argumentation turns out to be a very general way of combining the two classical types of probabilistic and logical reasoning, in which the two traditional questions of the probability and the logical deducibility of a hypothesis are replaced by the more general question of the probability of a hypothesis being logically deducible from the available knowledge base [42]. Our theory is therefore a contribution to the broad area of attempts to combine logic and probability [1,27,29,31,49, 78,80,115]. Since probabilities are not directly incorporated into the logical language, we propose an instance of an external probabilistic logic, but at the same time we depart form most other probabilistic logics by not considering sets of probabilities. The enrichment and further illumination of this connection is a principal goal of this paper. Possible application areas of probabilistic argumentation exist in domains where classical logics or probability theory alone are unable to entirely capture the characteristics of reasoning with partial information. Some existing applications areas are information retrieval [86–88], the authentication of public keys [37,50,63], model-based diagnostics and reliability analysis [4,60], Bayesian networks [110,111], statistics [65], and information fusion in the presence of unreliable sources [38]. The theory may also serve as a starting point for a more general decision theory, in which the amount and relevance of the available information is taken into account [45,77,106].1 The non-additivity of the proposed measure, i.e. the gap between respective degrees of support and possibility, naturally delivers such a degree of informativeness or conversely a degree of ignorance relative to the hypothesis in question. The study of decision theories with the option of further deliberation is a current research topic in the philosophy of economics [20,55,96]. 1.2. Introductory example In the main part of this paper, we will present the theory of probabilistic argumentation in a very general and formally rigorous form. To further expose the underlying basic ideas and technical concepts, we do not need this generality and rigorousness here. Instead, let’s play with a simple toy example, in which the knowledge base, denoted 1 It’s like in real life: people do not like decisions under ignorance (they prefer betting on events they know about). This psychological phenomenon is called ambiguity aversion and has been experimentally demonstrated by Ellsberg [25]. His observations are rephrased in Ellsberg’s paradox, which is often used as an argument against decision-making on the basis of subjective (additive) probabilities.

158

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Table 1 All 16 scenarios of the introductory example together with their conclusions with respect to z and ¬z s0

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

s11

s12

s13

s14

s15

a1 a2 a3 a4

0 0 0 0

1 0 0 0

0 1 0 0

1 1 0 0

0 0 1 0

1 0 1 0

0 1 1 0

1 1 1 0

0 0 0 1

1 0 0 1

0 1 0 1

1 1 0 1

0 0 1 1

1 0 1 1

0 1 1 1

1 1 1 1

z ¬z

| / | /

| / | /

| / | /

| | /

| | /

| | /

| | /

| | /

| / |

| / |

| / |

| |

| |

| |

| |

| |

Arguments

CounterArguments

Conflicts

by , is the following set of propositional sentences: = {a1 ∧ a2 → x, a3 → y, x ∨ y → z, a4 → ¬z}. Some of the propositions involved, namely a1 , a2 , a3 , and a4 , are supposed to represent probabilistically independent uncertain events. These are the probabilistic variables in our model, which will later serve as building blocks for the construction of arguments and counter-arguments. The assumption of probabilistic independence will not be a necessary restriction in our theory, but if we suppose to know their marginal probabilities, e.g. let P (a1 ) = 0.7,

P (a2 ) = 0.2,

P (a3 ) = 0.5,

P (a4 ) = 0.1,

it allows us to easily compute the probability of each single configuration in the corresponding 4-dimensional Boolean space by multiplication. For the configuration s3 = (1, 1, 0, 0), for example, which sets a1 and a2 to true and a3 and a4 to false, we get P ({s3 }) = P (a1 ∧ a2 ∧ ¬a3 ∧ ¬a4 ) = 0.7 ∗ 0.2 ∗ (1 − 0.5) ∗ (1 − 0.1) = 0.063. Such a configuration, which assigns a truth value to each probabilistic variable, is called scenario, and each truth value assignment of a scenarios is called assumption. Let us now look at the consequences of the possible scenarios. For this, we assume a particular scenario si to be the true configuration, the one that represents the state of the real world. This assumption allows us to instantiate each occurrence of a probabilistic variable in according to its value in si . The resulting set of sentences is called conditional knowledge base |si , and we can use it to see whether the given hypothesis is entailed or not. Table 1 lists all 24 = 16 scenarios s0 , . . . , s15 of our example and indicates by | or | / whether the hypotheses z and ¬z are logical consequences of the respective conditional knowledge bases |si or not. In the case of s3 = (1, 1, 0, 0), for example, we get |s3 = {x, x ∨ y → z}, of which z (unlike ¬z) is obviously a logical consequence. In other words, s3 supports the truth of z and is therefore a logical argument for z. Its probabilistic weight is P ({s3 }) = 0.063. An example of a counter-argument of z (i.e. of an argument for ¬z) is scenario s8 = (0, 0, 0, 1). It leads to |s8 = {x ∨ y → z, ¬z}, of which ¬z is an immediate logical consequence. In case a scenario supports both the hypothesis and its complement, we call it a conflict. In our example, s11 = (1, 1, 0, 1) with |s11 = {x, x ∨ y → z, ¬z} is such a conflict. Conflicts arise naturally, especially when the available background knowledge is complemented with actual observations. As they are incompatible with the given knowledge base, we will not consider them as proper arguments or counter-arguments. Note that the complement of the set of all conflicts can be viewed as the available evidence with respect to the probabilistic variables. With respect to the hypothesis z, we formally write ARGS(z) = {s3 , . . . , s7 }, ARGS(¬z) = {s8 , s9 , s10 }, and CONFS = {s11 , . . . , s15 } to denote the sets of arguments, counter-arguments, and conflicts, respectively. This decomposition of the 4-dimensional Boolean space is the result of the qualitative part of the evaluation. Note that in real-world sized problems, we do not require the elements of these sets to be listed explicitly. We will rather use appropriate logical representations such as DNFs (disjunctive normal forms), BDDs (binary decision diagrams), NNFs (negation normal forms), or PDAGs (propositional directed acyclic graphs) [12,15,109]. DNF representations are the core of the computational theory as presented in [39], but we do not make such a commitment in this paper. To get a quantitative judgment of the hypothesis z, we first consider the conditional probability P (ARGS(z)|CONFSc ) of the event ARGS(z) given that the true scenario is not conflicting. This conditional probability is what we call the degree of support of z. For the particular values of our example, we obtain the following

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

159

result: P (ARGS(z) ∩ CONFSc ) P (ARGS(z)) = dsp(z) = P ARGS(z)|CONFSc = c P (CONFS ) 1 − P (CONFS) P ({s3 , s4 , s5 , s6 , s7 }) P ({s3 }) + P ({s4 }) + P ({s5 }) + P ({s6 }) + P ({s7 }) = = 1 − P ({s11 , s12 , s13 , s14 , s15 }) 1 − [P ({s11 }) + P ({s12 }) + P ({s13 }) + P ({s14 }) + P ({s15 })] 0.513 = = 0.544. 0.943 In an analogous way, we can compute dsp(¬z) = 0.046 for the degree of support of ¬z. From this, we obtain z’s degree of possibility by dps(z) = 1 − dsp(¬z) = 0.954. This is the second relevant measure when it comes to quantitatively judge the truth of the hypothesis z. The resulting values dsp(z) = 0.544 and dps(z) = 0.954 indicate the presence of some non-negligible arguments for z and the almost perfect absence of corresponding counter-arguments. In other words, we have some good reasons to accept, but almost no reason to reject z. We can furthermore measure the amount of available information by the difference dig(z) = dps(z) − dsp(z) = 0.41, our degree of ignorance with respect to z. As mentioned earlier, being aware of the degree of ignorance can be useful for decision making. 1.3. Goals and overview The general purpose of this paper is to further develop, expose, and promote the theory of probabilistic argumentation. Another goal is to illuminate the connections to other theories of reasoning with uncertain information, particularly to the classical fields of logical and probabilistic reasoning. By surpassing previous publications on probabilistic argumentation in length, depth, and generality, we intend the paper to become the theory’s main and richest reference. And by not being limited to Boolean logic and through a more rigorous mathematical treatment, the paper also differs significantly from the current main reference [39]. With respect to the information theoretical perspective proposed in [57,59], in which uncertainty is represented by random variables with values in information algebras [56], and where reasoning is embedded in the corresponding algebraic structure, our paper is technically less complex and should therefore be accessible to a broader audience. In Section 2, we start by defending the position of non-additive degrees of belief and the resulting formal concept of an opinion with respect to an open question. This is a very general discussion, which we use to motivate and justify some of the major characteristics of our theory. Section 2 also includes an account on the theory’s historical root and its connection to more recent developments. Section 3, which is the main part of this paper, provides a general exposition of the theory’s main mathematical concepts and properties. We will also show how logical and probabilistic reasoning are included as special cases and say a few words on the connection to Dempster–Shafer theory. The paper ends in Section 4 with some concluding remarks. 2. Degrees of belief and opinions Argumentation as a process of logical reasoning is an entrenched part of the human intellect and therefore a very natural and general methodology for establishing conclusions on the basis of incomplete information. As such, argumentation plays a crucial role in the formation of rational beliefs and opinions, which themselves are the basis for rational decisions. In this respect, any formal theory of argumentation should also be conceived as (or at least be linked to) a theory of rational belief. As any theory of rationality must at some stage address questions like “What is the best way to represent an agent’s rational belief state?” [116] or “What laws should rational degrees of belief obey?”, we do not refrain from doing this here. We start from the position that belief is primarily quantitative and not categorical, i.e. we generally assume the existence of various degrees of belief. This corresponds to the observation that most human beings experience belief as a matter of degree. We follow the usual convention that such degrees of belief are values in the [0, 1]-interval, including the two extreme cases 0 for “no belief ” and 1 for “full belief ”, where any intermediate value represents its own level of certitude, thus allowing the quantification of statements like “I strongly believe . . . ” or “I can hardly believe . . . ” over the full range of the spectrum. Degrees of belief depend at least on two factors, the epistemic state of the person or agent who holds the belief, and the proposition or statement under consideration. With epistemic state we mainly refer to the information that

160

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

is available to the agent at a particular point in time t. If we denote the available information by t or simply , we may refer to an agent’s degree of belief with respect to a hypothesis h by Bel (h) ∈ [0, 1]. Furthermore, if ¬h represents the complementary hypothesis of h, then Bel (¬h) is called degree of disbelief of h. Typically, we expect to include sentences of a formal logical language or the specification of a probability function, but we do not further specify this at this point. Whenever no confusion is possible, we will abbreviate Bel (h) by Bel(h). The process of building up degrees of belief and disbelief from the available information is what we mean with reasoning. 2.1. Desired properties of degrees of belief Our goal here is to point out some of the properties most people would expect to encounter in a reasonable theory of rational degrees of belief. The properties of the following (non-exhaustive) list are the most important ones: Uniformity. If two agents possess exactly the same amount of relevant information with respect to a hypothesis h, they should have equal degrees of belief. In other words, degrees of belief are supposed to primarily depend on the available information , rather than on an agent’s individual preferences or bias. This is what we express in our notation Bel (h). Consistency. If a hypothesis h logically entails another hypothesis h , then Bel(h) is expected to be smaller or equal than Bel(h ). Formally, this means that h | h implies Bel(h) Bel(h ), where | denotes logical entailment. Furthermore, we expect Bel(⊥) = 0 for the inherently false hypothesis ⊥ (the one that entails any other hypothesis) and Bel() = 1 for the inherently true hypothesis (the one that is entailed by any other hypothesis). Non-Monotonicity. This property tells us that the expansion of the available information with some new information may result in lower, equal, or higher degrees of belief. Accordingly, the new information is called confirming, neutral, or disconfirming, respectively. Formally, non-monotonicity means that Belt (h) and Belt (h), the agent’s degrees of belief at two different points in time t t , are in no pre-determined relationship. Consequently, somebody’s belief of a particular hypothesis may repeatedly go up and down when evidence accumulates over time (see Fig. 2). Non-Additivity. More controversial is the question whether degrees of belief should be additive or not. In this paper, we will assume the non-additivity property to hold, which states that degrees of belief of complementary hypotheses h and ¬h do not necessarily add up to 1, or more generally, that the degrees of belief of two exclusive hypotheses h1 and h2 do not necessarily add up to the degree of belief of their disjunction h1 ∨ h2 . Formally, non-additivity with respect to degrees of belief means Bel(h) + Bel(¬h) 1, which is a special case of Bel(h1 ) + Bel(h2 ) Bel(h1 ∨ h2 ). Note that non-additivity is a direct consequence of assuming Bel∅ (h) = 0 for all hypothesis h ≡ . This can be justified by arguing that the extreme case of total ignorance, represented by = ∅, should not allow degrees of belief different from zero, except for h ≡ . This reflects a very cautious and skeptical attitude, according to which nothing is believed without being supported by evidence, and it implies Bel∅ (h) + Bel∅ (¬h) = 0 for all h ≡ ⊥, , a particular case of non-additive degrees of belief. Fig. 2 illustrates the non-monotonic behavior of non-additive degrees of belief, if the knowledge base t accumulates evidence over time. Non-additive degrees of belief are also appealing as a proper way of distinguishing between aleatory uncertainty (or simply uncertainty) and epistemic uncertainty (or ignorance). Many different theories of uncertain reasoning are

Fig. 2. Non-monotone and non-additive degrees of belief when new evidence accumulates over time.

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 3. The opinion triangle with its three dimensions: belief, disbelief, and ignorance.

161

Fig. 4. Special types of opinions.

motivated by this, e.g. the Dempster–Shafer theory [19,98], the Theory of Hints [62], the Transferable Belief Model [104], the Evidentiary Value Model [94], and many others (see Section 2.4). Non-additivity is also a crucial property of classical logic. At first sight, since logic is usually not concerned with numbers, this is not very apparent. But if we consider logical entailment |, we may often encounter cases of | / h and | / ¬h, which corresponds to Bel (h) = Bel (¬h) = 0, the above-mentioned case of total ignorance with respect to h. From the perspective of a non-additive belief measure, additivity appears as an ideal case in which degrees of belief coincide with long-run frequencies or subjective probabilities. There are numerous practical examples where the available evidence is such that this ideal case actually occurs, but this is certainly not always the case. 2.2. Opinions Belief in general and degrees of belief in particular are undoubtedly closely connected to an agent’s opinion. In a formal setting, opinions have been introduced by Jøsang as the fundamental concept of what he calls Subjective Logic [51–53]. Essentially the same concept has been studied before under different names, first by Ginsberg [33] and later by Hájek et al. [46,47] and Daniel [14]. In all cases, the starting point is a non-additive measure of belief. Here we follow Jøsang’s original definition in [53], according to which an opinion with respect to a hypothesis h is a triple ωh = (b, d, i),

(1)

where b = Bel(h) is the agent’s degree of belief in the hypothesis h, d = Bel(¬h) the degree of disbelief in h, and i = 1 − (b + d) the so-called degree of ignorance relative to h.2 Notice that b, d, and i sum up to one, i.e. any pair of values implies the third one. Opinions should therefore be regarded as a two-dimensional rather than three-dimensional concept. For illustrative purposes, it may be useful to represent the set of all possible opinions as shown in Fig. 3 by a 2-simplex, the so-called opinion triangle. Jøsang put forward this picture in [51–53]. To represent a ternary probabilistic space, Dempster used a similar picture in one of his most influential papers almost thirty years earlier [19]. Dempster did not have a particular name, so he referred to it as barycentric coordinates, which is the correct mathematical term. The formal concept of an opinion is very intuitive and helpful when it comes to explain or justify the non-additivity assumption. As shown in Fig. 4, it covers nicely various types of epistemic states, including the extreme cases of full belief by (1, 0, 0), full disbelief by (0, 1, 0), and full ignorance by (0, 0, 1). Other particular opinions are the ones on the edges of the triangle. Notice that Bayesian opinions of the form (p, 1 − p, 0), which are unambiguously characterized by a single (additive) probability value p, do not allow the agent to have “no opinion” or to say “I don’t know”. This is why we postulate non-additivity as a fundamental requirement for a general quantitative model of belief. 2 In [51], Jøsang calls i degree of uncertainty rather than degree of ignorance (which is a bit misleading), and opinions are defined as quadruples (b, d, i, a) with an additional component a, the so-called relative atomicity (we do not need this here).

162

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

2.3. Historical roots of non-additive degrees of belief As Shafer and later Kohlas pointed out [58,99,101,102], first examples of the two-dimensional (non-additive) view of degrees of belief can be found in the literature of the late seventeenth and early eighteenth centuries, well before Bayesian ideas were developed. Historically, non-additive degrees of belief were mostly motivated by judicial applications, such as the reliability of witnesses in the courtroom, or more generally by the credibility of testimonies on past events or miracles. The first two combination rules for testimonies were published in 1699 in an anonymous article [118].3 One of them considers two independent witnesses with respective credibilities (frequencies of saying the truth) p1 and p2 . If we suppose that they deliver the same report, they are either both telling the truth with probability p1 p2 or they are both lying with probability (1 − p1 )(1 − p2 ). Every other configuration is impossible. The original formulation of the main statement is the following: The ratio of truth saying cases to the total number of cases, p 1 p2 (2) , p1 p2 + (1 − p1 )(1 − p2 ) will represent the probability of both testifiers asserting the truth.4 Translated into our terminology, it means that if both witnesses report the truth of a hypothesis h, then the value for Bel(h) is given by the expression in (2). In Section 3, we will show how to obtain the same result from a probabilistic argumentation system. The corresponding formula for n independent witnesses of equal credibility p, pn (3) , p n + (1 − p)n has been mentioned in [70] by Laplace (1749–1827) and is closely related to the Condorcet Jury Theorem discussed in social choice theory [8,16,73,74]. Notice that both probabilities in (2) and (3) sum up to 1 with respect to the two possibilities h and ¬h. It thus seems that they are classical additive probabilities, but since they do not depend on a prior probability with respect to h, they raise the controversial question of whether these formulae are proper posterior probabilities in a Bayesian sense. George Boole (1815–1864) gives a similar formula that includes a prior distribution [11], but (2) and (3) still appear to be reasonable results. The connection between Laplace’s and Boole’s formulae has been studied in [38], in which both expressions drop out as special cases of a more general model of partially reliable information sources. This general model is also applicable to situations of contradictory testimonies. It presupposes non-additive degrees of belief, but Laplace’s and Boole’s formulae themselves remain additive. However, the fact that Laplace’s formula does not require a prior probability for h turns out to be the consequence of approaching the problem from the perspective of non-additive degrees of belief. Another important historical contribution, in which the connection to non-additive degrees of belief is more obvious, can be found in the fourth part of Jakob Bernoulli’s (1654–1705) famous Ars Conjectandi (the art of conjecture) [6]. He distinguishes between necessary and contingent (uncertain) statements: A proposition is called necessary, relative to our knowledge, when its contrary is incompatible with what we know. It is contingent, if it is not entailed by what we know. With respect to the question of whether the hypothesis h is implied by the given evidence, Bernoulli analyses four possible situations: (a) the evidence is necessary and implies h necessarily; (b) the evidence is contingent, but implies h necessarily; (c) the evidence is necessary, but implies h only contingently; (d) the evidence is contingent and implies h only contingently. In (c) and (d), a further distinction is made between pure and mixed arguments. In the mixed case, it is assumed that if the evidence does not imply h, it implies ¬h, whereas nothing is said about ¬h in the pure case. Bernoulli 3 There is some disagreement about the authorship of this article. Shafer names the English cleric George Hooper (1640–1727) [100], but for Pearson, the true author is the English statistician Edmund Halley (1656–1742) [85]. Another possible author is the Scottish mathematician John Craig (1663–1731). 4 In its substance, this statement was considered important enough to be included in Francis Edgeworth’s (1845–1926) article on probability in the 11th edition of the Encyclopædia Britannica [22].

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

163

then considers the number of cases in which the evidence occurs and in which h (or ¬h) is entailed. Finally, the corresponding ratios with respect to the total number of cases turn out to be non-additive in (b) and in the pure versions of (c) and (d). Bernoulli also discusses the problem of combining several testimonies. Essentially, his combination rules are special cases of what is known today as Dempster’s rule of combination (see next subsection). In the mixed version of (c), the results of the combination coincide with Laplace’s formula, again without requiring a prior probability for h. Laplace’s analysis is thus included in Bernoulli’s analysis, but the connection to non-additive cases is now more obvious. Even more general is Johann Heinrich Lambert’s (1728–1777) discussion in [69]. From Lambert’s perspective, Bernoulli’s pure and mixed arguments are special cases of a more general situation, in which a syllogism (logical argument) has three parts, the affirmative, the negative, and the indeterminate. There is a number attached to each of these parts, all three of them summing up to 1. This is exactly what we call today an opinion ωh = (b, d, i). In this sense, Bernoulli’s distinction between pure and mixed arguments is a restriction to positive and Bayesian opinions, respectively, but Lambert’s discussion covers the general case. A more comprehensive summary of Bernoulli’s, Lambert’s, and Laplace’s work with corresponding links to the modern view is given in [58,99]. Notice that these very old ideas, until they were rediscovered by Dempster, Hacking, and Shafer at the end of the 20th century [18,34,98], were completely eliminated from mainstream probability over almost three full centuries. 2.4. Connections to more recent developments The idea of defining non-additive degrees of belief or opinions on the basis of probabilities of provability has been the motivation of many other approaches. To the best of our knowledge, the term probability of provability (or probability of necessity) has first been used by Pearl [83] and later by Laskey et al. in [71] and Smets in [103] in their discussions about the connection between Bayesian probabilities and belief functions. Ruspini proposed a similar view, but he prefers to talk about epistemic probabilities P (Kh) and P (K¬h) of the epistemic states Kh (h is known) and K¬h (¬h is known), respectively [92,93]. A similar view is discussed in [14,46,47], where pairs (b, d) are called Dempster pairs. The notion of a Dempster pair is obviously linked to the Dempster–Shafer theory (DST) [19,98], which is also known as the theory of belief functions or the Theory of Evidence. This theory also yields a pair of values for belief and disbelief, but the degree of disbelief is usually expressed by the degree of plausibility Pl(h) = 1 − Bel(¬h). This implies Bel(h) Pl(h) for all possible hypotheses h and defines thus a partitioning of the unit interval [0, 1] into three blocks. The same type of belief/plausibility pairs are used in the related Transferable Belief Model (TBM) [104] and the Theory of Hints [62]. Notice that the spirit behind such belief/plausibility pairs is very similar to the modal operators 2 (necessity) and 3 (possibility) in modal logic [9]. Another two-dimensional representation of belief results from using the principle of indifference to transform belief/plausibility pairs into a additive probabilities. In the TBM framework, this transformation is called pignistic transformation [104], and its result is called betting probability BetP(h). In the simple case of two complementary hypotheses h and ¬h, BetP(h) is simply the arithmetic mean of Bel(h) and Bel(¬h). BetP(h) together with the degree of ignorance is an alternative pair, which reflects precisely the view of some moderate Bayesians, who admit that in addition to the (additive) degree of belief one should also consider the strength of the belief, which depends on the amount of available supporting evidence. In [48], Hawthorne describes this point in the following way: I contend that Bayesians need two distinct notions of probability. We need the usual degree-of-belief notion that is central to the Bayesian account of rational decision. But Bayesians also need a separate notion of probability that represents the degree to which evidence supports hypotheses. In Hawthorne’s sense, additional supporting evidence can function in two different ways: it may increase either the degree of belief or the strength of belief (or both). This point is the central idea of what Schum [97] calls the Scandinavian School of Evidentiary Value [23,30], another non-additive approach to degrees of belief. It is known today as the Evidentiary Value Model (EVM) [94] and originates from the work of the Swedish lawyer Ekelöf in the early 1960ies [24].

164

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Instead of talking about non-additive degrees of belief, some authors prefer to call them non-additive probabilities [32,95,96]. They are often understood as bounds of probability intervals, which are induced by sets of compatible probability functions [66,68,105,113]. Such bounds are often called lower and upper probabilities, which is similar to what Dempster originally had in mind [18]. Today, the common general term for this particular class of approaches is imprecise probabilities [112]. Imprecise or non-additive probabilities have also been in use in physics for a long time, where the role of the non-additivity is to describe the deviation of elementary particles in mechanical wave-like behavior [28]. In this paper, in order to avoid unnecessary confusion, we prefer to make a strict distinction between additive probabilities (in the classical sense) and non-additive degrees of belief or degrees of support. The theory of probabilistic argumentation demonstrates how to use the former to obtain the latter. 3. Probabilistic argumentation Let us now build up a formal theory of probabilistic argumentative reasoning, which adopts at its core the characteristics of degrees of belief and opinions as suggested in the previous section. For this, we require the qualitative (or logical) part of the available information to be expressed by a set of well-formed sentences of a logical language, which we call knowledge base. The logical language itself is supposed to possess a well-defined model-theoretic semantics, in which a proper entailment relation | is defined in terms of set inclusion of models (or interpretations) in some underlying universe. The simplest non-trivial case is the language of propositional logic, where the universe is a multi-dimensional Boolean space over a set of propositional variables. The application of this particular case to probabilistic argumentation has been extensively discussed in the literature [39,43,44]. Other simple languages are obtained from finite set constraints [40], interval constraints [79], or general multivariate constraints such as (linear) equations and/or inequalities [61,117]. In this paper, we do not restrict ourselves to a particular language, but we assume the universe to be a multi-dimensional space V generated by a set of variables V . The corresponding logical language will be denoted by LV , of which is a proper subset. To represent the quantitative (or probabilistic) part of the given information , we suppose the existence of a fully specified probability measure P over a sample space W , which itself is generated by a subset W ⊆ V of so-called probabilistic variables. To keep the mathematics in our discussion as simple as possible, we restrict W to be discrete and finite, but most concepts and definition are extendable to the continuous case. Otherwise, we do not make further assumptions regarding the specification of the probability measure P . The simplest and most efficient specification results from assuming the variables in W to be mutually independent (as in the example in Section 1.2), but other more flexible and powerful techniques such as Bayesian or Markov networks are applicable as well [76]. Note that we do not impose any particular interpretation of probability. In this section, we start with a general discussion of the conceptual differences between logical and probabilistic reasoning. These are the two main mathematical tools to construct a probabilistic theory of logical arguments. Our discussion will give us a better understanding of the key connections, which will then be generalized into a combined (argumentation-oriented) theory of logical and probabilistic reasoning. 3.1. Connecting logic and probability Logic and probability theory have both a long history in science. They are mainly rooted in philosophy and mathematics, but are nowadays important tools in many other fields such as computer science and particularly in Artificial Intelligence. Some philosophers studied the connection between logical and probabilistic reasoning, and a great number of attempts to combine these disciplines have been made, but logic and probability theory are still widely perceived to be separate theories. This is quite surprising, since both disciplines are driven by the same goal, namely to evaluate hypotheses through a formal process of reasoning. One of the key conceptual points, which separates probability theory and logic, is the following: classical probabilistic reasoning presupposes the existence of a probability measure over all variables, whereas pure logical reasoning does not deal with probabilities at all. In other words, logic presupposes a probability measure over none of the variables involved. If we call the variables involved in the probability measure probabilistic, we can say that a probabilistic model consist of probabilistic variables only, whereas all variables of a logical model are non-probabilistic. From this point of view, one of the main differences between logical and probabilistic reasoning is the number of probabilistic

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

165

Fig. 5. The connection between probabilistic argumentation and the classical fields of logical and probabilistic reasoning through different sets of probabilistic variables.

variables. This simple observation turns out to be crucial for understanding many similarities and differences between logical and probabilistic reasoning.5 3.2. Probabilistic argumentation systems With the above remarks in mind, building a more general theory of reasoning is almost straightforward. The simple idea is to allow an arbitrary number of probabilistic variables. More formally, if the available information is represented over a set of variables V , we suppose to have a subset W ⊆ V of probabilistic variables, all with finite domains. If X denotes the domain of a single variable X ∈ V and W the corresponding Cartesian product with respect to all variables in W , we can consider a probability space (W , 2W , P ), where 2W denotes the power set of W and P : 2W → [0, 1] a corresponding probability measure that satisfies the Kolmogorov axioms. The finiteness assumption with regards to W is not a conceptual restriction of this theory, but it allows us to define P with respect to the σ -algebra 2W and thus helps to keep the mathematics simple. For the general case of arbitrary sets of probabilistic variables, let us now define the mathematical structure of a probabilistic argumentation system, into which the available information needs to be compiled. Definition 1. A probabilistic argumentation system is a quintuple A = (V , LV , , W, P ),

(4)

whose components V , LV , , W , and P are as defined above. We will later see precisely how the general theory of probabilistic argumentation degenerates into the classical fields of logical and probabilistic reasoning for W = ∅ and W = V , respectively, but the general idea of this connection is already depicted in Fig. 5. Example 2. To illustrate the concept of a probabilistic argumentation system, consider the simple story in which our friend Alice flips a fair coin and promises to invite us to a barbecue tomorrow night provided that the coin lands on head. Alice is well known to always keep her promises, but she does not say anything about what she is doing in case the coin lands on tail, i.e. she may or may not organize the barbecue in that case. Of course, we would like to know whether the barbecue takes place or not. How can this knowledge be expressed in terms of a probabilistic argumentation system? The given evidence consists of two pieces: the first one is Alice’s reliable promise, and the second one is the fact that the two possible outcomes of tossing a fair coin are known to be equally likely. Thus the evidence is best modeled with two Boolean variables, say H (for head) and B (for barbecue), with domains H = B = {0, 1}, a (uniform) probability function over H , and a propositional sentence h → b (with h and b as placeholders for the atomic events H = 1 and B = 1, respectively). We have thus V = {H, B}, = {h → b}, W = {H }, and P (h) = P (¬h) = 0.5. 5 The literature on combining logic and probability is huge, e.g. see [1,27,29,31,49,78,80,115], but the idea of distinguishing between probabilistic and non-probabilistic variables seems to be relatively new and unexplored.

166

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Altogether we get a probabilistic argumentation system A = (V , LV , , W, P ) as defined above, where LV is the language of propositional logic. Example 3. Another simple example is the judicial problem of two independent witnesses who deliver the same report (see historical notes in Section 2.3). Let p1 and p2 be their respective credibilities (frequencies of saying the truth). To model this situation as a probabilistic argumentation system, consider five Boolean variables REL1 (reliability of Witness 1), REL2 (reliability of Witness 2), REP1 (report of Witness 1), REP2 (report of Witness 2), and HYP (the hypothesis in question). Again, we use propositions reli , repi , and hyp as placeholders for RELi = 1, REPi = 1, and HYP = 1, respectively. We have thus V = {REL1 , REL2 , REP1 , REP2 , HYP} and W = {REL1 , REL2 } with respective marginal probabilities P (rel1 ) = p1 and P (rel2 ) = p2 . The required probability measure P follows then from the independence assumption. To model the logical constraints of this example, suppose that a reliable witness knows the true state of the hypothesis and thus delivers a positive (negative) report whenever the hypothesis is true (false). Unreliable witnesses are supposed to act the other way around. We can thus consider propositional sentences reli → (hyp ↔ repi ) and ¬reli → (hyp ↔ ¬repi ), which can be merged into a single sentence reli ↔ (hyp ↔ repi ). If we suppose that both witnesses deliver a positive report, we get = {rel1 ↔ (hyp ↔ rep1 ), rel2 ↔ (hyp ↔ rep2 ), rep1 , rep2 } for our knowledge base, which is logically equivalent to = {rel1 ↔ hyp, rel2 ↔ hyp, rep1 , rep2 }. All elements together constitute a probabilistic argumentation system A = (V , LV , , W, P ), where LV is again the language of propositional logic. 3.3. Conflicts For a given probabilistic argumentation system A = (V , LV , , W, P ), the problem to solve is to judge whether a hypothesis, expressed by an additional sentence h ∈ LV , is true or false. For this, we denote the set of models of (the elements of the universe V for which is true) by EV = JK ⊆ V , and thus assume the true state of the world to be exactly one of its element. Think of EV as the available evidence with regards to all variables V . Furthermore, let H = JhK ⊆ V denote the set of models of the hypothesis h ∈ LV , i.e. h is true if and only if the true state of the world is in H . In the following, we will use the sets EV and H interchangeably with and h, respectively. The main formal definitions in this section are based on two key observations. The first one is the fact that the evidence EV ⊆ V , which restricts the set of possible states relative to V , also restricts the possible states relative to W . We call the elements s ∈ W scenarios, and by projecting EV from V to W , as illustrated in Fig. 6, we get the set ↓W EW = EV ⊆ W of scenarios which are consistent with the knowledge base . This means that exactly one element of EW corresponds to the true state of the world, and it implies that the scenarios of the complementary set, W \EW , are all incompatible with . In other words, they represent states which are in conflict with the available knowledge base, written as |s | ⊥. Definition 4. If |s | ⊥ holds for a scenario s ∈ W , then s is called a conflict of . The set of all such conflicts is denoted by CONFSA = {s ∈ W : |s | ⊥} = W \EW .

(5)

In the above definition, |s represents the conditional knowledge base given s, which we obtain from by replacing all occurrences of probabilistic variables by their respective values in s (see Section 1.2 for an example). Note that conflicts may arise very naturally, especially when the available information is complemented with actual observations or facts from the real world. In the following, we will often use CONFS as a short form for CONFSA . 3.4. Arguments and counter-arguments The second observation goes in the other direction, that is from W to V . Let’s assume that a certain scenario s ∈ W is the true scenario. With respect to V , this particular situation reduces the set of possible states from EV to EV |s = x ∈ EV : x↓W = s = J|sK, (6)

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 6. Projecting the knowledge base from V to W .

167

Fig. 7. Evidence conditioned on various scenarios.

where x↓W denotes the projection of a state x from V to W . This restricted set of states corresponds to the models of |s and thus contains all the elements of EV that are compatible with s. This idea is illustrated in Fig. 7 for four different scenarios s0 , s1 , s2 , and s3 . Note that s ∈ EW implies EV |s = ∅ (respectively |s | / ⊥) and vice versa. Consider now a consistent scenario s ∈ EW for which EV |s ⊆ H (respectively |s | h) holds. This means that h is a logical consequence of s and , and s can thus be seen as a defeasible or hypothetical proof for h in the light of . We must say defeasible, because it is uncertain whether s is the true scenario or not. In other words, h is only supported by s, but not entirely proven, and every supporting scenario is thus a defeasible logical argument for h. Definition 5. If |s | h holds for a consistent scenario s ∈ EW and a hypothesis h ∈ LV , then s is called an argument for h. With ARGSA (h) = {s ∈ EW : |s | h} = {s ∈ W : |s | h, |s | / ⊥} we denote the set of all such

(7)

arguments.6

Similarly, the elements of ARGSA (¬h) = {s ∈ EW : |s | ¬h} are logical counter-arguments, which refute h in the light of the given evidence. When no confusion is anticipated, we omit the reference to A and use the short forms ARGS(h) and ARGS(¬h). Note that ARGS(h) ∩ ARGS(¬h) = ∅ holds for all possible hypotheses h ∈ LV . In the example of Fig. 7, the hypothesis h is supported by the argument s3 , but not by s0 , s1 , or s2 (s0 is a conflict). Similarly, ¬h is supported (or h is refuted) by s1 , but not by s0 , s1 , or s3 . In the case of s2 , no definite conclusion is possible for h. The existence of such neutral scenarios is the reason for the non-additivity of degrees of support, as we will see below and in the next subsection. The decomposition of the set W into supporting, refuting, conflicting, and neutral scenarios is depicted in Fig. 8. Example 6. In the situation of Example 3, we have two probabilistic variables REL1 and REL2 , i.e. depending on whether the two witnesses are saying the truth or not, this yields four different scenarios s0 = (0, 0), s1 = (0, 1), s2 = (1, 0), and s3 = (1, 1). After receiving two positive reports, rep1 and rep2 , our knowledge base becomes = {rel1 ↔ hyp, rel2 ↔ hyp, rep1 , rep2 }, from which we obtain |s0 = {¬hyp, rep1 , rep2 }, |s1 = {⊥}, |s2 = {⊥}, and |s3 = {hyp, rep1 , rep2 }. This implies that s3 is an argument and s0 is a counter-argument for hyp, and that s1 and s2 are conflicts. Note that there are no neutral scenarios. In other words, it can either be the case that both witnesses 6 Alternatively, we may define the set of arguments simply by Args(h) = {s ∈ : |s | h} = ARGS(h) ∪ CONFS, which then implies W CONFS = Args(⊥) and therefore ARGS(h) = Args(h) \ Args(⊥). From this perspective, it seems that Args(h) is the more fundamental concept, from which both CONFS and ARGS(h) follow. In fact, most computational methods are designed to compute the so-called quasi-arguments Args(h), see [35,36,39], but at the conceptual core, the choice between ARGS(h) and Args(h) remains a matter of taste. Here we prefer ARGS(h) to stress out that conflicts should not be regarded as proper arguments.

168

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 8. Arguments, counter-arguments, conflicts, and neutral scenarios.

are telling the truth (in which case we conclude hyp) or that they are both lying (in which case we conclude ¬hyp). Every other configuration is impossible. This is exactly the argument used in the original formulation of the problem in [118]. The key definitions in (5) and (7) have a number of mathematical consequences. One property, which is particularly important for computational purposes, is shown in the following theorem, in which S ↓W denotes the projection of a subset S ⊆ V to W . The proof of the theorem is given in Appendix A. Theorem 7. ↓W ARGS(h) = EW \ EV ∩ H c .

(8)

This theorem tells us the problem of computing ARGS(h) is essentially a projection (or variable elimination) problem (see [56] for a generic solution of the projection problem). Translated into the terminology of logic, it means that the variables in U = V \ W needs to be eliminated from the set of sentences ∪ {¬h}, which we obtain from the knowledge base by adding the negated hypothesis to it. If is a clausal set and the variables in U are propositions, it is possible to realize the elimination as a resolution-based procedure [39]. Note that variable elimination in propositional logic can also be viewed as a quantifier elimination problem [114]. Other mathematical consequences of (5) and (7) arise if we consider special situations like ≡ , ≡ ⊥, h ≡ , or h ≡ ⊥. Table 2 summarizes some of these consequences, which are all easy to verify (for corresponding proofs in the context of propositional logic, see [39]). Further simple properties occur if we consider particular situations of interrelated hypotheses or knowledge bases. Some of these (easily verifiable) properties are shown in the following non-exhaustive list: h | h

⇒

ARGS(h) ⊆ ARGS(h ),

h | ¬h

⇒

ARGS(h) ∩ ARGS(h ) = ∅,

h ≡ h1 ∧ h2

⇒

ARGS(h) = ARGS(h1 ) ∩ ARGS(h2 ),

⇒

CONFSA ⊆ CONFSA .

⊆

Table 2 Consequences of (5) and (7) for various special situations ARGS(h)

ARGS(¬h)

CONFS

≡ ⊥

∅ ∅ EW EW ∩ (H c )↓W

W

| h (e.g. h ≡ ) | ¬h (e.g. h ≡ ⊥) Vars(h) ⊆ W

∅ EW ∅ EW ∩ H ↓W

≡

h ≡ , h ≡ ⊥, Vars(h) ∩ W = ∅

∅

∅

≡⊥

W \EW ∅

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

169

The last property describes the monotonic growth of the conflict set when new information arrives, i.e. when A = (V , LV , , W, P ) is updated into A = (V , LV , , W, P ). Note that apart from that, ⊆ does not allow general conclusions about the relationship between ARGSA (h) and ARGSA (h): new evidence may turn neutral scenarios into arguments or counter-arguments, or existing arguments or counter-arguments may be turned into conflicts. This is the mathematical reason for the non-monotonic behavior of our approach, as we will further see in the following subsection. 3.5. Degrees of support, possibility, and ignorance The above definitions of conflicts, arguments, and counter-arguments are the key notions, on which the definitions of degrees of support and possibility in this subsection are based. Based on the general idea that every argument s ∈ ARGS(h) contributes to the possible truth of the hypothesis h for a given knowledge base , we can measure the strength of such a contribution by the posterior probability P ({s}) = P ({s}|EW ), where EW plays the role of the evidence on which the prior probability P is conditioned. To take all such contributions into account, we will now consider the same posterior probability P with respect to the whole set ARGS(h). Definition 8. The degree of support of a hypothesis h ∈ LV is the conditional probability P (ARGS(h) ∩ EW ) P (ARGS(h)) dspA (h) = P ARGS(h) = P ARGS(h)|EW = = P (EW ) 1 − P (CONFS) {P ({s}): s ∈ ARGS(h)} , = 1 − {P ({s}): s ∈ CONFS}

(9)

of the event ARGS(h) given the evidence EW .7 Note that degrees of support are undefined for ≡ ⊥. Again, we will use the more convenient short form dsp(h) whenever possible. Example 9. Consider again the judicial problem of Example 3 and its discussion in Example 6. From the given marginal probabilities and the independence assumption we obtain P ({s0 }) = (1 − p1 )(1 − p2 ), P ({s1 }) = (1 − p1 )p2 , P ({s2 }) = p1 (1 − p2 ), P ({s3 }) = p1 p2 by multiplication. This leads then to dsp(hyp) =

P ({s3 }) p 1 p2 = , P ({s0 , s3 }) p1 p2 + (1 − p1 )(1 − p2 )

which is exactly the same result as the one derived from original formulation of the problem in [118]. Similarly, we obtain dsp(¬hyp) =

P ({s0 }) (1 − p1 )(1 − p2 ) = , P ({s0 , s3 }) p1 p2 + (1 − p1 )(1 − p2 )

for the negative hypothesis ¬hyp, which means that dsp(hyp) and dsp(¬hyp) are additive in this particular situation. This is a consequence of the absence of neutral scenarios. At this point, it is important to notice that dsp(h) defines an ordinary probability measure in the classical sense of Kolmogorov, for which the probabilities of complementary events ARGS(h) and EW \ARGS(h) add up to 1. Nevertheless, with respect to complementary hypotheses h and ¬h, for which the sets ARGS(h) and ARGS(¬h) are not necessarily complementary with respect to EW (as shown in Fig. 8), we obtain non-additive (or sub-additive) degrees of support, for which the inequality dsp(h) + dsp(¬h) 1

(10)

7 Using the alternative set of arguments Args(h) (see previous footnote), we could also define degrees of support by dsp (h) = P (Args(h)) = A P (Args(h))−P (Args(⊥)) . Since both definitions are mathematically equivalent, the choice between Args(h) and ARGS(h) remains a matter of taste. 1−P (Args(⊥))

170

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 9. The opinion induced by degrees of support and possibility.

holds. Degrees of support should therefore be understood as non-additive posterior probabilities of logical deducibility. Except for ≡ ⊥, they are well-defined for all possible hypotheses h ∈ LV , that is even in cases in which the prior probability measure P does not cover all variables. This increased flexibility is an important advantage over traditional probabilistic reasoning, which presupposes the existence of a complete probability measure over all variables. Instead of considering the degree of support of the complementary hypothesis ¬h, it is often more convenient to look at so-called degrees of possibility of h. They are usually defined in terms of degree of support, namely by dpsA (h) = 1 − dspA (¬h).

(11)

Intuitively, the degree of possibility is thus a measure of the absence of counter-arguments. Together with the degree of support, a hypothesis is finally judged by a pair of values dsp(h) and dps(h), for which dsp(h) dps(h) always holds. Or we may look at it as the corresponding interval [dsp(h), dps(h)] ⊆ [0, 1] of length dig(h) = dps(¬h) − dsp(¬h),

(12)

which we call degree of ignorance with respect to h. Another view is to look at dsp(h) and dps(h) as the anchors of a unique point ωh = (b, d, i) in the opinion triangle (see Section 2.2), with b = dsp(h), d = 1 − dps(h), and i = dig(h) as depicted in Fig. 9. To conclude this subsection, let’s again look at some evident mathematical properties, which occur in special situations like ≡ , ≡ ⊥, h ≡ , or h ≡ ⊥. Table 3 lists all immediate consequences of the properties listed Table 2 (see previous subsection). Another immediate consequence occurs in the situation, where a hypothesis h1 logically entails another hypothesis h2 . This increases both the degree of support and the degree of possibility: dsp(h) dsp(h ), h | h ⇒ (13) dps(h) dps(h ). Note that no such conclusions are possible for cases like | . In other words, degrees of support and possibility (and therefore degrees of ignorance) may change non-monotonically when new qualitative information or evidence arrives. Typically, confirming evidence increases the degree of support (by producing new arguments), but it may also increase the degree of possibility (by transforming counter-arguments into conflicts). Similarly, disconfirming Table 3 Consequences of (10), (11), and (12) for various special situations ≡⊥

dsp(h)

dps(h)

dig(h)

undefined

undefined

undefined

≡ ⊥

| h (e.g. h ≡ ) | ¬h (e.g. h ≡ ⊥) Vars(h) ⊆ W

1 0 x ∈ [0, 1]

1 0 x ∈ [0, 1]

0 0 0

≡

h ≡ , h ≡ ⊥, Vars(h) ∩ W = ∅

0

1

1

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

171

evidence decreases the degree of possibility in the first place (by producing new counter-arguments), but it may also decrease the degree of support (by transforming arguments into conflicts). With this, degrees of support satisfy all the properties suggested in Section 2.1 for rational degrees of belief: uniformity follows from the fact that for a given hypothesis h, dsp(h) only depends on the agent’s evidence A, consistency is the property described in (13), non-additivity is expressed by the inequality in (10), and non-monotonicity holds for the reasons just explained. 3.6. Special cases: Logical and probabilistic reasoning To complete the technical discussion about probabilistic argumentation, let us briefly investigate how the classical fields of logical and probabilistic reasoning fit into this general theory. Logical reasoning. From the perspective of probabilistic argumentation, logical reasoning is characterized by W = ∅. This has a number of simple consequences. First, it implies that the set of possible scenarios W = {s0 } consists of a single element s0 = () only, which represents the empty vector of values. This means that P ({s0 }) = 1 is the only possible prior probability and is thus implicitly given. Furthermore, we have |s0 = , which allows us to simplify (7) into {s0 }, for ⊥ ≡ | h, ARGS(h) = ∅, otherwise. If we assume ≡ ⊥, we get EW = {s0 } = W and thus P ({s0 }) = 1. This implies 1, for | h, dsp(h) = 0, otherwise. In other words, degrees of support play the role of an indicator function for the logical deducibility of h from . Probabilistic reasoning. Purely probabilistic models are characterized by W = V , which again has various simple consequences. The most obvious one is the fact that W = V , from which EW = EV immediately follows. This allows us to write E as a common placeholder for EW and EV and to simplify (7) into ARGS(h) = E ∩ H. From this simplification we obtain P (E ∩ H ) = P (H |E), P (E) which is the usual way of defining posterior probabilities in the context of probabilistic reasoning. dsp(h) =

Probabilistic argumentation is therefore a true generalization of the two classical types of logical and probabilistic reasoning. This is a remarkable conclusion, which lifts probabilistic argumentation from its original intention as a theory of argumentative reasoning up to a unified theory of logical and probabilistic reasoning. 3.7. Connection to Dempster–Shafer theory From a technical point of view, many concepts of probabilistic argumentation are closely connected to respective concepts in the Dempster–Shafer theory. This connection has been thoroughly discussed in [41], according to which every probabilistic argumentation system is expressible as a corresponding belief (or mass) function, and vice versa. More precisely, if A = (V , LV , , W, P ) denotes a given probabilistic argumentation system, we may take V as the frame of discernment and then define a mass function m : 2V → [0, 1] by m(A) = P {s} : s ∈ W , J|sK = A , for all A ⊆ V . For a hypothesis H = JhK ⊆ V , it can then be shown that the concepts of normalized belief and plausibility, 1 m(A) and Pl(H ) = 1 − Bel H c , Bel(H ) = 1 − m(∅) ∅ =A⊆H

172

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

correspond precisely to our notions of degree of support and possibility, dsp(h) and dps(h), respectively. To obtain exactly the same result in a different way, it is also possible to translate the principal components of A (the sentences in and the probability measure P ) individually into respective mass functions and then apply Dempster’s rule of combination (see [41] for details). Finally, we can express any arbitrary mass function as a probabilistic argumentation system and formulate Dempster’s combination rule as a particular form of merging two probabilistic argumentation systems. Despite these technical similarities, the theories are still quite different from a conceptual point of view. For example, consider Dempster’s rule of combination, which is a crucial element of the Dempster–Shafer theory, but almost inexistent in the theory of probabilistic argumentation. Another difference is the fact that the notions of belief and plausibility in the Dempster–Shafer theory are often entirely detached from a probabilistic interpretation (especially in Smets’s TBM framework), whereas degrees of support and possibility are probabilities by definition. Finally, while the use of a logical language to express factual information is an intrinsic part of a probabilistic argumentation system, it is an almost unknown technique in the Dempster–Shafer theory. In other words, probabilistic argumentation demonstrates how to decorate the mathematical foundations of Dempster–Shafer theory with the expressiveness and convenience of a logical language. 4. Conclusion This paper describes a formal theory of argumentative reasoning called probabilistic argumentation. The proposed reasoning process consists of a qualitative and a quantitative part. For the qualitative part, the paper offers precise formal definitions of logical arguments, counter-arguments, and conflicts, as well as a detailed exhibition of corresponding mathematical properties. One of the most interesting aspects is the (conflict-driven) non-monotonic behavior of arguments and counter-arguments when new information accumulates over time. This shows that nonmonotonicity, which seems to be incompatible with monotone logical systems at first sight, results naturally from the proposed argumentation-oriented approach. The quantitative part of the reasoning process starts from the assumption of a given probability measure over some (but not necessarily all) variables involved. The idea then is to measure the weights of arguments and counterarguments by respective probabilities. The total probabilistic weight of a set of arguments is called degree of support. This is the key concept of this paper, which provides a non-additive and non-monotonic measure of the available support for the hypothesis. The non-additivity is a consequence of the underlying logical structure, in which the sets of arguments and counter-arguments are not necessarily complementary. As it is typical for a probabilistic system, non-monotonicity is obtained through Bayesian conditioning on the available evidence. Given the mathematical properties of degrees of support, the paper suggests to view probabilistic argumentation as a theory of rational degrees of belief, from which we may generate what some authors call an opinion. This is a simple but very instructive picture, which is useful when it comes justify the assumption of non-additive degrees of belief or the difference between uncertainty and ignorance. One of the most remarkable consequence of this paper is the observation that probabilistic argumentation naturally includes the two classical approaches to automated reasoning (namely logical and probabilistic reasoning) as special cases. The parameter that makes them distinct is the number of probabilistic variables. Probabilistic argumentation is more general in the sense that it supports any number of probabilistic variables. We can thus consider probabilistic argumentation as a new foundation for a unified theory of logical and probabilistic reasoning. As such, it may serve as starting point for a generalized decision theory, in which the possibility of incomplete or missing information is taken into account to decide about further deliberation. Acknowledgement This research is supported by the Swiss National Science Foundation, Project No. PP002-102652/1, and The Leverhulme Trust. Thanks to Michael Wachter for helpful remarks and proof-reading.

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

173

Appendix A. Proof of Theorem 7 In the following proof, we start from the definition of the set ARGS(h) and transform it step by step into the right-hand side of Theorem 7. With S ↑V we denote the vacuous extension of a subset S ⊆ W to V . ARGS(h) = {s ∈ EW : |s | h} = {s ∈ EW : EV |s ⊆ H } = EW \ {s ∈ EW : EV |s H } = EW \ s ∈ EW : EV |s ∩ H c = ∅ = EW \ s ∈ EW : x ∈ EV : x↓W = s ∩ H c = ∅ = EW \ s ∈ EW : EV ∩ {s}↑V ∩ H c = ∅ c = EW \ s ∈ EW : EV ∩ H c {s}↑V ↓W ↑V c ↓W {s} = E W \ s ∈ E W : EV ∩ H c ↓W {s}c = EW \ s ∈ E W : EV ∩ H c ↓W ↓W ∩ {s}c = ∅ = EW \ EV ∩ H c . = E W \ s ∈ E W : EV ∩ H c

2

References [1] E.W. Adams, A Primer of Probability Logic, CSLI Publications, Stanford, 1998. [2] L. Amgoud, C. Cayrol, A reasoning model based on the production of acceptable arguments, Annals of Mathematics and Artificial Intelligence 34 (1–3) (2002) 197–215. [3] L. Amgoud, C. Cayrol, Inferring from inconsistency in preference-based argumentation frameworks, Journal of Automated Reasoning 29 (2) (2002) 125–169. [4] B. Anrig, J. Kohlas, Model-based reliability and diagnostic: A common framework for reliability and diagnostics, International Journal of Intelligent Systems 18 (10) (2003) 1001–1033. [5] M. Beer, M. d’Inverno, N. Jennings, M. Luck, C. Preist, M. Schroeder, Argumentation and negotiation, Knowledge Engineering Review 14 (3) (1999) 285–289. [6] J. Bernoulli, Ars Conjectandi, Thurnisiorum, Basel, 1713. [7] P. Besnard, A. Hunter, A logic-based theory of deductive arguments, Artificial Intelligence 128 (1–2) (2001) 203–235. [8] D. Black, Theory of Committees and Elections, Cambridge University Press, Cambridge, USA, 1958. [9] P. Blackburn, M. de Rijke, Y. Venema, Modal Logic, Cambridge University Press, 2001. [10] A.J. Bonner, A logic for hypothetical reasoning, in: AAAI’88, 7th National Conference on Artificial Intelligence, Saint Paul, USA, 1988, pp. 480–484. [11] G. Boole, The Laws of Thought, Walton and Maberley, London, 1854. [12] R.E. Bryant, Graph-based algorithms for Boolean function manipulation, IEEE Transactions on Computers 35 (8) (1986) 677–691. [13] C. Cayrol, S. Doutre, J. Mengin, On decision problems related to the preferred semantics for argumentation frameworks, Journal of Logic and Computation 13 (3) (2003) 377–403. [14] M. Daniel, Algebraic structures related to Dempster–Shafer theory, in: B. Bouchon-Meunier, R.R. Yager, L.A. Zadeh (Eds.), IPMU’94, 5th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, in: LNCS, vol. 945, Springer, Paris, France, 1994, pp. 51–61. [15] A. Darwiche, P. Marquis, A knowledge compilation map, Journal of Artificial Intelligence Research 17 (2002) 229–264. [16] M. de Condorcet, Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix, L’Imprimerie Royale, Paris, France, 1785. [17] J. de Kleer, A perspective on assumption-based truth maintenance, Artificial Intelligence 59 (1993) 63–67. [18] A.P. Dempster, Upper and lower probabilities induced by a multivalued mapping, Annals of Mathematical Statistics 38 (1967) 325–339. [19] A.P. Dempster, A generalization of Bayesian inference, Journal of the Royal Statistical Society 30 (1968) 205–247. [20] I. Douven, Decision theory and the rationality of further deliberation, Economics and Philosophy 18 (2) (2002) 303–328. [21] P.M. Dung, On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games, Artificial Intelligence 77 (2) (1995) 321–357. [22] F.Y. Edgeworth, Probability, in: H. Chisholm (Ed.), Encyclopædia Britannica, vol. 22, 11th ed., University of Cambridge, 1911, pp. 376–403. [23] M. Edman, Adding independent pieces of evidence, in: Modality, Morality and other Problems of Sense and Nonsense: Essays dedicated to Sören Halldén, CWK Gleerups, Lund, Sweden, 1973, pp. 180–191. [24] P.O. Ekelöf, Rättegång, sixth ed., Norstedts Juridik AB, Stockholm, Sweden, 1992. [25] D. Ellsberg, Risk, ambiguity, and the Savage axioms, Quarterly Journal of Economics 75 (1961) 643–669. [26] R. Fagin, J.Y. Halpern, Uncertainty, belief, and probability, Computational Intelligence 7 (3) (1991) 160–173. [27] R. Fagin, J.Y. Halpern, N. Megiddo, A logic for reasoning about probabilities, Information and Computation 87 (1/2) (1990) 78–128. [28] R.P. Feynman, R.B. Leighton, M. Sands, The Feynman Lectures on Physics, vol. I, second ed., Addison-Wesley, 1963. [29] J. Fox, Probability, logic and the cognitive foundations of rational belief, Journal of Applied Logic 1 (3–4) (2003) 197–224.

174

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

[30] P. Gärdenfors, B. Hansson, N.E. Sahlin (Eds.), Evidentiary Value: Philosophical, Judicial and Psychological Aspects of a Theory, CWK Gleerups, Lund, Sweden, 1983. [31] G. Gerla, Inferences in probability logic, Artificial Intelligence 70 (1–2) (1994) 33–52. [32] I. Gilboa, Expected utility with purely subjective non-additive probabilities, Journal of Mathematical Economics 16 (1987) 65–88. [33] M. Ginsberg, Non-monotonic reasoning using Dempster’s rule, in: R.J. Brachman (Ed.), AAAI’84, 4th National Conference on Artificial Intelligence, Austin, USA, 1984, pp. 112–119. [34] I. Hacking, The Emergence of Probability, Cambridge University Press, 1975. [35] R. Haenni, Cost-bounded argumentation, International Journal of Approximate Reasoning 26 (2) (2001) 101–127. [36] R. Haenni, Anytime argumentative and abductive reasoning, Soft Computing—A Fusion of Foundations, Methodologies and Applications 8 (2) (2003) 142–149. [37] R. Haenni, Using probabilistic argumentation for key validation in public-key cryptography, International Journal of Approximate Reasoning 38 (3) (2005) 355–376. [38] R. Haenni, S. Hartmann, Modeling partially reliable information sources: a general approach based on Dempster–Shafer theory, International Journal of Information Fusion 7 (4) (2006) 361–379. [39] R. Haenni, J. Kohlas, N. Lehmann, Probabilistic argumentation systems, in: D.M. Gabbay, P. Smets (Eds.), Handbook of Defeasible Reasoning and Uncertainty Management Systems, vol. 5: Algorithms for Uncertainty and Defeasible Reasoning, Kluwer Academic Publishers, Dordrecht, Netherlands, 2000, pp. 221–288. [40] R. Haenni, N. Lehmann, Building argumentation systems on set constraint logic, in: B. Bouchon-Meunier, R.R. Yager, L.A. Zadeh (Eds.), Information, Uncertainty and Fusion, Kluwer Academic Publishers, Dordrecht, Netherlands, 2000, pp. 393–406. [41] R. Haenni, N. Lehmann, Probabilistic argumentation systems: a new perspective on Dempster–Shafer theory, International Journal of Intelligent Systems 18 (1) (2003) 93–106 (Special Issue on the Dempster–Shafer Theory of Evidence). [42] R. Haenni, Towards a unifying theory of logical and probabilistic reasoning, in: F.B. Cozman, R. Nau, T. Seidenfeld (Eds.), ISIPTA’05, 4th International Symposium on Imprecise Probabilities and Their Applications, Pittsburgh, USA, 2005, pp. 193–202. [43] R. Haenni, Propositional argumentation systems and symbolic evidence theory, PhD thesis, University of Fribourg, Switzerland, 1996. [44] R. Haenni, B. Anrig, J. Kohlas, N. Lehmann, A survey on probabilistic argumentation, in: ECSQARU’01, 6th European Conference on Symbolic and Quantitative Approaches to Reasoning under Uncertainty, Workshop on Adventures in Argumentation, Toulouse, France, 2001, pp. 19–25. [45] R. Haenni, Ignoring ignorance is ignorant, Tech. rep., Center for Junior Research Fellows, University of Konstanz, Germany, 2003. [46] P. Hájek, T. Havránek, R. Jiroušek, Uncertain Information Processing in Expert Systems, CRC Press, Boca Raton, USA, 1992. [47] P. Hájek, J.J. Valdés, Generalized algebraic approach to uncertainty processing in rule-based expert systems (dempsteroids), Computers and Artificial Intelligence 10 (1991) 29–42. [48] J. Hawthorne, Degree-of-belief and degree-of-support: Why Bayesians need both notions, Mind 114 (454) (2005) 277–320. [49] C. Howson, Probability and logic, Journal of Applied Logic 1 (3–4) (2003) 151–165. [50] J. Jonczy, R. Haenni, Credential networks: a general model for distributed trust and authenticity management, in: A. Ghorbani, S. Marsh (Eds.), PST’05, 3rd Annual Conference on Privacy, Security and Trust, St. Andrews, Canada, 2005, pp. 101–112. [51] A. Jøsang, A logic for uncertain probabilities, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9 (3) (2001) 279–311. [52] A. Jøsang, V.A. Bondi, Legal reasoning with subjective logic, Artificial Intelligence and Law 8 (4) (2001) 289–315. [53] A. Jøsang, Artificial reasoning with subjective logic, in: A.C. Nayak, M. Pagnucco (Eds.), 2nd Australian Workshop on Commonsense Reasoning, Perth, Australia, 1997. [54] A. Kean, G.K. Tsiknis, Assumption-based reasoning and clause management systems, Computational Intelligence 8 (1992) 1–24. [55] D. Kelsey, J. Quiggin, Theories of choice under ignorance and uncertainty, Journal of Economic Surveys 6 (2) (1992) 133–153. [56] J. Kohlas, Information Algebras: Generic Structures for Inference, Springer, London, 2003. [57] J. Kohlas, Probabilistic argumentation systems: A new way to combine logic with probability, Journal of Applied Logic 1 (3–4) (2003) 225–253. [58] J. Kohlas, Reliability of arguments, in: E. von Collani (Ed.), Defining the Science of Stochastics, in: Sigma Series in Stochastics, vol. 1, Heldermann, Lemgo, Germany, 2004, pp. 73–94. [59] J. Kohlas, Uncertain information: Random variables in graded semilattices, International Journal of Approximate Reasoning 46 (1) (2007) 17–34. [60] J. Kohlas, B. Anrig, R. Haenni, P.A. Monney, Model-based diagnostics and probabilistic assumption-based reasoning, Artificial Intelligence 104 (1998) 71–106. [61] J. Kohlas, P.A. Monney, Propagating belief functions through constraint systems, International Journal of Approximate Reasoning 5 (1991) 433–461. [62] J. Kohlas, P.A. Monney, A Mathematical Theory of Hints—An Approach to the Dempster–Shafer Theory of Evidence, Lecture Notes in Economics and Mathematical Systems, vol. 425, Springer, 1995. [63] R. Kohlas, Decentralized trust evaluation and public-key authentication, PhD thesis, University of Bern, Switzerland, 2007. [64] J. Kohlas, P.A. Monney, Probabilistic assumption-based reasoning, in: D. Heckerman, A. Mamdani (Eds.), UAI’93, 9th Conference on Uncertainty in Artificial Intelligence, Washington, USA, 1993, pp. 485–491. [65] J. Kohlas, P.A. Monney, An algebraic theory for statistical information based on the theory of hints, International Journal of Approximate Reasoning. [66] B.O. Koopman, The axioms and algebra of intuitive probability, Annals of Mathematics and Artificial Intelligence 41 (1940) 269–292. [67] S. Kraus, K. Sycara, A. Evenchik, Reaching agreements through argumentation: A logical model and implementation, Artificial Intelligence 104 (1–2) (1998) 1–69.

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

175

[68] H.E. Kyburg, Higher order probabilities and intervals, International Journal of Approximate Reasoning 2 (1988) 195–209. [69] J.H. Lambert, Neues Organon oder Gedanken über die Erforschung und Bezeichnung des Wahren und dessen Unterscheidung vom Irrtum und Schein, Johann Wendler, Leipzig, 1764. [70] P.S. Laplace, Théorie Analytique des Probabilités, third ed., Courcier, Paris, 1820. [71] K.B. Laskey, P.E. Lehner, Assumptions, beliefs and probabilities, Artificial Intelligence 41 (1) (1989) 65–77. [72] F. Lin, An argument-based approach to nonmonotonic reasoning, Computational Intelligence 9 (3) (1993) 254–267. [73] C. List, On the significance of the absolute margin, British Journal for the Philosophy of Science 55 (3) (2004) 521–544. [74] C. List, R.E. Goodin, Epistemic democracy: Generalizing the Condorcet jury theorem, Journal of Political Philosophy 9 (3) (2001) 277–306. [75] N. Maudet, S. Parsons, I. Rahwan (Eds.), ArgMAS’06, 3rd International Workshop on Argumentation in Multi-Agent Systems, Springer, Hakodate, Japan, 2006. [76] P.A. Monney, M. Chan, Modelling dependence in Dempster–Shafer theory, International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 15 (1) (2007) 93–114. [77] H.T. Nguyen, E.A. Walker, On decision making using belief functions, in: R.R. Yager, J. Kacprzyk, M. Fedrizzi (Eds.), Advances in the Dempster–Shafer Theory of Evidence, John Wiley and Sons, New York, USA, 1994, pp. 311–330. [78] N.J. Nilsson, Probabilistic logic, Artificial Intelligence 28 (1) (1986) 71–87. [79] W. Older, A. Vellino, Constraint arithmetic on real intervals, in: F. Benhamou, A. Colmerauer (Eds.), Constraint Logic Programming: Selected Research, MIT Press, 1993, pp. 175–196. [80] G. Paass, Probabilistic logic, in: P. Smets, E.H. Mamdani, D. Dubois, H. Prade (Eds.), Non-Standard Logics for Automated Reasoning, Academic Press, London, 1988, pp. 213–251. [81] S. Parsons, C. Sierra, N. Jennings, Agents that reason and negotiate by arguing, Journal of Logic and Computation 8 (3) (1998) 261–292. [82] S. Parsons, N. Maudet, P. Moraitis, I. Rahwan (Eds.), ArgMAS’05, 2nd International Workshop on Argumentation in Multi-Agent Systems, LNCS, vol. 4049, Springer, Utrecht, Netherlands, 2005. [83] J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, San Mateo, USA, 1988. [84] J. Pearl, Reasoning with belief functions: An analysis of compatibility, International Journal of Approximate Reasoning 4 (5–6) (1990) 363–389. [85] K. Pearson, The History of Statistics in the 17th and 18th Centuries against the Changing Background of Intellectual, Scientific and Religious Thought: Lectures by Karl Pearson given at University College (London) during the Academic Sessions 1921–1933, Lubrecht & Cramer, London, UK, 1978. [86] J. Picard, J. Savoy, Using probabilistic argumentation systems to search and classify web sites, IEEE Data Engineering Bulletin 24 (3) (2001) 33–41. [87] J. Picard, J. Savoy, Enhancing retrieval with hyperlinks: A general model based on propositional argumentation systems, Journal of the American Society for Information Science and Technology 54 (4) (2003) 347–355. [88] J. Picard, Probabilistic argumentation systems for information retrieval, PhD thesis, University of Neuchatel, 2000. [89] H. Prakken, G. Sartor, A dialectical model of assessing conflicting arguments in legal reasoning, Artificial Intelligence and Law 4 (1996) 331–368. [90] H. Prakken, G. Vreeswijk, Logical systems for defeasible argumentation, in: D. Gabbay, F. Guenther (Eds.), Handbook of Philosophical Logic, vol. D7: Semi Classical Logics 7, second ed., Kluwer Academic Publishers, 2001, pp. 219–318. [91] I. Rahwan, P. Moraitis, C. Reed (Eds.), ArgMAS’04, 1st International Workshop on Argumentation in Multi-Agent Systems, LNCS, vol. 3366, Springer, New York, USA, 2004. [92] E.H. Ruspini, J. Lowrance, T. Strat, Understanding evidential reasoning, International Journal of Approximate Reasoning 6 (3) (1992) 401– 424. [93] E.H. Ruspini, The logical foundations of evidential reasoning, Tech. Rep. 408, SRI International, AI Center, Menlo Park, USA, 1986. [94] N.E. Sahlin, W. Rabinowicz, The evidentiary value model, in: D.M. Gabbay, P. Smets (Eds.), Handbook of Defeasible Reasoning and Uncertainty Management Systems, vol. 1: Quantified Representation of Uncertainty and Imprecision, Kluwer Academic Publishers, Dordrecht, Netherlands, 1998, pp. 247–266. [95] R.H. Sarin, P.P. Wakker, A simple axiomatization of nonadditive expected utility, Econometrica 60 (1992) 1255–1272. [96] D. Schmeidler, Subjective probability and expected utility without additivity, Econometrica 57 (3) (1989) 571–587. [97] D.A. Schum, Probability and the process of discovery, proof, and choice, Boston University Law Review 66 (3–4) (1986) 825–876. [98] G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, 1976. [99] G. Shafer, Non-additive probabilities in the work of Bernoulli and Lambert, Archive for History of Exact Sciences 19 (1978) 309–370. [100] G. Shafer, Perspectives on the theory and practice of belief functions, International Journal of Approximate Reasoning 4 (5–6) (1990) 323– 362. [101] G. Shafer, The early development of mathematical probability, in: I. Grattan-Guinness (Ed.), Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences, Routledge, London, UK, 1993, pp. 1293–1302. [102] G. Shafer, The significance of Jacob Bernoulli’s Ars Conjectandi for the philosophy of probability today, Journal of Econometrics 75 (1996) 15–32. [103] P. Smets, Probability of provability and belief functions, Journal de la Logique et Analyse 133 (1991) 177–195. [104] P. Smets, R. Kennes, The transferable belief model, Artificial Intelligence 66 (1994) 191–234. [105] C.A.B. Smith, Consistency in statistical inference and decision, Journal of the Royal Statistical Society 23 (1961) 31–37. [106] M. Smithson, Ignorance and Uncertainty, Springer, 1988. [107] B. Verheij, Arguments and defeat in argument-based nonmonotonic reasoning, in: EPIA’95, 7th Portuguese Conference on Artificial Intelligence, Madeira, Portugal, 1995, pp. 213–224.

176

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

[108] G. Vreeswijk, Defeasible dialectics: A controversy-oriented approach towards defeasible argumentation, Journal of Logic and Computation 3 (1993) 317–334. [109] M. Wachter, R. Haenni, Propositional DAGs: a new graph-based language for representing Boolean functions, in: P. Doherty, J. Mylopoulos, C. Welty (Eds.), KR’06, 10th International Conference on Principles of Knowledge Representation and Reasoning, AAAI Press, Lake District, UK, 2006, pp. 277–285. [110] M. Wachter, R. Haenni, Logical compilation of Bayesian networks with discrete variables, in: K. Mellouli (Ed.), ECSQARU’07, 9th European Conference on Symbolic and Quantitative Approaches to Reasoning under Uncertainty, LNAI, vol. 4724, Hammamet, Tunisia, 2007, pp. 536–547. [111] M. Wachter, R. Haenni, Logical compilation of Bayesian networks, Tech. Rep. iam-06-006, University of Bern, Switzerland, 2006. [112] P. Walley, Statistical Reasoning with Imprecise Probabilities, Monographs on Statistics and Applied Probability, vol. 42, Chapman and Hall, London, UK, 1991. [113] P. Walley, T.L. Fine, Towards a frequentist theory of upper and lower probability, Annals of Statistics 10 (1982) 741–761. [114] R. Williams, Algorithms for quantified Boolean formulas, in: SODA’02, 13th Annual ACM–SIAM Symposium on Discrete Algorithms, San Francisco, USA, 2002, pp. 299–307. [115] J. Williamson, Probability logic, in: D. Gabbay, R. Johnson, H.J. Ohlbach, J. Woods (Eds.), Handbook of the Logic of Argument and Inference: the Turn Toward the Practical, Elsevier, Amsterdam, 2002, pp. 397–424. [116] J. Williamson, Objective Bayesian nets, in: S.N. Artëmov, H. Barringer, A.S. d’Avila Garcez, L.C. Lamb, J. Woods (Eds.), We Will Show Them!, in: Essays in Honour of Dov Gabbay, vol. 22, College Publications, 2005, pp. 713–730. [117] N. Wilson, Uncertain linear constraints, in: R. López de Mántaras, L. Saitta (Eds.), ECAI’04: 16th European Conference on Artificial Intelligence, Valencia, Spain, 2004, pp. 231–235. [118] Anonymous Author, A calculation of the credibility of human testimony, Philosophical Transactions of the Royal Society 21 (1699) 359–365.

Probabilistic argumentation Rolf Haenni Bern University of Applied Sciences, Department of Engineering and Information Technology, CH-2501 Biel, Switzerland Received 29 June 2007; received in revised form 8 November 2007; accepted 20 November 2007 Available online 9 January 2008

Abstract Argumentation in the sense of a process of logical reasoning is a very intuitive and general methodology of establishing conclusions from defeasible premises. The core of any argumentative process is the systematical elaboration, exhibition, and weighting of possible arguments and counter-arguments. This paper presents the formal theory of probabilistic argumentation, which is conceived to deal with uncertain premises for which respective probabilities are known. With respect to possible arguments and counter-arguments of a hypothesis, this leads to probabilistic weights in the first place, and finally to an overall probabilistic judgment of the uncertain proposition in question. The resulting probabilistic measure is called degree of support and possesses the desired properties of non-monotonicity and non-additivity. Reasoning according to the proposed formalism is an simple and natural generalization of the two classical forms of probabilistic and logical reasoning, in which the two traditional questions of the probability and the logical deducibility of a hypothesis are replaced by the more general question of the probability of a hypothesis being logically deducible from the available knowledge base. From this perspective, probabilistic argumentation also contributes to the emerging area of probabilistic logics. © 2007 Elsevier B.V. All rights reserved. Keywords: Argumentation; Reasoning; Logic; Probability theory; Uncertainty; Degrees of belief; Probabilistic logic; Dempster–Shafer theory

1. Introduction Traditionally, argumentation has been a topic of interdisciplinary interest in areas like philosophy, psychology, communication science, linguistics, law, economy, sociology, or other non-technical sciences. With the emergence of autonomous intelligent agents and multi-agent systems, where the aim is to make a computer decide and act in place of a human being, argumentation has also become an important research topic in computer science, particularly within the AI community [75,81,82,91]. The meaning of the English word argumentation is twofold. Many AI oriented argumentation theories are focused on argumentation as a methodical process of logical reasoning, in which conclusions are drawn from credible reasons, see e.g. [2,3,7,13,21,72,90,107]. The dialectical meaning of argumentation, namely as a debate, negotiation, or discussion, in which reasons are advanced for and against some controversial proposition or proposal, is another primary concern of the AI view of argumentation [5,67,89,108], as most approaches are driven by the general idea of putting E-mail address: [email protected] 1570-8683/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jal.2007.11.006

156

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

forward the pros and cons of the proposition in question. This is also the case in the theory proposed in this paper, but apart from that, it is primarily an argumentative theory of logical and probabilistic reasoning. Ideally, the construction of logical reasons for a conclusion is based on facts or generally accepted premises or principles. But in the real world, in which we need to deal with unreliable, incomplete, or even contradictory information, this seems not to be a very typical situation. An adequate logical theory of argumentation should therefore be conceived to deal with assumptions, uncertain premises, or defeasible statements, from which more or less credible logical arguments are constructed. In the proposed theory of probabilistic argumentation, the credibility (or the weight) of logical arguments is measured by probabilities. For this, we assume the uncertainty of possible premises for defeasible arguments to be adequately representable by a joint probability function. For maximal simplicity, we often suppose the premises to be stochastically independent, in which case it is sufficient to know individual marginal probabilities, but this is not a general restriction. One can think of uncertain premises as atomic sources of uncertainty, which are not further decomposable, and it depends on the concrete problem or model whether they are independent or not. Such premises will be represented by means of so-called probabilistic variables. Restricting the possible values of a probabilistic variable forms an assumption, and a combination of such assumptions, called scenario, forms a possible logical argument. This type of argumentative reasoning (or argumentation-oriented reasoning), i.e. the process of constructing arguments from uncertain assumptions, is sometimes called assumption-based or hypothetical reasoning [10,17,54, 62,64,71]. 1.1. Motivation and general ideas The ultimate goal of our approach is to quantitatively judge the truth or falsity of an uncertain proposition by an argumentative inference process. The qualitative part of the available information or evidence, on which this judgment is based, is called knowledge base, and the proposition in question is called hypothesis. Both the knowledge base and the hypothesis are supposed to be expressible by logical sentences in an appropriate logical language. The examples given in this paper are purely propositional, but the theory is applicable to many more general logics. Most formal concepts and definitions in this paper are not restricted to a specific logic. For a given hypothesis, the first step is to construct logical arguments and counter-arguments. Each argument (counter-argument) delivers some sort of logical proof for the truth (falsity) of the hypothesis in question, in the sense that the hypothesis (complementary hypothesis) becomes a logical consequence of the knowledge base once the argument is added to it. Note that a combination of an argument with a counter-argument should obviously lead to a logical contradiction. The theory of probabilistic argumentation proposes a precise solution of how to deal with such conflicts. This is an important component of the theory, which is responsible for its non-monotonic character. Determining all possible arguments, counter-arguments, and conflicts is the logical or qualitative part of the proposed evaluation process. It is evident that inference methods to accomplish this step depend strongly on the chosen logical language. Once the qualitative part of the evaluation is completed, the question is how to derive respective probabilistic weights. As a general solution, the theory of probabilistic argumentation proposes a non-additive measure called degree of support and its dual counter-part called degree of possibility. They measure the credibility of the hypothesis by the total probabilities that it is supported by arguments and that it is not defeated by counter-arguments, respectively. Degrees of support are sometimes called probabilities of provability (or probabilities of deducibility) [26,83], which do not necessarily sum up to 1 for a pair of complementary hypotheses. In [84], Pearl worries about the usefulness of such probabilities of provability: Why should we concern ourselves with the probability that the evidence implies h, rather than the probability that h is true, given the evidence? Of course, we would prefer having the latter, which is some sort of an ideal situation, but the imperfection of the available information does not always provide such an ideal situation. Moreover, by considering probabilities of events of the form “the hypothesis is deducible” rather than “the hypothesis is true”, we are not leaving the classical field of additive probabilities in the sense of Kolmogorov’s axioms. Computing such degrees of support and possibility is the probabilistic or quantitative part of the evaluation process. The entire evaluation process with the connections

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

157

Fig. 1. The conceptual connections between assumptions and arguments, arguments and hypotheses (solid arcs), counter-arguments and hypotheses (dotted arcs), hypotheses and degrees of support (solid arrows), and hypotheses and degrees of possibility (dotted arrows).

between the above-mentioned concepts of assumptions, arguments and counter-arguments, hypotheses, and degrees of support and possibility is depicted in Fig. 1. As probabilistic argumentation is primarily conceived to deal with the uncertainty of possible arguments, one can also look at it as a theory of non-monotonic reasoning under uncertainty. In fact, probabilistic argumentation turns out to be a very general way of combining the two classical types of probabilistic and logical reasoning, in which the two traditional questions of the probability and the logical deducibility of a hypothesis are replaced by the more general question of the probability of a hypothesis being logically deducible from the available knowledge base [42]. Our theory is therefore a contribution to the broad area of attempts to combine logic and probability [1,27,29,31,49, 78,80,115]. Since probabilities are not directly incorporated into the logical language, we propose an instance of an external probabilistic logic, but at the same time we depart form most other probabilistic logics by not considering sets of probabilities. The enrichment and further illumination of this connection is a principal goal of this paper. Possible application areas of probabilistic argumentation exist in domains where classical logics or probability theory alone are unable to entirely capture the characteristics of reasoning with partial information. Some existing applications areas are information retrieval [86–88], the authentication of public keys [37,50,63], model-based diagnostics and reliability analysis [4,60], Bayesian networks [110,111], statistics [65], and information fusion in the presence of unreliable sources [38]. The theory may also serve as a starting point for a more general decision theory, in which the amount and relevance of the available information is taken into account [45,77,106].1 The non-additivity of the proposed measure, i.e. the gap between respective degrees of support and possibility, naturally delivers such a degree of informativeness or conversely a degree of ignorance relative to the hypothesis in question. The study of decision theories with the option of further deliberation is a current research topic in the philosophy of economics [20,55,96]. 1.2. Introductory example In the main part of this paper, we will present the theory of probabilistic argumentation in a very general and formally rigorous form. To further expose the underlying basic ideas and technical concepts, we do not need this generality and rigorousness here. Instead, let’s play with a simple toy example, in which the knowledge base, denoted 1 It’s like in real life: people do not like decisions under ignorance (they prefer betting on events they know about). This psychological phenomenon is called ambiguity aversion and has been experimentally demonstrated by Ellsberg [25]. His observations are rephrased in Ellsberg’s paradox, which is often used as an argument against decision-making on the basis of subjective (additive) probabilities.

158

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Table 1 All 16 scenarios of the introductory example together with their conclusions with respect to z and ¬z s0

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

s11

s12

s13

s14

s15

a1 a2 a3 a4

0 0 0 0

1 0 0 0

0 1 0 0

1 1 0 0

0 0 1 0

1 0 1 0

0 1 1 0

1 1 1 0

0 0 0 1

1 0 0 1

0 1 0 1

1 1 0 1

0 0 1 1

1 0 1 1

0 1 1 1

1 1 1 1

z ¬z

| / | /

| / | /

| / | /

| | /

| | /

| | /

| | /

| | /

| / |

| / |

| / |

| |

| |

| |

| |

| |

Arguments

CounterArguments

Conflicts

by , is the following set of propositional sentences: = {a1 ∧ a2 → x, a3 → y, x ∨ y → z, a4 → ¬z}. Some of the propositions involved, namely a1 , a2 , a3 , and a4 , are supposed to represent probabilistically independent uncertain events. These are the probabilistic variables in our model, which will later serve as building blocks for the construction of arguments and counter-arguments. The assumption of probabilistic independence will not be a necessary restriction in our theory, but if we suppose to know their marginal probabilities, e.g. let P (a1 ) = 0.7,

P (a2 ) = 0.2,

P (a3 ) = 0.5,

P (a4 ) = 0.1,

it allows us to easily compute the probability of each single configuration in the corresponding 4-dimensional Boolean space by multiplication. For the configuration s3 = (1, 1, 0, 0), for example, which sets a1 and a2 to true and a3 and a4 to false, we get P ({s3 }) = P (a1 ∧ a2 ∧ ¬a3 ∧ ¬a4 ) = 0.7 ∗ 0.2 ∗ (1 − 0.5) ∗ (1 − 0.1) = 0.063. Such a configuration, which assigns a truth value to each probabilistic variable, is called scenario, and each truth value assignment of a scenarios is called assumption. Let us now look at the consequences of the possible scenarios. For this, we assume a particular scenario si to be the true configuration, the one that represents the state of the real world. This assumption allows us to instantiate each occurrence of a probabilistic variable in according to its value in si . The resulting set of sentences is called conditional knowledge base |si , and we can use it to see whether the given hypothesis is entailed or not. Table 1 lists all 24 = 16 scenarios s0 , . . . , s15 of our example and indicates by | or | / whether the hypotheses z and ¬z are logical consequences of the respective conditional knowledge bases |si or not. In the case of s3 = (1, 1, 0, 0), for example, we get |s3 = {x, x ∨ y → z}, of which z (unlike ¬z) is obviously a logical consequence. In other words, s3 supports the truth of z and is therefore a logical argument for z. Its probabilistic weight is P ({s3 }) = 0.063. An example of a counter-argument of z (i.e. of an argument for ¬z) is scenario s8 = (0, 0, 0, 1). It leads to |s8 = {x ∨ y → z, ¬z}, of which ¬z is an immediate logical consequence. In case a scenario supports both the hypothesis and its complement, we call it a conflict. In our example, s11 = (1, 1, 0, 1) with |s11 = {x, x ∨ y → z, ¬z} is such a conflict. Conflicts arise naturally, especially when the available background knowledge is complemented with actual observations. As they are incompatible with the given knowledge base, we will not consider them as proper arguments or counter-arguments. Note that the complement of the set of all conflicts can be viewed as the available evidence with respect to the probabilistic variables. With respect to the hypothesis z, we formally write ARGS(z) = {s3 , . . . , s7 }, ARGS(¬z) = {s8 , s9 , s10 }, and CONFS = {s11 , . . . , s15 } to denote the sets of arguments, counter-arguments, and conflicts, respectively. This decomposition of the 4-dimensional Boolean space is the result of the qualitative part of the evaluation. Note that in real-world sized problems, we do not require the elements of these sets to be listed explicitly. We will rather use appropriate logical representations such as DNFs (disjunctive normal forms), BDDs (binary decision diagrams), NNFs (negation normal forms), or PDAGs (propositional directed acyclic graphs) [12,15,109]. DNF representations are the core of the computational theory as presented in [39], but we do not make such a commitment in this paper. To get a quantitative judgment of the hypothesis z, we first consider the conditional probability P (ARGS(z)|CONFSc ) of the event ARGS(z) given that the true scenario is not conflicting. This conditional probability is what we call the degree of support of z. For the particular values of our example, we obtain the following

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

159

result: P (ARGS(z) ∩ CONFSc ) P (ARGS(z)) = dsp(z) = P ARGS(z)|CONFSc = c P (CONFS ) 1 − P (CONFS) P ({s3 , s4 , s5 , s6 , s7 }) P ({s3 }) + P ({s4 }) + P ({s5 }) + P ({s6 }) + P ({s7 }) = = 1 − P ({s11 , s12 , s13 , s14 , s15 }) 1 − [P ({s11 }) + P ({s12 }) + P ({s13 }) + P ({s14 }) + P ({s15 })] 0.513 = = 0.544. 0.943 In an analogous way, we can compute dsp(¬z) = 0.046 for the degree of support of ¬z. From this, we obtain z’s degree of possibility by dps(z) = 1 − dsp(¬z) = 0.954. This is the second relevant measure when it comes to quantitatively judge the truth of the hypothesis z. The resulting values dsp(z) = 0.544 and dps(z) = 0.954 indicate the presence of some non-negligible arguments for z and the almost perfect absence of corresponding counter-arguments. In other words, we have some good reasons to accept, but almost no reason to reject z. We can furthermore measure the amount of available information by the difference dig(z) = dps(z) − dsp(z) = 0.41, our degree of ignorance with respect to z. As mentioned earlier, being aware of the degree of ignorance can be useful for decision making. 1.3. Goals and overview The general purpose of this paper is to further develop, expose, and promote the theory of probabilistic argumentation. Another goal is to illuminate the connections to other theories of reasoning with uncertain information, particularly to the classical fields of logical and probabilistic reasoning. By surpassing previous publications on probabilistic argumentation in length, depth, and generality, we intend the paper to become the theory’s main and richest reference. And by not being limited to Boolean logic and through a more rigorous mathematical treatment, the paper also differs significantly from the current main reference [39]. With respect to the information theoretical perspective proposed in [57,59], in which uncertainty is represented by random variables with values in information algebras [56], and where reasoning is embedded in the corresponding algebraic structure, our paper is technically less complex and should therefore be accessible to a broader audience. In Section 2, we start by defending the position of non-additive degrees of belief and the resulting formal concept of an opinion with respect to an open question. This is a very general discussion, which we use to motivate and justify some of the major characteristics of our theory. Section 2 also includes an account on the theory’s historical root and its connection to more recent developments. Section 3, which is the main part of this paper, provides a general exposition of the theory’s main mathematical concepts and properties. We will also show how logical and probabilistic reasoning are included as special cases and say a few words on the connection to Dempster–Shafer theory. The paper ends in Section 4 with some concluding remarks. 2. Degrees of belief and opinions Argumentation as a process of logical reasoning is an entrenched part of the human intellect and therefore a very natural and general methodology for establishing conclusions on the basis of incomplete information. As such, argumentation plays a crucial role in the formation of rational beliefs and opinions, which themselves are the basis for rational decisions. In this respect, any formal theory of argumentation should also be conceived as (or at least be linked to) a theory of rational belief. As any theory of rationality must at some stage address questions like “What is the best way to represent an agent’s rational belief state?” [116] or “What laws should rational degrees of belief obey?”, we do not refrain from doing this here. We start from the position that belief is primarily quantitative and not categorical, i.e. we generally assume the existence of various degrees of belief. This corresponds to the observation that most human beings experience belief as a matter of degree. We follow the usual convention that such degrees of belief are values in the [0, 1]-interval, including the two extreme cases 0 for “no belief ” and 1 for “full belief ”, where any intermediate value represents its own level of certitude, thus allowing the quantification of statements like “I strongly believe . . . ” or “I can hardly believe . . . ” over the full range of the spectrum. Degrees of belief depend at least on two factors, the epistemic state of the person or agent who holds the belief, and the proposition or statement under consideration. With epistemic state we mainly refer to the information that

160

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

is available to the agent at a particular point in time t. If we denote the available information by t or simply , we may refer to an agent’s degree of belief with respect to a hypothesis h by Bel (h) ∈ [0, 1]. Furthermore, if ¬h represents the complementary hypothesis of h, then Bel (¬h) is called degree of disbelief of h. Typically, we expect to include sentences of a formal logical language or the specification of a probability function, but we do not further specify this at this point. Whenever no confusion is possible, we will abbreviate Bel (h) by Bel(h). The process of building up degrees of belief and disbelief from the available information is what we mean with reasoning. 2.1. Desired properties of degrees of belief Our goal here is to point out some of the properties most people would expect to encounter in a reasonable theory of rational degrees of belief. The properties of the following (non-exhaustive) list are the most important ones: Uniformity. If two agents possess exactly the same amount of relevant information with respect to a hypothesis h, they should have equal degrees of belief. In other words, degrees of belief are supposed to primarily depend on the available information , rather than on an agent’s individual preferences or bias. This is what we express in our notation Bel (h). Consistency. If a hypothesis h logically entails another hypothesis h , then Bel(h) is expected to be smaller or equal than Bel(h ). Formally, this means that h | h implies Bel(h) Bel(h ), where | denotes logical entailment. Furthermore, we expect Bel(⊥) = 0 for the inherently false hypothesis ⊥ (the one that entails any other hypothesis) and Bel() = 1 for the inherently true hypothesis (the one that is entailed by any other hypothesis). Non-Monotonicity. This property tells us that the expansion of the available information with some new information may result in lower, equal, or higher degrees of belief. Accordingly, the new information is called confirming, neutral, or disconfirming, respectively. Formally, non-monotonicity means that Belt (h) and Belt (h), the agent’s degrees of belief at two different points in time t t , are in no pre-determined relationship. Consequently, somebody’s belief of a particular hypothesis may repeatedly go up and down when evidence accumulates over time (see Fig. 2). Non-Additivity. More controversial is the question whether degrees of belief should be additive or not. In this paper, we will assume the non-additivity property to hold, which states that degrees of belief of complementary hypotheses h and ¬h do not necessarily add up to 1, or more generally, that the degrees of belief of two exclusive hypotheses h1 and h2 do not necessarily add up to the degree of belief of their disjunction h1 ∨ h2 . Formally, non-additivity with respect to degrees of belief means Bel(h) + Bel(¬h) 1, which is a special case of Bel(h1 ) + Bel(h2 ) Bel(h1 ∨ h2 ). Note that non-additivity is a direct consequence of assuming Bel∅ (h) = 0 for all hypothesis h ≡ . This can be justified by arguing that the extreme case of total ignorance, represented by = ∅, should not allow degrees of belief different from zero, except for h ≡ . This reflects a very cautious and skeptical attitude, according to which nothing is believed without being supported by evidence, and it implies Bel∅ (h) + Bel∅ (¬h) = 0 for all h ≡ ⊥, , a particular case of non-additive degrees of belief. Fig. 2 illustrates the non-monotonic behavior of non-additive degrees of belief, if the knowledge base t accumulates evidence over time. Non-additive degrees of belief are also appealing as a proper way of distinguishing between aleatory uncertainty (or simply uncertainty) and epistemic uncertainty (or ignorance). Many different theories of uncertain reasoning are

Fig. 2. Non-monotone and non-additive degrees of belief when new evidence accumulates over time.

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 3. The opinion triangle with its three dimensions: belief, disbelief, and ignorance.

161

Fig. 4. Special types of opinions.

motivated by this, e.g. the Dempster–Shafer theory [19,98], the Theory of Hints [62], the Transferable Belief Model [104], the Evidentiary Value Model [94], and many others (see Section 2.4). Non-additivity is also a crucial property of classical logic. At first sight, since logic is usually not concerned with numbers, this is not very apparent. But if we consider logical entailment |, we may often encounter cases of | / h and | / ¬h, which corresponds to Bel (h) = Bel (¬h) = 0, the above-mentioned case of total ignorance with respect to h. From the perspective of a non-additive belief measure, additivity appears as an ideal case in which degrees of belief coincide with long-run frequencies or subjective probabilities. There are numerous practical examples where the available evidence is such that this ideal case actually occurs, but this is certainly not always the case. 2.2. Opinions Belief in general and degrees of belief in particular are undoubtedly closely connected to an agent’s opinion. In a formal setting, opinions have been introduced by Jøsang as the fundamental concept of what he calls Subjective Logic [51–53]. Essentially the same concept has been studied before under different names, first by Ginsberg [33] and later by Hájek et al. [46,47] and Daniel [14]. In all cases, the starting point is a non-additive measure of belief. Here we follow Jøsang’s original definition in [53], according to which an opinion with respect to a hypothesis h is a triple ωh = (b, d, i),

(1)

where b = Bel(h) is the agent’s degree of belief in the hypothesis h, d = Bel(¬h) the degree of disbelief in h, and i = 1 − (b + d) the so-called degree of ignorance relative to h.2 Notice that b, d, and i sum up to one, i.e. any pair of values implies the third one. Opinions should therefore be regarded as a two-dimensional rather than three-dimensional concept. For illustrative purposes, it may be useful to represent the set of all possible opinions as shown in Fig. 3 by a 2-simplex, the so-called opinion triangle. Jøsang put forward this picture in [51–53]. To represent a ternary probabilistic space, Dempster used a similar picture in one of his most influential papers almost thirty years earlier [19]. Dempster did not have a particular name, so he referred to it as barycentric coordinates, which is the correct mathematical term. The formal concept of an opinion is very intuitive and helpful when it comes to explain or justify the non-additivity assumption. As shown in Fig. 4, it covers nicely various types of epistemic states, including the extreme cases of full belief by (1, 0, 0), full disbelief by (0, 1, 0), and full ignorance by (0, 0, 1). Other particular opinions are the ones on the edges of the triangle. Notice that Bayesian opinions of the form (p, 1 − p, 0), which are unambiguously characterized by a single (additive) probability value p, do not allow the agent to have “no opinion” or to say “I don’t know”. This is why we postulate non-additivity as a fundamental requirement for a general quantitative model of belief. 2 In [51], Jøsang calls i degree of uncertainty rather than degree of ignorance (which is a bit misleading), and opinions are defined as quadruples (b, d, i, a) with an additional component a, the so-called relative atomicity (we do not need this here).

162

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

2.3. Historical roots of non-additive degrees of belief As Shafer and later Kohlas pointed out [58,99,101,102], first examples of the two-dimensional (non-additive) view of degrees of belief can be found in the literature of the late seventeenth and early eighteenth centuries, well before Bayesian ideas were developed. Historically, non-additive degrees of belief were mostly motivated by judicial applications, such as the reliability of witnesses in the courtroom, or more generally by the credibility of testimonies on past events or miracles. The first two combination rules for testimonies were published in 1699 in an anonymous article [118].3 One of them considers two independent witnesses with respective credibilities (frequencies of saying the truth) p1 and p2 . If we suppose that they deliver the same report, they are either both telling the truth with probability p1 p2 or they are both lying with probability (1 − p1 )(1 − p2 ). Every other configuration is impossible. The original formulation of the main statement is the following: The ratio of truth saying cases to the total number of cases, p 1 p2 (2) , p1 p2 + (1 − p1 )(1 − p2 ) will represent the probability of both testifiers asserting the truth.4 Translated into our terminology, it means that if both witnesses report the truth of a hypothesis h, then the value for Bel(h) is given by the expression in (2). In Section 3, we will show how to obtain the same result from a probabilistic argumentation system. The corresponding formula for n independent witnesses of equal credibility p, pn (3) , p n + (1 − p)n has been mentioned in [70] by Laplace (1749–1827) and is closely related to the Condorcet Jury Theorem discussed in social choice theory [8,16,73,74]. Notice that both probabilities in (2) and (3) sum up to 1 with respect to the two possibilities h and ¬h. It thus seems that they are classical additive probabilities, but since they do not depend on a prior probability with respect to h, they raise the controversial question of whether these formulae are proper posterior probabilities in a Bayesian sense. George Boole (1815–1864) gives a similar formula that includes a prior distribution [11], but (2) and (3) still appear to be reasonable results. The connection between Laplace’s and Boole’s formulae has been studied in [38], in which both expressions drop out as special cases of a more general model of partially reliable information sources. This general model is also applicable to situations of contradictory testimonies. It presupposes non-additive degrees of belief, but Laplace’s and Boole’s formulae themselves remain additive. However, the fact that Laplace’s formula does not require a prior probability for h turns out to be the consequence of approaching the problem from the perspective of non-additive degrees of belief. Another important historical contribution, in which the connection to non-additive degrees of belief is more obvious, can be found in the fourth part of Jakob Bernoulli’s (1654–1705) famous Ars Conjectandi (the art of conjecture) [6]. He distinguishes between necessary and contingent (uncertain) statements: A proposition is called necessary, relative to our knowledge, when its contrary is incompatible with what we know. It is contingent, if it is not entailed by what we know. With respect to the question of whether the hypothesis h is implied by the given evidence, Bernoulli analyses four possible situations: (a) the evidence is necessary and implies h necessarily; (b) the evidence is contingent, but implies h necessarily; (c) the evidence is necessary, but implies h only contingently; (d) the evidence is contingent and implies h only contingently. In (c) and (d), a further distinction is made between pure and mixed arguments. In the mixed case, it is assumed that if the evidence does not imply h, it implies ¬h, whereas nothing is said about ¬h in the pure case. Bernoulli 3 There is some disagreement about the authorship of this article. Shafer names the English cleric George Hooper (1640–1727) [100], but for Pearson, the true author is the English statistician Edmund Halley (1656–1742) [85]. Another possible author is the Scottish mathematician John Craig (1663–1731). 4 In its substance, this statement was considered important enough to be included in Francis Edgeworth’s (1845–1926) article on probability in the 11th edition of the Encyclopædia Britannica [22].

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

163

then considers the number of cases in which the evidence occurs and in which h (or ¬h) is entailed. Finally, the corresponding ratios with respect to the total number of cases turn out to be non-additive in (b) and in the pure versions of (c) and (d). Bernoulli also discusses the problem of combining several testimonies. Essentially, his combination rules are special cases of what is known today as Dempster’s rule of combination (see next subsection). In the mixed version of (c), the results of the combination coincide with Laplace’s formula, again without requiring a prior probability for h. Laplace’s analysis is thus included in Bernoulli’s analysis, but the connection to non-additive cases is now more obvious. Even more general is Johann Heinrich Lambert’s (1728–1777) discussion in [69]. From Lambert’s perspective, Bernoulli’s pure and mixed arguments are special cases of a more general situation, in which a syllogism (logical argument) has three parts, the affirmative, the negative, and the indeterminate. There is a number attached to each of these parts, all three of them summing up to 1. This is exactly what we call today an opinion ωh = (b, d, i). In this sense, Bernoulli’s distinction between pure and mixed arguments is a restriction to positive and Bayesian opinions, respectively, but Lambert’s discussion covers the general case. A more comprehensive summary of Bernoulli’s, Lambert’s, and Laplace’s work with corresponding links to the modern view is given in [58,99]. Notice that these very old ideas, until they were rediscovered by Dempster, Hacking, and Shafer at the end of the 20th century [18,34,98], were completely eliminated from mainstream probability over almost three full centuries. 2.4. Connections to more recent developments The idea of defining non-additive degrees of belief or opinions on the basis of probabilities of provability has been the motivation of many other approaches. To the best of our knowledge, the term probability of provability (or probability of necessity) has first been used by Pearl [83] and later by Laskey et al. in [71] and Smets in [103] in their discussions about the connection between Bayesian probabilities and belief functions. Ruspini proposed a similar view, but he prefers to talk about epistemic probabilities P (Kh) and P (K¬h) of the epistemic states Kh (h is known) and K¬h (¬h is known), respectively [92,93]. A similar view is discussed in [14,46,47], where pairs (b, d) are called Dempster pairs. The notion of a Dempster pair is obviously linked to the Dempster–Shafer theory (DST) [19,98], which is also known as the theory of belief functions or the Theory of Evidence. This theory also yields a pair of values for belief and disbelief, but the degree of disbelief is usually expressed by the degree of plausibility Pl(h) = 1 − Bel(¬h). This implies Bel(h) Pl(h) for all possible hypotheses h and defines thus a partitioning of the unit interval [0, 1] into three blocks. The same type of belief/plausibility pairs are used in the related Transferable Belief Model (TBM) [104] and the Theory of Hints [62]. Notice that the spirit behind such belief/plausibility pairs is very similar to the modal operators 2 (necessity) and 3 (possibility) in modal logic [9]. Another two-dimensional representation of belief results from using the principle of indifference to transform belief/plausibility pairs into a additive probabilities. In the TBM framework, this transformation is called pignistic transformation [104], and its result is called betting probability BetP(h). In the simple case of two complementary hypotheses h and ¬h, BetP(h) is simply the arithmetic mean of Bel(h) and Bel(¬h). BetP(h) together with the degree of ignorance is an alternative pair, which reflects precisely the view of some moderate Bayesians, who admit that in addition to the (additive) degree of belief one should also consider the strength of the belief, which depends on the amount of available supporting evidence. In [48], Hawthorne describes this point in the following way: I contend that Bayesians need two distinct notions of probability. We need the usual degree-of-belief notion that is central to the Bayesian account of rational decision. But Bayesians also need a separate notion of probability that represents the degree to which evidence supports hypotheses. In Hawthorne’s sense, additional supporting evidence can function in two different ways: it may increase either the degree of belief or the strength of belief (or both). This point is the central idea of what Schum [97] calls the Scandinavian School of Evidentiary Value [23,30], another non-additive approach to degrees of belief. It is known today as the Evidentiary Value Model (EVM) [94] and originates from the work of the Swedish lawyer Ekelöf in the early 1960ies [24].

164

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Instead of talking about non-additive degrees of belief, some authors prefer to call them non-additive probabilities [32,95,96]. They are often understood as bounds of probability intervals, which are induced by sets of compatible probability functions [66,68,105,113]. Such bounds are often called lower and upper probabilities, which is similar to what Dempster originally had in mind [18]. Today, the common general term for this particular class of approaches is imprecise probabilities [112]. Imprecise or non-additive probabilities have also been in use in physics for a long time, where the role of the non-additivity is to describe the deviation of elementary particles in mechanical wave-like behavior [28]. In this paper, in order to avoid unnecessary confusion, we prefer to make a strict distinction between additive probabilities (in the classical sense) and non-additive degrees of belief or degrees of support. The theory of probabilistic argumentation demonstrates how to use the former to obtain the latter. 3. Probabilistic argumentation Let us now build up a formal theory of probabilistic argumentative reasoning, which adopts at its core the characteristics of degrees of belief and opinions as suggested in the previous section. For this, we require the qualitative (or logical) part of the available information to be expressed by a set of well-formed sentences of a logical language, which we call knowledge base. The logical language itself is supposed to possess a well-defined model-theoretic semantics, in which a proper entailment relation | is defined in terms of set inclusion of models (or interpretations) in some underlying universe. The simplest non-trivial case is the language of propositional logic, where the universe is a multi-dimensional Boolean space over a set of propositional variables. The application of this particular case to probabilistic argumentation has been extensively discussed in the literature [39,43,44]. Other simple languages are obtained from finite set constraints [40], interval constraints [79], or general multivariate constraints such as (linear) equations and/or inequalities [61,117]. In this paper, we do not restrict ourselves to a particular language, but we assume the universe to be a multi-dimensional space V generated by a set of variables V . The corresponding logical language will be denoted by LV , of which is a proper subset. To represent the quantitative (or probabilistic) part of the given information , we suppose the existence of a fully specified probability measure P over a sample space W , which itself is generated by a subset W ⊆ V of so-called probabilistic variables. To keep the mathematics in our discussion as simple as possible, we restrict W to be discrete and finite, but most concepts and definition are extendable to the continuous case. Otherwise, we do not make further assumptions regarding the specification of the probability measure P . The simplest and most efficient specification results from assuming the variables in W to be mutually independent (as in the example in Section 1.2), but other more flexible and powerful techniques such as Bayesian or Markov networks are applicable as well [76]. Note that we do not impose any particular interpretation of probability. In this section, we start with a general discussion of the conceptual differences between logical and probabilistic reasoning. These are the two main mathematical tools to construct a probabilistic theory of logical arguments. Our discussion will give us a better understanding of the key connections, which will then be generalized into a combined (argumentation-oriented) theory of logical and probabilistic reasoning. 3.1. Connecting logic and probability Logic and probability theory have both a long history in science. They are mainly rooted in philosophy and mathematics, but are nowadays important tools in many other fields such as computer science and particularly in Artificial Intelligence. Some philosophers studied the connection between logical and probabilistic reasoning, and a great number of attempts to combine these disciplines have been made, but logic and probability theory are still widely perceived to be separate theories. This is quite surprising, since both disciplines are driven by the same goal, namely to evaluate hypotheses through a formal process of reasoning. One of the key conceptual points, which separates probability theory and logic, is the following: classical probabilistic reasoning presupposes the existence of a probability measure over all variables, whereas pure logical reasoning does not deal with probabilities at all. In other words, logic presupposes a probability measure over none of the variables involved. If we call the variables involved in the probability measure probabilistic, we can say that a probabilistic model consist of probabilistic variables only, whereas all variables of a logical model are non-probabilistic. From this point of view, one of the main differences between logical and probabilistic reasoning is the number of probabilistic

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

165

Fig. 5. The connection between probabilistic argumentation and the classical fields of logical and probabilistic reasoning through different sets of probabilistic variables.

variables. This simple observation turns out to be crucial for understanding many similarities and differences between logical and probabilistic reasoning.5 3.2. Probabilistic argumentation systems With the above remarks in mind, building a more general theory of reasoning is almost straightforward. The simple idea is to allow an arbitrary number of probabilistic variables. More formally, if the available information is represented over a set of variables V , we suppose to have a subset W ⊆ V of probabilistic variables, all with finite domains. If X denotes the domain of a single variable X ∈ V and W the corresponding Cartesian product with respect to all variables in W , we can consider a probability space (W , 2W , P ), where 2W denotes the power set of W and P : 2W → [0, 1] a corresponding probability measure that satisfies the Kolmogorov axioms. The finiteness assumption with regards to W is not a conceptual restriction of this theory, but it allows us to define P with respect to the σ -algebra 2W and thus helps to keep the mathematics simple. For the general case of arbitrary sets of probabilistic variables, let us now define the mathematical structure of a probabilistic argumentation system, into which the available information needs to be compiled. Definition 1. A probabilistic argumentation system is a quintuple A = (V , LV , , W, P ),

(4)

whose components V , LV , , W , and P are as defined above. We will later see precisely how the general theory of probabilistic argumentation degenerates into the classical fields of logical and probabilistic reasoning for W = ∅ and W = V , respectively, but the general idea of this connection is already depicted in Fig. 5. Example 2. To illustrate the concept of a probabilistic argumentation system, consider the simple story in which our friend Alice flips a fair coin and promises to invite us to a barbecue tomorrow night provided that the coin lands on head. Alice is well known to always keep her promises, but she does not say anything about what she is doing in case the coin lands on tail, i.e. she may or may not organize the barbecue in that case. Of course, we would like to know whether the barbecue takes place or not. How can this knowledge be expressed in terms of a probabilistic argumentation system? The given evidence consists of two pieces: the first one is Alice’s reliable promise, and the second one is the fact that the two possible outcomes of tossing a fair coin are known to be equally likely. Thus the evidence is best modeled with two Boolean variables, say H (for head) and B (for barbecue), with domains H = B = {0, 1}, a (uniform) probability function over H , and a propositional sentence h → b (with h and b as placeholders for the atomic events H = 1 and B = 1, respectively). We have thus V = {H, B}, = {h → b}, W = {H }, and P (h) = P (¬h) = 0.5. 5 The literature on combining logic and probability is huge, e.g. see [1,27,29,31,49,78,80,115], but the idea of distinguishing between probabilistic and non-probabilistic variables seems to be relatively new and unexplored.

166

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Altogether we get a probabilistic argumentation system A = (V , LV , , W, P ) as defined above, where LV is the language of propositional logic. Example 3. Another simple example is the judicial problem of two independent witnesses who deliver the same report (see historical notes in Section 2.3). Let p1 and p2 be their respective credibilities (frequencies of saying the truth). To model this situation as a probabilistic argumentation system, consider five Boolean variables REL1 (reliability of Witness 1), REL2 (reliability of Witness 2), REP1 (report of Witness 1), REP2 (report of Witness 2), and HYP (the hypothesis in question). Again, we use propositions reli , repi , and hyp as placeholders for RELi = 1, REPi = 1, and HYP = 1, respectively. We have thus V = {REL1 , REL2 , REP1 , REP2 , HYP} and W = {REL1 , REL2 } with respective marginal probabilities P (rel1 ) = p1 and P (rel2 ) = p2 . The required probability measure P follows then from the independence assumption. To model the logical constraints of this example, suppose that a reliable witness knows the true state of the hypothesis and thus delivers a positive (negative) report whenever the hypothesis is true (false). Unreliable witnesses are supposed to act the other way around. We can thus consider propositional sentences reli → (hyp ↔ repi ) and ¬reli → (hyp ↔ ¬repi ), which can be merged into a single sentence reli ↔ (hyp ↔ repi ). If we suppose that both witnesses deliver a positive report, we get = {rel1 ↔ (hyp ↔ rep1 ), rel2 ↔ (hyp ↔ rep2 ), rep1 , rep2 } for our knowledge base, which is logically equivalent to = {rel1 ↔ hyp, rel2 ↔ hyp, rep1 , rep2 }. All elements together constitute a probabilistic argumentation system A = (V , LV , , W, P ), where LV is again the language of propositional logic. 3.3. Conflicts For a given probabilistic argumentation system A = (V , LV , , W, P ), the problem to solve is to judge whether a hypothesis, expressed by an additional sentence h ∈ LV , is true or false. For this, we denote the set of models of (the elements of the universe V for which is true) by EV = JK ⊆ V , and thus assume the true state of the world to be exactly one of its element. Think of EV as the available evidence with regards to all variables V . Furthermore, let H = JhK ⊆ V denote the set of models of the hypothesis h ∈ LV , i.e. h is true if and only if the true state of the world is in H . In the following, we will use the sets EV and H interchangeably with and h, respectively. The main formal definitions in this section are based on two key observations. The first one is the fact that the evidence EV ⊆ V , which restricts the set of possible states relative to V , also restricts the possible states relative to W . We call the elements s ∈ W scenarios, and by projecting EV from V to W , as illustrated in Fig. 6, we get the set ↓W EW = EV ⊆ W of scenarios which are consistent with the knowledge base . This means that exactly one element of EW corresponds to the true state of the world, and it implies that the scenarios of the complementary set, W \EW , are all incompatible with . In other words, they represent states which are in conflict with the available knowledge base, written as |s | ⊥. Definition 4. If |s | ⊥ holds for a scenario s ∈ W , then s is called a conflict of . The set of all such conflicts is denoted by CONFSA = {s ∈ W : |s | ⊥} = W \EW .

(5)

In the above definition, |s represents the conditional knowledge base given s, which we obtain from by replacing all occurrences of probabilistic variables by their respective values in s (see Section 1.2 for an example). Note that conflicts may arise very naturally, especially when the available information is complemented with actual observations or facts from the real world. In the following, we will often use CONFS as a short form for CONFSA . 3.4. Arguments and counter-arguments The second observation goes in the other direction, that is from W to V . Let’s assume that a certain scenario s ∈ W is the true scenario. With respect to V , this particular situation reduces the set of possible states from EV to EV |s = x ∈ EV : x↓W = s = J|sK, (6)

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 6. Projecting the knowledge base from V to W .

167

Fig. 7. Evidence conditioned on various scenarios.

where x↓W denotes the projection of a state x from V to W . This restricted set of states corresponds to the models of |s and thus contains all the elements of EV that are compatible with s. This idea is illustrated in Fig. 7 for four different scenarios s0 , s1 , s2 , and s3 . Note that s ∈ EW implies EV |s = ∅ (respectively |s | / ⊥) and vice versa. Consider now a consistent scenario s ∈ EW for which EV |s ⊆ H (respectively |s | h) holds. This means that h is a logical consequence of s and , and s can thus be seen as a defeasible or hypothetical proof for h in the light of . We must say defeasible, because it is uncertain whether s is the true scenario or not. In other words, h is only supported by s, but not entirely proven, and every supporting scenario is thus a defeasible logical argument for h. Definition 5. If |s | h holds for a consistent scenario s ∈ EW and a hypothesis h ∈ LV , then s is called an argument for h. With ARGSA (h) = {s ∈ EW : |s | h} = {s ∈ W : |s | h, |s | / ⊥} we denote the set of all such

(7)

arguments.6

Similarly, the elements of ARGSA (¬h) = {s ∈ EW : |s | ¬h} are logical counter-arguments, which refute h in the light of the given evidence. When no confusion is anticipated, we omit the reference to A and use the short forms ARGS(h) and ARGS(¬h). Note that ARGS(h) ∩ ARGS(¬h) = ∅ holds for all possible hypotheses h ∈ LV . In the example of Fig. 7, the hypothesis h is supported by the argument s3 , but not by s0 , s1 , or s2 (s0 is a conflict). Similarly, ¬h is supported (or h is refuted) by s1 , but not by s0 , s1 , or s3 . In the case of s2 , no definite conclusion is possible for h. The existence of such neutral scenarios is the reason for the non-additivity of degrees of support, as we will see below and in the next subsection. The decomposition of the set W into supporting, refuting, conflicting, and neutral scenarios is depicted in Fig. 8. Example 6. In the situation of Example 3, we have two probabilistic variables REL1 and REL2 , i.e. depending on whether the two witnesses are saying the truth or not, this yields four different scenarios s0 = (0, 0), s1 = (0, 1), s2 = (1, 0), and s3 = (1, 1). After receiving two positive reports, rep1 and rep2 , our knowledge base becomes = {rel1 ↔ hyp, rel2 ↔ hyp, rep1 , rep2 }, from which we obtain |s0 = {¬hyp, rep1 , rep2 }, |s1 = {⊥}, |s2 = {⊥}, and |s3 = {hyp, rep1 , rep2 }. This implies that s3 is an argument and s0 is a counter-argument for hyp, and that s1 and s2 are conflicts. Note that there are no neutral scenarios. In other words, it can either be the case that both witnesses 6 Alternatively, we may define the set of arguments simply by Args(h) = {s ∈ : |s | h} = ARGS(h) ∪ CONFS, which then implies W CONFS = Args(⊥) and therefore ARGS(h) = Args(h) \ Args(⊥). From this perspective, it seems that Args(h) is the more fundamental concept, from which both CONFS and ARGS(h) follow. In fact, most computational methods are designed to compute the so-called quasi-arguments Args(h), see [35,36,39], but at the conceptual core, the choice between ARGS(h) and Args(h) remains a matter of taste. Here we prefer ARGS(h) to stress out that conflicts should not be regarded as proper arguments.

168

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 8. Arguments, counter-arguments, conflicts, and neutral scenarios.

are telling the truth (in which case we conclude hyp) or that they are both lying (in which case we conclude ¬hyp). Every other configuration is impossible. This is exactly the argument used in the original formulation of the problem in [118]. The key definitions in (5) and (7) have a number of mathematical consequences. One property, which is particularly important for computational purposes, is shown in the following theorem, in which S ↓W denotes the projection of a subset S ⊆ V to W . The proof of the theorem is given in Appendix A. Theorem 7. ↓W ARGS(h) = EW \ EV ∩ H c .

(8)

This theorem tells us the problem of computing ARGS(h) is essentially a projection (or variable elimination) problem (see [56] for a generic solution of the projection problem). Translated into the terminology of logic, it means that the variables in U = V \ W needs to be eliminated from the set of sentences ∪ {¬h}, which we obtain from the knowledge base by adding the negated hypothesis to it. If is a clausal set and the variables in U are propositions, it is possible to realize the elimination as a resolution-based procedure [39]. Note that variable elimination in propositional logic can also be viewed as a quantifier elimination problem [114]. Other mathematical consequences of (5) and (7) arise if we consider special situations like ≡ , ≡ ⊥, h ≡ , or h ≡ ⊥. Table 2 summarizes some of these consequences, which are all easy to verify (for corresponding proofs in the context of propositional logic, see [39]). Further simple properties occur if we consider particular situations of interrelated hypotheses or knowledge bases. Some of these (easily verifiable) properties are shown in the following non-exhaustive list: h | h

⇒

ARGS(h) ⊆ ARGS(h ),

h | ¬h

⇒

ARGS(h) ∩ ARGS(h ) = ∅,

h ≡ h1 ∧ h2

⇒

ARGS(h) = ARGS(h1 ) ∩ ARGS(h2 ),

⇒

CONFSA ⊆ CONFSA .

⊆

Table 2 Consequences of (5) and (7) for various special situations ARGS(h)

ARGS(¬h)

CONFS

≡ ⊥

∅ ∅ EW EW ∩ (H c )↓W

W

| h (e.g. h ≡ ) | ¬h (e.g. h ≡ ⊥) Vars(h) ⊆ W

∅ EW ∅ EW ∩ H ↓W

≡

h ≡ , h ≡ ⊥, Vars(h) ∩ W = ∅

∅

∅

≡⊥

W \EW ∅

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

169

The last property describes the monotonic growth of the conflict set when new information arrives, i.e. when A = (V , LV , , W, P ) is updated into A = (V , LV , , W, P ). Note that apart from that, ⊆ does not allow general conclusions about the relationship between ARGSA (h) and ARGSA (h): new evidence may turn neutral scenarios into arguments or counter-arguments, or existing arguments or counter-arguments may be turned into conflicts. This is the mathematical reason for the non-monotonic behavior of our approach, as we will further see in the following subsection. 3.5. Degrees of support, possibility, and ignorance The above definitions of conflicts, arguments, and counter-arguments are the key notions, on which the definitions of degrees of support and possibility in this subsection are based. Based on the general idea that every argument s ∈ ARGS(h) contributes to the possible truth of the hypothesis h for a given knowledge base , we can measure the strength of such a contribution by the posterior probability P ({s}) = P ({s}|EW ), where EW plays the role of the evidence on which the prior probability P is conditioned. To take all such contributions into account, we will now consider the same posterior probability P with respect to the whole set ARGS(h). Definition 8. The degree of support of a hypothesis h ∈ LV is the conditional probability P (ARGS(h) ∩ EW ) P (ARGS(h)) dspA (h) = P ARGS(h) = P ARGS(h)|EW = = P (EW ) 1 − P (CONFS) {P ({s}): s ∈ ARGS(h)} , = 1 − {P ({s}): s ∈ CONFS}

(9)

of the event ARGS(h) given the evidence EW .7 Note that degrees of support are undefined for ≡ ⊥. Again, we will use the more convenient short form dsp(h) whenever possible. Example 9. Consider again the judicial problem of Example 3 and its discussion in Example 6. From the given marginal probabilities and the independence assumption we obtain P ({s0 }) = (1 − p1 )(1 − p2 ), P ({s1 }) = (1 − p1 )p2 , P ({s2 }) = p1 (1 − p2 ), P ({s3 }) = p1 p2 by multiplication. This leads then to dsp(hyp) =

P ({s3 }) p 1 p2 = , P ({s0 , s3 }) p1 p2 + (1 − p1 )(1 − p2 )

which is exactly the same result as the one derived from original formulation of the problem in [118]. Similarly, we obtain dsp(¬hyp) =

P ({s0 }) (1 − p1 )(1 − p2 ) = , P ({s0 , s3 }) p1 p2 + (1 − p1 )(1 − p2 )

for the negative hypothesis ¬hyp, which means that dsp(hyp) and dsp(¬hyp) are additive in this particular situation. This is a consequence of the absence of neutral scenarios. At this point, it is important to notice that dsp(h) defines an ordinary probability measure in the classical sense of Kolmogorov, for which the probabilities of complementary events ARGS(h) and EW \ARGS(h) add up to 1. Nevertheless, with respect to complementary hypotheses h and ¬h, for which the sets ARGS(h) and ARGS(¬h) are not necessarily complementary with respect to EW (as shown in Fig. 8), we obtain non-additive (or sub-additive) degrees of support, for which the inequality dsp(h) + dsp(¬h) 1

(10)

7 Using the alternative set of arguments Args(h) (see previous footnote), we could also define degrees of support by dsp (h) = P (Args(h)) = A P (Args(h))−P (Args(⊥)) . Since both definitions are mathematically equivalent, the choice between Args(h) and ARGS(h) remains a matter of taste. 1−P (Args(⊥))

170

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

Fig. 9. The opinion induced by degrees of support and possibility.

holds. Degrees of support should therefore be understood as non-additive posterior probabilities of logical deducibility. Except for ≡ ⊥, they are well-defined for all possible hypotheses h ∈ LV , that is even in cases in which the prior probability measure P does not cover all variables. This increased flexibility is an important advantage over traditional probabilistic reasoning, which presupposes the existence of a complete probability measure over all variables. Instead of considering the degree of support of the complementary hypothesis ¬h, it is often more convenient to look at so-called degrees of possibility of h. They are usually defined in terms of degree of support, namely by dpsA (h) = 1 − dspA (¬h).

(11)

Intuitively, the degree of possibility is thus a measure of the absence of counter-arguments. Together with the degree of support, a hypothesis is finally judged by a pair of values dsp(h) and dps(h), for which dsp(h) dps(h) always holds. Or we may look at it as the corresponding interval [dsp(h), dps(h)] ⊆ [0, 1] of length dig(h) = dps(¬h) − dsp(¬h),

(12)

which we call degree of ignorance with respect to h. Another view is to look at dsp(h) and dps(h) as the anchors of a unique point ωh = (b, d, i) in the opinion triangle (see Section 2.2), with b = dsp(h), d = 1 − dps(h), and i = dig(h) as depicted in Fig. 9. To conclude this subsection, let’s again look at some evident mathematical properties, which occur in special situations like ≡ , ≡ ⊥, h ≡ , or h ≡ ⊥. Table 3 lists all immediate consequences of the properties listed Table 2 (see previous subsection). Another immediate consequence occurs in the situation, where a hypothesis h1 logically entails another hypothesis h2 . This increases both the degree of support and the degree of possibility: dsp(h) dsp(h ), h | h ⇒ (13) dps(h) dps(h ). Note that no such conclusions are possible for cases like | . In other words, degrees of support and possibility (and therefore degrees of ignorance) may change non-monotonically when new qualitative information or evidence arrives. Typically, confirming evidence increases the degree of support (by producing new arguments), but it may also increase the degree of possibility (by transforming counter-arguments into conflicts). Similarly, disconfirming Table 3 Consequences of (10), (11), and (12) for various special situations ≡⊥

dsp(h)

dps(h)

dig(h)

undefined

undefined

undefined

≡ ⊥

| h (e.g. h ≡ ) | ¬h (e.g. h ≡ ⊥) Vars(h) ⊆ W

1 0 x ∈ [0, 1]

1 0 x ∈ [0, 1]

0 0 0

≡

h ≡ , h ≡ ⊥, Vars(h) ∩ W = ∅

0

1

1

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

171

evidence decreases the degree of possibility in the first place (by producing new counter-arguments), but it may also decrease the degree of support (by transforming arguments into conflicts). With this, degrees of support satisfy all the properties suggested in Section 2.1 for rational degrees of belief: uniformity follows from the fact that for a given hypothesis h, dsp(h) only depends on the agent’s evidence A, consistency is the property described in (13), non-additivity is expressed by the inequality in (10), and non-monotonicity holds for the reasons just explained. 3.6. Special cases: Logical and probabilistic reasoning To complete the technical discussion about probabilistic argumentation, let us briefly investigate how the classical fields of logical and probabilistic reasoning fit into this general theory. Logical reasoning. From the perspective of probabilistic argumentation, logical reasoning is characterized by W = ∅. This has a number of simple consequences. First, it implies that the set of possible scenarios W = {s0 } consists of a single element s0 = () only, which represents the empty vector of values. This means that P ({s0 }) = 1 is the only possible prior probability and is thus implicitly given. Furthermore, we have |s0 = , which allows us to simplify (7) into {s0 }, for ⊥ ≡ | h, ARGS(h) = ∅, otherwise. If we assume ≡ ⊥, we get EW = {s0 } = W and thus P ({s0 }) = 1. This implies 1, for | h, dsp(h) = 0, otherwise. In other words, degrees of support play the role of an indicator function for the logical deducibility of h from . Probabilistic reasoning. Purely probabilistic models are characterized by W = V , which again has various simple consequences. The most obvious one is the fact that W = V , from which EW = EV immediately follows. This allows us to write E as a common placeholder for EW and EV and to simplify (7) into ARGS(h) = E ∩ H. From this simplification we obtain P (E ∩ H ) = P (H |E), P (E) which is the usual way of defining posterior probabilities in the context of probabilistic reasoning. dsp(h) =

Probabilistic argumentation is therefore a true generalization of the two classical types of logical and probabilistic reasoning. This is a remarkable conclusion, which lifts probabilistic argumentation from its original intention as a theory of argumentative reasoning up to a unified theory of logical and probabilistic reasoning. 3.7. Connection to Dempster–Shafer theory From a technical point of view, many concepts of probabilistic argumentation are closely connected to respective concepts in the Dempster–Shafer theory. This connection has been thoroughly discussed in [41], according to which every probabilistic argumentation system is expressible as a corresponding belief (or mass) function, and vice versa. More precisely, if A = (V , LV , , W, P ) denotes a given probabilistic argumentation system, we may take V as the frame of discernment and then define a mass function m : 2V → [0, 1] by m(A) = P {s} : s ∈ W , J|sK = A , for all A ⊆ V . For a hypothesis H = JhK ⊆ V , it can then be shown that the concepts of normalized belief and plausibility, 1 m(A) and Pl(H ) = 1 − Bel H c , Bel(H ) = 1 − m(∅) ∅ =A⊆H

172

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

correspond precisely to our notions of degree of support and possibility, dsp(h) and dps(h), respectively. To obtain exactly the same result in a different way, it is also possible to translate the principal components of A (the sentences in and the probability measure P ) individually into respective mass functions and then apply Dempster’s rule of combination (see [41] for details). Finally, we can express any arbitrary mass function as a probabilistic argumentation system and formulate Dempster’s combination rule as a particular form of merging two probabilistic argumentation systems. Despite these technical similarities, the theories are still quite different from a conceptual point of view. For example, consider Dempster’s rule of combination, which is a crucial element of the Dempster–Shafer theory, but almost inexistent in the theory of probabilistic argumentation. Another difference is the fact that the notions of belief and plausibility in the Dempster–Shafer theory are often entirely detached from a probabilistic interpretation (especially in Smets’s TBM framework), whereas degrees of support and possibility are probabilities by definition. Finally, while the use of a logical language to express factual information is an intrinsic part of a probabilistic argumentation system, it is an almost unknown technique in the Dempster–Shafer theory. In other words, probabilistic argumentation demonstrates how to decorate the mathematical foundations of Dempster–Shafer theory with the expressiveness and convenience of a logical language. 4. Conclusion This paper describes a formal theory of argumentative reasoning called probabilistic argumentation. The proposed reasoning process consists of a qualitative and a quantitative part. For the qualitative part, the paper offers precise formal definitions of logical arguments, counter-arguments, and conflicts, as well as a detailed exhibition of corresponding mathematical properties. One of the most interesting aspects is the (conflict-driven) non-monotonic behavior of arguments and counter-arguments when new information accumulates over time. This shows that nonmonotonicity, which seems to be incompatible with monotone logical systems at first sight, results naturally from the proposed argumentation-oriented approach. The quantitative part of the reasoning process starts from the assumption of a given probability measure over some (but not necessarily all) variables involved. The idea then is to measure the weights of arguments and counterarguments by respective probabilities. The total probabilistic weight of a set of arguments is called degree of support. This is the key concept of this paper, which provides a non-additive and non-monotonic measure of the available support for the hypothesis. The non-additivity is a consequence of the underlying logical structure, in which the sets of arguments and counter-arguments are not necessarily complementary. As it is typical for a probabilistic system, non-monotonicity is obtained through Bayesian conditioning on the available evidence. Given the mathematical properties of degrees of support, the paper suggests to view probabilistic argumentation as a theory of rational degrees of belief, from which we may generate what some authors call an opinion. This is a simple but very instructive picture, which is useful when it comes justify the assumption of non-additive degrees of belief or the difference between uncertainty and ignorance. One of the most remarkable consequence of this paper is the observation that probabilistic argumentation naturally includes the two classical approaches to automated reasoning (namely logical and probabilistic reasoning) as special cases. The parameter that makes them distinct is the number of probabilistic variables. Probabilistic argumentation is more general in the sense that it supports any number of probabilistic variables. We can thus consider probabilistic argumentation as a new foundation for a unified theory of logical and probabilistic reasoning. As such, it may serve as starting point for a generalized decision theory, in which the possibility of incomplete or missing information is taken into account to decide about further deliberation. Acknowledgement This research is supported by the Swiss National Science Foundation, Project No. PP002-102652/1, and The Leverhulme Trust. Thanks to Michael Wachter for helpful remarks and proof-reading.

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

173

Appendix A. Proof of Theorem 7 In the following proof, we start from the definition of the set ARGS(h) and transform it step by step into the right-hand side of Theorem 7. With S ↑V we denote the vacuous extension of a subset S ⊆ W to V . ARGS(h) = {s ∈ EW : |s | h} = {s ∈ EW : EV |s ⊆ H } = EW \ {s ∈ EW : EV |s H } = EW \ s ∈ EW : EV |s ∩ H c = ∅ = EW \ s ∈ EW : x ∈ EV : x↓W = s ∩ H c = ∅ = EW \ s ∈ EW : EV ∩ {s}↑V ∩ H c = ∅ c = EW \ s ∈ EW : EV ∩ H c {s}↑V ↓W ↑V c ↓W {s} = E W \ s ∈ E W : EV ∩ H c ↓W {s}c = EW \ s ∈ E W : EV ∩ H c ↓W ↓W ∩ {s}c = ∅ = EW \ EV ∩ H c . = E W \ s ∈ E W : EV ∩ H c

2

References [1] E.W. Adams, A Primer of Probability Logic, CSLI Publications, Stanford, 1998. [2] L. Amgoud, C. Cayrol, A reasoning model based on the production of acceptable arguments, Annals of Mathematics and Artificial Intelligence 34 (1–3) (2002) 197–215. [3] L. Amgoud, C. Cayrol, Inferring from inconsistency in preference-based argumentation frameworks, Journal of Automated Reasoning 29 (2) (2002) 125–169. [4] B. Anrig, J. Kohlas, Model-based reliability and diagnostic: A common framework for reliability and diagnostics, International Journal of Intelligent Systems 18 (10) (2003) 1001–1033. [5] M. Beer, M. d’Inverno, N. Jennings, M. Luck, C. Preist, M. Schroeder, Argumentation and negotiation, Knowledge Engineering Review 14 (3) (1999) 285–289. [6] J. Bernoulli, Ars Conjectandi, Thurnisiorum, Basel, 1713. [7] P. Besnard, A. Hunter, A logic-based theory of deductive arguments, Artificial Intelligence 128 (1–2) (2001) 203–235. [8] D. Black, Theory of Committees and Elections, Cambridge University Press, Cambridge, USA, 1958. [9] P. Blackburn, M. de Rijke, Y. Venema, Modal Logic, Cambridge University Press, 2001. [10] A.J. Bonner, A logic for hypothetical reasoning, in: AAAI’88, 7th National Conference on Artificial Intelligence, Saint Paul, USA, 1988, pp. 480–484. [11] G. Boole, The Laws of Thought, Walton and Maberley, London, 1854. [12] R.E. Bryant, Graph-based algorithms for Boolean function manipulation, IEEE Transactions on Computers 35 (8) (1986) 677–691. [13] C. Cayrol, S. Doutre, J. Mengin, On decision problems related to the preferred semantics for argumentation frameworks, Journal of Logic and Computation 13 (3) (2003) 377–403. [14] M. Daniel, Algebraic structures related to Dempster–Shafer theory, in: B. Bouchon-Meunier, R.R. Yager, L.A. Zadeh (Eds.), IPMU’94, 5th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, in: LNCS, vol. 945, Springer, Paris, France, 1994, pp. 51–61. [15] A. Darwiche, P. Marquis, A knowledge compilation map, Journal of Artificial Intelligence Research 17 (2002) 229–264. [16] M. de Condorcet, Essai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix, L’Imprimerie Royale, Paris, France, 1785. [17] J. de Kleer, A perspective on assumption-based truth maintenance, Artificial Intelligence 59 (1993) 63–67. [18] A.P. Dempster, Upper and lower probabilities induced by a multivalued mapping, Annals of Mathematical Statistics 38 (1967) 325–339. [19] A.P. Dempster, A generalization of Bayesian inference, Journal of the Royal Statistical Society 30 (1968) 205–247. [20] I. Douven, Decision theory and the rationality of further deliberation, Economics and Philosophy 18 (2) (2002) 303–328. [21] P.M. Dung, On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games, Artificial Intelligence 77 (2) (1995) 321–357. [22] F.Y. Edgeworth, Probability, in: H. Chisholm (Ed.), Encyclopædia Britannica, vol. 22, 11th ed., University of Cambridge, 1911, pp. 376–403. [23] M. Edman, Adding independent pieces of evidence, in: Modality, Morality and other Problems of Sense and Nonsense: Essays dedicated to Sören Halldén, CWK Gleerups, Lund, Sweden, 1973, pp. 180–191. [24] P.O. Ekelöf, Rättegång, sixth ed., Norstedts Juridik AB, Stockholm, Sweden, 1992. [25] D. Ellsberg, Risk, ambiguity, and the Savage axioms, Quarterly Journal of Economics 75 (1961) 643–669. [26] R. Fagin, J.Y. Halpern, Uncertainty, belief, and probability, Computational Intelligence 7 (3) (1991) 160–173. [27] R. Fagin, J.Y. Halpern, N. Megiddo, A logic for reasoning about probabilities, Information and Computation 87 (1/2) (1990) 78–128. [28] R.P. Feynman, R.B. Leighton, M. Sands, The Feynman Lectures on Physics, vol. I, second ed., Addison-Wesley, 1963. [29] J. Fox, Probability, logic and the cognitive foundations of rational belief, Journal of Applied Logic 1 (3–4) (2003) 197–224.

174

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

[30] P. Gärdenfors, B. Hansson, N.E. Sahlin (Eds.), Evidentiary Value: Philosophical, Judicial and Psychological Aspects of a Theory, CWK Gleerups, Lund, Sweden, 1983. [31] G. Gerla, Inferences in probability logic, Artificial Intelligence 70 (1–2) (1994) 33–52. [32] I. Gilboa, Expected utility with purely subjective non-additive probabilities, Journal of Mathematical Economics 16 (1987) 65–88. [33] M. Ginsberg, Non-monotonic reasoning using Dempster’s rule, in: R.J. Brachman (Ed.), AAAI’84, 4th National Conference on Artificial Intelligence, Austin, USA, 1984, pp. 112–119. [34] I. Hacking, The Emergence of Probability, Cambridge University Press, 1975. [35] R. Haenni, Cost-bounded argumentation, International Journal of Approximate Reasoning 26 (2) (2001) 101–127. [36] R. Haenni, Anytime argumentative and abductive reasoning, Soft Computing—A Fusion of Foundations, Methodologies and Applications 8 (2) (2003) 142–149. [37] R. Haenni, Using probabilistic argumentation for key validation in public-key cryptography, International Journal of Approximate Reasoning 38 (3) (2005) 355–376. [38] R. Haenni, S. Hartmann, Modeling partially reliable information sources: a general approach based on Dempster–Shafer theory, International Journal of Information Fusion 7 (4) (2006) 361–379. [39] R. Haenni, J. Kohlas, N. Lehmann, Probabilistic argumentation systems, in: D.M. Gabbay, P. Smets (Eds.), Handbook of Defeasible Reasoning and Uncertainty Management Systems, vol. 5: Algorithms for Uncertainty and Defeasible Reasoning, Kluwer Academic Publishers, Dordrecht, Netherlands, 2000, pp. 221–288. [40] R. Haenni, N. Lehmann, Building argumentation systems on set constraint logic, in: B. Bouchon-Meunier, R.R. Yager, L.A. Zadeh (Eds.), Information, Uncertainty and Fusion, Kluwer Academic Publishers, Dordrecht, Netherlands, 2000, pp. 393–406. [41] R. Haenni, N. Lehmann, Probabilistic argumentation systems: a new perspective on Dempster–Shafer theory, International Journal of Intelligent Systems 18 (1) (2003) 93–106 (Special Issue on the Dempster–Shafer Theory of Evidence). [42] R. Haenni, Towards a unifying theory of logical and probabilistic reasoning, in: F.B. Cozman, R. Nau, T. Seidenfeld (Eds.), ISIPTA’05, 4th International Symposium on Imprecise Probabilities and Their Applications, Pittsburgh, USA, 2005, pp. 193–202. [43] R. Haenni, Propositional argumentation systems and symbolic evidence theory, PhD thesis, University of Fribourg, Switzerland, 1996. [44] R. Haenni, B. Anrig, J. Kohlas, N. Lehmann, A survey on probabilistic argumentation, in: ECSQARU’01, 6th European Conference on Symbolic and Quantitative Approaches to Reasoning under Uncertainty, Workshop on Adventures in Argumentation, Toulouse, France, 2001, pp. 19–25. [45] R. Haenni, Ignoring ignorance is ignorant, Tech. rep., Center for Junior Research Fellows, University of Konstanz, Germany, 2003. [46] P. Hájek, T. Havránek, R. Jiroušek, Uncertain Information Processing in Expert Systems, CRC Press, Boca Raton, USA, 1992. [47] P. Hájek, J.J. Valdés, Generalized algebraic approach to uncertainty processing in rule-based expert systems (dempsteroids), Computers and Artificial Intelligence 10 (1991) 29–42. [48] J. Hawthorne, Degree-of-belief and degree-of-support: Why Bayesians need both notions, Mind 114 (454) (2005) 277–320. [49] C. Howson, Probability and logic, Journal of Applied Logic 1 (3–4) (2003) 151–165. [50] J. Jonczy, R. Haenni, Credential networks: a general model for distributed trust and authenticity management, in: A. Ghorbani, S. Marsh (Eds.), PST’05, 3rd Annual Conference on Privacy, Security and Trust, St. Andrews, Canada, 2005, pp. 101–112. [51] A. Jøsang, A logic for uncertain probabilities, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9 (3) (2001) 279–311. [52] A. Jøsang, V.A. Bondi, Legal reasoning with subjective logic, Artificial Intelligence and Law 8 (4) (2001) 289–315. [53] A. Jøsang, Artificial reasoning with subjective logic, in: A.C. Nayak, M. Pagnucco (Eds.), 2nd Australian Workshop on Commonsense Reasoning, Perth, Australia, 1997. [54] A. Kean, G.K. Tsiknis, Assumption-based reasoning and clause management systems, Computational Intelligence 8 (1992) 1–24. [55] D. Kelsey, J. Quiggin, Theories of choice under ignorance and uncertainty, Journal of Economic Surveys 6 (2) (1992) 133–153. [56] J. Kohlas, Information Algebras: Generic Structures for Inference, Springer, London, 2003. [57] J. Kohlas, Probabilistic argumentation systems: A new way to combine logic with probability, Journal of Applied Logic 1 (3–4) (2003) 225–253. [58] J. Kohlas, Reliability of arguments, in: E. von Collani (Ed.), Defining the Science of Stochastics, in: Sigma Series in Stochastics, vol. 1, Heldermann, Lemgo, Germany, 2004, pp. 73–94. [59] J. Kohlas, Uncertain information: Random variables in graded semilattices, International Journal of Approximate Reasoning 46 (1) (2007) 17–34. [60] J. Kohlas, B. Anrig, R. Haenni, P.A. Monney, Model-based diagnostics and probabilistic assumption-based reasoning, Artificial Intelligence 104 (1998) 71–106. [61] J. Kohlas, P.A. Monney, Propagating belief functions through constraint systems, International Journal of Approximate Reasoning 5 (1991) 433–461. [62] J. Kohlas, P.A. Monney, A Mathematical Theory of Hints—An Approach to the Dempster–Shafer Theory of Evidence, Lecture Notes in Economics and Mathematical Systems, vol. 425, Springer, 1995. [63] R. Kohlas, Decentralized trust evaluation and public-key authentication, PhD thesis, University of Bern, Switzerland, 2007. [64] J. Kohlas, P.A. Monney, Probabilistic assumption-based reasoning, in: D. Heckerman, A. Mamdani (Eds.), UAI’93, 9th Conference on Uncertainty in Artificial Intelligence, Washington, USA, 1993, pp. 485–491. [65] J. Kohlas, P.A. Monney, An algebraic theory for statistical information based on the theory of hints, International Journal of Approximate Reasoning. [66] B.O. Koopman, The axioms and algebra of intuitive probability, Annals of Mathematics and Artificial Intelligence 41 (1940) 269–292. [67] S. Kraus, K. Sycara, A. Evenchik, Reaching agreements through argumentation: A logical model and implementation, Artificial Intelligence 104 (1–2) (1998) 1–69.

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

175

[68] H.E. Kyburg, Higher order probabilities and intervals, International Journal of Approximate Reasoning 2 (1988) 195–209. [69] J.H. Lambert, Neues Organon oder Gedanken über die Erforschung und Bezeichnung des Wahren und dessen Unterscheidung vom Irrtum und Schein, Johann Wendler, Leipzig, 1764. [70] P.S. Laplace, Théorie Analytique des Probabilités, third ed., Courcier, Paris, 1820. [71] K.B. Laskey, P.E. Lehner, Assumptions, beliefs and probabilities, Artificial Intelligence 41 (1) (1989) 65–77. [72] F. Lin, An argument-based approach to nonmonotonic reasoning, Computational Intelligence 9 (3) (1993) 254–267. [73] C. List, On the significance of the absolute margin, British Journal for the Philosophy of Science 55 (3) (2004) 521–544. [74] C. List, R.E. Goodin, Epistemic democracy: Generalizing the Condorcet jury theorem, Journal of Political Philosophy 9 (3) (2001) 277–306. [75] N. Maudet, S. Parsons, I. Rahwan (Eds.), ArgMAS’06, 3rd International Workshop on Argumentation in Multi-Agent Systems, Springer, Hakodate, Japan, 2006. [76] P.A. Monney, M. Chan, Modelling dependence in Dempster–Shafer theory, International Journal of Uncertainty, Fuzziness and KnowledgeBased Systems 15 (1) (2007) 93–114. [77] H.T. Nguyen, E.A. Walker, On decision making using belief functions, in: R.R. Yager, J. Kacprzyk, M. Fedrizzi (Eds.), Advances in the Dempster–Shafer Theory of Evidence, John Wiley and Sons, New York, USA, 1994, pp. 311–330. [78] N.J. Nilsson, Probabilistic logic, Artificial Intelligence 28 (1) (1986) 71–87. [79] W. Older, A. Vellino, Constraint arithmetic on real intervals, in: F. Benhamou, A. Colmerauer (Eds.), Constraint Logic Programming: Selected Research, MIT Press, 1993, pp. 175–196. [80] G. Paass, Probabilistic logic, in: P. Smets, E.H. Mamdani, D. Dubois, H. Prade (Eds.), Non-Standard Logics for Automated Reasoning, Academic Press, London, 1988, pp. 213–251. [81] S. Parsons, C. Sierra, N. Jennings, Agents that reason and negotiate by arguing, Journal of Logic and Computation 8 (3) (1998) 261–292. [82] S. Parsons, N. Maudet, P. Moraitis, I. Rahwan (Eds.), ArgMAS’05, 2nd International Workshop on Argumentation in Multi-Agent Systems, LNCS, vol. 4049, Springer, Utrecht, Netherlands, 2005. [83] J. Pearl, Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, San Mateo, USA, 1988. [84] J. Pearl, Reasoning with belief functions: An analysis of compatibility, International Journal of Approximate Reasoning 4 (5–6) (1990) 363–389. [85] K. Pearson, The History of Statistics in the 17th and 18th Centuries against the Changing Background of Intellectual, Scientific and Religious Thought: Lectures by Karl Pearson given at University College (London) during the Academic Sessions 1921–1933, Lubrecht & Cramer, London, UK, 1978. [86] J. Picard, J. Savoy, Using probabilistic argumentation systems to search and classify web sites, IEEE Data Engineering Bulletin 24 (3) (2001) 33–41. [87] J. Picard, J. Savoy, Enhancing retrieval with hyperlinks: A general model based on propositional argumentation systems, Journal of the American Society for Information Science and Technology 54 (4) (2003) 347–355. [88] J. Picard, Probabilistic argumentation systems for information retrieval, PhD thesis, University of Neuchatel, 2000. [89] H. Prakken, G. Sartor, A dialectical model of assessing conflicting arguments in legal reasoning, Artificial Intelligence and Law 4 (1996) 331–368. [90] H. Prakken, G. Vreeswijk, Logical systems for defeasible argumentation, in: D. Gabbay, F. Guenther (Eds.), Handbook of Philosophical Logic, vol. D7: Semi Classical Logics 7, second ed., Kluwer Academic Publishers, 2001, pp. 219–318. [91] I. Rahwan, P. Moraitis, C. Reed (Eds.), ArgMAS’04, 1st International Workshop on Argumentation in Multi-Agent Systems, LNCS, vol. 3366, Springer, New York, USA, 2004. [92] E.H. Ruspini, J. Lowrance, T. Strat, Understanding evidential reasoning, International Journal of Approximate Reasoning 6 (3) (1992) 401– 424. [93] E.H. Ruspini, The logical foundations of evidential reasoning, Tech. Rep. 408, SRI International, AI Center, Menlo Park, USA, 1986. [94] N.E. Sahlin, W. Rabinowicz, The evidentiary value model, in: D.M. Gabbay, P. Smets (Eds.), Handbook of Defeasible Reasoning and Uncertainty Management Systems, vol. 1: Quantified Representation of Uncertainty and Imprecision, Kluwer Academic Publishers, Dordrecht, Netherlands, 1998, pp. 247–266. [95] R.H. Sarin, P.P. Wakker, A simple axiomatization of nonadditive expected utility, Econometrica 60 (1992) 1255–1272. [96] D. Schmeidler, Subjective probability and expected utility without additivity, Econometrica 57 (3) (1989) 571–587. [97] D.A. Schum, Probability and the process of discovery, proof, and choice, Boston University Law Review 66 (3–4) (1986) 825–876. [98] G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, 1976. [99] G. Shafer, Non-additive probabilities in the work of Bernoulli and Lambert, Archive for History of Exact Sciences 19 (1978) 309–370. [100] G. Shafer, Perspectives on the theory and practice of belief functions, International Journal of Approximate Reasoning 4 (5–6) (1990) 323– 362. [101] G. Shafer, The early development of mathematical probability, in: I. Grattan-Guinness (Ed.), Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences, Routledge, London, UK, 1993, pp. 1293–1302. [102] G. Shafer, The significance of Jacob Bernoulli’s Ars Conjectandi for the philosophy of probability today, Journal of Econometrics 75 (1996) 15–32. [103] P. Smets, Probability of provability and belief functions, Journal de la Logique et Analyse 133 (1991) 177–195. [104] P. Smets, R. Kennes, The transferable belief model, Artificial Intelligence 66 (1994) 191–234. [105] C.A.B. Smith, Consistency in statistical inference and decision, Journal of the Royal Statistical Society 23 (1961) 31–37. [106] M. Smithson, Ignorance and Uncertainty, Springer, 1988. [107] B. Verheij, Arguments and defeat in argument-based nonmonotonic reasoning, in: EPIA’95, 7th Portuguese Conference on Artificial Intelligence, Madeira, Portugal, 1995, pp. 213–224.

176

R. Haenni / Journal of Applied Logic 7 (2009) 155–176

[108] G. Vreeswijk, Defeasible dialectics: A controversy-oriented approach towards defeasible argumentation, Journal of Logic and Computation 3 (1993) 317–334. [109] M. Wachter, R. Haenni, Propositional DAGs: a new graph-based language for representing Boolean functions, in: P. Doherty, J. Mylopoulos, C. Welty (Eds.), KR’06, 10th International Conference on Principles of Knowledge Representation and Reasoning, AAAI Press, Lake District, UK, 2006, pp. 277–285. [110] M. Wachter, R. Haenni, Logical compilation of Bayesian networks with discrete variables, in: K. Mellouli (Ed.), ECSQARU’07, 9th European Conference on Symbolic and Quantitative Approaches to Reasoning under Uncertainty, LNAI, vol. 4724, Hammamet, Tunisia, 2007, pp. 536–547. [111] M. Wachter, R. Haenni, Logical compilation of Bayesian networks, Tech. Rep. iam-06-006, University of Bern, Switzerland, 2006. [112] P. Walley, Statistical Reasoning with Imprecise Probabilities, Monographs on Statistics and Applied Probability, vol. 42, Chapman and Hall, London, UK, 1991. [113] P. Walley, T.L. Fine, Towards a frequentist theory of upper and lower probability, Annals of Statistics 10 (1982) 741–761. [114] R. Williams, Algorithms for quantified Boolean formulas, in: SODA’02, 13th Annual ACM–SIAM Symposium on Discrete Algorithms, San Francisco, USA, 2002, pp. 299–307. [115] J. Williamson, Probability logic, in: D. Gabbay, R. Johnson, H.J. Ohlbach, J. Woods (Eds.), Handbook of the Logic of Argument and Inference: the Turn Toward the Practical, Elsevier, Amsterdam, 2002, pp. 397–424. [116] J. Williamson, Objective Bayesian nets, in: S.N. Artëmov, H. Barringer, A.S. d’Avila Garcez, L.C. Lamb, J. Woods (Eds.), We Will Show Them!, in: Essays in Honour of Dov Gabbay, vol. 22, College Publications, 2005, pp. 713–730. [117] N. Wilson, Uncertain linear constraints, in: R. López de Mántaras, L. Saitta (Eds.), ECAI’04: 16th European Conference on Artificial Intelligence, Valencia, Spain, 2004, pp. 231–235. [118] Anonymous Author, A calculation of the credibility of human testimony, Philosophical Transactions of the Royal Society 21 (1699) 359–365.