Testing Knowledge-Based Recommender Applications

3 downloads 26995 Views 81KB Size Report
supported in our development environment which is presented e.g. in ... box testing into recommender application development processes and present experi-.
ÖGAI Journal 24/4

Testing Knowledge-Based Recommender Applications Alexander Felfernig1, Klaus Isak2, Thomas Kruggel1 1

Institut für Wirtschaftsinformatik und Anwendungssysteme Universitätsstraße 65-67, A-9020 Klagenfurt, Austria 2

ConfigWorks GmbH Lakeside B01, A-9020 Klagenfurt, Austria [email protected] [email protected] [email protected]

Abstract Knowledge-based recommender applications provide valuable support for customers in the process of identifying solutions from a potentially large set of (complex) products. These systems are successfully applied in domains such as financial services or the computer industry. However, in order to increase the acceptance of recommender technologies in industrial environments, testing techniques have to be integrated which allow a systematic validation of recommender knowledge bases. In this paper we show the integration of white box testing approaches into recommender development. We report first experiences from applying these concepts in industrial projects.

Introduction Recommender applications support the identification of products fitting to the wishes and needs of a customer. These applications are of great importance for making product assortments accessible to customers without technical domain knowledge. Knowledgebased recommender applications (advisors) (Ardissono et al. 2003; Burke 2000; Felfernig 2005) exploit deep knowledge about the product domain in order to determine solutions for the customer. Two basic aspects have to be considered when implementing a knowledge-based recommender application. Firstly, the relevant set of products has to be identified and transformed into a corresponding formal representation, i.e. a recommender knowledge base (Felfernig et al. 2003, Felfernig 2005) has to be defined. Such a knowledge base consists of a description of the set of products, a description of the possible set of customer requirements and a set of constraints restricting the possible combinations of customer requirements and product properties. Secondly, a recommender process has to be defined which represents personalized navigation paths through a recommender application (see e.g. Figure 1). Both, knowledge base- and recommender process design are supported in our development environment which is presented e.g. in (Felfernig 2005). Knowledge acquisition and maintenance as a collaborative process conducted by technical and domain experts is still a very time-consuming task. One of the key issues in this context is to support a systematic validation of recommender knowledge bases, i.e. to

1

2

ÖGAI Journal 24/4

include testing mechanisms into recommender development processes which ensure the correctness of the results calculated by the knowledge base. Testing mechanisms improve the applicability of recommender technologies in the following sense: •

Reduction of development efforts: if a new version of a recommender knowledge base is created, regression tests can be automatically triggered in order to assure that this new version is still consistent with the defined test cases. Automated regression tests reduce development and maintenance costs since faults in the knowledge base are detected very early, i.e. are not propagated to the productive environment.



Integration with model-based debugging: test cases can be automatically generated from a recommender process definition. Beside serving as a basis for regression testing, these test cases are used for debugging recommender knowledge bases, i.e. to automatically identify potential sources of inconsistencies in the knowledge base.



Improved acceptance of recommender technologies: although testing is considered the most pragmatic and successful technique in quality assurance, the research field is still insufficiently explored (Pretschner 2001, Preece 1997). However, our experiences from commercial projects show that the availability of a structured test support increases the acceptance of knowledge-based technologies by the customer.

In this paper we show how white box testing approaches can be integrated with knowledge-based recommender technologies. Firstly, we show the representation of recommender knowledge bases and recommender processes by giving a simple example from the financial services domain. Secondly, we discuss the integration of path-oriented white box testing into recommender application development processes and present experiences from applying the presented concepts in real-world projects.

Representing Recommendation Knowledge The first step when building a recommender application is the construction of a recommender knowledge base (see, e.g., the financial services recommender knowledge base in Figure 1) which consists of a set of variables (VC, VPROD) and a corresponding set of constraints (CR, CF, CPROD). •

Customer Properties (VC) describe possible customer requirements. Customer requirements are instantiations of customer properties. In the financial services domain willingness to take risks (low, medium, high) is an example for such a property and willingness to take risks = low is an example for a customer requirement.



Constraints (CR) are restricting the possible combinations of customer requirements, e.g., short investment periods are incompatible with high risk investments. Confronted with such customer requirements, the recommender application indicates the incompatibility and tells the customer to change his/her requirements.



Product Properties (VPROD) are a description of the properties of a given set of products. Examples for product properties in the financial services domain are recommended investment period, product type, or expected return on investment.



Filter Conditions (CF) establish the relationship between customer requirements and an available product assortment. An example for a filter condition is that customers without experiences in the financial services domain should not receive recommendations which include high risk products.

ÖGAI Journal 24/4 •

Allowed instantiations of Product Properties are represented by constraints (CPROD) which define restrictions on the possible instantiations of variables in VPROD.

Given a set of customer requirements, we can calculate a recommendation (result) for a specific customer. We denote the task to identify a set of products for a concrete customer as recommendation task. Definition 1 (Recommendation Task): A recommendation task can be defined as a Constraint Satisfaction Problem (VC, VPROD, CC ∪ CF ∪ CR ∪ CPROD), where VC is a set of variables representing possible customer requirements and VPROD is a set of variables describing product properties. CPROD is a set of constraints describing available product instances, CR is a set of constraints describing possible combinations of customer requirements and CF is a set of constraints describing the relationship between customer requirements and available products (also called filter conditions). Finally, CC is a set of concrete customer requirements (represented by unary constraints). □ Example 1 (Recommendation Task): In addition to the recommender knowledge base (VC, VPROD, CF ∪ CR ∪ CPROD) of Figure 1, CC={wrc=low, klc=beginner, idc=shortterm, slc=savings} is a set of customer requirements. □ Based on this definition of a recommendation task, we can now introduce the notion of a solution (consistent recommendation) for a recommendation task. Definition 2 (Consistent Recommendation): An assignment of the variables in VC and VPROD is denoted as consistent recommendation for a recommendation task (VC, VPROD, CC ∪ CF ∪ CR ∪ CPROD) iff each variable in VC, VPROD has an assigned value and this assignment is consistent with CC ∪ CF ∪ CR ∪ CPROD. □ Example 2 (Consistent Recommendation): {wrc=low, klc=beginner, idc=shortterm, slc=savings, namep=savings, erp=3, rip=low, mnivp=1, instp=A}. □ In order to be able to adapt the dialog style to a customer’s preferences and level of product domain knowledge, we have to provide mechanisms allowing the definition of personalized recommender user interfaces. A recommender user interface can be described by a finite number of states, where state transitions are triggered by requirements imposed by customers. Figure 1 includes an example for the process model of the user interface of a financial services recommender application. Customers specify requirements (input values) for a subset of a given set of customer properties. Depending on the input of the customer the automaton changes its state, e.g. an expert without any willingness to take risks (wrc=low, klc=expert) who isn’t interested in financial advisory (awc = no), is forwarded to the state q4 (direct product search). Consequently, different subsets of variables are defined by different paths in the automaton. The automaton of Figure 1 is based on the following definition (Felfernig et al., 2006). Definition 3 (Recommender Process): we define a Recommender Process to be a 6tuple (Q, Σ, ∏, E, S, F), where • Q = {q1, q2, ..., qj} is a finite set of states, where var(qi) = xi is a finite domain variable assigned to qi, prec(qi) = {φ1, φ2,..., φm} is the set of preconditions of qi (φk = {cr, cs, ..., ct} ⊆ ∏), postc(qi) = {ψ1, ψ2, ..., ψn} is the set of postconditions of qi (ψl= {cu, cv, ..., cw} ⊆ ∏), and dom(xi) = {xi=di1, xi=di2, ..., xi=dip } denotes the set of possible assignments of xi, i.e. the domain of xi. • Σ = {xi = dij | xi = var(qi), (xi = dij ) ∈ dom(xi)} is a finite set of variable assignments (input symbols), the input alphabet. • ∏ = {c1, c2, ..., cq} is a set of constraints (transition conditions) restricting the set of words accepted by the recommender process.

3

4

ÖGAI Journal 24/4

• • •

E is a finite set of transitions ⊆ Q × ∏ × Q. S ⊆ Q is a set of start states. F ⊆ Q is a set of final states. □

A word w ∈ Σ* (i.e. a sequence of user inputs) is accepted by a recommender process if there is an accepting run of w in the process. Words accepted by the process definition are candidates for test cases (see the following section). Customer Properties (VC): /* level of expertise */ klc(expert, average, beginner) /* willingness to take risks */ wrc(low, medium, high) /* duration of investment */ idc(shortterm, mediumterm, longterm) /* advisory wanted? */ awc(yes, no) /* direct product search */ dsc(savings, bonds, stockfunds, singleshares) /* type of low risk investment */ slc(savings, bonds) /* availability of funds? */ avc(yes,no) /* type of high risk investment */ shc(stockfunds, singleshares) Product Properties (VPROD): /* product name */ namep(text) /* expected return rate */ erp(1..40) /* risk rate of product */ rip(low, medium, high) /* minimal investment period */ mnivp (1..14) /* financial institute */ instp(text)

q0

c0

var(q0)=wrc

q1

c2

var(q1)=klc

Transition Conditions: c0: true c1: klc=beginner c2: klc=expert ∨ klc=average c3: awc=yes c4: awc=no c5: idc=shortterm c6: idcshortterm c7: klcbeginner c8: avc=no c9: avc=yes

c1

c4

q2

q4

var(q2)=awc

var(q4)=dsc

c3

q7

var(q7)=slc

c5

q3

c6, c7

var(q3)=idc

Constraints(CR): CR1: wrc = high ⇒ idcshortterm CR2: klc= beginner ⇒ wrchigh … Filter Conditions(CF): CF1: idc=shortterm ⇒ mnivp < 3 CF2: idc=mediumterm ⇒ mnivp >=3 ∧ mnivp < 6 CF3: idc=longterm ⇒ mnivp >=6 CF4: wrc=low ⇒ rip=low CF5: wrc=medium ⇒ rip=medium ∨ rip=low CF6: wrc=high ⇒ rip=high ∨ rip=medium ∨ rip=low CF7: klc = beginner ⇒ rip high CF8: slc = savings ⇒ namep=savings CF9: slc = bonds ⇒ namep=bonds …

c8

q5

c9

var(q5)=avc

q6 var(q6)=shc

Allowed instantiatons of Product Properties(CPROD): /* product 1 */ namep=savings ∧ erp=3 ∧ rip=low ∧ mnivp=1 ∧ instp=A ∨ … /* product 2 */ namep=bonds ∧ erp=5 ∧ rip=medium ∧ mnivp=5 ∧ instp=B ∨ … /* product 3 */ namep=equity ∧ erp=9 ∧ rip=high ∧ mnivp=10 ∧ instp=B

Figure 1: Example process definition and recommender knowledge base.

Testing Recommender Knowledge Bases Having completed the definition of a recommender knowledge base and the corresponding process we want to check whether the results calculated by the recommender knowledge base are correct. For this purpose we support the specification of test cases which can be used to check the correct behaviour of a recommender knowledge base. Recommender process definitions serve as a basis for automatically generating test cases. Test cases can be generated by solving a Constraint Satisfaction Problem which is derived from specific paths of a recommender process. Definition 4 (Path in Recommender Process): we denote a sequence p=[(q1,C1,q2), (q2,C2,q3), ..., (qi−1,Ci−1,qi)] ((qα, Cα, qβ ) ∈ E) as a path in a recommender process connecting the states q1 and qi where q1 ∈ S and qi ∈ Q. Furthermore, pathvars(p)={var(q1), var(q2), …, var(qi-1), var(qi)} represents customer properties of p and transitions(p)={C1,C2, …, Ci-1} represents transition conditions of p. □

ÖGAI Journal 24/4 Example 3 (Path in Recommender Process): From the recommender process in Figure 1 we can derive e.g. the path p=[(q0,c0:true,q1), (q1,c1:klc=beginner,q3), (q3, c5:idc=shortterm, q7)], where pathvars(p)={wrc, klc, idc, slc} and transitions(p)={c0, c1, c5}. □ Since we are primarily interested in paths leading to a final state in the process definition, we introduce the concept of a consistent path. Definition 5 (Consistent Path): we denote a path p=[(q1,C1,q2), (q2,C2,q3), ..., (qi−1,Ci−1,qi)] ((qα, Cα, qβ ) ∈ E) as consistent iff ∪Cα is satisfiable. □ Example 4 (Consistent Path): Path p in Example 3 is consistent, since c0:true ∪ c1: klc=beginner ∪ c5: idc=shortterm is satisfiable. □ Now, we can calculate the set of possible input sequences for a consistent path p of a recommender process definition. Definition 6 (Input Sequence for Path): We define an input sequence for a path p=[(q1,C1,q2), (q2,C2,q3), ..., (qi−1,Ci−1,qi)] ((qα,Cα, qβ ) ∈ E) as s=[x1=d1j, x2=d2j, …, xi=dij], where xi=dij ∈ dom(xi) and ∪(xi=dij) is consistent with ∪Cα in p. □ Example 5 (Input Sequence for Path): For path p in Example 3 there exist 6 different input sequences: s1=[wrc=low, klc=beginner, idc=shortterm, slc=savings]. s2=[wrc=low, klc=beginner, idc=shortterm, slc=bonds]. s3=[wrc=medium, klc=beginner, idc=shortterm, slc=savings]. s4=[wrc=medium, klc=beginner, idc=shortterm, slc=bonds]. s5=[wrc=high, klc=beginner, idc=shortterm, slc=savings]. s6=[wrc=high, klc=beginner, idc=shortterm, slc=bonds]. □ For each of the input sequences calculated for a path in a recommender process, we can determine the corresponding results calculated by the knowledge base. One concrete combination of an input sequence and corresponding result is denoted as test case. Definition 7 (Test Set): A set T = {t1, t2, …, tn} is denoted as a test set consisting of a set of test cases ti, where •

ti = [x1=d1j, x2=d2j, …, xi=dij, y1, y2, …, yn] if the recommender knowledge base



ti = [x1=d1j, x2=d2j, …, xi=dij] if mender knowledge base.

∪(xi=dij) ∪ ∪(yi=dij) is consistent with

∪(xi=dij) ∪ ∪(yi=dij) is inconsistent with the recom-

In this case, xi represent customer properties and yi represent product properties, (xi=dij ∈ dom(xi), yi=dij ∈ dom(yi)). □ Example 6 (Test Set): For the input sequences of Example 5 we can derive the following test set T = { t1=[wrc=low, klc=beginner, idc=shortterm, slc=savings, namep=savings, erp=3, rip=low, mnivp=1, instp=A], t2=[wrc=low, klc=beginner, idc=shortterm, slc=bonds], t3=[wrc=medium, klc=beginner, idc=shortterm, slc=savings, namep=savings, erp=3, rip=low, mnivp=1, instp=A], t4=[wrc=medium, klc=beginner, idc=shortterm, slc=bonds], t5=[wrc=high, klc=beginner, idc=shortterm, slc=savings, namep=savings, erp=3, rip=low, mnivp=1, instp=A], t6=[ wrc=high, klc=beginner, idc=shortterm, slc=bonds]}. □ After being validated by a domain expert, test cases can be used for testing new versions of a recommender knowledge base. Note that the domain expert can define a test case

5

6

ÖGAI Journal 24/4

as positive (test case should be accepted by the recommender knowledge base) and negative (test case should be rejected by the recommender knowledge base). We denote + a set of positive test cases as T and a set of negative test cases as T . A recommender knowledge base is valid, if all the positive cases are consistent with the knowledge base and no negative ones are accepted by the knowledge base. Definition 8 (Valid Recommender Knowledge Base): Given a recommender knowledge base KB=(VC, VPROD, CF ∪ CR ∪ CPROD) and corresponding sets T+ = {t1+, t2+, …, tn+} and T- = { t1-, t2-, …, tm-}, KB is valid iff there exists a solution for (VC, VPROD, CF ∪ CR ∪ CPROD ∪ ti+) ∀ti+ ∈ T+ and no solution exists for (VC, VPROD, CF ∪ CR ∪ CPROD ∪ ti-) ∀ti- ∈ T-. □ In situations where a recommender knowledge base becomes invalid, the task of the knowledge engineer is to identify faulty constraints in the knowledge base. This task can be solved by applying concepts of model-based diagnosis (Reiter 1987, Felfernig et al. 2004) to the automated debugging of recommender knowledge bases.

Experiences from Projects The presented concepts to the test of recommender knowledge bases have been primarily applied within the context of projects in the financial services domain (Felfernig et al. 2005). Our experiences show that domain experts agree with accepting additional efforts related to the inspection of test sets since the quality of the recommendations is of serious concern, e.g. recommendations calculated for customers in the financial services domain must be correct in every case. The disposable time of domain experts for testing is restricted, therefore mechanisms must be provided which reduce the amount of tests. The complete set of possible input sequences for a knowledge base with 20 customer properties with a domain of cardinality 5 would comprise about 520 sequences. The validation of the corresponding test cases is definitely infeasible for a domain expert. Reducing the input space to 20 possible paths each path defined by 7 variables and 5 possible values per variable reduces the number of potential test cases to 1.5 mio which is still infeasible. Therefore, we have to introduce additional constraints which allow a further refinement of the set of relevant test cases. Currently, the following restriction types are taken into account in our recommender application development environment (Felfernig 2005): •

Large variable domains can be split up into a corresponding set of equivalence classes which form a partition of a set, where a partition refers to a collection of mutually disjoint subsets and the union of those sets is the entire set. The selection of variable assignments should be based on the concept of boundary value analysis assuming that errors tend to occur near the extreme values of an input variable.



Combinations of customer requirements which are inconsistent with the knowledge base can be omitted by certifying the corresponding incompatibility constraints as valid, e.g. the constraint “customers with an age above 55 must not receive a recommendation of a pension product”. If this constraint is certified, we can omit all input sequences with the corresponding assignment combination.



In some situations the recommender system poses questions which do not have an effect on the final result (marketing questions, where no constraints are defined on the corresponding variable), e.g. when recommending pension products the customer can be asked to make a decision concerning the preferred mode of payment (one-off, regularly, want to decide later) which has no influence on the final result.

ÖGAI Journal 24/4 •

Confronted with large variable domains and long recommender processes, random selection mechanisms are a potential means to significantly reduce the number of test cases. Different facets of random selection are possible, e.g. path selection or assignment selection (reduction of a variable domain using a statistical distribution).

By introducing such additional restrictions we can reduce the number of input sequences from 1.5 mio to about 500-1000 (Felfernig 2005) which is accepted by domain experts.

Conclusions In this paper we have discussed basic concepts behind the development of knowledgebased recommender applications. We have shown how to apply testing techniques in order to support domain experts in the validation phase of a recommender application development project. Future work will include the development of abstraction concepts for the improved presentation of test cases to domain experts and mechanisms which increase the probability of detecting existing errors in a knowledge base.

Acknowledgements The work presented in this paper has been done in the context of the project Koba4MS (Knowledge-based Advisors for Marketing and Sales) (Project Nr.: FFG-808479).

References [1] L. Ardissono, A. Felfernig, G. Friedrich, D. Jannach, G. Petrone, R. Schaefer, M. Zanker. A Framework for the development of personalized, distributed web-based configuration systems. AI Magazine, 24(3):93–108, 2003. [2] R. Burke. Knowledge-based Recommender Systems. Encyclopedia of Library & Information Systems, 69(32), 2000. [3] A. Felfernig, G. Friedrich, D. Jannach, M. Stumptner, M. Zanker. Configuration knowledge representations for Semantic Web applications. AI Engineering Design, Analysis and Manufacturing Journal, 17:31–50, 2003. [4] A. Felfernig, G. Friedrich, D. Jannach, and M. Stumptner. Consistency-based Diagnosis of Configuration Knowledge Bases. AI Journal 2, 152, 213–234, 2004. [5] A. Felfernig. Koba4MS: Selling Complex Products and Services Using KnowledgeBased Recommender Technologies. 7th IEEE International Conference on ECommerce Technology, pp. 92–100, 2005. [6] A. Felfernig, A. Kiener, Knowledge-based Interactive Selling of Financial Services with th FSAdvisor, 17 Innovative Applications of Artificial Intelligence Conference (IAAI'05), Pittsburgh, Pennsylvania, AAAI Press, pp. 1475-1482 (2005). [7] A. Felfernig, K. Shchekotykhin. Debugging User Interface Descriptions of Knowledgebased Recommender Applications, to appear in Proceedings of the ACM International Conference on Intelligent User Interfaces, Sydney, Australia, 2006 (to appear). [8] A. Preece, S Talbot and L Vignollet. Evaluation of Verification Tools for KnowledgeBased Systems, International Jrnl. of Human-Computer Studies, 47, 629-658, 1997. [9] A. Pretschner. Classical search strategies for test case generation with Constraint Logical Programming. Formal Approaches to Testing of Software, pp. 47-60, 2001. [10] R. Reiter. A theory of diagnosis from first principles. AI Journal 23, 1, 57–95, 1987.

7