An Empirical User Study of an Active Reuse ... - Semantic Scholar

5 downloads 10430 Views 183KB Size Report
CodeBroker delivers components whenever a doc comment or a signature ... wants to create a random number between two integers and writes a doc comment ...
Proceedings of 7th International Conference on Software Reuse (ICSR-7), Austin, TX, pp. 281-292, Apr. 15-19, 2002

An Empirical User Study of an Active Reuse Repository System Yunwen Ye1,2 1

SRA Key Technology Laboratory, Inc., 3-12 Yotsuya, Shinjuku, Tokyo, 160-004, Japan Department of Computer Science, University of Colorado, Boulder, CO80309-430, USA [email protected]

2

Abstract. This paper reports an empirical user study of an active reuse repository system. Instead of waiting passively for software developers to initiate the component location process with a well-defined reuse query, active reuse repository systems infer reuse queries from syntactic and semantic cues present in partially constructed programs in development environments, and proactively deliver components that match the inferred reuse queries. The reported empirical user study of an implemented active reuse repository system called CodeBroker shows that active repository systems promote reuse by motivating and enabling software developers to reuse components whose existence is not anticipated, and reducing the cost of reuse through the automation of the component location process.

1 Introduction One factor that inhibits the widespread success of systematic software reuse is the problem of no attempt to reuse—software developers construct new systems from scratch rather than reusing existing software components from a reuse repository. According to previous studies [11], no attempt to reuse is the leading failure mode of software reuse. We are primarily concerned with the cognitive difficulties that prevent software developers from attempting to reuse: (1) the unawareness of the existence of reusable components, and (2) the lack of means to locate the wanted components [6, 21]. If software developers do not even anticipate that a component that can be reused in their current development task exists in the repository, they would not attempt to reuse. An effective reuse repository often contains numerous components (for example, the Java 1.2 core library has more than 70 packages and 2100 classes), which makes it impossible for software developers to anticipate the existence of all the components. Previous empirical studies [7] conclude that most software developers can only anticipate the existence of a limited portion of the components included in a repository, and that they would not actively seek the reuse of the components whose existence they do not know. This conclusion is corroborated by many reports [4, 5, 17] about reuse experience in companies. Even if software developers are willing to reuse a component, they might not be able to do so if they perceive reuse costs more than developing from scratch or if they

are unable to locate the component. Browsing and querying have been the principal approaches to locating components. Browsing requires that users have a fairly good understanding about the structure of the reuse repository, and it is not scalable. Querying requires that users be able to formulate a well-defined query that clearly states their information needs, which is cognitively challenging [8]. Active reuse repository systems that support information delivery mechanisms hold the potential to address the above two issues [20]. Unlike most traditional reuse repository systems that solely employ information access mechanisms (browsing and querying), which require software developers to initiate the reuse process, active reuse repository systems infer reuse queries from syntactic and semantic cues present in partially constructed programs in development environments, and proactively deliver components that match the inferred reuse queries. An active reuse repository system called CodeBroker has been designed and developed. This paper reports an empirical user study of the CodeBroker system to show how active reuse repository systems promote reuse by encouraging and enabling software developers to reuse components whose existence is not anticipated, and reducing the cost of reuse through the automation of the component location process.

2 The CodeBroker System This section briefly describes the CodeBroker system, which supports Java developers by delivering task-relevant and personalized components—components that can potentially be reused in the current development task and that are not yet known to the developers (see [20, 21] for details). CodeBroker is seamlessly integrated with the programming environment Emacs. Figure 1 shows a screen shot and the architecture of the system, which consists of an interface agent and a backend search engine. Running continuously as a background process in Emacs, the interface agent infers and extracts reuse queries by analyzing the partially written program in the normal editing space of Emacs (Figure 1a). Inferred queries are passed to the search engine, which retrieves matching components from the repository. Retrieved components are delivered by the interface agent in the delivery buffer (Figure 1b), after it has removed the components that are contained in discourse models and user models (see below for brief discussion and [9] for details). The reuse repository of components contains indexes created by CodeBroker from the standard Java documentation that Javadoc generates from Java source programs, and links to the Java documentation system. CodeBroker delivers components whenever a doc comment or a signature definition is entered into the editing space. For example, in Figure 1a, the developer wants to create a random number between two integers and writes a doc comment to indicate so. As soon as the rightmost ‘/’ (signaling the end of a doc comment) is entered, the contents of the doc comment are extracted as a query, and components from the repository that match it are shown immediately in the delivery buffer. The similarity between a query and a component is determined by full text retrieving techniques. CodeBroker supports both the Latent Semantic Analysis (LSA) technique [3] and the probabilistic model [16].

Comment: Create a random number between two limits Signature: int