A new bibliometric approach to measure knowledge ...

3 downloads 0 Views 2MB Size Report
Jan 1, 2007 - ... knowledge transfer, knowledge exchange, or knowledge management. As ...... Technology & Innovation Indicators (STI 2017), ESIEE, Paris.
A new bibliometric approach to measure knowledge transfer of internationally mobile scientists Valeria Aman German Centre for Higher Education Research and Science Studies (DZHW) Schützenstraße 6a, 10117 Berlin Phone: +49(0)30-2064177-40 [email protected]

Abstract This study introduces a new bibliometric approach to study the effects of international scientific mobility on knowledge transfer. It is based on an analysis of internationally mobile and noninternationally mobile German scientists publishing in journals that are indexed in Scopus. Using bibliometric data such co-authored articles, references and lexical abstract terms from the Scopus database, a method is presented that is based on cosine similarity to measure the similarity of the knowledge base of authors and their co-authors. This quantifiable method is capable of revealing potential knowledge transfer between internationally mobile scientists and different types of coauthors. In addition, the Shannon index is used as a diversity measure to analyse the knowledge base of scientists. Analyses are presented for an overall 9-year publication period (2007 to 2015), split into a pre-mobility phase, a mobility phase and a post-mobility phase, each of which lasts for three years. Internationally mobile scientists are compared with non-internationally mobile scientists and the potentials and limitations of the method presented are discussed. It is concluded that the bibliometric approach proposed is useful when applied on a large scale. International mobility proves to benefit the exchange of knowledge between scientists and various types of co-authors.

Introduction There is an increased interest in understanding the mechanisms that facilitate the transfer of knowledge between scientists. Knowledge transfer issues have captured the attention of various academic disciplines such as health, education, social sciences and natural sciences. As a consequence, a variety of terms are used interchangeably, such as dissemination of knowledge, knowledge diffusion, knowledge transfer, knowledge exchange, or knowledge management. As in the majority of studies related to the topic, the term knowledge transfer is used throughout this paper and concepts of knowledge sharing and knowledge exchange are implicit because they are always in place when people interact. The internationalization of science systems makes international mobility relevant to dynamics of knowledge transfer and knowledge production. International mobility is supposed to facilitate the acquisition and recombination of knowledge that is located at distant places (Laudel, 2003; Gläser, 2006) and is indispensable when the knowledge to be transferred is tacit and thus requires personal contact, observation of colleagues, or informal meetings (Collins, 1974, 2001). It is therefore crucial to understand how international mobility influences the transfer of scientific knowledge across national borders. In this study, international mobility of scientists is understood as the physical transition between research organizations that are located in different countries of the world. Cross-border mobility is a means to gain access to international research groups (Ackers, 2008). Previous 1

literature related to geographic mobility and transfer of knowledge focused mainly on the size and direction of migratory flows, neglecting the mechanisms underlying knowledge transfer. In contrast, Ackers suggests that the “focus should shift towards the quality of flows and the nature of knowledge transfer processes” (2005, p.116). Moreover, there is no systematic body of empirical knowledge that allows examining the implications of mobility for the dynamics of the production and diffusion of knowledge (Cañibano et al., 2008). Although the circulation of knowledge at the transnational level is often desired and facilitated by organizational support or job change, the relationship between international mobility and knowledge circulation is relatively under-researched and the field of study suffers from a lack of empirical results. Most of the existing studies are speculations about the impact of international mobility on knowledge production and transfer, whereas published studies of true knowledge transfer are rare. However, Collins (1974) described a case where an innovative laser could be constructed by scientists who had seen the original set-up and interacted with the constructors, but could not be replicated by scientists who had only read about it and studied diagrams. We still know little about how international mobility contributes to knowledge transfer because we lack methods to measure knowledge transfer. Measuring the potential knowledge transfer of internationally mobile scientists is a complex endeavor that cannot be addressed exhaustively in this study. My research aims at contributing to a partial solution of how to indicate knowledge transfer between actors by using bibliometrics as a promising method. Apart from institutional evaluations and individual research assessments, bibliometrics can be used for measuring intellectual influences of scholarly activities. One of the first studies exploring bibliometric methods related to international mobility was published by Laudel (2003). International scientific mobility can be traced by looking at scientists’ affiliations through the years and their scientific output throughout their career. The main goal of this study is to test methods for measuring knowledge transfer of internationally mobile scientists. The basic assumption is that whenever scientists interact, knowledge transfer takes place. Working abroad in a research group may facilitate collaboration that is an opportunity for co-authorship and potential knowledge transfer. Whenever authors co-publish they communicate and interact, which fosters knowledge transfer. The intensity of knowledge transfer may depend on a common knowledge base. Assuming that co-authorship constitutes one opportunity where knowledge transfer takes place, the present study examines German scientists and their co-authors publishing between 2007 and 2015 in journals that are covered by Scopus. Internationally mobile German scientists are compared with non-internationally mobile scientists in the field of Chemistry, Medicine and Physics & Astronomy to account for epistemic characteristics. The data set used in this study allows for robust findings. I measure the similarity of the knowledge base of authors and different types of co-authors with the cosine similarity that was first introduced in information science by Salton and McGill (1983). In bibliometrics, first similarity measures were derived from citation-based links as well as from text-mining of article titles and abstracts (see Glänzel, 2012). In addition, I make use of the Shannon index, which is a diversity measure developed in information science to measure the information content of data sets (Shannon, 1948). In the context of international mobility the Shannon index indicates how the diversity of knowledge bases differs between scientists

2

being internationally mobile and those who are not. Moreover, I examine how the diversity of scientists’ knowledge bases changes over time, especially in relation to international mobility. The overall question of how international mobility of scientists contributes to knowledge transfer cannot be solved with this single publication. As a first step of the complex endeavor, I focus on co-authorship as the primary variable of research collaboration. Co-authorship can be understood as an opportunity for knowledge transfer and the similarity of knowledge bases of authors and co-authors functions as proxy of knowledge transfer. The following theoretical part outlines how knowledge transfer, international mobility and research collaboration are related to each other. Subsequently, I explain the data and methods used and describe how knowledge transfer was operationalized in the underlying study. In the rest of the paper I present the results, a discussion, an overview of the strengths and limitations of the method presented and finish with a conclusion and outlook. Theory Knowledge transfer and international mobility Knowledge production is a collective endeavor taking place on the micro-social level, the level of organizations and on the level of scientific communities (Gläser, 2003). Most scientific communities are international and have members in diverse countries working on a collective stock of knowledge. Every active member contributes to the common body of knowledge by publishing and referencing knowledge claims (Gläser & Laudel, 2001). The need to contribute to new knowledge may require international mobility as one mode to create scientific networks. Scientists are important carriers of knowledge and their international mobility represents a particular knowledge flow. Internationally mobile scientist inevitably carry knowledge with them that they can integrate through participation in research processes with the knowledge base of the research group they visit. Previous literature reports that international mobility enables the dissemination of scientists’ knowledge to other places and encourages new combinations of knowledge (Laudel, 2003). The theoretical construct of knowledge recombination by Fleming (2001) suggests that mobility of scientists facilitates the mobility of knowledge, and that knowledge combined from different places is potentially more creative and more diverse than local knowledge. The specific accumulation of knowledge and its processing result in a permanent recombination of scientists’ knowledge base (Gläser, 2006). There are different types of knowledge that constitute the knowledge base of scientists. Gläser (2006) established a meaningful distinction of three knowledge types: formal knowledge that is published and can be accessed by the scientific community, informal knowledge that is communicated on demand, and tacit knowledge. Tacit knowledge refers to the highly personable knowledge that individuals find hard to express and that can be only transferred in face-to-face communication (ibid.). This knowledge type is embedded in experiences and skills of the individual and is only revealed through its application as long as individuals decide to share their tacit knowledge (Polanyi, 1962). Seminal empirical studies on the role of tacit knowledge were published by Collins (1974, 2001) arguing that the transfer of tacit knowledge can only be solved by physical mobility in form of exchange of visits. Thus, some types of knowledge can only be transferred through international mobility that is tied to co-presence. Following Gläser’s (2006) distinction of knowledge, communication in 3

co-presence enables to transfer the three different knowledge types in the following way. The transfer of knowledge about formal knowledge may be realized as the recommendation to read a specific publication. Informal knowledge could be transferred as an oral explanation of a certain method. Tacit knowledge during an international stay could be methodical know-how, i.e. the demonstration of a trick in an experiment that is nowhere described and only exists in a scientist’s mind (Gläser, 2006). The only knowledge type not restricted to co-presence is codified formal knowledge that can be transferred through e-mails, phones and real-time information technology. Although science invests considerably in codification of knowledge in form of scientific publications, a large part of the knowledge produced remains tacit or informal. These types of knowledge are only transferred through co-presence and interaction that require international mobility. The interaction of scientists includes not only the transfer of knowledge but also the expansion and modification of knowledge so that knowledge transfer can be regarded as inherent to knowledge production. Research collaboration as an opportunity for knowledge transfer The production and transfer of knowledge is more and more a collective process. Collaboration between research groups is a dominant process in science that is necessarily tied to co-presence. Research collaboration can either require international mobility to enable interaction or result from international mobility. The growing trend towards scientific collaboration and an associated increase in co-publications in the last decades have been documented in previous studies (see e.g. Wagner & Leydesdorff, 2005). Research collaboration often starts informally at conferences or similar occasions where scientists meet and are stimulated to think about unsolved research problems, possible joint projects or the interpretation of results (Laudel, 2001). Collaboration brings scientists together to observe, discuss and reflect research work and to combine elements of existing knowledge. In research collaborations scientists can either be diametric in their research focus and complement each other’s work or be similar in their topical orientation and work on the same issue. According to Katz and Martin one benefit of collaboration is “the sharing of knowledge, skills and techniques” (1997, p.14). Collaboration also ensures a division of labor and therefore a more efficient approach to a problem. Research collaboration that requires international mobility can be motivated by factors such as the desire to increase knowledge, to exchange data and skills and to advance a professional career (Luukkonen et al., 1993). Another rationale behind going abroad to collaborate with a leading research group in a field is that scientists want to increase their scientific visibility and recognition (Katz & Martin, 1997). The need to operate with large-scale instruments may cause scientists to go abroad and to collaborate with foreign research groups. Also the urge to gain knowledge in highly developing research fields may force scientists to work with international groups to make progress in a field. Results of an analysis of scientific collaboration and scientific mobility of Chinese researchers in the field of plant molecular life science suggests that international mobility emerges via scientific collaboration (Jonkers & Tijssen, 2008). Moreover, research collaboration is one way to transfer new knowledge - especially tacit knowledge that is not yet codified in publications. The physical proximity of scientists during

4

an international stay abroad can benefit the acquisition of skills and tacit knowledge and therefore guarantee the effective translation of contextualized knowledge. Research collaboration also involves the transfer of formal knowledge, e.g. hinting at certain publications to read and cite. In general, co-presence of scientists in research collaboration is inevitably connected to the transfer of any type of knowledge, because scientists are essential knowledge carriers that take their knowledge with them when they move from one organization to another (Gläser, 2006). Data and methods Data base The data were retrieved from Scopus (Elsevier) that is licensed as custom data at the Competence Centre for Bibliometrics1 and is integrated in an in-house database version. Scientists are identified by their author ID in Scopus. The author ID is supposed to combine all publications of an author under a single ID to handle common first and last names (Moed et al., 2013). In order to associate authors with their publication oeuvre, the algorithm uses a multifaceted approach where name spelling variants, affiliations, co-authors, subject areas and the prior publication history are taken into account.2 Only the author ID enables large-scale analyses of international mobility, because it is imprecise to disentangle scientists simply on the basis of their first and last name. Moed et al. (2013) and Conchi and Michels (2014) published on the reliability of Scopus’ author ID to trace scientists’ mobility and found it acceptable if applied on a large data set so that certain flaws are counterbalanced. Kawashima and Tomizawa (2015) also evaluated the accuracy of Scopus’ author ID and reported that recall and precision are beyond 98%. A recent study of German scientists shows that Scopus author ID produces good results (Aman, 2017). Internationally mobile scientists are specified as authors whose affiliation changes from one country to another. A country relates to the geographic location of the institute at which scientists conducted and published their work. It is not the country of their nationality or official country of residence. Data selection The compilation of the data set is as follows. In a first step, all authors were selected who have at least one paper published with a German affiliation between 2007 and 2015 (a total of 677,195 scientists). Publications are limited to journal articles and journal reviews as well as to conference proceedings covered in Scopus. In a next step the publication count of these authors within the time frame 2007 to 2015 was determined. On the basis of this data “German authors” were defined as those who have the majority of their papers published with a German affiliation and who have published 3 to 200 publications (fractionally counted according to country

1

Competence Centre for Bibliometrics. http://www.forschungsinfo.de/Bibliometrie/en/index.php?id=home The algorithm aims at higher levels of precision than recall. Thus, as soon as a publication cannot be assigned to an existing author ID, a new profile will be created under which the publication appears. The algorithm is supplemented by an author feedback system where an author can indicate whether publications are wrongly attributed to his profile. 2

5

codes).3 By this limitation I try to account for merged or split identities and exclude authors with few publications, because they hardly provide information on how authors move from one country to another. To study the effects of international mobility on knowledge transfer, I distinguish three phases: a pre-mobility phase that ranges from 2007 to 2009, a mobility phase of an undefined duration between 2010 and 2012 and a post-mobility phase that lasts from 2013 to 2015. The choice of periods is explained as follows: The most up-to-date publication year available at the time of data collection is 2015. To exclude characteristics of the Scopus database it is deemed important not to reach out to the initial period when Scopus was launched. It is assumed that international mobility comprises an extended period of residence abroad of about two to three years and equals typically the post-doctoral phase. The duration of the international mobility phase is not defined, because a three-year research stay can lead to multiple publications of a single publication year so that - from a bibliometric point of view - it would seem as if a scientist has been abroad for only one year. Thus, the mobility episode (2010-2012) is characterized as a phase where at least two publications are affiliated to a non-German institution. To measure effects of international mobility on knowledge transfer, all of the German scientists under study have to publish in the phase prior to international mobility as well as after the mobility episode exclusively from German institutions. In addition, these authors have to have at least two publications in the pre-mobility phase (2007-2009) as well as in the post-mobility phase (20132015). All these restrictions apply to a small set of 602 German scientists. Co-authors under study are identified as those who have at least two joint publications with German scientists. This guarantees that a co-author and a German author were interacting and not affiliated on the basis of a large collaboration. To exclude errors of homonyms (one author name relating to different persons – common name) and synonyms of co-authors (scientists having more than one author name – split identity), co-authors are limited to those who publish up to 200 journal and review publications as well as conference proceedings between 2007 and 2015. Co-authors are distinguished not only according to the pre-mobility, mobility and postmobility phase but also according to their co-authorship type within each of these phases. To identify authors from the same German institution a specific institution coding was used that exists for all German institutions. Every institution with a German country code in Scopus is attributed to a unique institution code (Winterhager et al., 2014). This institution coding allows identifying co-authors in the pre-mobility and post-mobility phase that are affiliated to the same institution as the German author under study. Thus, for the pre- and post-mobility phase the following co-author types can be distinguished:   

Co-authors from the same German institution Co-authors from another German institution Co-authors from abroad

3

Scientists under study are not German by nationality but by publishing mainly from German institutions in 2007 to 2015.

6

Co-authors from abroad encompass all authors who are affiliated to institutions other than those located in Germany. The identification of co-authors from the same institution is not possible for the mobility phase. Therefore, another reasonable partition of co-authors was chosen. For the mobility phase (2010-2012), I distinguish the following four co-author types:    

Co-authors from the same German institution Co-authors from another German institution Co-authors from the country of international stay, thus co-authors affiliated to the country that equals the country of the international stay of the author under study Co-authors from another country than that of the international stay

Each of the co-author types distinguished constitutes a disjoint group. This partitioning of co-authors into ten different types enables to study the influence of co-authors on knowledge transfer. Co-authors are treated hierarchically in terms of the phases, i.e. they are associated to a phase dependent on the publication year of the first detectable co-authored publication. The hierarchy of co-authors within a period is as follows: whenever co-authors are affiliated to the same German institution they are regarded as ‘Co-authors from the same German institution’. Whenever a German country code occurs, co-authors are treated as ‘Co-authors from another German institution’. The rest constitutes co-authors from abroad. Furthermore, Scopus’ All Science Journal Classification (AJSC) classification system was used: It distinguishes 334 minor fields and 27 major fields. Every of the 334 minor fields is assigned to a major field. Since the number of scientists under study is rather small, I decided to work with major fields. Scientists were attributed to the most often occurring major field (i.e. dominant major field). The following fields constitute a sufficient number of scientists for an appropriate analysis: Medicine (195), Physics & Astronomy (125), and Chemistry (49). Due to shortage of space and reasons of clarity I will mainly focus on Chemistry and take Medicine and Physics & Astronomy to exemplify specific results. Control group As a control group, I selected a random sample of 10,000 German scientists who have exclusively published with a German affiliation in the period 2007-2015. To measure effects of co-authorship, I also distinguish the three phases as described above (2007-2009, 2010-2012 and 2013-2015). The same restrictions to co-authors were set as to the group of German scientists who have been internationally mobile. Thus, the number of non-internationally mobile German scientists in the control group diminishes to 2,487. Unlike for the authors who have been mobile it is neither possible to analyze co-authors from the country of international stay nor those co-authors from another country than that of the international stay. Hence, the partition of co-authors in the mobility phase is the same as in the pre- and post-mobility phase. Operationalization of knowledge base Apart from the methodological question of how knowledge transfer between scientists can be measured, another question of importance is whether international mobility has any impact on the knowledge base of a scientist. By definition, knowledge base is intangible and cannot be measured directly. What can be measured is rather the diversity of the knowledge regarded as 7

important to be codified in a publication - than knowledge itself. The knowledge base of an author can be approached by the references or lexical terms used in publications. To answer the basic question of whether international mobility has any impact on the knowledge base of scientists I compare the changes in the knowledge bases of internationally mobile authors with non-internationally mobile authors. Knowledge base on the basis of publications at the time t1 is defined as all publications cumulated from t0 until t1. A transition in knowledge base is expressed with the Shannon index - a measure of diversity. The Shannon index is calculated as follows: 𝑛

𝐼 = − ∑𝑖=1 𝑝𝑖 ln(𝑝𝑖 ), where KB is the Shannon index of the distribution of references or lexical terms across articles, n is the number of references or lexical terms and pi is the share of references or lexical terms in article i. The higher the number of references or lexical terms used the higher the Shannon index. Whereas t0 is constant and represents the publication date 01.01.2007, t1 varies and represents the end of year 2007 up to the end of 2015. The idea behind the cumulative view on knowledge base is that the knowledge base of a scientist increases with time. Publications that were once referenced constitute the knowledge stock and can be reused in future publications, because they are kept in mind or in a reference management system. Thus, the diversity of the knowledge base can either stagnate or grow with time. However, it can also diminish when scientists cite year by year the same references. Unlike the calculation of the Shannon index that only considers the knowledge base of the scientists under study, the cosine similarity takes authors and their co-authors into account. Therefore, the following part discusses the appropriateness of co-authorship as a proxy for research collaboration. Co-authorship as a proxy for research collaboration In the underlying study, co-authorship is a first attempt to measure the impact of co-authors on the transfer of knowledge. Because of the accessibility of bibliometric data, co-authorship is continuously used as a proxy for research collaboration (Ponomariov & Boardman, 2016). When two or more authors are listed as co-authors on the same publication, it is plausible that they must have collaborated and interacted in some way (Laudel, 2002). Joint research publications reflect successful scientific collaboration and are likely to signify knowledge flows between scientists. The effective integration of internationally mobile scientists into research groups abroad can be documented by co-authored publications. The reason why research collaboration is measured through co-authorships is not only because bibliometric data is widely available but also because co-authorship is a true output of research collaboration. According to Laudel (2002), of all collaboration types those that involve a division of labor are assumed to be the most important types of collaboration because they include creative contributions from participating actors. Hence, this collaboration type can be measured accurately by co-authorships (ibid. p.13). However, research collaboration can take forms that vary from offering general advice and insights to active participation in research that becomes visible as co-authorship (Katz & Martin, 1997, p.3). One drawback of focusing on co-authors is that active participation can mean simply the provision of material. Another drawback is that co-authored publications 8

constitute a partial indicator of successful collaboration because not all joint research is published and not every co-author contributes equally to a publication (Jonkers & Tijssen, 2008). In most of the studies that use bibliometric data to analyze collaboration, one can find a caveat at the end of the paper saying that co-authorship does not necessarily entail research collaboration or that often research collaboration does not produce publications (see Ponomariov & Boardman, 2016). Due to the relatively large number of scientists in this study, co-authorship that is not associated with knowledge transfer (e.g. co-authors providing material only) should constitute a rare case to distort the results. Operationalization of knowledge transfer The methodological question is how knowledge transfer can be measured bibliometrically. Successful collaborations result in co-authored publications that are available to the scientific community. These publications provide detailed information on the content (title, abstract, and keywords), influential work (references), scientists and organizations, grants, and the geographical location (affiliation). Scientific publications are not only the elementary output unit of research collaboration - they also constitute a crucial element in knowledge transfer mechanisms between scientists. Even though Gao et al. (2011) refer to patent co-inventorships, their central idea that the use of co-authors to describe knowledge transfer is meaningful can be transferred to scientific publications. It rests on the assumption that co-authors know each other well enough to exchange effectively knowledge relevant to the content of their jointly authored publications (ibid.). But what can be measured by co-authorships beyond the research output of joint collaborations? Based on the idea that co-authorship is a mechanism to transfer knowledge, it should be possible to measure the approximation in the knowledge bases of co-authors. In this study both German authors and their co-authors function as sources and as receivers of knowledge. Whenever knowledge transfer takes place the similarity of the knowledge base of co-authors increases. However, an increasing similarity of the knowledge bases of authors and co-authors does not necessarily mean that knowledge transfer has taken place. Assuming that publications sharing the same references or lexical terms result from interaction or a common language, knowledge transfer is operationalized as the shared references and abstract terms, respectively of German scientists and their co-authors. Thus, the vectors of the number of articles by reference and the number of articles by abstract term are considered. The similarity of publications of German authors and their co-authors is computed on the basis of the references used and on the basis of lexical terms in abstracts. Again, t0 represents the publication date 01.01.2007 and t1 the end of a year. Whereas t0 is constant t1 varies and represents the end of 2007, 2008, and so on. As an example, the vectors of the numbers of lexical terms per article in the respective publication profiles are considered. For two vectors A and B, the angle α between them is computed using the following formula: cos(𝛼) =

9

A∗B . |A| ∗ |B|

Cosine similarity is defined as the cosine of the angle of two vectors. The closer the vectors the more similar authors are in their choice of lexical terms. Figure 1 illustrates this notion of similarity for the publication practice of two authors. The smaller the angle the bigger is the similarity. In the calculation of the similarity between authors and their co-authors, co-authored publications were excluded and are only considered as publications of the German authors. 20 Author 1 Author 2

No. of abstracts containing term B

18 16 14 12 10 8 6

α

4 2 0 1

2

3

4

5

6

7

8

9

10 11 12 13 14

No. of abstracts containing term A Figure 1. Angle between publication profiles of two authors.

To work with lexical terms, all terms were extracted from abstracts and their frequency of occurrence was determined. Instead of the exclusion of common stop words, the specificity of terms was considered. The specificity was computed on the basis of the inverse document frequency (IDF) that weighs occurring terms. The IDF of a term t can be calculated as follows: 𝑓

𝐼𝐷𝐹𝑡 = −𝑙𝑛 𝑁𝑡 , 𝐷

where 𝑓𝑡 is the number of documents that entail the term t, whereas 𝑁𝐷 represents the number of documents. Hereby, widely used terms are weighted lower than seldom occurring terms. Results In the following, the results are shown in an exemplary way for the field of Chemistry. In the latter part of the result section, results in the field of Medicine and Physics & Astronomy are illustrated and discussed. These three fields under study differ in terms of co-authorship patterns and the motivation to become internationally mobile. Epistemic practices are decisive for the propensity to become geographically mobile. International mobility in a field such as highenergy physics is at a high level of activity, mainly because of the necessity to share high-cost facilities. Thus, Physics and Astronomy is characterized by a latent mobility to large-scale facilities and the possibility to conduct experiments that are tied to certain laboratories. Chemistry is a field where international mobility is an integral part of an academic career. Chemists go abroad to work in a certain laboratory and to acquire specific methods. Medical researchers are motivated to become internationally mobile because they wish to specialize in a certain medical field and the know-how they achieve abroad often acts as a boost for their career (Costigliola, 2011). Whereas in Chemistry the research groups working in a lab are rather 10

small, Physics and Astronomy is characterized by large groups of scientists conducting experiments. Especially Astronomy is an international field where the majority of publications result from international collaboration. A closer look on the data shows that the average number of co-authors per publication of German scientists within 2007-2015 is 8.1 in Physics and Astronomy, followed by 3.0 in Chemistry and 2.5 in Medicine. It is therefore important to interpret results against the backdrop of field-specific characteristics. In a first step of the analysis, I tried to assure that internationally mobile authors do not differ in their publication productivity from non-internationally mobile authors in the phase 20072009, thus prior to the mobility phase. The results for the field of Chemistry are illustrated in Figure 2. Chemistry

Share of authors in %

100 80 60 40 20

non-internationally mobile authors internationally mobile authors

0 2

4

6

8 10 12 14 16 18 21 23 27 30 32 36 40 45

Number of publications in 2007 - 2009

Figure 2. Distribution of productivity of internationally mobile and non -internationally mobile authors for Chemistry in the period 2007-2009.

Results show that internationally mobile authors are not more productive in terms of publications than non-internationally mobile authors. The null hypothesis of the KolmogorovSmirnov test is not rejected. Thus, the distributions of publications per author in the group of internationally mobile authors and in the group of non-internationally mobile authors are similar. Table 1 shows that the average number of publications in the period 2007-2009 is similar in the group of internationally mobile authors and the group of non-internationally mobile authors. Table 1. Overview of the statistical data for the field of Chemistry in the phase 2007-2009.

No. of internationally mobile authors

49

No. of non-internationally mobile authors

116

Avg. no. of publications of mobile authors

9.71

Avg. no. of publications of non-mobile authors

9.23

Kolmogorov-Smirnov statistic

0.19

P-value

0.14

11

Knowledge base The following Figure 3 provides an overview of the knowledge base of internationally mobile authors and non-internationally mobile authors. In the left diagram the knowledge base was computed on the basis of references used by these two different groups of authors. In the right diagram the knowledge base is computed on the basis of abstract terms. As a point of reference: Assuming that all references were cited equally often (and are thus uniformly distributed), a Shannon index of 2.3 means that on average 10 different references were used; a Shannon index of 4.6 means that on average 100 different references were used. As explained in the method section the Shannon index is considered as a cumulative measure so that a Shannon index for the year 2011 considers all references cited in papers that were published from 01.01.2007 to 31.12.2011. To compute the Shannon index on the basis of abstract terms the specificity of terms was considered so that terms are weighted according to their frequency of occurrence. One can infer from Figure 3 that the knowledge base for both groups of scientists increases with time and does not stagnate. However, internationally mobile authors have a larger knowledge base than non-internationally mobile authors. In addition, the group of internationally mobile authors reveals a steeper increase in the knowledge base within the mobility phase (2010-2012) than the group of non-internationally mobile authors.

Figure 3. Knowledge base of internationally mobile authors and non -internationally mobile authors in the field of Chemistry on the basis of references and abstract terms.

Knowledge transfer on the basis of references In the following figures the cosine similarity of authors and co-authors is computed for the vectors of number of publications by reference. German authors being internationally mobile are presented on the left side, whereas the control group representing non-internationally mobile authors is displayed on the right. Figure 4 visualizes the graphs for the angles between authors and the different types of co-authors from the 2007-2009 period. The y-axis represents the angles between the vectors that range from 0 to 90, whereby 90 degrees represents orthogonality meaning that authors and co-authors have no single reference in common. The smaller the degree the higher the similarity is in terms of the choice of references. 12

It is visible that co-authors from the 2007-2009 phase become more and more similar in their choice of references and that the similarity flattens at some point. German authors are mostly similar to co-authors from the same German institution independent of whether they have been abroad or not.

Figure 4. Reference similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for co-authors from the phase 2007-2009 distinguished by type of co -authorship in the field of Chemistry .

A different observation can be made in Figure 5 where the cosine similarity of authors and co-authors is visualized for the mobility period. The first co-authored publication of German scientists and their co-authors appeared between 2010 and 2012. Before being internationally mobile, German authors had little in common with co-authors they were working with during their international stay in 2010-2012. However, during their stay and afterwards the similarity of authors and co-authors from 2010-2012 was growing steadily. It is remarkable that German authors who were internationally mobile show a steep increase in similarity to their co-authors from the country of their international stay in the phase 2010-2012. Prior to co-publishing in 2010-2012, co-authors from the country of international stay had little in common with German authors - the angle is constant at about 88 degrees. After the international mobility (2013-2015) authors are mostly similar to their co-authors from the same German institution that they co-published with between 2010 and 2012. This finding is similar for the control group. Non-internationally mobile authors are mostly distant to co-authors from other German institutions in their choice of references.

13

Figure 5. Reference similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for co-authors from the phase 2010-2012 distinguished by type of co -authorship in the field of Chemistry .

Figure 6 reflects the similarity to co-authors from the post-mobility phase (2013-2015). The types of co-authorships are the same as in the pre-mobility phase. It is evident that co-authors who come into play between 2013 and 2015 have little in common with German authors in previous years (2007-2012). This could hint at the fact that an international stay causes a new research orientation so that the co-authors scientists choose in the post-mobility phase (20132015) have little in common with the research work of the German scientists before going abroad and while being abroad. Furthermore, Figure 6 shows that an international stay abroad may have influenced the topical orientation of German scientists in the sense that they are internationally oriented and more similar to co-authors from abroad than to German scientists in the post-mobility phase (2013-2015).

Figure 6. Reference similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for co-authors from the phase 2013-2015 distinguished by type of co -authorship in the field of Chemistry .

14

Co-authors from abroad are in fact more similar to German authors in the previous years – even before they were co-publishing for the first time. This finding may also explain that the same research focus in previous years led to joint publications in 2013-2015. In the control group, no difference is visible for the three different types of co-authors. The similarity increases steadily in the phase of co-publishing between all of the co-author types. Knowledge transfer on the basis of abstract terms In the following figures the angles between the vectors of number of publications by abstract term of German authors being internationally mobile (mobile group) and those who were not mobile (control group) are displayed and discussed. It is evident in Figure 7 that for internationally mobile authors as well as non-internationally mobile authors, co-authors from the same German institution are the most similar ones. The similarity increases with every year and flattens slowly. The assumption that internationally mobile authors are more similar to co-authors from abroad than those who were not internationally mobile between 2010 and 2012 holds true.

Figure 7. Abstract term similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for coauthors fro m the phase 2007 -2009 distinguished by type of co -authorship in the field of Chemistry.

Figure 8 visualizes the cosine similarity of authors and co-authors from the mobility phase. Whereas in 2007 authors are equally similar to all three types of co-authors, the similarity changes over the years. From the beginning of the mobility phase, authors become increasingly similar to co-authors from abroad. The opposite is true for the control group. Co-authors from abroad that appear as co-authors initially in the phase 2010-2012 are the most dissimilar in 2013-2015. In terms of the remaining co-author types no clear difference is visible.

15

Figure 8. Abstract term similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for coauthors fro m the phase 2010 -2012 distinguished by type of co -authorship in the field of Chemistry.

For co-authors of the post-mobility phase one can notice that the international stay abroad may have an impact in the way that co-authors from abroad who initially occurred as co-authors in the 2013-2015 period are very similar to German authors (Figure 9). The approximation between 2007 and 2012 took place even before the first measurable co-authored publication. The control group features another picture: In general, all co-author types are not as similar as in the group of mobile authors. Especially co-authors from abroad appear quite distant in comparison to co-authors from Germany and those from the same institution.

Figure 9. Abstract term similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angle in degrees for coauthors fro m the phase 2013 -2015 distinguished by type of co -authorship in the field of Chemistry.

16

In the final result section, I want to focus on the abstract term similarity for the mobility phase in the field of Medicine and Physics & Astronomy. Figure 10 shows that in general, German authors who were internationally mobile are more similar to their co-authors than noninternationally mobile authors. As in the field of Chemistry, German authors share a higher knowledge base with co-authors from abroad.

Figure 10. Abstract term similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for coauthors fro m the phase 2010 -2012 distinguished by type of co -authorship in the field of Medicine.

Finally, Figure 11 shows that just as in Chemistry and Medicine, internationally mobile scientists become increasingly similar to all types of co-authors from the mobility phase. After the mobility phase, co-authors from the same German institution are mostly dissimilar in the phase 2013-2015 to German scientists who have been internationally mobile between 2010 and 2012. Non-internationally mobile scientists are mostly similar to co-authors from abroad with whom they co-publish for the first time in the 2010-2012 phase. After the mobility phase these authors are mostly dissimilar to co-authors from the same German institution.

17

Figure 11. Abstract term similarity of internationally mobile and non -internationally mobile authors’ and co-authors’ knowledge base represented by angles in degrees for coauthors fro m the phase 201 0-2012 distinguished by type of co -authorship in the field of Physics & Astrono my.

Discussion of results This study aimed at testing a method to measure the effect of international mobility on knowledge transfer. Knowledge transfer was approached through co-authorships between internationally mobile and non-internationally mobile scientists, respectively and different types of co-authors. Knowledge transfer was operationalized as the change in knowledge base represented by references and abstract terms used by authors and co-authors. One main finding of the study is that the knowledge base of scientists independent of being internationally mobile or not grows with time. The Shannon index proves that scientists increasingly refer to other references and use new terms with every year. However, scientists who are internationally mobile have a more diverse knowledge base in terms of references and abstract terms used than scientists who were not abroad. Results also show that similarity grows towards all types of co-authors with time. The similarity stagnates in the recent period for co-authors from the 2007-2009 phase. For coauthors who came into play in the two latter periods (2010-2012; 2013-2015) the flattening of similarity is not that evident but it can be assumed that a similar stagnation in similarity may become visible in future time. Furthermore, results show that the similarity between internationally mobile scientists and certain types of co-authors grows to a higher extent than between others. This may imply that co-authors are often chosen on the basis of a common research orientation so that a similarity in previous research work becomes evident even before a first co-published article. One could conclude that before collaboration comes into being the similarity between scientists has to rise and a common link is necessary. Some of the results are unexpected though: Internationally mobile scientists show a convergence towards most of the co-author types distinguished except for those co-authors who are in the country of international stay. This result would imply that co-authors abroad are not primarily chosen on the basis of a similar research orientation. However, the study revealed that the similarity of German authors 18

to their co-authors from abroad increases immensely with international mobility. Evidently, scientific mobility is more than meetings between individuals because members of the host research group may become co-authors. When scientists co-work, they can screen each other’s skills and adjust to each other making the knowledge transfer more efficient. Especially in the three disciplines under study in which the research work is often conducted outside the national borders of a country and scientific findings result from interaction triggered by international mobility and collaboration, an increased similarity of the publication practice was measured. Strengths and limitations of the method The method presented proves to be useful to measure the intellectual mobility of scientists. Unlike co-citation analysis, it is independent of the time dimension, e.g. time that elapses to receive citations. Cosine similarity can be calculated on the basis of article titles, subject headings, or even full texts. The value-added of the method presented arises from its universal application. Apart from the similarity between scientists and their co-authors, the method can also be used to detect emerging topics along a scientist’s trajectory by analyzing an author’s change of vocabulary over time. Time-series can produce a trace of the changes in the intellectual structure of a scientist and the research group visited. Working with lexical terms is not trivial and the occurrence of homonyms, synonyms as well as stop words has to be taken into account. Lexical terms are less formalized than citations, so that they tend to produce more false positives than citation links (Glänzel et al., 2017). Another downside of text-based methods is that they tend to overestimate the similarity between documents (Glänzel, 2012), especially when the same vocabulary is used in a local environment where knowledge transfer takes place. Finally, there are confounding variables that affect the similarity measure as for instance general trends in the use of specific vocabulary among colleagues. A growing knowledge base does not necessarily imply a growing similarity. The longer scientists are in a research field, the more similar their knowledge base may become. The bibliometric method presented does not aim at assessing the performance of individual scientists but rather the effects of international mobility at higher aggregation levels. The analysis of large datasets and the choice of a comparison group tend to cancel out randomly distributed errors. However, systematic errors are not accounted for, because certain limitations are inherent to bibliometric studies of international mobility. Bibliometric research allows tracking mobility only to the extent that scientists publish and that the affiliation is stated on the publication and is linked to them. Another factor influencing results is that a stay abroad does not necessarily result in a publication, and thus cannot be measured as a stay abroad. Moreover, the publication delay has to be kept in mind, thus a mobility episode that becomes evident in 2010-2012 on the basis of publications, may result from earlier stays abroad. Another challenge to the bibliometric approach is that it is based on the use of unique author profiles that rely on the Scopus algorithm (Moed et al. 2013). I tried to minimize these limitations by restricting the period to recent years (2007-2015) and setting some additional restrictions to publication counts and co-authors. This study is a small contribution to a complex question and there are fundamental challenges that hamper a proper approach. Scientists may be abroad and not publish any coauthored publication and at the same time co-authorship does not mean that an intense exchange 19

of knowledge has taken place. Another restriction to the approach in this study apart from disentangling co-authorship and co-location is that the similarity of the knowledge base is a necessary but not a sufficient condition to measure knowledge transfer. However, a necessary but not sufficient condition is a step forward to the development of more advanced methods to measure knowledge transfer. Conclusions and future research The underlying study shows that the mere inflow and outflow rates of mobile scientists are insufficient to studies of the impact of international mobility on knowledge aspects. The need to study the effects of international mobility on knowledge transfer constitutes new challenges for empirical research. Using a bibliometric approach that is based on data in scientific publications, I provide insight into the change of knowledge bases of internationally mobile scientists and non-internationally mobile scientists. What I measure is a specific change in the knowledge base, i.e. the increase of the diversity of the knowledge base of scientists. So far, diversity is the only approach to the problem. It is therefore used in addition to the cosine similarity that indicates the approximation of two scientists in terms of their knowledge base. Cosine similarity may still be regarded as a rough measure to account for a potential change in the knowledge base of two actors. However, the value added of the approach presented is that an increased similarity of scientists and their co-authors is capable of indicating potential knowledge transfer. This pilot study provides interesting outcomes in terms of similarity of internationally mobile authors and non-internationally mobile authors towards certain co-author types. There is evidence to assume that scientists’ engagement in collaborative activity is a way in which knowledge is transferred (Zucker et al., 2007). Whereas codified knowledge flows through research articles and scientific equipment, tacit knowledge is embodied in a person and is communicated through interaction when doing research together. Especially the transfer of tacit knowledge requires face-to-face communication between scientists which makes international mobility an important mediator. Research collaboration is thus an important characteristic of international mobility that can shape knowledge transfer. That is why knowledge transfer was operationalized as co-authored publications which result from formal as well as informal communication and interaction. With advanced bibliometric data it would be possible to distinguish the co-authors abroad more precisely. One can imagine delineating co-authors of the mobility phase on the basis of the same institution or at least the same city. However, one has to account for different spelling variants of one and the same institution or city, e.g. Antwerp and Antwerpen, Gent and Ghent, or Copenhagen and København. Moreover, the methodology presented is only able to show patterns of similarity on the basis of the publication practice of scientists and does not cover the specific content of the knowledge transferred. Future work could consider the lexical terms used according to different sections of an article instead of analyzing the abstract as a whole. Further work should also test whether discipline-specific characteristics influence the robustness of the method presented. To conclude, it remains important to develop improved methods for a better understanding of the knowledge flows of mobile scientists and the value added that international mobility brings to the individual scientist as well as to research. 20

Acknowledgments The present study is an extended version of an article presented at the 16th International Conference on Scientometrics and Informetrics, Wuhan (China), 16 - 20 October 2017. The study was funded by the Bundesministerium für Bildung und Forschung (BMBF) under the grant number 01PQ16002. The data builds on the bibliometric database provided by the Competence Centre for Bibliometrics (grant number: 01PQ17001). I would like to thank Jochen Gläser and Nicolai Netz for their valuable comments during the genesis of the paper. I would also like to thank two anonymous reviewers for their comments, which have helped to improve the paper substantially. References Ackers, L. (2005). Moving People and Knowledge: Scientific Mobility in the European Union. International Migration, 43(5), 99–129. Ackers, L. (2008). Internationalisation, Mobility and Metrics: A New Form of Indirect Discrimination? Minerva, 46(4), 411–435. https://doi.org/10.1007/s11024-008-9110-2 Aman, V. (2017). Does the Scopus author ID suffice to track scientific international mobility? A case study based on Leibniz laureates. Presented at the 22th Conference on Science, Technology & Innovation Indicators (STI 2017), ESIEE, Paris. Aman, V. (2017). A new bibliometric approach to measure knowledge transfer of internationally mobile scientists, In: Proceedings of ISSI 2017 – The 16th International Conference on Scientometrics and Informetrics. Wuhan University, China, 1480-1490. Cañibano, C., Otamendi, J., & Andújar, I. (2008). Measuring and assessing researcher mobility from CV analysis: the case of the Ramón y Cajal programme in Spain. Research Evaluation, 17(1), 17–31. https://doi.org/10.3152/095820208X292797 Collins, H. M. (1974). The TEA Set: Tacit Knowledge and Scientific Networks. Science Studies, 4, 165–186. Collins, H. M. (2001). Tacit Knowledge, Trust and the Q of Sapphire. Social Studies of Science, 31(1), 71–85. https://doi.org/10.1177/030631201031001004 Conchi, S., & Michels, C. (2014). Scientific mobility: An analysis of Germany, Austria, France and Great Britain. Fraunhofer ISI Discussion Papers Innovation Systems and Policy Analysis. http://hdl.handle.net/10419/94371 Costigliola, V. (2011). Mobility of medical doctors in cross-border healthcare. The EPMA Journal, 2(4), 333. https://doi.org/10.1007/s13167-011-0133-7 Fleming, L. (2001). Recombinant Uncertainty in Technological Search. Management Science, 47(1), 117–132. Gao, X., Guan, J., & Rousseau, R. (2011). Mapping collaborative knowledge production in China using patent co-inventorship. Scientometrics, 88(2), 343–362. Glänzel, W. (2012). Bibliometric methods for detecting and analysing emerging research topics. El profesional de la Información, 21(2), 194–201. Glänzel, W., Heeffer, S., & Thijs, B. (2017). Lexical analysis of scientific publications for nano-level scientometrics. Scientometrics, 111(3), 1897–1906. https://doi.org/10.1007/s11192-017-2336-8

21

Gläser, J. (2003). What Internet Use Does and Does not Change in Scientific Communities. Science Studies, 16(1), 38–51. Gläser, J. (2006). Wissenschaftliche Produktionsgemeinschaften. Die soziale Ordnung der Forschung. Frankfurt/New York: Campus. Gläser, J., & Laudel, G. (2001). Integrating Scientometric Indicators into Sociological Studies: Methodical and Methodological Problems. Scientometrics, 52(2), 414–434. Jonkers, K., & Tijssen, R. (2008). Chinese researchers returning home: Impacts of international mobility on research collaboration and scientific productivity. Scientometrics, 77(2), 309–333. Katz, J. S., & Martin, B. R. (1997). What is research collaboration? Research Policy, 26(1), 1–18. https://doi.org/10.1016/S0048-7333(96)00917-1 Kawashima, H., & Tomizawa, H. (2015). Accuracy evaluation of Scopus Author ID based on the largest funding database in Japan. Scientometrics, 103(3), 1061–1071. https://doi.org/10.1007/s11192-015-1580-z Laudel, G. (2001). Collaboration, creativity and rewards: why and how scientists collaborate. International Journal of Technology Management, 22(7/8), 762–780. Laudel, G. (2002). Collaboration and reward. What do we measure by co-authorships? Research Evaluation, 11(1), 3–15. https://doi.org/10.3152/147154402781776961 Laudel, G. (2003). Studying the brain drain: Can bibliometric methods help? Scientometrics, 57(2), 215–237. https://doi.org/10.1023/A:1024137718393 Luukkonen, T., Tijssen, R. J. W., Persson, O., & Sivertsen, G. (1993). The measurement of international scientific collaboration. Scientometrics, 28(1), 15–36. https://doi.org/10.1007/BF02016282 Moed, H. F., Aisati, M., & Plume, A. (2013). Studying scientific migration in Scopus. Scientometrics, 94, 929–942. https://doi.org/10.1007/s11192-012-0783-9 Polanyi, M. (1962). The Republic of Science: Its Political and Economic Theory. Minerva, 1, 54–73. Ponomariov, B., & Boardman, C. (2016). What is co-authorship? Scientometrics, 109(3), 1939–1963. Salton, G., & McGill, M. J. (1983). Introduction to Modern Information Retrieval. Auckland: McGraw-Hill. Abgerufen von https://www.amazon.com/Introduction-InformationRetrieval-Computer-Science/dp/0070544840 Shannon, C. E. (2013). A Mathematical Theory of Communication. Bell System Technical Journal, 27(3), 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x Wagner, C. S., & Leydesdorff, L. (2005). Network structure, self-organization, and the growth of international collaboration in science. Research Policy, 34, 1608–1618. Winterhager, M., Schwechheimer, H., & Rimmert, C. (2014). Institutionenkodierung als Grundlage für bibliometrische Indikatoren. Bibliometrie - Praxis und Forschung, 3(14), 1– 22. Zucker, L. G., Darby, M. R., Furner, J., Liu, R. C., & Ma, H. (2007). Minerva unbound: Knowledge stocks, knowledge flows and new knowledge production. Research Policy, 36(6), 850–863. https://doi.org/10.1016/j.respol.2007.02.007 22