Inflectional morphology and syntax in correspondence

0 downloads 0 Views 283KB Size Report
Eu sei que ele o encontrará. I know that he 3.sg.masc.acc will-find ... disseram a verdade. ...... Such formatives are independent of phrase structure and de-.
In﬇ectional morphology and syntax in correspondence Evidence from European Portuguese Ana R. Luís and Ryo Otoguro

University of Coimbra and Waseda University Clitic pronouns in European Portuguese differ from clitics in other Romance languages in two important ways: (1) preverbal clitics can take wide scope over coordinated verb phrases and can be separated from the verb by (up to two) non-projecting particles (Crysmann 2002, Luís 2004); (2) the preverbal placement of clitics is dependent on a heterogeneous set of particles and phrases in preverbal position rather than on the finiteness of the verb. In this paper, we account for both (1) and (2) within the lexicalist theory of LexicalFunctional Grammar (LFG) (Kaplan and Bresnan 1982, Bresnan 2001). As to (1), we show that the scopal and distributional properties of proclitics can be straightforwardly captured by placing morphological tokens in correspondence with syntactic atoms. As to (2), we argue that the effect of proclitc triggers on clitic placement can receive a unified accounted if proclitic contexts are defined in terms of functional precedence (Kaplan and Zaenen 1989).

1. Introduction Empirical evidence suggests that pronominal proclitics in European Portuguese (EP) exhibit a number of phrasal properties.1 First, their preverbal position is not dependent on the finiteness of the verb, but on a specific set of particles and phrases which must necessarily occur in preverbal position. Second, they can be separated from the verb by up to two particles and take wide scope over coordinated verb phrases. By 1. We would like to thank the audiences at the International LFG Conferences in Canterbury (July 2004) and Bergen (July 2005), and at the 2nd York-Essex Morphology Meeting in Essex (November 2004). Special thanks go to the following colleagues for their helpful suggestions: Miriam Butt, Mary Dalrymple, Ron Kaplan, Tracy Holloway King, Gergana Popova, Louisa Sadler and Andrew Spencer. We also thank two anonymous reviewers for their constructive comments.



Ana R. Luís and Ryo Otoguro

contrast, in Italian or Spanish, both enclitics and proclitics must be immediately adjacent to the verb (Miller and Sag 1997, Monachesi 1999). Luís (2004) and Luís and Spencer (2005) have shown that cliticizaton in EP constitutes an inflectional phenomenon, despite the syntactic conditioning of proclitic placement and the syntactic transparency of the proclitic-verb sequence. Inflectional status is strongly supported by the internal properties of the pronominal cluster and by the nature of the verb-enclitic unit. A wide range of phonological, morphological, and syntactic diagnostics previously employed for other Romance languages indicate that object pronouns in EP should effectively be analysed as inflectional affixes. However, while enclitics behave like word-level affixes, proclitics attach as phrasal affixes (in the sense of Anderson (1992, 2005)). In this paper we focus on the partly phrasal and partly morphological properties of proclitics and proclitic placement. The goal of this paper is therefore two-fold: (1) to examine the representation of phrasal affixes (i.e., proclitics) in EP and (2) to explore an account of the preverbal placement of pronominal clitics within the lexicalist framework of LFG. With respect to (1), the syntactic transparency of the proclitic-verb sequence poses a serious challenge to lexicalist theories which disallow units smaller than the word from appearing on an independent phrase structure node. The core idea of our analysis therefore is to prohibit morphological strings from being inserted directly into the phrase structure, in harmony with standard lexicalist assumptions. We introduce minor alterations to the classical mapping between words and phrase structure: rather than assuming a one-to-one correspondence between words and syntactic terminal, the correspondence between inflectional strings and c-structure nodes is mediated through a mapping function which allows one inflectional string to correspond to more than one syntactic terminal. With respect to (2), we account for the effect of preverbal contexts on proclitic placement by applying f(unctional)-precedence to EP cliticisation. More precisely, we assume that the functional information contributed by each trigger f-precedes the information provided by the pronominal clitics. Our proposal captures the heterogeneous set of proclitic triggers by drawing on linear order and functional information, rather than on phrase structural positions. The structure of our paper is as follows. In Section 2, we survey the inflectional properties of pronominal clitics and the heterogeneous nature of the proclitic contexts. We start by providing empirical evidence which shows that both preverbal and postverbal clitic clusters exhibit inflectional properties (2.1), but whereas enclitic clusters behave like word-level suffixes (2.2), proclitic clusters behave more like phrasal affixes (2.3). Section 3 provides the theoretical background within which we explore our LFG approach to phrasal affixation and proclitic placement. We begin with an overview of the correspondence between c(onstituent)-structure and f(unctional)-structure within LFG (3.1). This is followed by a discussion of the implications of the Principle of Lexical Integrity on the phrase structure representation of phrasal affixes (3.2). In Section 4, we offer an outline of our proposal. We start with a survey of the configurational properties of the EP phrase-structure (4.1). We then explore a novel c-structure



In﬇ectional morphology and syntax in correspondence 

representation of phrasal affixes, by formulating a mapping relation between morphological tokens and syntactic atoms (4.2). Having laid out the necessary LFG machinery, we examine each one of the proclitic contexts in terms of f(unctional)-precedence (4.3). Section 5 concludes our paper.

2. Overview of the data In most Romance languages (e.g., Spanish, French, Italian), the alternation between the preverbal and postverbal placement of pronominal clitics is conditioned by the finiteness of the verb. Clitic placement in European Portuguese however is sensitive to words and phrases in preverbal position (Martins 1994). In the presence of such elements, pronominal clitics must occur preverbally. Compare the position of clitics in (2a) and (2b) below. (1) a. O Pedro encontrou-os, porque os procurou. the Pedro brought-3.pl.masc.acc, because 3.pl.masc.acc searched ‘Pedro found them, because he searched for them.’ b. As professoras deram-lhes canetas, mas não lhes deram papel. the teachers gave-3.pl.dat pens, but not 3.pl.dat gave paper ‘The teachers gave them pens, but they didn’t give them paper.’ In (1), proclitic placement is determined by the clause-initial subordinating conjunction porque ‘because’ (cf. (1a)) and by the preverbal negation marker não ‘not’ (cf. (1b)). In the absence of such preverbal triggers, clitics appear postverbally in their default position. Proclisis is also triggered by other preverbal syntactic contexts: embedded clauses introduced by complementisers, as in (2a); relative pronouns, as in (2b); fronted focus phrases, as shown in (2c); operator-like adverbs, such as também ‘also’, até ‘even’ and já ‘already’, as in (2d); wh-phrases in (2e), and quantified subjects in (2f). As alluded to above, these words and phrases can only trigger proclisis if they occur in preverbal position. We return to each one of these proclitic contexts in Section 4.3. (2) a. Eu sei que ele o encontrará. I know that he 3.sg.masc.acc will-stnd ‘I know that he will stnd it.’ b. O polícia que te viu é meu tio. the policeman who 2.sg.acc saw is my uncle ‘The policeman who saw you is my uncle.’ c. Deste livro me lembro bem. of-this book 1.sg.refl remember well ‘I remember this book well.’

 Ana R. Luís and Ryo Otoguro

d. As crianças também o viram. the children also 1.sg.masc.acc saw ‘The children saw him, too.’ e. Quantos presentes te ofereceram? how-many gifts 2.sg.dat gave ‘How many presents did they give you?’ f. Todas as crianças nos disseram a verdade. all.pl.fem the children 1.pl.dat said the truth ‘All the children told us the truth.’

2.1

Inflectional properties

Luís (2004) shows that clitic sequences display a significant number of affix properties. Regardless of whether they occur preverbally or postverbally, clitic clusters exhibit morphophonological alternations such as fusion, syncretism, and cluster-internal allomorphy. We briefly illustrate these affix properties in some detail. Morphophonological fusion takes place between 3rd person accusative clitics and vowel-final dative clitics. In (3a) the 3rd person singular masculine accusative o is fused with the 1st person singular dative me, giving rise to the portmanteau cluster mo. In (3b) the same accusative clitic is fused with the 2nd person singular dative te, giving rise to the cluster to. Such portmanteau clusters surface both postverbally (cf. (3a)) and preverbally (cf. (3b)). (3) a. Disse-mo. (*me-o) said-1.sg.dat-.3.sg.masc.acc ‘S/he said it to me.’ b. ... que to disse. (*te-o) ... that 2.sg.dat-3.sg.masc.acc said ‘... that s/he said it to you.’ Other portmanteau forms which result from the obligatory fusion between clitics are illustrated in Table 1. Table 1.╇ Portmanteau clusters in EP

1.sg.dat (me) 2.sg.dat (te)

3.sg.masc.acc (o)

3.sg.fem.acc (a)

3.pl.masc.acc (os)

3.pl.fem.acc (as)

mo (*me-o) to (*te-o)

ma (*me-a) ta (*te-a)

mos (*me-os) tos (*te-os)

mas (*me-as) tas (*te-as)



In﬇ectional morphology and syntax in correspondence 

As to syncretism, we find syncretic clitic clusters when 3rd person dative clitics combine with 3rd person accusative clitics. This combination of clitics neutralises the number features of the dative clitics as illustrated in (4), where the portmanteau cluster lho can mean either ‘it to him’ or ‘it to them’. (4) a. Deu-lho. (*lhe-o) gave-3.sg/pl.dat-3.sg.masc.acc ‘S/he gave it to him/them.’ b. ... que lho deu. (*lhe-o) ... that 3.sg/pl.dat-3.sg.masc.acc gave ‘... that s/he gave it to him/them.’ The complete set of syncretic forms found in EP is as in Table 2. Finally, object pronouns generally exhibit morphophonological alternations when 3rd person accusative pronouns such as o ‘him’, a ‘her’, os ‘them.MASC’ and as ‘them. FEM’ are preceded by a 1st or 2nd person plural dative pronoun. This kind of combination triggers ‘reciprocal’ allomorphy inside the cluster given that both the dative and the accusative clitics exhibit shape alternations. Dative clitics undergo clitic-final consonant deletion, while accusative clitics surface as l-initial allomorphs, as shown below: (5) a. Deu-no-lo. (*nos-o) gave-2.pl.dat-3.sg.masc.acc ‘S/he gave it to us.’ b. ... que no-lo disse. (*nos-o) ‘... that 1.pl.dat-3.sg.masc.acc said ‘... that s/he said it to us.’ Table 3 illustrates the complete inventory of clitic clusters displaying ‘reciprocal’ allomorphy in EP. So far, we have shown that there is a wide range of idiosyncratic processes taking place inside clitic clusters which strongly suggest that EP pronominals should be viewed as inflectional affixes. Table 2.╇ Syncretism inside clitic clusters

3.sg.dat (lhe) 3.pl.dat (lhes)

3.sg.masc.acc (o)

3.sg.fem.acc (a)

3.pl.masc.acc (os)

3.pl.fem.acc (as)

lho

lha

lhos

lhas

 Ana R. Luís and Ryo Otoguro

Table 3.╇ Morphophonological alternations inside the clitic cluster

1pl.dat 2pl.dat

3.sg.masc.acc

3.sg.fem.acc

3.pl.masc.acc

3.pl.fem.acc

no-lo (*nos-o) vo-lo (*vos-s)

no-la (*nos-a) vo-la (*vos-a)

no-los (*nos-os) vo-los (*vos-os)

no-las (*nos-as) vo-las (*vos-as)

To capture the fact that enclitics and proclitics are formally and semantically identical, Luís (2004) develops an inflectional analysis which generates enclitics and proclitics through one and the same realisation rule. Adopting a revised version of Paradigm Function Morphology (Stump 2001), such realisation rules generate each clitic as an ‘ambifixal’ exponent, i.e., an affix that has the ability to attach either as a prefix or as a suffix (cf. Stump (1993)).

2.2

The enclitic-verb sequence

Enclitics, which must be immediately adjacent to the verb, interact with the verb in a number of morphophonologically complex ways. As this section will illustrate, (a) they may trigger allomorphy on the verb (cf. (6)), (b) undergo stem-induced allomorphy (cf. (8)) and (c) induce and undergo allomorphic variation within the same verb-enclitic sequence (cf. (10)). Stem-allomorphy takes place when 1st and 2nd person plural enclitic pronouns (i.e., nos ‘us’ and vos ‘you’) are preceded by a 1st person plural verb form, regardless of the tense or aspect properties of the verb: (6) a. vêmo-nos see.1.pl.pres-1.pl.refl ‘we see us’

(*vêmos-nos)

b. davamo-vos give.1.pl.perf-2.pl.dat ‘we give you’

(*davamos-vos)

Consonant deletion is entirely dependent on the person and number features of both the enclitic and the verb. If nos and vos are preceded by any other consonant final verb form, as in (7), the verb-final consonant is not deleted. (7) recebes-nos receive.2.sg.pres-1.pl.acc ‘you receive us’

(*recebe-nos)



In﬇ectional morphology and syntax in correspondence 

Enclitics also undergo morphophonological variation when 3rd person accusative clitics combine with 3rd plural verb forms. In this context, accusative clitics, which by default are vowel-initial, surface as n-initial allomorphs. This generalization applies to both lexical verbs (cf. (8a)) and auxiliaries (cf. (8b)) and is not sensitive to the tense value of the verb. (8) a. lavam-no wash.3.pl.pres-3.sg.masc.acc ‘they wash him’

(*lavam-o)

b. tinham-nas visto (*tinham-as) had.3.pl.impf-3.pl.fem.acc seen ‘they had seen them’ On the contrary, other nasal-final verbs (e.g., 3rd singular present indicative as in (9a) or 2nd singular imperative forms as in (9b)) select the vowel-initial clitic allomorph, showing that n-allomorphs are sensitive to the morphosyntactic properties of the preceding verb rather than to the phonological form of the verb. (9) a. Os livros, o professor tem-os na pasta. the.pl books the.sg teacher has-3.pl.acc in.the briefcase ‘The books, the teacher has them inside his briefcase.’ b. Põe-os na rua! put.2.sg.imper-3.pl.masc.acc in.the street ‘Throw them out!’ As shown in (10), 3rd person accusative clitics can also surface as l-initial allomorphs. These forms are selected when the preceding verb ends in one of the following consonants: -s, -z, -r: (10) cantamo-lo sing.1.pl.pres-3.sg.masc.acc ‘we sing it’

(*cantamos-o)

The l-allomorph is not attested across word boundaries. For example, vowel initial words preceded by consonant-final words do not undergo this alternation, as shown in (11a), and definite articles that are phonologically similar to the 3rd accusative clitics also block the change as in (11b). (11) a. lápis azul ‘blue pencil’ b. Tu compras o bolo ‘You buy the cake’ In addition, l-allomorphs trigger consonant-final deletion on the verb as illustrated in (10). Consonant-deletion however is blocked before an l-initial noun in (12a) and an

 Ana R. Luís and Ryo Otoguro

l-initial adverb in (12b), indicating that the phenomenon does not apply across word boundaries. (12) a. compramos luvas ‘we bought gloves’ b. diz logo ‘speak later’ Finally, enclitic pronouns in EP also have the ability to interact with internal layers of affixation. As (13) illustrates, the cluster intervenes between the verb stem and the future/conditional agreement marker. Crucially, 3rd person accusative clitics undergo and induce the same kind of allomorphy: (13) a. cantá-lo-ei (*cantar-o-ei) sing.inf-3.sg.masc.acc-1.sg.fut ‘I will sing it’ b. escrevê-las-ás (*escrever-as-ás) write.inf-3.pl.fem.acc-2.sg.fut ‘you will write them’ The data surveyed in this section clearly shows that EP enclitics interact morpholophonologically with the verb in various ways indicating that pronominal enclitics in EP constitute word-level suffixes.

2.3

Proclitic puzzles

Whereas enclitics must be always immediately adjacent to the verb, proclitics can have wide scope over two conjoined verb phrases (cf. (14)) and be separated from the verb by intervening words (cf. (15)). Such scopal and distributional differences seem to indicate that, despite the morphological similarities illustrated in Section 2.1, proclitics do not form a cohering morphological unit with verbal host. Illustrating the facts briefly, proclitics can take scope over a coordinated phrase: (14) a. Apenas a minha mãe me [ajudou e incentivou]. only the my mother 1.sg.acc [helped and encouraged] ‘Only my mother helped me and encouraged me.’ b.

Acho que lhes think.1.sg that 3.pl.dat [leram uma história e deram um livro]. [have.read a story and have.given a book] ‘I think that they have read them a story and given them a book.’

In (14a), the proclitic me functions as the object of the coordinated verb phrase ajudou e incentivou ‘helped and encouraged’. Similarly, in (14b), the preverbal clitic lhes realises



In﬇ectional morphology and syntax in correspondence 

the dative argument of the conjoined verb phrase leram uma historia e deram um livro. In the second example, both leram and deram have in addition a non-identical object complement: uma historia is the complement of ‘read’ and um livro is the complement of ‘give’. One further difference between enclitics and proclitics is that while enclitics must be adjacent to the host, lexical items are allowed to intervene between the proclitics and the verb. (15) a. ... acho que ela lho ainda não disse. ... think that she 3.pl.dat-3.sg.masc.acc yet not told ‘... I think that s/he hasn’t told it to him/her/them yet.’ b.

... embora eu saiba que a já tens em ... althought I know that 3.sg.fem.acc already have in grande dose. big position ‘... although I know that you already have tons of it (= patience).’

In (15a), the proclitic is separated from the verb by two particles, ainda não ‘not yet’, and in (15b) by the particle já ‘already’. Although there are quite severe restrictions on the lexical items that can intervene between the proclitic cluster and the verbal host, the data clearly shows that the proclitic-verb sequence is syntactically transparent. To summarise so far, we appear to have a somewhat mixed picture concerning proclitics. On the one hand, there are a number of points of similarity with enclitics which support their affix status. On the other hand, the interpolation of what is clearly syntactic material between the proclitic and the verb constitutes an argument in favour of the syntactic attachment of the proclitics. So, what the data shows is that the difference between enclitics and proclitics is not just a question of right/left linearisation of an affix to the verbal host. Based on the evidence presented above, Luís (2004) argues that EP pronominal clitics must be allowed to attach either to the right edge of a verbal stem (in the case of enclisis) or to the left edge of a phrasal node (in the case of proclisis). The asymmetry between enclitics and proclitics is captured by treating enclitics as verbal suffixes and proclitics as phrasal affixes. This proposal elaborates on the wellknown distinction between word-level affixation and phrasal-affixation, formulated orginally by Klavans (1985) and developed more recently by Anderson (1992, 2005), Spencer (2000) and Luís and Spencer (2005).

3. The framework Morphology and configurational syntax constitute independent levels of linguistic structure in Lexical Functional Grammar (LFG) (Kaplan and Bresnan 1982, Bresnan 2001, Falk 2001, Dalrymple 2001) and a strong division is assumed between word-internal

 Ana R. Luís and Ryo Otoguro

structure and phrase structure. In this section, we start by surveying basic assumptions about the correspondence between c-structure and f-structure in LFG. We then discuss some of the challenges posed by EP proclitics to Lexical Integrity (Bresnan 2001) and to the one-to-one correspondence between inflected words and phrase-structure nodes.

3.1

Lexical-functional grammar

LFG is a non-derivational lexicalist theory with co-present parallel structures, linked by principles of correspondence. Each one of these structures has a different formal character and models a different aspect of the structure of the language (e.g., surface phrase structure, grammatical relations, semantic relations, among other). Each level of structure is therefore autonomous and obeys specific well-formedness conditions. Particularly relevant for our paper are two syntactic structures, c-structure (constituent-structure) and f-structure (functional-structure). The c-structure models the hierarchical relation (i.e., dominance) and the linear ordering (i.e., precedence) of both words and phrases, while the f-structure models grammatical relations and predicate argument structure. C-structures are represented by conventional phrase structure trees as exemplified below: (16)

IP DP the robbers

I VP

I were

V

DP

Adv

smashing the window brutally Whereas in transformational analyses, phrase structures can dominate empty nodes, c-structures in LFG are strictly surface oriented. The Principle of Economy of Expression in (17) prohibits the theory from postulating otherwise unecessary c-structure nodes. Therefore, a sentence such as the robbers smashed the window cannot be assigned the representation in (18a), but must be represented as (18b) where the I′ dominates the VP without an I node. (17) Economy of Expression: All syntactic phrase structure nodes are optional and are not used unless required by independent principles.



In﬇ectional morphology and syntax in correspondence 

(18) a.*

IP DP

the robbers

I I e

VP V

DP

smashed the window b.

Adv brutally

IP DP

I

the robbers

VP V

DP

smashed the window

Adv brutally

Transformational grammars postulate the I (or T) position regardless of whether there is phonologically overt material in this position to encode abstract functional information (such as tense and agreement). In LFG, however, functional information is represented at a different level of syntactic structure, namely the f-structure. Therefore, the c-structure functional categories are best viewed as phrase structure positions, given that no particular feature content is associated to them. The f-structure associated to the c-structure in (16) is given on next page:

 Ana R. Luís and Ryo Otoguro

(19)

pred tense

‘smashsubj,obj’ past

aspect prog + subj

obj

adj

spec

pred ‘the’ def +

pred ‘robber’ num pl spec

pred ‘the’ def +

pred ‘window’ num sg pred ‘brutally’

Formally, f-structures are represented as functions from attributes to values. As shown in (19), an f-structure contains a set of ordered pairs such as 〈tense, past〉 and 〈num, pl〉 . The attributes are symbols which encode syntactic properties such as tense, aspect and num(ber) as well as grammatical functions (gfs) like subj(ect), obj(ect) and adj(unct). The values may be atomic (e.g., the value of tense) or they may themselves be complex (e.g., the value of subj which is itself an f-structures). The other type of feature appearing in the f-structure is called semantic form. In (19), the value of pred (i.e., ‘smash 〈subj,obj’〉) comprises both the predicated name and a list of the governable gfs subcategorised for by the predicate. F-structures can also be written down as a set of propositions where a function applies to an attribute and yields a value. The following parenthetic notation for functional application is used in LFG: (20) (fa) = v iff 〈a v〉 ∈ f, where f is an f-structure, a is an atomic symbol and v is a value. To see how (20) works, let us consider the English sentence John cried loudly and its corresponding f-structure in (21): (21) subj f1:

pred f2: pers num

‘John’ 3 sg

adj

f3: “loudly”

pred

‘crysubj’ past

tense

In (21), functions are given names, f1, f2 and f3 (the internal structure of f3 is ignored for expository purposes). Based on the principle of function application given in (20), the f-structure in (21) can be written as a set of equations:



In﬇ectional morphology and syntax in correspondence 

(22)

(f1 subj) f2 (f2 pred) = ‘John’ (f2 pers) = 3 (f2 num) = sg (f1 pred) = ‘cry 〈subj〉’ (f1 tense) = past f3 ∈ (f1 adj)

The first equation states that the pair comprising the atomic symbol subj and the fstructure f2 is a member of the f-structure f1, i.e., 〈subj, f2〉 ∈ f1. The following equations can be read in the same manner; (f2 num) = sg is equal to 〈num, sg〉 ∈ f2 and (f1 tense) = past is equal to 〈tense, past〉 ∈ f1. Note that the value of adj is a set of f-structures which is expressed by curly brackets, so that f3 is a member of f1, i.e. f3 ∈ (f1 adj). A set of statements of this kind is called an f-description, and is used to specify lexical properties of words in the lexicon as we will see below. So, because f-structures are sets of ordered pairs, they can be formulated as mathematical functions: (f1 tense) = past is equivalent to stating that f1 applies to the argument tense to yield the value tense. Since LFG has two parallel levels of syntactic structures, namely c-structure and f-structure, an important question is how to draw the correspondence between one structure and the other. As alluded to above, the co-present parallel structures are related to each other through principles of correspondence. In the case of both syntactic structures, the correspondence is modelled by the function ϕ which maps a c-structure node onto an f-structure, as defined in (23). (23) ϕ: N → F, where N is a c-structure node and F is an f-structure. (24) and (25) illustrate an important facet of LFG, namely the fact that the correspondence between c-structure and f-structure can be many-to-one, i.e., one f-structure can be assigned to more than one c-structure node. (24) n1:A n2:B n3:C n4:D

n5:E

f1:

q f2:

s u

t v

w x r f3: y z

(25) ϕ(n1) = ϕ(n3) = ϕ(n4) = f1 ϕ(n2) = f2 ϕ(n5) = f3 If we assign * to the current node, and refer to the mother node as M(*) (where M is a function that maps one node to its mother), the correspondences can be captured by annotating functional equations on c-structure nodes. For the c-structure in (24), the equations are annotated as in (26).

 Ana R. Luís and Ryo Otoguro

(26)

A (ϕ(M(*))q) = ϕ(*) B

(ϕ(M(*)) = ϕ(*) C

(ϕ(M(*)) = ϕ(*) D

(ϕ(M(*))r) = ϕ(*) E

For expository reasons, the functional equations in (26) can also be substituted by the abbreviated annotations ↑ and ↓. These are defined as follows: (27) ↑: = ϕ(M(*)) ↓: = ϕ(*) Substituting the functional annotations given in (26) with the abbreviated notations ↑ and ↓ defined in (27), we produce the annotated c-structure shown below: (28)

A ( q) =  B

= C

= D

( r) =  E

Returning to the sentence in (16), the robbers smashed the window brutally, if we adopt the simplified c-structure annotations, in (28), we obtain the following annotated c-structure: (29)

IP ( subj) =  DP

= I

the robbers

= VP

 =  ( obj) =    (adj) DP Adv V smashed the window

brutally



In﬇ectional morphology and syntax in correspondence 

In (29), the equation ↑ = ↓ on the IP, I′, VP and V ensures that all these nodes are mapped onto the same f-structure. The equation (↑ subj) = ↓ on the DP node indicates that the subject DP corresponds to the value of subj. Similarly, the equation (↑ obj) = ↓ states that the DP the window is mapped onto the value of obj. And, (29) ↓ ∈ (↑ adj) indicates that the adverb brutally functions as an adjunct. The type of correspondence between c-structure and f-structure assigned to a given phrase structure configuration is language specific. However, despite such variation, attempts have been made at formulating general mapping principles. Bresnan (2001: 102–3) proposes the following generalisations: (30) a. C-structure heads are f-structure heads. b. Specifiers of functional categories are the grammatical discourse functions df. c. Complements of functional categories are f-structure co-heads. d. Complements of lexical categories are the nondiscourse argument functions cf. e. Constituents adjoined to phrasal constituents are nonargument functions af or not annotated. Explaining briefly the mapping principles outlined in (30). When one c-structure node is mapped onto the same structure as its mother node, that node is called an f-structure head. So, according to (30a, c), both a c-structure head and a complement of a functional category are given ↑ = ↓ . Therefore, I′, I, VP and V are all annotated as ↑ = ↓ . The discourse function df comprises the subj, the topic and the focus. Therefore, following (30b), the subject DP in (29) is given (↑ subj) = ↓ . (30c) constrains the annotation on complements of lexical categories. In (29), the DP the window is defined by this constraint and given the argument function (↑ obj) = ↓ . Finally, (30e) assigns a non-argument functional annotation to a node that is adjoined to a phrasal constituent. This explains why ↓ ∈ (↑ adj) is assigned to the Adv node in (29). An important aspect of LFG is its commitment to lexicalism. The Lexical Integrity Principle given in (31) defines the lexical and syntactic components as being subject to different well-formedness principles. Words are constructed in the morphology and are inserted into the c-structure as fully inflected words. Syntactic processes therefore cannot manipulate the internal morphological structure of such words. (31) Morphologically complete words are leaves of the c-structure tree and each leaf corresponds to one and only one c-structure node (Bresnan and Mchombo 1995, Bresnan 2001) Some examples of lexical entries are given in (32). Each lexical entry comprises three elements: a lexical form, a category label and f-descriptions.2 The f-descriptions 2. There are alternative approaches to inflectional morphology within LFG. In Finite-State Morphology, for example, lexical form is built through the combination of stems and abstract morphological feature tags (see Butt et al. (1999)).

 Ana R. Luís and Ryo Otoguro

provide the feature content for the f-structures and play therefore a crucial role in establishing the correspondence between c-structure and f-structure. (32) a. the D (↑ pred) = ‘the’ (↑ def) = + b. robbers N (↑ pred) = ‘robber’ (↑ num) = pl c. smashed V (↑ pred) = ‘smash 〈subj,obj〉’ (↑ tense) = past d. window N (↑ pred) = ‘window’ (↑ num) = sg The f-descriptions contained in the lexical entries flow into the f-structure by means of the ↑ which maps the feature contents onto the f-structure of specific pre-terminal nodes (i.e., N, V, D, etc.). (33) shows an annotated c-structure with fully specified lexical entries. And (34) illustrates the f-structure that results form the combination between the annotations on the c-structure nodes the f-descriptions on the lexical items. (33)

IP ( subj) =  DP

= I

( spec) =   =  NP D = the ( pred) = ‘the’ N ( def) = +

= VP =  V

(obj) =  DP

( spec) =   =  smashed robber D NP ( pred) = ‘robber’ ( pred) = ‘smashed…’ ( num) = pl ( tense) = past ( pers) = 3 =  the ( pred) = ‘the’ N ( def) = + window ( pred) = ‘window’ ( num) = sg



In﬇ectional morphology and syntax in correspondence 

(34)

pred ‘smashsubj,obj’ tense past subj

obj

3.2

spec

pred ‘the’ def +

pred num

‘robber’ pl

spec

pred ‘the’ def +

pred num

‘window’ sg

Issues

LFG treats morphology and syntax as independent levels of linguistic structure. A strong division exists between word-internal structure, on the one hand, and the structure between words, on the other, based on the underlying conviction that word-formation cannot take place in the syntax. In a lexicalist theory of grammar, the role of the morphology is to process morphological operations, such as combining roots and affixes, changing stem forms, among other, and creating fully inflected words. In LFG, such morphological operations are distinct from the syntactic ones, in harmony with the principle of Lexical Integrity in (31). According to this principle, at the level of c-structure, each terminal node can only be instantiated by one morphologically complete word. The EP clitic system constitutes a challenge to the strict separation between morphology and syntax assumed in LFG. In morphological terms, as shown in Section 2, both enclitics and proclitics in EP must be treated as inflectional affixes. In effect, the strong resemblance between enclitic clusters and proclitic clusters can only be insightfully captured if both preverbal and postverbal clitics sequences are generated through the same inflectional mechanisms. In addition, we have seen that while enclitics attach to the verb like word-level suffixes, proclitics are best viewed as phrasal affixes. At the level of c-structure, the EP enclitic must be dominated by the same node that dominates the verb, as shown in (35). This c-structure is in conformity with lexical integrity in so far as the verb-enclitic combination is dominated by one and only one c-structure node (Bresnan and Mchombo 1987). (35)

VP = V vêem-nos (obj pred) = pro

 Ana R. Luís and Ryo Otoguro

While it is uncontroversial that proclitics contribute the same functional information as enclitics, proclitics cannot receive the c-structure representation in (35). As alluded to before, the problem posed by pronominal proclitics in EP arises from the fact that they exhibit both inflectional and syntactic properties: they exhibit a wide range of inflectional properties like their enclitic counterparts, but they also exhibit scopal and distributional properties which indicate that they are not morphologically attached to the verbal host. In phrase structural terms, their syntactic transparency suggests that proclitics should be represented as independent nodes. However, under the traditional view of Lexical Integrity, affixes (or parts of words) are not allowed to be represented as leaves of the c-structure tree. In the following section, we capture the dual properties of EP pronominal proclitics by exploring a new approach to wordhood within LFG.

4. Analysis The properties of proclitics and proclitic triggers in EP can be summarised as follows: (a) in preverbal position, clitic affixes select a phrasal host and behave therefore like phrasal inflections; (b) proclitic triggers always precede the finite verb; (c) the position of proclitic triggers cannot be reduced to one single phrase structure position; and (d) they constitute a heterogeneous group of elements which contribute a wide range of information to f-structure. Any account of EP proclisis must be able to capture these four points (see Luís (2004) for survey of previous analyses of proclisis in EP).

4.1

EP phrase structure

We start our analysis of EP clitic placement by laying out basic assumptions about the EP phrase structure. The schematic c-structure for EP comprises the lexical projection VP and the functional projections IP and CP, as given in (36). (36)

CP XP

C C

IP NP/DP Adv/Neg I

I Adv VP V

NP/DP



In﬇ectional morphology and syntax in correspondence 

Briefly, we assume that finite verbs/auxiliaries are base-generated in I or C, whereas non-finite verbs are generated in V (cf. Kroeger (1993), King (1995), Bresnan (2001). Adverbs are left-/right-adjoined to I′, and negations are treated as a type of adj (Sells 2001). Spec-IP is the position for the subject NP/DP, annotated as (↑ subj) = ↓ . Spec-CP is the position of a fronted focused phrase or a wh-phrase, both annotated as (↑ focus) = ↓. We also assume that the discourse function topic appears in Spec-CP (cf. Sells (2001) for Swedish). With respect to topic, the data in (4.1) seem to suggest that it is adjoined to IP, as assumed for English (Bresnan 2001: 180–3): (37) a. Ao João, a professora deu-lhe um livro. to João the teacher gave-3.sg.masc.acc a book ‘To João, the teacher gave a book.’ b. Ao João, o livro, a professora deu-lho. to João the book the teacher gave-3.sg.dat/3.sg.masc.acc ‘To João, the book, the teacher gave.’ The fronted phrase in (37a) ao João appears to be adjoined to IP, and the two topicalised phrases in (37b) also seem to be (multiply) adjoined to IP. In each structure in (38), however, the fronted topic phrase is actually followed by the finite verb and the subject. For such cases, we would like to propose that the subject is sitting in Spec-IP while the verb is base-generated in C. The verb’s higher position makes the Spec-CP position available for the fronted topic. (38) a. Este livro, dou-to eu this book give-2.sg.dat/3.sg.masc.acc i ‘This book, I give it to you.’ b. Deste livro, lembro-me eu this book, remember-1.sg.refl i ‘This book, I remember.’ Following standard LFG assumptions about c-structure/f-structure correspondence given in (30), we also assume that the functional head and its complement are f-structure co-heads. Therefore, V, V′, VP, I, I′, IP, C, C′ and CP are all annotated as ↑ = ↓. Finally, we treat the complement of V as an obj in the f-structure.

4.2

Morphological tokens and syntactic atoms

At the interface between morphology and c-structure, we put morphological tokens in correspondence with syntactic atoms: (39) a. Morphological token: Each morphological token corresponds to a well-formed stem-affix string which is defined by morphology-internal principles.

 Ana R. Luís and Ryo Otoguro

b. Syntactic atom: Syntactic atoms are leaves on c-structure trees; each leaf corresponds to one and only one terminal node; the insertion of syntactic atoms into c-structure is subject to standard phrase structure constraints. At the purely morphological level, inflectional strings are generated as sequences of morphological tokens. Such formatives are independent of phrase structure and defined by principles of inflectional morphology. In our proposal, we adopt Stump’s (2001) Paradigm Function Morphology (PFM), more specifically a revised version of PFM called Generalised Paradigm Function Morphology (GPFM) (Luís and Spencer 2005, Spencer ms). Within GPFM, the Paradigm Function (PF) constitutes a global function which takes as input a lexeme and a complete set of morphosyntactic properties σ associated to that lexeme (i.e., 〈L, σ〉), yielding as output an inflected form of that lexeme and σ (i.e., 〈verb form, σ〉). The output form is generated by the PF through (1) the selection of the stem (S), (2) the realisation of the affix (R), and (3) the linearisation of the affix with respect to the stem (L). Let us illustrate how the verb-clitic sequences vê-me in (40a) and me vê in (40b) are realised as inflectional strings. As alluded to before, proclisis can only occur if a specific word or phrase occurs preverbally. In (40b), we have the proclitic trigger raramente preceding the verb and attracting the clitic into preverbal position. In (40a), on the contrary, the clitic occurs postverbally as an enclitic because no preverbal trigger is available. (40) a. O João vê-me raramente. the João sees-1.sg.dat rarely ‘João rarely sees me.’ b. O João raramente me vê. Within GPFM, the realisation of vê-me and me vê is modelled as shown in (41): (41) a. Where σ = {TNS:pres, AGR:{PERS:3, NUM:sg}, OBJ:{PERS:1, NUM:sg, CASE:dat}}, PF(〈VER, σ〉) = def i. S: vê ii. R: me iii. L: vê < me b. Where σ = {TNS:pres, AGR:{PERS:3, NUM:sg}, OBJ:{PERS:1, NUM:sg, CASE:dat}, RESTRICTED:yes}, PF(〈VER,〉) = def i. S: vê ii. R: me iii. L: me < vê



In﬇ectional morphology and syntax in correspondence 

In both (41a) and (41b), the morphosyntactic features associated with the lexeme are exactly the same: 3rd person singular present tense and dative 1st person singular pronominal object. Thus, the subfunctions S and R apply to a similar set of features capturing the inflectional similarities between vê me and me vê. In both cases, therefore, the PF yields the same stem and the same affix, namely vê and me, respectively. However (41a) and (41b) differ with respect to the subfunction L which is responsible for the linearisation of the pronominal affix. By default, L locates the affix me to the right of the stem vê, as in (41a). On the contrary, in proclitic contexts, the subfunction L places the affix to the left of the stem. Inside the morphological component, proclisis is induced by the markedness feature {RESTRICTED:yes} which is contained in the morphosyntactic feature set associated to the lexeme in (41b) but not in (41a). We will have more to say about the nature of this feature and its association to proclitic contexts in Section 4.3. As alluded to above, we propose that morphological tokens are mapped onto syntactic atoms. Morphological tokens, as defined in (39a), are well-formed stem-affix strings which are obtained from the application of inflectional operations as illustrated in (41). Thus, stem-affix strings such as vê me and me vê constitute morphological tokens and are represented in square brackets: [vê, me] and [me, vê]. Syntactic atoms, as defined in (39b), are the leaves of c-structure trees. In most cases, one morphological token corresponds to one syntactic atom. When the correspondence between tokens and atoms is one-to-one, inflected words are inserted under one phrase structure terminal in the c-structure. However, the correspondence between morphological tokens and syntactic atoms may also be non-isomorphic (i.e., either one morphological token corresponds to more than one syntactic atom or many morphological tokens correspond to one single syntactic atom). In EP, verb-enclitic sequences correspond to one syntactic atom, while procliticverb sequences correspond to two syntactic atoms, as illustrated in (42a) and (42b), respectively: (42) a. [vê, me] ⇒ vê-me i/c b. [me, vê] ⇒ me Clvê i/c (42a) states that [vê, me] is mapped onto a single syntactic atom with a category label I or C (see EP phrase structure in Section 4.1 below). In (42b), since the mapping between morphological tokens and syntactic atoms is one to many, the affix and the stem correspond to two distinct syntactic atoms. We further propose that the clitic bears the category label Cl. Overall, our distinction between tokens and atoms constitutes an attempt at accounting for the mismatch between morphology and syntax within a lexicalist framework. Other mismatch phenomena between morphology and phrase structure have been attested in a number of unrelated languages. Otoguro (2006) and Luís and Otoguro (2006: 40–43), for example, extend the current proposal to Hindi-Urdu verb morphology and show that, in the future tense, two morphological tokens correspond

 Ana R. Luís and Ryo Otoguro

to one syntactic atom. For this language, the mapping is mediated through an algorithm that captures a many-to-one correspondence between tokens and atoms.3 Turning now to the c-structure representation of EP pronominal clitics, in (43) enclitics constitute a single morphological token which is mapped onto a single syntactic atom. EP proclitics are mapped onto two c-structure terminals, as shown in (43b): (43) a.

[vê, me]

IP NP o João

b.

I I

Adv

vê-me

raramente [me, vê]

IP NP

I

o João Adv

I

raramente Cl

I

me



In (43b), we position the pronominal clitic under Cl in the c-structure and adjoin it to an X0 (cf. Sadler Arnold (1994), Sadler (1997), Toivonen (2003). Interpolated elements, such adverbs and the negation marker, examplified in (44), are allowed to undergo multiple X0 adjunction, following the proposal by Luís and Sadler (2003).

3. Another lexicalist approach to the morphology-syntax mismatch is formulated inWescoat (2002) who argues that English subject-auxiliary sequences such as he’ll correspond to one single morphological unit which is mapped onto two syntactic terminals. Wescoat formulates lexical-sharing trees for the treatment of English inflected pronouns, allowing two or more terminal nodes to share the same morphological object. For EP, the problem with Wescoat’s analysis is that it requires shared nodes to be immediately adjacent.



In﬇ectional morphology and syntax in correspondence 

(44)

CP C IP

C porque DP

I

ele

I

Cl

Adv

o

ainda

Adv

I

não visitou

The elements appearing between the pronominal clitic and the finite verb are adverbials which are allowed to adjoin to an X0. Following Toivonen (2003), we refer to these adverbials as non-projecting words (the accent over the category label Adv signals that these X0s cannot project). Summing up, at the interface between morphology and c-structure, a labelling algorithm takes as input morphological tokens and delivers labelled syntactic atoms. Crucially, through the mapping between morphological tokens and syntactic atoms shown in (42) morphological strings cannot be inserted directly into the phrase structure. The morphology yields well-formed inflectional strings (i.e., stem-affix combinations) which realise a complete set of morphosyntactic properties, as in (41) above. However, at the level of c-structure, syntactic terminal nodes are instantiated by syntactic atoms. The insertion of syntactic atoms into c-structure is regulated by standard phrase structure principles (e.g., immediate domination, linearisation and instantiation) and EP phrase structure. By separating the morphological generation of inflectional strings, i.e. morphological tokens, from their phrase structural properties, our proposal captures the dual properties of EP proclitics, namely their affixal behaviour illustrated in Section 2.1 and their phrasal properties outlined in Section 2.3. The key goal of our analysis is to allow single morphological tokens (i.e., stem-affix combinations) to be mapped onto one or more syntactic atoms without incurring any violation of lexical integrity. It is worth pointing out that our approach to wordhood does not require any changes to the formal properties of c-structure trees or to the fstructure to c-structure mapping.

 Ana R. Luís and Ryo Otoguro

4.3

Proclitic contexts

In this section, we offer an outline of our LFG-treatment of proclitic contexts in EP. It is argued that the effect of proclitc triggers on clitic placement can be straightforwardly accounted for by an approach in which (1) proclitic contexts are defined in terms of functional-precedence (Kaplan and Zaenen 1989) (4.3.1) and in which (2) morphological features are placed in correspondence with syntactic features (Sadler and Spencer 2001, Nordlinger and Sadler 2004) (4.3.2). Having provided the necessary LFG machinery, we examine each one of the proclitic contexts in turn (4.3.3). 4.3.1 F-precedence As shown in Section 2, proclisis is triggered by a set of heterogeneous syntactic contexts. Previous syntactic analyses of EP cliticisation have tried to identify configurational similarities by placing triggers under functional nodes such as CP or IP or under functional projections such as NegP or FocP (Martins 1994, Madeira 1993). However, finding a common denominator for all proclitic contexts in the phrase structure has proven difficult. Some attempts have therefore been made at identifying natural semantic classes (e.g. downward monotone quantifiers (Crysmann 2002)), but such classes have also failed to encompass all proclitic triggers. Other studies have argued that clitic placement is largely driven by discourse-information structure (McConvell 1996). It is however far from clear how this intuition could be extended to subordinating complementisers of conjunctions. Overall, then, there appears to be no single configurational, semantic or discourse explanation for procliticisation in EP (Luís 2004). One of the purposes of our paper is to show that, despite the heterogeneity of the proclitic contexts, clitic placement can be straightforwardly accounted for by an approach in which both the position and the nature of the triggers are defined through f(unctional)-precedence relations (Bresnan 2001: 195):4 (45) F-precedence (