Reciprocal Anaphora Resolution in Pashto Discourse

5 downloads 0 Views 175KB Size Report
Keywords—Pashto Pronoun; Reciprocal Anaphora; Anaphora. Resolution ..... Sometimes, the Pashto indefinite pronouns (IndefPro) appear either in the current ...
Reciprocal Anaphora Resolution in Pashto Discourse Rahman Ali Mohammad Abid Khan Muhammad Bilal Ihsan Rabbi Department of Computer Science, University of Peshawar, N.W.F.P, Pakistan [email protected], [email protected], [email protected], [email protected]

Abstract--It is very important for an effective machine translation system to transform the multi-sentential discourses, of a source language, into simple and cohesion free discourses. Anaphora resolution is the predicament in computational linguistics that is responsible for the alleged problem. This paper is about the lexical (reciprocal) anaphora resolution in Pashto language and presents an algorithm that is based on the rules to resolve the reciprocal pronouns (RecipPro) of Pashto language. The algorithm has been tested on real world text taken from Pashto stories, novels, newspapers, web-pages and showed an accuracy of 94.45%. Keywords—Pashto Pronoun; Reciprocal Anaphora; Anaphora Resolution; Algorithm

I.

INTRODUCTION

Computational treatment of natural languages is a challenging task. The reason is that the understanding phase of human languages by computer is too difficult. It is because of some concrete problems including word sense disambiguation, syntactic and lexical ambiguities, text segmentation, irregular input, anaphora and ellipsis. The resolution of pronouns in natural languages is an important process for several natural language processing (NLP) applications including information extraction, question answering, and text summarization [6]. Several rule-based approaches have been proposed for the pronouns resolution problem in NLP [4,5,7,8,9,10]. In English, the lexical anaphors have been worked by applying the anaphor binding algorithm [7]. Pashto is an Arbic script-based language having frequent use of the pronouns. To minimize the cohesion in Pashto text, it is needed to resolve these pronouns. Personal anaphora in this language has been treated recently [1,2]. Similarly, all types of Pashto anaphora have recently been reported while analyzing Pashto text critically [3]. No more work has been reported for the pronouns resolution in Pashto text until now. In this paper, a rule-based approach is proposed that effectively identifies the antecedents of lexical anaphors

(Reciprocal). The effectiveness of the approach is demonstrated through experiments on Pashto text taken from different genres. The paper is based on some factors that are helpful in the resolution process. The rest of the paper is organized as follows. Section 2 describes lexical anaphora and its categories. Section 3 describes the rules that help in the resolution of reciprocal pronouns. Section 4 outlines the rules in the form of an algorithm. Section 5 evaluates the approach while Section 6 concludes the work done. II.

LEXICAL ANAPHORA IN PASHTO

When the reflexive pronoun (RefPro) or RecipPro of Pashto language refers backward to some previously mentioned noun (N) or noun phrase (NP) in the text then this phenomenon is called Pashto lexical anaphora. There are two types of lexical anaphora. These are reciprocal and reflexive. The following example shows these types: (2.1) ‫د ﮐﻠﻲ ﻣﺎﺣﻮل ؤ ـ د ځﻨﯥ ﺧﻠﻘﻮ د اﻧګڼﻮ ﻧﻮ هډو دروازې ﻧﮥ وې ـ ډﯦﺮو ﭘﮑښﯥ د‬ ‫ځﺎن ﻧﻪ د ﻏﻨﻮ ﺷﭙﻮﻟﻮﻧﻪ ﺗﺎؤ ﮐړي وو ـ ﺧﻠﻘﻮ ﻳﻮ ﺑﻞ ﭘﯧﮋﻧﺪل ـ‬ [" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-49]

[də] [kalay] [māḥwl] [wo]. [də] [jane] [xəlqw] [də] [angaṇonú] [haḍú] [dərwāze] [nə] [we]. [ḍerú] [pakṣ̌e] [də] [jān] [nə] [də] [γanú] [šapúlona] [tāw kaṛI wo].[xəlqú] [yaw bəl] [pežəndəl]. [of] [village] [environment] [was]. [of] [some] [people] [of] [veranda-pl] [not at all] [door-pl] [not] [Were-Perfective]. [majority] [out of them] [themselves] [from] [of] [thorn] [enclosure] [wrap-Perfective] [people] [one another] [know]. It was a rural community. The houses of some of the people were without doors. Most of them have enclosed themselves by spines. The people knew one another. In the above example, the RefPro (‫[ ځﺎن‬jān]-self) in the third sentence refers back to the plural (pl) NP (‫[ ﺧﻠﻘﻮ‬xəlqú]people) in the second sentence and results in reflexive anphora. Similarly the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-one another) that appears in the last sentence also refers to the pl NP (‫ﺧﻠﻘﻮ‬

[xəlqú]-people) of the same sentence immediately preceding it and results in reciprocal anaphora.

(3.1.1) ‫اﻋﻼن وﺷﻮ ﭼﯥ دا ﺑﻨﻴﺎ د ﺑﻪ هﻐﻪ ﺳړے ږدي ﭼﯥ ټﻮل ﻋﻤﺮ ﮐښﯥ ﺋﯥ ﺑﺪ ﮐﺎ ر ﻧﮥ‬ -‫وي ﮐړې ـ ټﻮﻟﻮ ﺧﻠﻘﻮ ﻳﻮ ﺑﻞ ﺗﻪ ﮐﺘﻞ ﺷﺮوع ﮐړل‬

Some of the researchers have worked on lexical anaphora in English language [7]. They have used anaphor binding algorithm for the resolution of these pronouns.

[" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-262]

This paper is only about the resolution process of the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]) in Pashto text.

A.

Reciprocal Anaphora

When the Pashto RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) refers to the pl NP in the preceding text then this phenomenon is called reciprocal anaphora. For example: (2.1.1) ‫ﺑﻴﺎ ﺑﻪ ﻣﺴﺘﻲ ﺷﺮوع ﺷﻮﻩ ـ دﻟﺘﻪ دې هﻠﮑﺎﻧﻮ او اﺧﻮا ﭘﻪ ﮐﻮر دﻧﻨﻪ ﺑﻪ ﺟﻴﻨﮑﻮ ﭘﻪ ﻳﻮ ﺑﻞ‬ ‫اوﺑﮥ اﭼﻮل ﺷﺮوع ﮐړل ـ‬ [" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-76]

[biā] [bə] [məstI] [šorúς šawá]. [dəlta] [de] [hálakānú] [aw] [aəxwá] [pə] [kúr] [dananə] [bə] [jInakú] [pə] [yaw bəl] [aúbə] [ačawəl] [šorúς kaṛəl]. [then] [to be-Particle] [intoxication] [start-Imperfective]. [here] [these] [boy-Pl] [and] [there] [at] [home] [inside] [to beParticle] [girl-pl] [at] [one another] [water] [pouring] [start]. Then the intoxication starts. Here the boys used to pour water on one another and inside the home the girls used to do the same. In the above discourse (DC), the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-one another) of the second sentence refers to the pl antecedent (‫[ ﺟﻴﻨﮑﻮ‬jInakú]-girls) of the same sentence. The Pashto, RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) is equivalent to the English ‘each other’ and ‘one another’. In English language, when two objects are involved then the pronoun ‘each other’ is used and when more than two objects are involved then ‘one another’ is used. In Pashto, reciprocal pronoun (‫ )ﻳﻮ ﺑﻞ‬is used for either the cases i.e. whether two objects are involved or more. III.

RULES FOR THE RESOLUTION OF RECIPROCAL PRONOUNS

There are some rules which are helpful in the resolution process of reciprocal pronouns. These are given in detail in the following sub-sections. A.

Number Agreement

The RecipPro, (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) of Pashto language always refers to the pl NP in the preceding sentence(s). For example:

[1ςlān] [wašú] [če] [dā] [bunIād] [bə] [háγa] [saṛay] [ǧadI] [če] [ṭol] [ςúmər] [kṣ̌e] [əye] [bəd kār] [nə] [wI] [káṛe].[ṭolú] [xəlqú] [yaw bəl] [tə] [katəl] [šorúς kaṛəl]. [announcement] [do-Perfective] [that] [this] [foundation] [to be-Particle] [that] [person] [put down] [that] [all] [life] [in] [he-Clitic] [evil deed] [not] [do-Perfective]. [all] [people] [one another] [to] [looking] [start-Imperfective]. It was announced that only that person will put down this foundation that had never don an evil deed in his life. All people were looking at one another. In the above DC, the possible antecedents for the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-one another) that appears in the last sentence are: (‫[ ﺳړے‬saṛay]-person,‫[ ﺧﻠﻘﻮ‬xəlqú]-people). The number agreement selects only the pl antecedent (‫[ ﺧﻠﻘﻮ‬xəlqú]people) and rules out the second (‫[ ﺳړے‬saṛay]-man/person). Here, the NP (‫[ ﺳړے‬saṛay]-man/person) is ruled out due to its Sg nature. B.

Pleonastic

Sometimes, the Pashto RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) does not refer to any thing specific. In English, such pronouns are termed as pleonastic [7]. While resolving such RecipPro(s), this system just declares them as pleonastic. For example: (3.2.1) ‫دﻏﺴﯥ ﺗﺮ درﯦﻮ ورځﻮ ﺑﻪ د هﻤﺪردۍ ﭘﻪ ﺟﻮړ ډوډۍ هﻢ ورﻟﻪ ﻳﻮ ﺑﻞ ﮐﻮﻟﻪ ـ‬ [" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-69]

[daγáse] [tər] [darew] [wárajú] [bə] [də] [həmdərdəy] [pə] [júṛ] [ḍúḍəy] [húm] [wərla] [yaw bəl] [kawalá]. [similarly] [up to] [three] [days] [to be-Particle] [of] [sympathy] [at] [reason] [lunch/dinner] [also] [for them-Clitic] [one another] [do]. In the same way, people used to offer meal to them up to three days on sympathy basis. In the above example, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬does not refer to something specific in the text. Therefore it is pleonastic and can not be resolved. This system will just declare it pleonastic C.

Reciprocal Pronoun Followed by a Noun

In Pashto text, the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) may be immediately followed by a N. In this case, it is equivalent to the English word ‘another’ and is

not a RecipPro. Hence, the system declares it non-anaphoric. For example: (3.3.1) ‫ګﺎ ډي ﻳﻮ ښﮥ ﺳﺎﻋﺖ ﻣﺰل وﮐړو او ﺑﻴﺎ روﻏﻮﻧﺪې ﭘﻪ ﻳﻮ ﺑﻞ ﺳټﯧﺸﻦ ودرﯦﺪو ـ‬

[aw] [biā] [dā] [húm] [wəyili šawi di].[če] [múnǧ] [de] [də] [yaw bəl] [ḥaqwúq] [wápežanú].

[" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-130]

[and] [then] [this] [too] [say-Perfective]. [that] [we] [of] [one another] [rights] [know].

[gaḍI] [yaw] [ṣ̌a] [saςət] [mazəl wákṛú] [aw] [biā] [ro] [γúnde] [pə] [yaw bəl] [sṭešən] [wadredú].

And it has also been said that we should know the rights of one another.

[train] [one] [complete] [hour] [travel-Perfective] [and] [then] [slowly] [at] [another] [station] [stop-Perfective].

In the above example, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬is preceded by the pl PersPro (‫[ﻣﻮﻧږ‬múnǧ]-we) in the same sentence. Here, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬refers to the pl PersPro for its antecedent and our system calls the strong personal pronouns resolution algorithm [1] to resolve the RecipPro (‫ )ﻳﻮ ﺑﻞ‬to the correct antecedent.

The train slowly stopped at another station after a complete hour trip. In this DC, the word (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-another) is immediately followed by a noun (‫[ ﺳټﯧﺸﻦ‬sṭešən]-station). Therefore, it is not a reciprocal pronoun but is equivalent to the English word another. D.

Reciprocal Pronoun Preceded by a Plural Noun

In Pashto text, the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) may be immediately preceded by a pl N. In this case, N is the antecedent. For example: (3.4.1) ‫ورور او ﺧﻮر ﻳﻮ ﺑﻞ ﺗﻪ ﭘﻪ ﻣﻌﻨٰﻰ ﺧﻴﺰ اﻧﺪاز ﮐښـﯥ اوﮐﺘﻞ ـ‬ [“‫“ ﺗﻮر ﭘړوﻧﮯ‬, a Pashto Novel, Prof. H., Hydat, Page-10]

[warúr aw xúr] [yaw bəl] [tə] [pə] [məςnnay xIz] [andāz] [kṣ̌e] [awkatəl]. [brother and sister] [each other] [to] [at] [evocative] [style] [in] [look-Perfective]. Brother and sister looked evocatively at each other. In this example, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬is immediately preceded by the coordinated pl N (‫[ ورور او ﺧﻮر‬warúr aw xúr]brother and sister). Hence, this is the antecedent. E.

Personal Pronoun Antecedent

Sometimes, in Pashto text the RecipPro is preceded by a pl personal pronoun (PersPro) in the same sentence. In this case, the RecipPro refers to the antecedent of the PersPro. In Pashto, the PersPro(s) can be resolved by the strong personal anaphora resolution algorithm [1]. So, the RecipPro in this case are resolved by calling Ali’s and his co-authors algorithm [1]. An example is: (3.5.1) ‫او ﺑﻴﺎ دا هﻢ وﺋﻴﻠﻲ ﺷﻮي دي ـ ﭼﯥ ﻣﻮﻧږ دې د ﻳﻮ ﺑﻞ ﺣﻘﻮق وﭘﯧﮋﻧﻮ ـ‬ [" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-242]

F.

Recency

In most of the cases, the antecedent of RecipPro (‫ )ﻳﻮ ﺑﻞ‬is present in the same sentence. In Pashto, the RecipPro always refers to the nearest pl NP in the same or preceding sentence (s). The example (2.1.1) given in section (A) shows that the RecipPro(‫ )ﻳﻮ ﺑﻞ‬prefers the antecedent (‫[ ﺟﻴﻨﮑﻮ‬jInakú]-girls) instead of (‫[ هﻠﮑﺎﻧﻮ‬hálakānú]-boys) because it is the nearest one. G.

Universal Pronoun Antecedent

Some times, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬refers to the universal pronouns (UnivPro) in the same or preceding sentence (s). In this case, the UnivPro acts as the antecedent of the RecipPro (‫)ﻳﻮ ﺑﻞ‬. For example: (3.7.1) ‫ﺧﻮ ﭘﻪ ﺧﯧټﻪ ﺑﻪ هﺮ ﺳړے ﻣﻮړ ؤ ـ ﭘﻪ ﻏﻢ ښﺎدۍ ﮐښﯥ ﺑﻪ ﻳﻮ ﺑﻞ ﺗﻪ وﻻړ ؤ ـ‬ [" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-66]

[xú] [pə] [xeṭa] [bə] [hər saṛay] [múṛ] [wo]. [pə] [γəm] [šādəy] [kṣ̌e] [bə] [yaw bəl] [tə] [walāṛ] [wo]. [but] [at] [stomach] [may] [every person] [satisfied/fed up] [was]. [at] [sorrow] [merriment] [in] [may] [one another] [to] [ready-Perfective]. Every person used to get sufficient food to fight hunger. They would help one another in occasions of sorrow and happiness. In the above DC, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬refers to the UnivPro (‫[ هﺮ ﺳړے‬hər saṛay]-every person) which acts as the correct antecedent. H.

Indefinite Pronoun Antecedent

Sometimes, the Pashto indefinite pronouns (IndefPro) appear either in the current or immediate preceding sentence(s). In this case, if the other rules fail in identifying the antecedent then RecipPro (‫ )ﻳﻮ ﺑﻞ‬refers to the IndefPro. For example:

(3.8.1) ‫ﺧﻮ ﭘﻪ دې ﮐښﯥ ﺑﻪ هﻢ ﭼﺎ ﭘﻪ ارام ﻧﮥ ﭘﺮﯦښﻮدو ـ ﻳﻮ ﺑﻞ راﺗﻪ وې هﻠﮑﻪ د ﺧﺪاﺋﮯ‬ ‫ﻓﻀﻞ دے ﭘﻪ ﺧﭙﻠﻪ ﺗﻨﺨﻮاﻩ دې ښﮥ ګﺬارﻩ ﮐﯧږي ـ‬ [" ‫" ګﻞ ﻣﻴﻨﻪ‬, Pashto Novel, Mirza Jehanzeb Yar, Page-233]

[xú] [pə] [de] [kṣ̌e] [bə] [húm] [čā] [pə] [árām] [nə] [páreṣ̌əwdú]. [yaw bəl] [rā ta] [we] [hálaka] [də] [xdāəy] [faḏəl] [day] [pə] [xapalá] [tənxawáh] [de] [ṣ̌á] [g̣úδāra] [keǧI]. [but] [at] [this] [in] [to be-Particle] [too] [some one] [at] [rest] [not] [leave-pl-Perfective]. [one another][to me-clitic] [say] [boy] [of] [God] [grace] [is] [at] [own] [salary] [your-Clitic] [better] [means of livelihood] [do].

In the above example, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬is repeated four times. All of them refer to the same antecedent (‫[ اوﻧﯥ‬awne]trees) selected by one of the above rules. All the rules discussed above are formatted, summarized and listed in a rule table 1 on which the algorithm operates. IV. 1. 2. 3.

But no one leave us undisturbed. They would say that by grace of God your salary satisfies all your needs of livelihood. In the above DC, the RecipPro (‫ )ﻳﻮ ﺑﻞ‬of the second sentence refers to the IndefPro (‫[ﭼﺎ‬čā]-some one) of the first sentence and this IndefPro acts as an antecedent for this RecipPro (‫)ﻳﻮ ﺑﻞ‬. 4. I.

ALGORITHM

Take manually annotated Pashto text. Scan the text, DC by DC for RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]) for each DCi in the text a. for each RecipPro (‫) ﻳﻮ ﺑﻞ‬in the DCi i. for each rule Rj in the Rule Table 1. If (Rj.Condition=DCi..format) then 2. Apply Rj.Rule to get the antecedent (Rj.Antecedent) or apply (Ri.Action). 3. end if ii. end for b. end outer for end outermost for

Repetition

Sometimes, the RecipPro (‫[ ﻳﻮ ﺑﻞ‬yaw bəl]-each other/one another) is repeated in the same DC many times. In this case, all the reciprocal anaphoric devices (ADs) refer to the same antecedent. For example: (3.9.1) ‫ﭘﻪ دې ﻧﺮئ ﻧﺮئ هﻮا ﺑﻪ اوﻧﯥ ﻳﻮ ﺑﻞ ﺗﻪ ﻧﺰدې ﺷﻮې ﺧﻮ ﻳﻮ ﺑﻞ ﺑﻪ ﺋﯥ ښﮑﻞ ﻧﮥ ﮐړې‬ ‫ﺷﻮ ـ او ﭼﯥ ﮐﻠﻪ ﺑﻪ هﻮا ﻟږﻩ ﺗﯧﺰﻩ ﺷﻮﻩ ـ ﻧﻮ ﭘﻪ ﻳﻮ ﺑﻞ ﺑﻪ راﭘﺮﯦﻮﺗﯥ ـ او ﺧﭙﻠﻮ ﻣﯧﻨځ‬ ‫ﮐښﯥ ﺑﻪ ﺋﯥ ګﻨګﻮﺳﮯ ﺷﺎن اوﮐړو ـ ﻳﻮ ﺑﻞ ﺗﻪ ﺑﻪ ﺋﯥ ددې ټﻮﻟﯥ ورځﯥ ﺣﺎ ل‬ ‫واوروﻟﻮ ـ‬ [”‫“ګﺮل ﻓﺮﯦﻨډ‬, Pashto novel, Prof. H., Hidayat, Page-4]

[pə] [de] [nárəy] [nárəy] [hawā] [bə] [awne] [yaw bəl] [tə] [n1zde] [šawe] [xú] [yaw bəl] [bə] [əye] [ṣ̌úkəl nə kaṛe šo]. [aw] [če] [kalá] [ba] [háwa] [laǧá] [teza šawa]. [nú] [pə] [yaw bəl] [bə] [raprewate]. [aw] [xapalo menj kṣ̌e] [bə] [əye] [gúngwsay šān awkṛo]. [yaw bəl] [tə] [bə] [əye] [da de] [ṭole] [wáraje] [ḥāl] [wāwrawalú]. [at] [this] [mild] [mild] [air] [to be-Particle] [tree-pl] [one another] [to] [near] [come-Perfective] [but] [one another] [may] [they-clitic] [kiss-Neg-Perfective]. [and] [that] [when] [may] [air] [a little] [fast-Perfective]. [so] [at] [one another] [may] [fall-pl-Fem-Perfective]. [and] [themselves] [may] [they-clitic] [whisper-Perfective]. [one another] [to] [may] [they-clitic] [of] [this] [whole] [day] [condition-pl] [tellPerfective]. While the air was blowing mildly, the trees would bend towards one another but could not kiss each other. When the air blew fastly, they would fall on one another and whispered,. They would discuss the matters of the whole day long.

V.

EVALUATION

Due to the unavailability of annotated corpus and other NLP tools for Pashto language, a small manually tagged and segmented corpus was created for Pashto language. This corpus is composed of 362 selected examples from Pashto novels, stories, newspapers, websites and other sources containing 397 RecipPro(s). The algorithm correctly resolved 375 RecipPro(s) out of the 397 and showed an accuracy of 94.45 %. VI.

CONCLUSION

In this paper, an algorithm has been developed for the resolution of reciprocal pronouns in Pashto language. This algorithm uses constraints and preferences to resolve these ADs with an accuracy of 94.45%. The algorithm takes manually preprocessed Pashto text as an input because the tools for tagging and segmentation of Pashto text are not available yet.

pl

LIST OF ABBREVATIONS Abbreviation Term Natural Language NP Processing Plural N

Sg

Singular

PersPro

RecipPro

Reciprocal Pronoun

UnivPro

RefPro

Reflexive Pronoun

IndefPro

Sent AD

Sentence Anaphoric Device

DC R

Term NLP

Abbreviation Noun Phrase Noun Personal Pronoun Universal Pronoun Indefinite Pronoun Discourse Rule No

Pronoun

Rule No R1 R2 R3

Personal pronoun antecedent

R4 RecipPro (‫)ﻳﻮ ﺑﻞ‬ R5 R6 R7 R8

TABLE 1: RULES TABLE Rule Condition Reciprocal Pronoun …,(RecipPro),(N-pl/Sg) followed by a noun/noun phrase Pleonastic …,(RecipPro-Pleonastic),… Reciprocal Pronoun …,(N-pl), (RecipPro),… preceded by a pl noun

a

Number agreement

b

Recency Universal pronoun antecedent Indefinite pronoun antecedent Repetition

Sent1[…],…,Sentn[…,PersPropl,…,(RecipPro)],… (N1-pl), (N2-Sg), (N3-pl), (N4-Sg),…,(Ni-pl),(Ni+1Sg),(RecipPro), … (N1-pl),…,(N3-pl),…,(Ni-pl), (RecipPro) If there is no pl N or NP in the DC but has UnivPro If there is no pl N or NP or UnivPro in the DC but has IndefPro If RecipPro is repeated in the DC

Antecedent/Action It is not a RecipPro Pleonastic / Non-Anaphoric N-pl PersPro-pl  call strong personal anaphora resolution algorithm (Ali et al, 2007) (N1-pl),(N3-pl),(Ni-pl) (Ni-pl) UnivPro IndefPro Refers to same antecedent. apply the above rules to get the antecedent.

REFERENCES [1] Ali, R., Khan M. Abid, Rabbi, I. (2007), “Strong Personal Anaphora Resolution in Pashto Discourse”. In Proceeding IEEE ICET 3rd International Conference on Emerging Technologies. Islamabad, Pakistan, pp 148-154. [2] Ali, R., Khan M. Abid, Ahmad, R., Rabbi, I., (2008), “Rule Based Personal References Resolution in Pashto Discourse for Better Machine Translation”. In Proceeding IEEE ICEE 2nd International Conference on Electrical Engineering. UET Lahore, Pakistan, pp 57-62. [3] Ali, R., Khan M. Abid, Bilal, M., Rabbi, I., (2008), “Empirical Analysis of Pashto Text for Types of Pashto Anaphora”. In proceeding of International Conference on Information & Communication Technology (ICICT2008), University of Science & Technology, Bannu, Pakistan. [4] Baldwin, B. (1995), “Cogniac: A discourse processing engine”. Ph.D. thesis, Department of Computer and Information Sciences, university of Pennsylvania, Philadelphia, PA. [5] Brennan, S. E., Friedman, M. W., Pollard, C. (1987), “A centering approach to pronouns”. In Proceedings of the 25th Annual eeting of the Association for Computational Linguistics (ACL), pp155–162. [6] Iida, R., Inui, K. Matsumoto, Y., (December 2005), “Anaphora Resolution by Antecedent Identification Followed by Anaphoricity Determination”, ACM Transaction on Asian Language Information Processing, Vol. 4, pp 417-434.

[7] Lappin S. and Leass H. (1994), “An Algorithm for Pronominal Anaphora Resolution”. Computational Linguistics, 20(4), 1994, Pages 535-561. [8] Mitkov, R. (1997), “Factors in anaphora resolution”: they are not the only things that matter. A case study based on two different approaches. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics (ACL) and the 8th Conference of the European Chapter of the Association for Computational Linguistics (EACL) Workshop on Operational Factors in Practical, Robust Anaphora Resolution. [9] Nakaiwa, H., Shirai, S. (1996), “Anaphora resolution of japanese zero pronouns with deictic reference”. In Proceedings of the 16th International Conference on Computational Linguistics (COLING), pp 812–817. [10] Okumura, M., Tamura, K. (1996), “Zero pronoun resolution in Japanese discourse based on centering theory”. In Proceedings of the 16th International Conference on Computational Linguistics (COLING), pp 871–876.