Slides

4 downloads 105378 Views 2MB Size Report
of Gologras and Gawain, Territories in The Pendragon Adventure, A Storm of Swords, Connor MacLeod, Paul Atreides her manner ___, being ___ by their, ___ ...
Cross-Cutting Models of Lexical Semantics Joseph Reisinger and Raymond Mooney

Distributional Lexical Semantics

• Represent “meaning” as a point/vector in a high-dimensional space • Word relatedness correlates with some distance metric Almuhareb and Poesio (2004), Baroni and Lenci (2009), Bullinaria and Levy (2007), Erk (2007), Griffiths et al. (2007), Landauer and Dumais (1997), Moldovan (2006), Padó and Lapata (2007), Pantel and Pennacchiotti (2006), Sahlgren (2006), Turney and Pantel (2010)

bat

⌦= disco 2

Distributional Lexical Semantics

• Represent “meaning” as a point/vector in a high-dimensional space • Word relatedness correlates with some distance metric Almuhareb and Poesio (2004), Baroni and Lenci (2009), Bullinaria and Levy (2007), Erk (2007), Griffiths et al. (2007), Landauer and Dumais (1997), Moldovan (2006), Padó and Lapata (2007), Pantel and Pennacchiotti (2006), Sahlgren (2006), Turney and Pantel (2010)

bat

⌦=

d disco 2

Distributional Lexical Semantics

bat

disco

bat club disco

“meaning violates the triangle inequality” Tversky and Gati (1982), Griffiths et al. (2007)

3

Distributional Lexical Semantics

bat

bat

disco

club1

club2

disco

“meaning violates the triangle inequality” Tversky and Gati (1982), Griffiths et al. (2007)

• Address metric violations by learning word sense clusters / making use of local context

• Can we build a model that captures this directly? 4

Cross-cutting Concept Organization breakfast food

french food

healthy

snack

chinese food

unhealthy

dinner food

indian food



Human concept organization exhibits cross cutting structure Rosch, et al. (1976); Ross & Murphy (1999); Medin, et al. (2005); Shaftoe, et al. (2011)



Each categorization system controls what kinds of generalizations (e.g. inferences) are valid.



Do word usages exhibit similar cross-cutting?



Xue, Chen and Palmer (2006): sense disambiguation requires vastly different features for different polysemous verbs in Chinese. 5

Multi View Multinomial Clustering

• There are many valid word clusterings, each capturing different aspects of syntax or topicality

• We introduce a model to explicitly capture

and is ___ we are ___ he is ___ unwilling willing reluctant refusing glad

and are ___ which was ___ who are ___

exceedingly sincerely logically justly appropriately

about because

multiple organizational systems

• Cross-cutting categorization / latent

subspaces with separate, coherent clusterings

• Implement using LDA and DPMM primitives / Gibbs sampling

brand new ___ selection of ___ ___ for sale samsung panasonic toshiba sony epson

results for ___ the latest ___ to buy ___

toyota nissan mercedes volvo audi

dunlop yokohama toyo uniroyal michelin

Figure 1: Example clusterings from MVM applied to Google n-gram data. Top contexts (features) for each view are shown, along with examples of word clusters. Although these particular examples are interpretable, in

Multi View Multinomial Clustering Model View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

View 3 Cluster 2

Cluster 1

Cluster 2

3

4

Data Austin

History of Austin, Texas, University of Texas Medical Branch, 1993 Pacific hurricane season, Rutherford B. Hayes, List of pipeline accidents, List of Austin City Limits performers, Texas in the American Civil War, 6th Cavalry Regiment (United States) ___ texas homes, ___ law school, the citizens of ___, the ___ business directory, ___ police department, university in ___, ___ vacation rentals, the ___ parks and, by the ___ business journal, coming to ___, the ___ area, deals on ___ hotels

Betrayed

Survivor: The Amazon, Personal life of Marcus Tullius Cicero, Numb3rs, Huns, Rurouni Kenshin, Liberation of Paris, The Knightly Tale of Gologras and Gawain, Territories in The Pendragon Adventure, A Storm of Swords, Connor MacLeod, Paul Atreides her manner ___, being ___ by their, ___ and murdered, ___ his weakness, she ___ him, ___ the secret, ___ by her husband, a voice that ___, who felt ___, ___ to the police, ___ their country, suspected of having ___, ___ the confidence, even when ___

Cat

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

5

View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

View 3 Cluster 2

Cluster 1

Cluster 2

3

4

5

View 1 Cluster 1

Cat

Cluster 2

View 2 Cluster 3

Cluster 1

View 3 Cluster 2

Cluster 1

Cluster 2

3

4

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

5

View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

c1,d

Cat

View 3 Cluster 2

Cluster 1

Cluster 2

3

c2,d

4

c3,d

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

• Select a cluster assignment c

for d in each view v (DPMM) i.e. words are assigned to clusters within each view v,d

5

View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

c1,d

Cat

View 3 Cluster 2

Cluster 1

Cluster 2

3

c2,d

4

c3,d

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

• Select a cluster assignment c

for d in each view v (DPMM) i.e. words are assigned to clusters within each view v,d

• Select a view v for each observed feature, and generate it from c f

features distributed between views

vf,d

(LDA) i.e.

5

View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

View 3 Cluster 2

Cluster 1

Cluster 2

3

4

Cat

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

Betrayed

Survivor: The Amazon, Personal life of Marcus Tullius Cicero, Numb3rs, Huns, Rurouni Kenshin, Liberation of Paris, The Knightly Tale of Gologras and Gawain, Territories in The Pendragon Adventure, A Storm of Swords, Connor MacLeod, Paul Atreides her manner ___, being ___ by their, ___ and murdered, ___ his weakness, she ___ him, ___ the secret, ___ by her husband, a voice that ___, who felt ___, ___ to the police, ___ their country, suspected of having ___, ___ the confidence, even when ___

• Select a cluster assignment c

for d in each view v (DPMM) i.e. words are assigned to clusters within each view v,d

• Select a view v for each observed feature, and generate it from c f

features distributed between views

vf,d

(LDA) i.e.

5

View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

c1,d

c2,d

View 3 Cluster 2

Cluster 1

Cluster 2

3

4

c3,d

Cat

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

Betrayed

Survivor: The Amazon, Personal life of Marcus Tullius Cicero, Numb3rs, Huns, Rurouni Kenshin, Liberation of Paris, The Knightly Tale of Gologras and Gawain, Territories in The Pendragon Adventure, A Storm of Swords, Connor MacLeod, Paul Atreides her manner ___, being ___ by their, ___ and murdered, ___ his weakness, she ___ him, ___ the secret, ___ by her husband, a voice that ___, who felt ___, ___ to the police, ___ their country, suspected of having ___, ___ the confidence, even when ___

• Select a cluster assignment c

for d in each view v (DPMM) i.e. words are assigned to clusters within each view v,d

• Select a view v for each observed feature, and generate it from c f

features distributed between views

vf,d

(LDA) i.e.

5

View 1 Cluster 1

Cluster 2

View 2 Cluster 3

Cluster 1

c1,d

c2,d

View 3 Cluster 2

Cluster 1

Cluster 2

3

4

c3,d

Cat

South China Tiger, Hybrid (biology), List of mammals of Cameroon, Cantonese cuisine, Pound Puppies, Wonder Pets, The Wizard of Oz (1902 stage play), Mee-Ow, Animal rights, Rickrolling, Mera (comics), Taboo food and drink, Tuna, Garfield: The Movie ate the ___, have a ___ and a, the ___ and the mouse, the ___ who killed, ___ toys by, ___ in the city, ___ was diagnosed, crazy ___ lady, ___ of the month, protect your ___ from, new ___ food, and bought a ___, ___ or other animal, a sick ___,

Betrayed

Survivor: The Amazon, Personal life of Marcus Tullius Cicero, Numb3rs, Huns, Rurouni Kenshin, Liberation of Paris, The Knightly Tale of Gologras and Gawain, Territories in The Pendragon Adventure, A Storm of Swords, Connor MacLeod, Paul Atreides her manner ___, being ___ by their, ___ and murdered, ___ his weakness, she ___ him, ___ the secret, ___ by her husband, a voice that ___, who felt ___, ___ to the police, ___ their country, suspected of having ___, ___ the confidence, even when ___

• Select a cluster assignment c

for d in each view v (DPMM) i.e. words are assigned to clusters within each view v,d

• Select a view v for each observed feature, and generate it from c f

features distributed between views

vf,d

(LDA) i.e.

5

context

1−34 1−94

Cluster 1

2−0

View 3 2−47

Cluster 2

Figure 2: Topics with Senses: Shows top 20% of features for each view in a 3-view MVM fit to Google n-gram context data; different views place different mass on different sets of features. Cluster groupings within each view are shown. View 1 cluster 2 and View 3 cluster 1 both contain past-tense verbs, but only overlap on a subset of syntactic features.

Cluster 1

arbitrary characteristic comparative evolutionary fundamental inadequate inferior integral mystical poetic psychological radical singular systematic

View 2

austin betrayed charlotte conquered disappointed divorced embarked frustrated guarded hated jackson kent knocked murdered newcastle praised richmond secretly stationed stole summoned wounded

1−0

austin baltimore charlotte dallas pittsburgh richmond

0−77

kent liverpool manchester newcastle

0−10

arbitrary betrayed characteristic conquered disappointed divorced embarked evolutionary examine franklin frustrated fundamental guarded hated inadequate inferior integral jackson knocked likelihood murdered mystical poetic praised proportional radical secretly singular stationed stole summoned systematic wounded

View 1

secretly

Cluster 2

betrayed conquered disappointed divorced embarked frustrated guarded hated knocked murdered praised stationed stole summoned wounded

to an ___ side of the ___ first ___ of ___ of human the little ___ of ___ from the and an ___ written by ___ way of ___ real estate in ___ of ___ may hotels ___ hotels estate in ___ city of ___ welcome to ___ was the ___ of town of ___ to ___ a the city of ___ the ___ does not private message to ___ presence of ___ posted by ___ at name of ___ message to ___ located in ___ like ___ and in an ___ in ___ the hotels in ___ going to ___ from the ___ to dsl ___ dsl degree of ___ create a ___ by ___ to by ___ on born in ___ an ___ and ___ was born ___ said that ___ high school do not ___ you are ___ who is ___ which was ___ which ___ the were ___ in we are ___ was ___ to to be ___ and the more ___ so many ___ she was ___ of the ___ were of a ___ and near the ___ is also ___ i was ___ his ___ of could be ___ been ___ and be ___ or as ___ as and was ___ and is ___ and are ___ and ___ his also ___ the a more ___ ___ some of who are ___ were not ___ the very ___ the american ___ the ___ of that the ___ must be the ___ family that was ___ that ___ are posts by ___ of being ___ of ___ have might be ___ many ___ and is an ___ in these ___ he is ___ but the ___ of be ___ to are ___ to and ___ their along the ___ a kind of ___ ___ who had ___ open this result in ___ home page

Cluster 1

arbitrary austin baltimore characteristic comparative dallas evolutionary franklin fundamental inadequate inferior integral jackson kent likelihood liverpool mystical newcastle pittsburgh poetic proportional psychological radical richmond singular

0−0

context

1−34 1−94

Cluster 1

2−0

View 3 2−47

Cluster 2

Figure 2: Topics with Senses: Shows top 20% of features for each view in a 3-view MVM fit to Google n-gram context data; different views place different mass on different sets of features. Cluster groupings within each view are shown. View 1 cluster 2 and View 3 cluster 1 both contain past-tense verbs, but only overlap on a subset of syntactic features.

Cluster 1

arbitrary characteristic comparative evolutionary fundamental inadequate inferior integral mystical poetic psychological radical singular systematic

View 2

austin betrayed charlotte conquered disappointed divorced embarked frustrated guarded hated jackson kent knocked murdered newcastle praised richmond secretly stationed stole summoned wounded

1−0

austin baltimore charlotte dallas pittsburgh richmond

0−77

kent liverpool manchester newcastle

0−10

arbitrary betrayed characteristic conquered disappointed divorced embarked evolutionary examine franklin frustrated fundamental guarded hated inadequate inferior integral jackson knocked likelihood murdered mystical poetic praised proportional radical secretly singular stationed stole summoned systematic wounded

View 1

secretly

Cluster 2

betrayed conquered disappointed divorced embarked frustrated guarded hated knocked murdered praised stationed stole summoned wounded

to an ___ side of the ___ first ___ of ___ of human the little ___ of ___ from the and an ___ written by ___ way of ___ real estate in ___ of ___ may hotels ___ hotels estate in ___ city of ___ welcome to ___ was the ___ of town of ___ to ___ a the city of ___ the ___ does not private message to ___ presence of ___ posted by ___ at name of ___ message to ___ located in ___ like ___ and in an ___ in ___ the hotels in ___ going to ___ from the ___ to dsl ___ dsl degree of ___ create a ___ by ___ to by ___ on born in ___ an ___ and ___ was born ___ said that ___ high school do not ___ you are ___ who is ___ which was ___ which ___ the were ___ in we are ___ was ___ to to be ___ and the more ___ so many ___ she was ___ of the ___ were of a ___ and near the ___ is also ___ i was ___ his ___ of could be ___ been ___ and be ___ or as ___ as and was ___ and is ___ and are ___ and ___ his also ___ the a more ___ ___ some of who are ___ were not ___ the very ___ the american ___ the ___ of that the ___ must be the ___ family that was ___ that ___ are posts by ___ of being ___ of ___ have might be ___ many ___ and is an ___ in these ___ he is ___ but the ___ of be ___ to are ___ to and ___ their along the ___ a kind of ___ ___ who had ___ open this result in ___ home page

Cluster 1

arbitrary austin baltimore characteristic comparative dallas evolutionary franklin fundamental inadequate inferior integral jackson kent likelihood liverpool mystical newcastle pittsburgh poetic proportional psychological radical richmond singular

0−0

Data Austin

History of Austin, Texas, University of Texas Medical Branch, 1993 Pacific hurricane season, Rutherford B. Hayes, List of pipeline accidents, List of Austin City Limits performers, Texas in the American Civil War, 6th Cavalry Regiment (United States) ___ texas homes, ___ law school, the citizens of ___, the ___ business directory, ___ police department, university in ___, ___ vacation rentals, the ___ parks and, by the ___ business journal, coming to ___, the ___ area, deals on ___ hotels

Betrayed

Survivor: The Amazon, Personal life of Marcus Tullius Cicero, Numb3rs, Huns, Rurouni Kenshin, Liberation of Paris, The Knightly Tale of Gologras and Gawain, Territories in The Pendragon Adventure, A Storm of Swords, Connor MacLeod, Paul Atreides her manner ___, being ___ by their, ___ and murdered, ___ his weakness, she ___ him, ___ the secret, ___ by her husband, a voice that ___, who felt ___, ___ to the police, ___ their country, suspected of having ___, ___ the confidence, even when ___

• Word set: Top 43.7k words ranked by frequency in Wikipedia (ex top 1% as stop words)

• Syntax features: Contextual patterns from combined Google Web n-gram + Google Books n-gram corpus (3.5M features)

• Document features: Wikipedia article occurrence count (120k features)

Intrusion Task word

context

document

humor ingenuity delight advertisers astonishment

• “Model-based” lexical semantics: read word similarity directly from the model • Intruders are drawn from the top terms in other clusters • More robust than asking for numeric similarity judgements • Less inter-rater calibration required Chang et. al (2009)

Intrusion Task word

context

document

humor ingenuity delight advertisers astonishment

• “Model-based” lexical semantics: read word similarity directly from the model • Intruders are drawn from the top terms in other clusters • More robust than asking for numeric similarity judgements • Less inter-rater calibration required Chang et. al (2009)

Intrusion Task word

context

humor ingenuity delight advertisers astonishment

___ is characterized symptoms of ___ cases of ___ in cases of ___ real estate in ___

document

• “Model-based” lexical semantics: read word similarity directly from the model • Intruders are drawn from the top terms in other clusters • More robust than asking for numeric similarity judgements • Less inter-rater calibration required Chang et. al (2009)

Intrusion Task word

context

humor ingenuity delight advertisers astonishment

___ is characterized symptoms of ___ cases of ___ in cases of ___ real estate in ___

document

• “Model-based” lexical semantics: read word similarity directly from the model • Intruders are drawn from the top terms in other clusters • More robust than asking for numeric similarity judgements • Less inter-rater calibration required Chang et. al (2009)

Intrusion Task word

context

document

humor ingenuity delight advertisers astonishment

___ is characterized symptoms of ___ cases of ___ in cases of ___ real estate in ___

Puerto Rican cuisine Greek cuisine ThinkPad Palestinian cuisine Field ration

• “Model-based” lexical semantics: read word similarity directly from the model • Intruders are drawn from the top terms in other clusters • More robust than asking for numeric similarity judgements • Less inter-rater calibration required Chang et. al (2009)

Intrusion Task word

context

document

humor ingenuity delight advertisers astonishment

___ is characterized symptoms of ___ cases of ___ in cases of ___ real estate in ___

Puerto Rican cuisine Greek cuisine ThinkPad Palestinian cuisine Field ration

• “Model-based” lexical semantics: read word similarity directly from the model • Intruders are drawn from the top terms in other clusters • More robust than asking for numeric similarity judgements • Less inter-rater calibration required Chang et. al (2009)

Evaluation

• Amazon Mechanical Turk • 1256 unique raters (Country=US, >96%

User Comments U1 U2

approval)

• 5.7k unique intrusion tasks at 5x

U3

duplication: ~30k evaluations total

• 2736 rejected • Per-user average time for 50; “common”) context intrusion

word intrusion

DPMM−0.1−0.1 DPMM−0.1−0.01 LDA−50M−0.1−0.1 LDA−50M−0.1−0.01 LDA−100M−0.1−0.1 LDA−100M−0.1−0.01 LDA−200M−0.1−0.1 LDA−200M−0.1−0.01 LDA−300M−0.1−0.1 LDA−300M−0.1−0.01 LDA−500M−0.1−0.1 LDA−500M−0.1−0.01 LDA−1000M−0.1−0.1 LDA−1000M−0.1−0.01





0.2

0.4

0.6

0.8







MVM−3M−0.1−0.01 MVM−5M−0.1−0.01 MVM−5M−0.1−0.005 MVM−10M−0.1−0.01 MVM−10M−0.1−0.005 MVM−20M−0.1−0.01 MVM−30M−0.1−0.01 MVM−50M−0.1−0.01 MVM−100M−0.1−0.01

0.0



1.0











0.0

0.2



% correct (a) Syntax-only, common n-gram contexts.





0.8

1.0



0.4

0.6

1.0

l l a ver

●● ● ● ●● ●● ●





● ●

Syntax features only ●(freq>50; “common”) ● ●





0.5 ● ●

t c e r r o c

● ●

1.0

5 . 0 ● ● ● ●



●● ● ● ● ● ● ● ● ● ●

● ● ●●●

0.5

● ● ●

● ●

● ● ●●

● ● ● ●



● ●



● ●● ● ●



● ●



●● ● ●



● ●

ct

● ● 0.0 9 ● 3 . 0 ● i hope ● ● ● 2 2.5 3 0 . 0 l 10 10 10 e because d o m t 0 model size (clusters) . u 1 o of heads k m g o a c 5 n a e ● . t i a r t 0 d b d a )s model anonstr ● ● ● ● m ● n a e l h t (a) Syntax-only, common n-gram contexts. e ● d s e r d o , ● c ● o s ● y t  a word i r m e ● xe high as 2.5 ●● ●

LDA MVM

● ● ●



● ●

n

0.0 1.0

● ● ●



● ●

trusio

%









3.5

10

3



in word

% correct





n

● ●

● ●

●● ● ●

word intrusion

m o c a t a l l a d r e 0.0 v d O n n a s a t n l h e m the ores t 1.0 uore c s s  d l r e show ta yie0.33 ey a d s e t h n t e one that f o m 5 3 e . 0.5 0 s a e e v i and no t ela 0.46



● ●● ● ●

trusio

0.5

0.0 1.0



in ment

1.0

docu



context intrusion

3 3 . 0 ng ones. 5 3 . 0 o think-46 0. 9 3 . 0 ones to I don’t

usion

O



●10

)





0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

% correct

Syntax features only (freq < 50; “rare”) (a) Syntax-only, common n-gram contexts.

context intrusion

word intrusion

DPMM−0.1−0.1 DPMM−0.1−0.01 LDA−50M−0.1−0.1 LDA−50M−0.1−0.01 LDA−100M−0.1−0.1 LDA−100M−0.1−0.01 LDA−200M−0.1−0.1 LDA−200M−0.1−0.01 LDA−300M−0.1−0.1 LDA−300M−0.1−0.01 LDA−500M−0.1−0.1 LDA−500M−0.1−0.01 LDA−1000M−0.1−0.1 LDA−1000M−0.1−0.01













MVM−3M−0.1−0.01 MVM−5M−0.1−0.01 MVM−5M−0.1−0.005 MVM−10M−0.1−0.01 MVM−10M−0.1−0.005 MVM−20M−0.1−0.01 MVM−30M−0.1−0.01 MVM−50M−0.1−0.01 MVM−100M−0.1−0.01















0.6

0.8

1.0



0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

% correct (b) Syntax-only, rare n-gram contexts.

0.4

●(clusters) ● ●● model size ● ●





●●

●● ● ●

Syntax features only (freq < 50; “rare”) ● n-gram contexts. ● (a) Syntax-only, common ●





0.5 ●

t c e r r o c

● ●



1.0



● ● ●

0.5



● ●

● ● ●●

● ● ● ● ●

● ● ● ●



● ●

●●





● ●

●● ● ●



● ●

● ●

ct

● ● 0.0 9 ● 3 . 0 ● ● ● ● 2.4 1.8 2 2.2 2.6 2.8 3 3.2 0 . 0 l 10 10 10 10 10 10 10 10 e d o m t 0 model size (clusters) . u 1 o s k m g o a c e del ainndsdtraattain ● 5, 0.5 bmr0.01 ● o emon ● ● ● ● n a l h (b) Syntax-only, rare n-gram contexts. t relatively e ● d s e r d o , ● c ● o ● thyigher  ass m ● 2.5 ● ●● ●

LDA MVM

● ● ●















n

0.0 1.0

● ● ●

● ●

trusio

%



● ●

● ● ● ● ● ● ● ●



0.5

● ●

● ● ●



in word

% correct

● ● ●

●●

word intrusion

e v 0.0 d O n n a s ment ores tha 1.0 c s s d l r e i 3 3 y . 0 a t a d s e t aken verbah t en f o 5 3 e . 0.5 0 s a e ssage board e v i lat 0.46

●●

n

m I could m o c rettydiffy ata rall



trusio

.●0 0 0.5 ● 1.0



in ment

6 4 . 0 9 3 . ould 0have

context intrusion

1.0

docu

m a.3 word 3 0 between 5 3 . 0 red to try

e task to be



1.0

r e v O

usion

because of headall



3.5

10

3

●10

)





0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

% correct

“Common” syntax features + document features (b) Syntax-only, rare n-gram contexts.

context intrusion

document intrusion

DPMM−0.1−0.1 DPMM−0.1−0.01



LDA−50M−0.1−0.1 LDA−50M−0.1−0.01 LDA−100M−0.1−0.1 LDA−100M−0.1−0.01 LDA−200M−0.1−0.1 LDA−200M−0.1−0.01 LDA−300M−0.1−0.1 LDA−300M−0.1−0.01 LDA−500M−0.1−0.1 LDA−500M−0.1−0.01 LDA−1000M−0.1−0.1 LDA−1000M−0.1−0.01 MVM−3M−0.1−0.01 MVM−5M−0.1−0.01 MVM−5M−0.1−0.005 MVM−10M−0.1−0.01 MVM−10M−0.1−0.005 MVM−20M−0.1−0.01 MVM−30M−0.1−0.01 MVM−50M−0.1−0.01 MVM−100M−0.1−0.01

word intrusion ●

































0.0

0.2













0.2

0.4



0.4

0.6

0.8

1.0 0.0

0.2

0.4

0.6

0.8

1.0 0.0

% correct (c) Syntax+Documents, common n-gram contexts.

0.6

0.8

1.0

usion

1.0

all



●● ● ● ●● ●● ●



● ●



●● ●





● ● ●

● ●

0.5



● ●





● ● ●●

● ● ●





● ● ●● ● ●● ● ● ●

● ●

0.0 1.0

0.5 2

● ●●









● ●

●● ● ●



● ●● ●



●● ● ● ●

● ●





●●





● ●

2.5







3.5

10

3

3

●10

)





in

●3.5 10 10 10 ● 10 ● ● ● model size ● (clusters) ● ●● ● 2.5 ●

ct

MVM

● ●

document

0.0 1.0



usion

● ●● ● ●

● ●

LDA

n

l e d o m t u o we obtain similar res k 0.0 m g o alessl avariable c n a t i a t d d a n r de monst across n a e  sco(Figure l h t e dcomplexity s e r d o her as m



● ● ●

0.5 ● ●

0.5



● ●



● ●

trusio

0.0 1.0





● ● ● ● ● ●



in word

5 . 0 0.5

● ●

● ●





context intr

1.0





n

● ●











0.0 1.0

● ●

trusio

0.0 1.0



●●

word intrusion

(a) breaks out 46 0.model exity, demonstrating 9 3 . 0 e over LDA as model

1.0

document intrusion

m o s model and data comc a t a l l a d r e v O e higher s an than t res thscores Documents data yields o c sthe relativeeease s d l of the i 3 3 y . 0 a t a d e h t f o 5 3 e . 0 s a e e

0.5

in ment

0.33 0.35 0.46 0.39



t c e % correct r r o c %

Overall



docu

“Common” syntax●features + document features ● ●



context intrusion

33 ocuments 35 0 .946 9 .39 03



Conclusion

• Introduced a latent variable model accounting for cross-cutting / multiple clustering structure in word meaning

• Large-scale human evaluation of the semantic coherence of similarity predictions

• Significantly higher precision intrusion identification than related model-based approaches

• Even for fine-grained clusterings 25