NASA's Online Machine Aided Indexing System - NTRS

1 downloads 0 Views 2MB Size Report
translate text-extracted concepts. The structure and function of the various. NLD ... machine aided indexing system, known as the NASA. Lexical. Dictionary.
NASA Contractor

NASA's

Online

Aided

Final

June and

4518

Machine

Indexing

System

Report

P. Silvester, Paul

RMS

Report

Michael

T. Genuardi,

H. Klingbiel 0 0

Associates

_n

Linthicum

Heights,

Maryland I

0_ 0 0_

_m

0 C

0

Z

aO ,,-4

pm

Prepared

tO ¢ ¢ O

for

NASA

Scientific

under

Contract

and

Technical

NASW-4584

Information

°,m

Program

IL UJ_ ZUJ

tO

O,v_ 0 _

..J_
,

&,

',

terms. appears

following

the

$ after

the

key.

=

50

Too

Many

$'s.

(In

the

online

More

separates

the

than

key

on

paper,

to

the

end

Invalid

Format.

following

of

from

transaction indicate

one

entering

the a

of

The

$

DEL$(Key

is

in

only

one

term(s).

placed

logic

transaction

term(s)

may

term(s)

of

Transaction

both

In

field

does

not

transaction.

is

used.

writing

before

code

the $

out

and

after

the

key

and conform

to

It the

the

key

field.)

one

of

the

record

to

Posting

Terms

invalid

or

be

or

or

00

00

or

be

deleted)

Not

Found

may

have

or

*

'_

in

been

the

Thesaurus.

removed

The

from

the

posting

thesaurus.

The person doing the maintenance corrects the rejected transactions the modification dataset and re-executes the NASALOAD command.

Following is

entered

formats:

Word;00$Posting

in

been

transaction,

posting

the

Wordl;Word2$Posting

term

$ has

a

the

examined

execution

by See

NLD

of

whether

Control

See

the

NASALOAD

personnel. or

not

Language

or

whether

This the

job

return

not

command,

is has

run

codes

there

printout

generated

satisfactorily.

(RC)

are

any

to:

should

(The

all

equal

listed

that

any

errors

changes

and

additions

figures

are

accumulated

Job

zeros.)

must

be

corrected.

Record

the

from,

the

number KB.

of

These

See

whether

or

any

further

adjustment.

if

not

any

999

must

if

the

old

new

could

be

that

now

the

NASAPRNT. alphabetically without

logic

this in

printout the

posting

KB. term.

This by was Entries A

leading

a

by

to

less

be

to

check

to

a

and

the

NASA

inclusive

may

be

checking

to

command

keys.

codes,

NASANVRT. alphabetically

replaced

entry

been

deletions

reports.

changed

needs

term,

new

key

posting

KB

than

before

making is

after

the

correction

but

printout

in

term(s).

This it

the

ending

translation

again,

the

a

preferred.

the

changed

posting

the

entry

generally

the

fact

less

than

the

beforehand.

This by

changed

has

avoided

time-consumlng database

is be

old

needs

has

must

entered

entry

entry,

that

and

for

example:

entry

entry be

entry For

a continuation

continuation

KB

to,

is

generates

A

sample

shown

command

posting

sample

in multiple

page

is

a

print

of

the

Figure

a

print

Before order

the

to

the

of

in

51

the

creation

locate

posting shown

of

revised

KB

sorted

KB,

that

KB

sorted

of

the

is,

E-I.

generates terms.

essential with

in

page

terms Figure

a

particular are E-2.

listed

KBM

program

posting once

for

term each

08/01/92

NASA

LEXICAL

PAGE

DICTIONARY CADMIUM TELLURIDES. MERCURY TELLURIDES O0 O0

(HG,CD)TE;999 &;999 /U/;999 A;AND A;AND;B A;AND;B;STARS

A STARS, B STARS A STARS, B STARS

A;AND;B-TYPE A;AND;F A;AND;F;MAIN A;AND;F;MAIN;SEOUENCE

A

STARS, F STARS. MAIN SEOUENCE STARS A STARS, F STARS A STARS, F STARS A STARS, F STARS O0 A STARS, GIANT STARS A STARS, GIANT STARS O0 A STARS A STARS, SUPERGIANT STARS A STARS, SUPERGIANT STARS

A;AND;F;STARS A;AND;F;TYPE A;AND;F-TYPE A;BAND A;GIANT A;GIANTS A;STAR A;STARS A;SUPERGIANT A;SUPERGIANTS A;TYPE A;TYPE;SHELL A;TYPE:SHELL;STAR A;TYPE;SHELL;STARS A;TYPE;STAR A;TYPE;STAR5 A;136T A;999 A-IAND A-;AND;B-STARS

a

A STARS A STARS A STARS A STARS GALACTIC O0

CLUSTERS

e

A STARS, B STARS O0 ACRYLIC RESINS ACRYLATES. ALKYL COMPOUNDS BINARY MIXTURES O0 O0 ALTERNATING CURRENT A STARS, F STARS ARTIFICIAL INTELLIGENCE AUTOMATIC GAIN CONTROL A STARS A STARS AMORPHOUS SILICON,

A-;999 A-ALKYLACRYLATE;POLYMERS A-ALKYLACRYLATE;999 A-B;BINARY A-B;999 A-BAND;gg9 A-C;999 A-F;STARS A-I;TECHNOLOGY A-M/AGC;999 A-SHELL;STAR A-SHELL;STARS A-SI:H;FILMS

m

NASAPRNT

Figure

52

E-1

I0/29/91 T

COMMERCIAL;LAUNCH;VEHICLES

T

COMMERCIAL;LAUNCH;VEHICLE

T T T T T T T T

EARTH;TO;ORBIT;VEHICLE EARTH;TO;ORBIT;VEHICLES EARTH-TO-ORBIT;LAUNCH;VEHICLE EARTH-TO-ORBIT;LAUNCH;VEHICLES EARTH-TO-ORBIT;VEHICLES BOOSTER;VEHICLE BOOSTER;VEHICLES AIR-BREATHING;LAUNCH;VEHICLE

T

AIR;BREATHING;LAUNCH;VEHICLES

T

AIR-BREATHING;LAUNCH;VEHICLES

T T T

CARRIER;ROCKET CARRIER;ROCKETS AIR;BREATHING;LAUNCH;VEHICLE

T

SOVIET;LAUNCH;VEHICLES

T

SOVIET;LAUNCH;VEHICLE

T T T T

LAUNCH;WINDOW LAUNCH;WINDOWS LAUNCH;TIME LAUNCH;OR;LANDING;WINDOW

C T

LAUNCHER;OO ELECTROMAGNETIC;LAUNCHERS

T T E T T T T T T T C T T T T T T T T T T T T T

BOX;LAUNCHER BOX;LAUNCHERS LAUNCHERS;OO LAUNCHING;DEVICE LAUNCHING;DEVICES LAUNCH;TUBES LAUNCH;MODES LAUNCH;OO L_FT-OFF;OO LIFT;OFF GROUND/LAUNCH;OO LAUNCHING:BASE LAUNCH;COMPLEX;00 LAUNCH:FACILITIES LAUNCH;FACILITY LAUNCH;COMPLEXES LAUNCH;CONTROL;CENTER LAUNCH;CONTROL;FACILITIES LAUNCH;CONTROL;FACILITY LAUNCHING;BASES LAUNCHING;COMPLEXES LAUNCHING;FACILITIES LAUNCHING;FACILITY LAUNCHER;COMPLEXES

NASA

LEXICAL

DICTIONARY

BY

POSTING

TERN

PAGE

SPACE COMMERCIALIZATION. LAUNCH VEHICLES SPACE COMMERCIALIZATION. LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES AIR BREATHING BOOSTERS, LAUNCH VEHICLES AIR BREATHING BOOSTERS. LAUNCH VEHICLES AIR BREATHING BOOSTERS. LAUNCH VEHICLES LAUNCH VEHICLES LAUNCH VEHICLES AIR BREATHING BOOSTERS. ,LAUNCH VEHICLES LAUNCH VEHICLES, SOVIET SPACECRAFT LAUNCH VEHICLES, SOVIET SPACECRAFT LAUNCH WINDOWS LAUNCH WINDOWS LAUNCH WINDOWS LAUNCH WINDOWS. SPACECRAFT LANDING. WINDOWS (INTERVALS) LAUNCHERS ELECTROMAGNETIC PROPULSION, LAUNCHERS LAUNCHERS LAUNCHERS LAUNCHERS LAUNCHERS LAUNCHERS LAUNCHERS LAUNCHING LAUNCH|NG LAUNCHING LAUNCHING LAUNCHING LAUNCHZNG BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES LAUNCHING BASES

NASANVRT Figure

53

E-2

1230

NASAFIND. first

word

with

the

word

is

key not

which

it

quick

way

This

command

a

key,

and

or

word

in

found,

should of

flexible

way

PRINT

of

terminal if the

any

occur

or

the

command

word

screen,

the

returns

records

command groups

can of

or

If

the

ten

records.

a

key

position

A

dataset

or in

NASAFIND

dataset.

VSAM

or

beginning

requested

sequential

VSAM

key

records,

the

is

IDENTIFY

be

used

to A

test

sample

of

one

program.

translation

is

llne

It of

available

how

MAI

of

NEWACC

will

to

type

of

text

provides,

the

through

translate

output

E-3.

54

a

more out:

FK('key;of;record')

a max{mum

is

specified

KB

COUNT(xx)

Access-2

terms

a

in

partial

a

input the

or

on

is

a

the

material, KB.

Otherwise

message:

TO

phrases.

the

NASA

the

is

processes

full

into

next

which

CHAR

through

the

translation

program

Figure

KB,

for

exists.

locate

display

the

displaying

This one

UNABLE

The

and

KB

sequential

it

will

IDS(dataset.name.here)

NEWACC. minimum

if

program

the

ten

requested, the

displaying of

searches

displays

words, illustrated

phrases, in

4/92-12:52:

ENTER

A

x.

TO

PLEASE abundances

in

ABUNDANCES;999 IN;999

.USE.

CHEMIOALLY:999 PECULIAR;999 AND;99¢_

NORMAL;A_STARS

PROCESSING PHRASE

peculiar

.USE.

ABUNDANCE

.USE.

O0

and

normal

O0

O0

O0 .USE._A

A;TYPE;STARS

.USE.

PLEASE

PHRASE

ENTER

ENTER

chemically

.USE. .USE.

TERMINATE

A

STARS SPARS

>

Salple

of

NEWACC Output

Fi&ure

55

E-3

A-type

stars

16-FM

=

REPORT

DOCUMENTATION

I. AGENCY USE ONLY (l_ve blank)

PAGE

_

2. REPORT DATE September

_q,p,o,,_OMBNo.0704-roSS

3. REPORTTYPE AND DATES COVERED Contractor Report

1993

5. FUNDING NUMBERS

4. TITLE AND SUBTITLE NASA's Online Machine Aided Indexing System

C NASW-4584

Final Report

_. ^u'm0R_) June P. Silvester, Michael T. Genuardi and Paul H. Klingbiel 8. PER.FOR/vgNGORGANIZATION

PBRFORMINOORGANIZATION NAME(S) AND ADDRESS(ES)

REPORT NUMBER

NASA Center for AeroSpace Information Linthicum Heights, MD 21090-2934

SPONSORING/MONITOP..INGAGENCY NAME(S) AND ADDRESS(F.,S)

10.SPONSORING/NiONITORING AGENCY

National Aeronautics and Space Administration

REPORT NUMBER NASA-CR-4518

Washington, DC 20546

h.s_Pt_,cr,,,Rv

No'rr.s w

Y 12aDISTRIBUTION/AVAILAbILITY STATEMENT

12b. DISTRIBUTION CODE

Unclassified - Unlimited Subject Category - 82 13.ASSTP.ACT (maximum 200 words) This report describes the NASA Lexical Dictionary, a machine aided indexing system used online at the National Aeronautics and Space Administration's Center for AeroSpace Information (CASI). This system is comprised of a text processor that is based on the computational, non-syntactic analysis of input text, and an extensive 'knowledge base' that serves to recognize and translate text-extracted concepts. The structure and function of the various NLD system components are described

in detail.

Methods used for the development of the knowledge base are discussed. Particular attention is given to a statistically-based text analysis program that provides the knowledge base developer with a list of concept-specific phrases extracted from large textual corpora. Production and quality benefits resulting from the integration of machine

aided indexing at CASI are discussed along with

a number of secondary applications of NLD-derived systems including on-line spell checking and machine aided lexicography.

14.SUIL_.£T TERMS

15. _YMBER OPPAOES

64

computer techniques, dictionaries, information retrieval, information

16. PRICECODE

systems, terminology, thesauri

A04 i

17. SE_

CLASSIFICATION

Ol_REPORT

U_"I_

18.SECURITY CLASSIFICATION OF TI-DSPAGE Available

UncI_

19.SECUPJTYCLASSIFICATION OF ABSTRACT

from NASA Center for AeroSpace 800 Elkridge Landing Road Linthicum

Heights,

(301) 621-0390

MD

Unclus

20.LIMITATION OF ABSTRACT l,Jnlimi__¢[

Information

2109@2934 qF