Comparison of Grammar-Based and Statistical Language Models Trained on the Same Data

Hockey, Beth Ann; Rfayner, Manny

This paper presents a methodologically sound comparison of the performance of grammar-based (GLM) and statistical-based (SLM) recognizer architectures using data from the Clarissa procedure navigator domain. The Regulus open source packages make this possible with a method for constructing a grammar-based language model by training on a corpus. We construct grammar-based and statistical language models from the same corpus for comparison, and find that the grammar-based language models provide better performance in this domain. The best SLM version has a semantic error rate of 9.6%, while the best GLM version has an error rate of 6.0%. Part of this advantage is accounted for by the superior WER and Sentence Error Rate (SER) of the GLM (WER 7.42% versus 6.27%, and SER 12.41% versus 9.79%). The rest is most likely accounted for by the fact that the GLM architecture is able to use logical-form-based features, which permit tighter integration of recognition and semantic interpretation.

Document ID

20050240847

Acquisition Source

Ames Research Center

Document Type

Conference Paper

Authors

Date Acquired

August 23, 2013

Publication Date

January 1, 2005

Subject Category

Meeting Information

Meeting: AAAI Workshop on Spoken Language Understanding

Location: Pittsburg, PA

Country: United States

Start Date: July 9, 2005

End Date: July 10, 2005

Sponsors: American Association for Artificial Intelligence

Distribution Limits

Public

Public Use Permitted.

Document Inquiry

Available Downloads

There are no available downloads for this record.

No Preview Available

NTRS

NTRS - NASA Technical Reports Server

Available Downloads

Related Records