Experiments on Supervised Learning Algorithms for Text Categorization

Namburu, Setu Madhavi; Tu, Haiying; Luo, Jianhui; Pattipati, Krishna R.

Modern information society is facing the challenge of handling massive volume of online documents, news, intelligence reports, and so on. How to use the information accurately and in a timely manner becomes a major concern in many areas. While the general information may also include images and voice, we focus on the categorization of text data in this paper. We provide a brief overview of the information processing flow for text categorization, and discuss two supervised learning algorithms, viz., support vector machines (SVM) and partial least squares (PLS), which have been successfully applied in other domains, e.g., fault diagnosis [9]. While SVM has been well explored for binary classification and was reported as an efficient algorithm for text categorization, PLS has not yet been applied to text categorization. Our experiments are conducted on three data sets: Reuter's- 21578 dataset about corporate mergers and data acquisitions (ACQ), WebKB and the 20-Newsgroups. Results show that the performance of PLS is comparable to SVM in text categorization. A major drawback of SVM for multi-class categorization is that it requires a voting scheme based on the results of pair-wise classification. PLS does not have this drawback and could be a better candidate for multi-class text categorization.

Document ID

20060051818

Acquisition Source

Ames Research Center

Document Type

Conference Paper

Authors

Date Acquired

August 23, 2013

Publication Date

January 1, 2005

Publication Information

ISBN: 0-7803-8870-4

Subject Category

Report/Patent Number

Meeting Information

Meeting: Aerospace 2005 IEEE Conference

Country: United States

Start Date: March 5, 2005

End Date: March 12, 2005

Sponsors: Institute of Electrical and Electronics Engineers

Funding Number(s)

Distribution Limits

Public

Other

Available Downloads

There are no available downloads for this record.

No Preview Available

NTRS

NTRS - NASA Technical Reports Server

Available Downloads

Related Records