Effectiveness of Methods for Syntactic and Semantic Recognition of Numeral Strings: Tradeoffs Between Number of Features and Len

15 years 10 months ago

Download eprints.utas.edu.au

Abstract. This paper describes and compares the use of methods based on Ngrams (specifically trigrams and pentagrams), together with five features, to recognise the syntactic and semantic categories of numeral strings representing money, number, date, etc., in texts. The system employs three interpretation processes: word N-grams construction with a tokeniser; rule-based processing of numeral strings; and N-gram-based classification. We extracted numeral strings from 1,111 online newspaper articles. For numeral strings interpretation, we chose 112 (10%) of 1,111 articles to provide unseen test data (1,278 numeral strings), and used the remaining 999 articles to provide 11,525 numeral strings for use in extracting N-gram-based constraints to disambiguate meanings of the numeral strings. The word trigrams method resulted in 83.8%

Kyongho Min, William H. Wilson, Byeong Ho Kang

Real-time Traffic

Artificial Intelligence | AUSAI 2007 | Numeral Strings | Numeral Strings Interpretation | Strings Representing Money |

claim paper

Added	12 Aug 2010
Updated	12 Aug 2010
Type	Conference
Year	2007
Where	AUSAI
Authors	Kyongho Min, William H. Wilson, Byeong Ho Kang

Sciweavers

Effectiveness of Methods for Syntactic and Semantic Recognition of Numeral Strings: Tradeoffs Between Number of Features and Len

Artificial Intelligence | AUSAI 2007 | Numeral Strings | Numeral Strings Interpretation | Strings Representing Money |

Explore & Download

Productivity Tools

Sciweavers