Punctuation Detection with Full Syntactic Parsing
Authors | |
---|---|
Year of publication | 2010 |
Type | Article in Periodical |
Magazine / Source | Research in Computing Science, Special issue: Natural Language Processing and its Applications |
MU Faculty or unit | |
Citation | |
Web | http://www.cicling.org/2010/Vol46.pdf |
Field | Informatics |
Keywords | punctuation; grammar checking; parsing; syntactic analysis |
Description | The correct placement of punctuation characters is in many languages, including Czech, driven by complex guidelines. Although those guidelines use information of morphology, syntax and semantics, state-of-art systems for punctuation detection and correction are limited to simple rule-based backbones. In this paper we present a syntax-based approach by utilizing the Czech parser synt. This parser uses an adapted chart parsing technique for building the chart structure for the sentence. synt can then process the chart and provide several kinds of output information. The implemented punctuation detection technique utilizes the synt output in the form of automatic and unambiguous extraction of optimal syntactic structures from the sentence (noun phrases, verb phrases, clauses, relative clauses or inserted clauses). Using this feature it is possible to obtain information about syntactic structures related to expected punctuation placement. We also present experiments proving that this method makes it possible to cover most syntactic phenomena needed for punctuation detection or correction. |
Related projects: |