A Neural Model for Part-of-Speech Tagging in Historical Texts

Hardmeier, Christian

Repository landing page

oai:pure.ed.ac.uk:publications/bba43940-2737-4ab7-ba6c-575aa098acdf

A Neural Model for Part-of-Speech Tagging in Historical Texts

Authors: Christian Hardmeier
Publication date: 16 December 2016
Publisher

Abstract

Historical texts are challenging for natural language processing because they differ linguistically from modern texts and because of their lack of orthographical and grammatical standardisation. We use a character-level neural network to build a part-of-speech (POS) tagger that can process historical data directly without requiring a separate spelling normalisation stage. Its performance in a Swedish verb identification and a German POS tagging task is similar to that of a two-stage model. We analyse the performance of this tagger and a more traditional baseline system, discuss some of the remaining problems for tagging historical data and suggest how the flexibility of our neural tagger could be exploited to address diachronic divergences in morphology and syntax in early modern Swedish with the help of data from closely related languages

contributionToPeriodical

Similar works

Full text

Open in the Core reader

Download PDF

Edinburgh Research Explorer

oai:pure.ed.ac.uk:publications...

Last time updated on 21/11/2019

This paper was published in Edinburgh Research Explorer.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.