Repository landing page

We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.

Ripple-down rules based open information extraction for the web documents

Abstract

The World Wide Web contains a massive amount of information in unstructured natural language and obtaining valuable information from informally written Web documents is a major research challenge. One research focus is Open Information Extraction (OIE) aimed at developing relation-independent information extraction. Open Information Extraction systems seek to extract all potential relations from the text rather than extracting few pre-defined relations. Previous machine learning-based Open Information Extraction systems require large volumes of labelled training examples and have trouble handling NLP tools errors caused by Web s informality. These systems used self-supervised learning that generates a labelled training dataset automatically using NLP tools with some heuristic rules. As the number of NLP tool errors increase because of the Web s informality, the self-supervised learning-based labelling technique produces noisy label and critical extraction errors. This thesis presents Ripple-Down Rules based Open Information Extraction (RDROIE) an approach to Open Information Extraction that uses Ripple-Down Rules (RDR) incremental learning technique. The key advantages of this approach are that it does not require labelled training dataset and can handle the freer writing style that occurs in Web documents and can correct errors introduced by NLP tools. The RDROIE system, with minimal low-cost rule addition, outperformed previous OIE systems on informal Web documents

Similar works

Full text

thumbnail-image

UNSWorks

redirect
Last time updated on 10/04/2018

This paper was published in UNSWorks.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.