Automatic extraction of knowledge from web documents

Alani, Harith; Kim, Sanghee; Millard, David E.; Weal, Mark J.; Lewis, Paul H.; Hall, Wendy and Shadbolt, Nigel R. (2003). Automatic extraction of knowledge from web documents. In: 2nd International Semantic Web Conference - Workshop on Human Language Technology for the Semantic Web and Web Services, 20-23 Oct 2003, Sanibel Island, Florida, USA.

URL: http://iswc2003.semanticweb.org/#workshops

Abstract

A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper.

Viewing alternatives

Look up in Google Scholar

Download history

Download Accepted Manuscript (PDF / 285kB)

Item Actions

Export

You can export this page using these formats

About

Item ORO ID
20050
Item Type
Conference or Workshop Item
Academic Unit or School
Knowledge Media Institute (KMi)
Faculty of Science, Technology, Engineering and Mathematics (STEM)
Research Group
Centre for Research in Computing (CRC)
Copyright Holders
Depositing User
Harith Alani

CORE (COnnecting REpositories)

Open Research Online - ORO

Automatic extraction of knowledge from web documents

Abstract

Viewing alternatives

Download history

Item Actions

Export

About

The Open University

Explore

Undergraduate

Postgraduate

Policy

Follow us on Social media