We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.
Feeding decision support systems with Web information typically
requires sifting through an unwieldy amount of information that is
available in human-friendly formats only. Our focus is on a scalable
proposal to extract information from semi-structured documents
in a structured format, with an emphasis on it being scalable and
open. By semi-structured we mean that it must focus on informa tion that is rendered using regular formats, not free text; by scal able, we mean that the system must require a minimum amount of
human intervention and it must not be targeted to extracting in formation from a particular domain or web site; by open, we mean
that it must extract as much useful information as possible and not
be subject to any pre-defined data model. In the literature, there is
only one open but not scalable proposal, since it requires human
supervision on a per-domain basis. In this paper, we present a new
proposal that relies on a number of heuristics to identify patterns
that are typically used to represent the information in a web docu ment. Our experimental results confirm that our proposal is very
competitive in terms of effectiveness and efficiency.Ministerio de Economía y Competitividad TIN2016-75394-RMinisterio de Economía y Competitividad TIN2013-40848-
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.