We are not able to resolve this OAI Identifier to the repository landing page. If you are the repository manager for this record, please head to the Dashboard and adjust the settings.
Continued reliance on human operators for managing
data centers is a major impediment for them from
ever reaching extreme dimensions. Large computer systems
in general, and data centers in particular, will ultimately be
managed using predictive computational and executable models
obtained through data-science tools, and at that point, the
intervention of humans will be limited to setting high-level
goals and policies rather than performing low-level operations.
Data-driven autonomics, where management and control are
based on holistic predictive models that are built and updated
using generated data, opens one possible path towards limiting
the role of operators in data centers. In this paper, we present
a data-science study of a public Google dataset collected in a
12K-node cluster with the goal of building and evaluating a
predictive model for node failures. We use BigQuery, the big
data SQL platform from the Google Cloud suite, to process
massive amounts of data and generate a rich feature set
characterizing machine state over time. We describe how an
ensemble classifier can be built out of many Random Forest
classifiers each trained on these features, to predict if machines
will fail in a future 24-hour window. Our evaluation reveals
that if we limit false positive rates to 5%, we can achieve true
positive rates between 27% and 88% with precision varying
between 50% and 72%. We discuss the practicality of including
our predictive model as the central component of a data-driven
autonomic manager and operating it on-line with live data
streams (rather than off-line on data logs). All of the scripts
used for BigQuery and classification analyses are publicly
available from the authors’ website
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.