UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

New perspectives on the performance of machine learning classifiers for mode choice prediction: An experimental review

Hillel, Tim; (2021) New perspectives on the performance of machine learning classifiers for mode choice prediction: An experimental review. In: Proceedings of the 21st Swiss Transport Research Conference. (pp. pp. 1-33). Swiss Transport Research Conference Green open access

[thumbnail of Hillel_ New perspectives on the performance of machine learning classifiers for mode choice prediction_VoR.pdf]
Preview
Text
Hillel_ New perspectives on the performance of machine learning classifiers for mode choice prediction_VoR.pdf

Download (479kB) | Preview

Abstract

It appears to be a commonly held belief that Machine Learning (ML) classification algorithms should achieve substantially higher predictive performance than manually specified Random Utility Models (RUMs) for choice modelling. This belief is supported by several papers in the mode choice literature, which highlight stand-out performance of non-linear ML classifiers compared with linear models. However, many studies which compare ML classifiers with linear models have a fundamental flaw in how they validate models on out-of-sample data. This paper investigates the implications of this issue by repeating the experiments of three past papers using two different sampling methods for panel data. The results indicate that using trip-wise sampling with travel diary data causes significant data leakage. Furthermore, the results demonstrate that this data leakage introduces substantial bias in model performance estimates, particularly for flexible non-linear classifiers. Grouped sampling is found to address the issues associated with trip-wise sampling and provides reliable estimates of true Out-Of-Sample (OOS) predictive performance. Whilst the results from this study indicate that there is a slight predictive performance advantage of non-linear classifiers over linear Logistic Regression (LR) models, this advantage is much more modest than has been suggested by previous investigations.

Type: Proceedings paper
Title: New perspectives on the performance of machine learning classifiers for mode choice prediction: An experimental review
Event: 21st Swiss Transport Research Conference
Location: Ascona, Switzerland
Dates: 12 Sep 2021 - 14 Sep 2021
Open access status: An open access version is available from UCL Discovery
Publisher version: https://www.strc.ch/2021.php
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: Machine learning, random utility, discrete choice models, validation
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Civil, Environ and Geomatic Eng
URI: https://discovery.ucl.ac.uk/id/eprint/10174117
Downloads since deposit
6Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item