Engaging Big Data
Jan 2016 - Dec 2020
Big Data-driven agent-based transport simulations
Traditionally, disaggregate demand modelling has been useful in explaining travel behaviour and forecasting travel demand. However, the biggest challenge in disaggregate demand modelling has perhaps been the reliance on a few observations of (usually) reported human behaviour. Recorded in travel diary surveys, these observations are later generalised to the population at large. The reliability of the model predictions are compromised when uncertainties or errors occur in this process of generalisation, and when these inaccuracies further interact and compound. Moreover, surveys are very expensive to conduct, take a long time to encode and release, and are only performed periodically, further compounding problems of generalisation with the added dimension of time.
With the advent of big data, from sources like transit smart cards and cellular phones, comes the potential to revolutionise disaggregate travel demand modelling by overcoming many of these problems. These data sources are congruent with the granularity and mode of analysis of the disaggregate modelling approach, providing direct observation of individual travel behaviour between activity locations on a daily basis.
In the Engaging Big Data research project, we build on the research that started in the first phase of FCL, where we developed a simulation of transit entirely driven by smart card data. In this phase, our aim is to extend our scope to include a wide array of big data sources, including mobile phone data. We fuse these with travel survey data to derive a complete and constantly updated travel demand description based on the actual behaviour of commuters.
We develop novel machine learning methods that can associate activity purpose and sociodemographic information with commuter trajectories. In order to ensure privacy, we investigate generative techniques that can produce endless streams of commuter trajectories that are, while not linked to specific individuals, but are realistic and representative of the population, and which will allow the system to be deployed in a ‘machine-eyes-only’ mode. The project also aims to address implementation challenges, including distributed simulation, and automatic calibration of simulation parameters.
Publications
- Digital twin travellers: Disaggregated travel demand from aggregated mobile phone data – A privacy by design approachCuauhtemoc AndaPublic PhD Presentation. Zurich, Switzerland. 2022. Presentation.
- Synthesising digital twin travellers: Individual travel demand from aggregated mobile phone dataCuauhtémoc Anda, Sergio Arturo Ordonez Medina and Kay W. AxhausenTransportation Research Part C: Emerging Technologies, vol. 128, pp. 103118, Amsterdam: Elsevier, 2021.
Mobile phone data generated in mobile communication networks has the potential to improve current travel demand models and in general, how we plan for better urban transportation systems. However, due to its high-dimensionality, even if anonymised there still exists the possibility to re-identify the users behind the mobile phone traces. This risk makes its usage outside the telecommunication network incompatible with recent data privacy regulations, hampering its adoption in transportation-related applications. To address this issue, we propose a framework designed only with user-aggregated mobile phone data to synthesise realistic daily individual mobility — Digital Twin Travellers. We explore different strategies built around modified Markov models and an adaption of the Rejection Sampling algorithm to recreate realistic daily schedules and locations. We also define a one-day mobility population score to measure the similarity between the population of generated agents and the real mobile phone user population. Ultimately, we show how with a series of histograms provided by the telecommunication service provider (TSP) it is possible and plausible to disaggregate them into new synthetic and useful individual-level information, building in this way a big data travel demand framework that is designed in accordance with current data privacy regulations.
- Synthesising digital twin travellers: Individual travel demand from aggregated mobile phone dataCuauhtémoc Anda, Sergio Arturo Ordonez Medina and Kay W. AxhausenArbeitsberichte Verkehrs- und Raumplanung, vol. 1559, Singapore: Future Cities Laboratory (FCL), Singapore ETH Centre, 2020.
Mobile phone data generated in mobile communication networks has the potential to improve current travel demand models and in general, how we plan for better urban transportation systems. However, due to its high-dimensionality, even if anonymised there still exists the possibility to reidentify the users behind the mobile phone traces. This risk makes its usage outside the telecommunication network incompatible with recent data privacy regulations, hampering its adoption in transportation-related applications. To address this issue, we propose a framework designed only with user-aggregated mobile phone data to synthesise realistic daily individual mobility — Digital Twin Travellers. We explore different strategies built around modified Markov models and an adaption of the Rejection Sampling algorithm to recreate realistic daily schedules and locations. We also define a one-day mobility population score to measure the similarity between the population of generated agents and the real mobile phone user population. Ultimately, we show how with a series of histograms provided by the telecommunication service provider (TSP) it is possible and plausible to disaggregate them into new synthetic and useful individual-level information, building in this way a big data travel demand framework that is designed in accordance with current data privacy regulations.
- Privacy-by-design generative models of urban mobilityCuauhtémoc Anda and Sergio Arturo Ordonez MedinaArbeitsberichte Verkehrs- und Raumplanung, vol. 1454, Singapore: FCL, Singapore ETH Centre, 2019.
New streams of Location-based Services (LBS) Big data have risen society’s concerns in regards to data privacy. Even though these type of data sets are anonymised and aggregated in space and time, the risk of a privacy breach by user’s re-identification is still imminent. Still, LBS data has the potential to improve current travel demand models and transportation applications. We this in mind, we introduce a Privacy by Design framework that generates realistic disaggregated daily mobility patterns without the need for any personal information or access to individual-level LBS data. On the first step of the framework, we estimate the joint probability distribution of daily mobility patterns using modified Markov models, followed by an adaptation of the rejection sampling algorithm to improve the distribution of the daily tour types. We validate the synthetic mobility patterns against six different distributions and reach an average accuracy over 95%. With this, we hope to open the discussion in the transportation community in regards to data privacy and travel demand models.
- A single trajectory is a tragedy, 1.2 million is Big DataPieter Jacobus Fourie, Cuauhtémoc Anda and Sergio A. Ordoñez MedinaETH Blog - Engaging Mobility, Singapore: Engaging Mobility Group, Future Cities Laboratory (FCL), 2018.
There is still such a thing as bad publicity, as a recent New York Times exposè on app-driven person tracking confirms. Here’s how to stay out of the headlines by rolling your own data. We have developed methods that allow data stewards to stream completely synthetic location trails, which can fulfil the needs of many location-based services, and unconditionally guarantee individual privacy.
- Big data, AI, and data privacy for transport planningCuauhtémoc AndaDepartment of Urban Planning, Tongji University. Shanghai, China. 2018. Presentation.
- A time-space model of disaggregated urban mobility from aggregated mobile phone dataCuauhtémoc Anda15th International Conference on Travel Behavior Research (IATBR 2018), Santa Barbara, CA, USA Singapore; Zurich: Future Cities Laboratory (FCL); IVT, ETH Zurich, July 15-20, 2018.
- A time-space model of disaggregated urban mobility from aggregated mobile phone dataCuauhtémoc Anda and Sergio A. Ordoñez Medina15th International Conference on Travel Behavior Research (IATBR 2018), Santa Barbara, CA, USA Zurich: IVT, ETH Zurich, July 15-20, 2018.
- Archetypes of urban travelers: Clustering of mobile phone users in SingaporeCuauhtémoc AndaMobile Tartu 2018, Tartu, Estonia Singapore: FCL, Singapore ETH Centre, June 27-29, 2018.
- Archetypes of urban travellers: Clustering of mobile phone users in SingaporeCuauhtémoc Anda and Sergio A. Ordoñez MedinaMobile Tartu 2018, Tartu, Estonia Zurich: ETH Zurich, June 27-29, 2018.
An important task in transportation planning is to segment the travelling demand into groups of homogenous individuals for a better analysis. Traditionally travel demand is segmented through socio-demographic information. However, this information is often times not available in new streams of Big Data. We propose then a methodology to uncover the different archetypes of urban travellers based only on the mobility traces left by their mobile phones. We defined 5 variables that explain traits of travel behaviour within these digital traces, followed by an evaluation of two clustering algorithms on a dimensional-reduced space. We finally present 16 archetypes of urban travellers for the case of Singapore. Results from this study constitute one of the first steps towards the development of new Big Data travel demand models when privacy is a concern.