A distributionally robust optimization approach to reconstructing missing locations and paths using high-frequency trajectory data

Document Type


Publication Date



Daily high-frequency trajectory data (e.g., 0.1-s connected vehicle data) provide a promising foundation to improve the observability of travel demand dynamics. However, the raw trajectories are not always accurate and complete due to technical and privacy issues. This paper proposes a data-driven optimization modeling framework to reconstruct the location-duration-path choices for the missing observations from the incomplete trajectories. By processing many-day raw trajectories, we observe a set of historical choices of location-duration-path and identify missing observations in space and time dimensions. To improve computational efficiency, we apply data-driven network-time prisms that reduce the search space for the missing choices. Then, we formulate Distributionally Robust Optimization (DRO) models with likelihood bounds, a special case of data-driven optimization models using phi-divergences (i.e., χ2 distance), to reconstruct the missing choices. To solve the minimax programs of the DRO models while maintaining tractability, we reformulate and solve the equivalent dual problems of the DRO models based on the strong duality theory. To demonstrate and validate the proposed models, we use a real-world connected vehicle dataset containing around 2,800 connected vehicles over two separate months in Southeast Michigan from the Safety Pilot Model Deployment (SPMD) project and a transportation network from OpenStreetMap.

Publisher's Statement

© 2019 Elsevier Ltd. All rights reserved. Publisher's version of record: https://doi.org/10.1016/j.trc.2019.03.012

Publication Title

Transportation Research Part C: Emerging Technologies