Rights statement: © 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Accepted author manuscript, 400 KB, PDF document
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - GeoMatch
T2 - Efficient Large-Scale Map Matching on Apache Spark
AU - Zeidan, Ayman
AU - Lagerspetz, Eemil
AU - Zhao, Kai
AU - Nurmi, Petteri Tapio
AU - Tarkoma, Sasu
AU - Vo, Huy T.
PY - 2019/1/24
Y1 - 2019/1/24
N2 - We contribute by developing GeoMatch as a novel, scalable, and efficient big-data pipeline for large-scale map matching on Apache Spark. GeoMatch improves existing spatial big data solutions by utilizing a novel spatial partitioning scheme inspired by Hilbert space-filling curves. Thanks to the partitioning scheme, GeoMatch can effectively balance operations across different processing units and achieve significant performance gains. We demonstrate the effectiveness of GeoMatch through rigorous and extensive benchmarks that consider data sets containing large-scale urban spatial data sets ranging from 166, 253 to 3.78 billion location measurements. Our results show over 17-fold performance improvements compared to previous works while achieving better processing accuracy than current solutions (97.48%).
AB - We contribute by developing GeoMatch as a novel, scalable, and efficient big-data pipeline for large-scale map matching on Apache Spark. GeoMatch improves existing spatial big data solutions by utilizing a novel spatial partitioning scheme inspired by Hilbert space-filling curves. Thanks to the partitioning scheme, GeoMatch can effectively balance operations across different processing units and achieve significant performance gains. We demonstrate the effectiveness of GeoMatch through rigorous and extensive benchmarks that consider data sets containing large-scale urban spatial data sets ranging from 166, 253 to 3.78 billion location measurements. Our results show over 17-fold performance improvements compared to previous works while achieving better processing accuracy than current solutions (97.48%).
U2 - 10.1109/BigData.2018.8622488
DO - 10.1109/BigData.2018.8622488
M3 - Conference contribution/Paper
SP - 384
EP - 391
BT - 2018 IEEE International Conference on Big Data (Big Data)
PB - IEEE
ER -