TY - GEN
T1 - The single pixel GPS
T2 - 20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2012
AU - Feldman, Dan
AU - Sung, Cynthia
AU - Rus, Daniela
PY - 2012
Y1 - 2012
N2 - We present algorithms for simplifying and clustering patterns from sensors such as GPS, LiDAR, and other devices that can produce high-dimensional signals. The algorithms are suitable for handling very large (e.g. terabytes) streaming data and can be run in parallel on networks or clouds. Applications include compression, denoising, activity recognition, road matching, and map generation. We encode these problems as (k, m)-segment mean problems. Formally, we provide (1 + ε)-approximations to the k-segment and (k, m)-segment mean of a d-dimensional discrete-time signal. The k-segment mean is a k-piecewise linear function that minimizes the regression distance to the signal. The (k,m)-segment mean has an additional constraint that the projection of the k segments on Rd consists of only m ≤ k segments. Existing algorithms for these problems take O(kn2) and nO(mk) time respectively and O(kn2) space, where n is the length of the signal. Our main tool is a new coreset for discrete-time signals. The coreset is a smart compression of the input signal that allows computation of a (1 + ε)-approximation to the k-segment or (k,m)-segment mean in O(n log n) time for arbitrary constants ε,k, and m. We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n. We provide empirical evaluations of the quality of our coreset and experimental results that show how our coreset boosts both inefficient optimal algorithms and existing heuristics. We demonstrate our results for extracting signals from GPS traces. However, the results are more general and applicable to other types of sensors.
AB - We present algorithms for simplifying and clustering patterns from sensors such as GPS, LiDAR, and other devices that can produce high-dimensional signals. The algorithms are suitable for handling very large (e.g. terabytes) streaming data and can be run in parallel on networks or clouds. Applications include compression, denoising, activity recognition, road matching, and map generation. We encode these problems as (k, m)-segment mean problems. Formally, we provide (1 + ε)-approximations to the k-segment and (k, m)-segment mean of a d-dimensional discrete-time signal. The k-segment mean is a k-piecewise linear function that minimizes the regression distance to the signal. The (k,m)-segment mean has an additional constraint that the projection of the k segments on Rd consists of only m ≤ k segments. Existing algorithms for these problems take O(kn2) and nO(mk) time respectively and O(kn2) space, where n is the length of the signal. Our main tool is a new coreset for discrete-time signals. The coreset is a smart compression of the input signal that allows computation of a (1 + ε)-approximation to the k-segment or (k,m)-segment mean in O(n log n) time for arbitrary constants ε,k, and m. We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n. We provide empirical evaluations of the quality of our coreset and experimental results that show how our coreset boosts both inefficient optimal algorithms and existing heuristics. We demonstrate our results for extracting signals from GPS traces. However, the results are more general and applicable to other types of sensors.
KW - Douglas Peucker
KW - big data
KW - coresets
KW - line simplification
KW - streaming
KW - trajectories clustering
UR - http://www.scopus.com/inward/record.url?scp=84872763132&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/2424321.2424325
DO - https://doi.org/10.1145/2424321.2424325
M3 - Conference contribution
SN - 9781450316910
T3 - GIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems
SP - 23
EP - 32
BT - 20th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2012
Y2 - 6 November 2012 through 9 November 2012
ER -