Wednesday, September 26, 2007

Paper #9 - Streaming Algorithms for Line Simplification

Paper:
Streaming Algorithms for Line Simplification
(Mohammad Ali Abam et al)

Summary:
Line simplification is a technique used to approximate (i.e., simplify) a line of possibly infinite streaming points in order to accommodate limited storage. While its use in the paper primarily focuses on Geographical Information Systems (GIS), it also has practical applications in sketch recognition. Even though line simplification is used to overcome limited storage, current computation is quite costly for known algorithms. The goal of the paper is to present an algorithm which can reduce these costs into something more manageable.

Two distance metrics and a ratio are used to aid in computation and to determine efficiency, respectively. The first distance metric is Hausdorff distance, which both measure the distance of two sets of points. Hausdorff distance takes the largest distance from a set of smallest distances of each point in a set to the other, while Frechet distance can be defined as measuring the distance between two paths when both are traversed forward at different time intervals. A competitive ratio, which is a ratio of a simple even-number interval line simplification algorithm to the optimal algorithm in question, determines how efficient the optimal algorithm is.

The Abam linear simplification method claims to have a good competitive ratio under the two conditions of being monotonic and having an error-predicting oracle. The algorithm itself consists of four steps, given that streamed points are already simplified. The first step is to set the next streamed point as the next line-simplified point. The second step is to compute the error, with the oracle, of the previous point based on the line segment of the current point and a point that’s two points away, and to also add the previous point and its priority (its error if removed from the line simplification) into a priority queue. The third step is to extract the point in the priority queue with the minimum priority. The last step is to update the priority of the previous and succeeding point of the extracted point.

The key component in the algorithm is this use of the error oracle. There are three types of paths which the points stream in that determine how the oracle responds. For convex paths, which are paths that forms a convex polygon when the last point connects to the first point, and xy-monotone paths, which are paths that are intersected by horizontal or vertical lines at most one point, the Hausdorff distance is used. For arbitrary paths, the Frechet distance is used. With the given results, O(1) competitive ratio was achieved for the Hausdorff and Frechet cases with O(k) and O(k*k) additional storage, respectively.

Discussion:
Line simplification on potentially infinite streaming points with traditional techniques is a very costly operation, and the reduction of cost with the Abam method seems impressive. As Dr. Hammond elaborated, the Abam method has potential in use for certain sketches such as those which the user scribbles in a fixed region many more points than can be handled by general sketch recognition methods.

One issue which was brought up by fellow namesake Paul (Corey), which I didn’t realize was a problem at the time until he brought it up during my presentation of the paper, was the removal of a point with lowest priority each time a new point is streamed into the simplified line. I was unable to hear his entire argument about it, but I would intuitively think that the computation used to determine which points to remove would lose relevance if storage savings is no better than an algorithm which simply removes even-numbered points.

No comments: