Get Closest Pair of Points (2D)

Get Closest Pair of Points (2D)

The closest pair of points problem is defined as follows: Given n points on the two-dimensional plane, find the pair of points such that the distance between them is smaller than any other pair of points.

The following divide and conquer algorithm finds the closest pair in O(n log n) time.

Key Implementation

Step 0: sort points by x

public Point[] getClosestPair(Point[] ps){
	//Step 0: sort by x
	sortPointsByX(ps);
	//then
	return getClosestPair(ps,0,ps.length-1);
}

Using Quick sort to get ps sorted by x (horizon value) in O(n log n).

private void sortPointsByX(Point[] ps) {
	int size = ps.length;
	quickSort(ps,0,size-1);
}
private void quickSort(Point[] ps, int left, int right) {
	if (left >= right) return;
	//partition
	int i = left, j = right;
	Point tmp;
	double pivot = (ps[left].x + ps[right].x) / 2;
	while (i <= j) {
		//comparing the value of the specific column
		while (ps[i].x < pivot)i++;
		//comparing the value of the specific column
		while (ps[j].x > pivot)j--;
		if (i <= j) {
			tmp = ps[i];
			ps[i] = ps[j];
			ps[j] = tmp;
			i++;
			j--;
		}
	}
	if (left < i - 1) quickSort(ps, left, i - 1);
	if (i < right) quickSort(ps, i, right);
}

Step 1: Split the set of points into two equal-sized subsets by the median of the x-dimensions of the points.

public Point[] getClosestPair(Point[] ps, int start, int end){
	//...
	//Step 1, find medium X
	int mediumX = (start+end)/2;
    //...(step2 & step3)
	return returnPair;
}

Step 2: Solve the problem recursively. Find the closest pair of each subset.

	Point[] leftClosestPair = this.getClosestPair(ps, start, mediumX);
	Point[] rightCloestPair = this.getClosestPair(ps, mediumX, end);
	double leftDelta = this.getDistance(leftClosestPair);
	double rightDelta = this.getDistance(rightCloestPair);

Step 3. Find the minimal distance among the pair of points in which one lies on one side and the other lies on the other side.

Calculate the edge of 2-delta area.

	int leftEdge = this.getLeftEdgeOfDelta(ps, mediumX, leftDelta);
	int rightEdge = this.getRightEdgeOfDelta(ps, mediumX, rightDelta);

Sort the points in [leftEdge, rightEdge] by Y. Time cost: n logn. Array idxToOrderY[idx]=order means: the idx-th point in [leftEdge, rightEdge] is the order-th point in a sorted list of those points. Array orderYToIdx[order]=idx means, the order-th point of a sorted list has an index of idx in all all points(ps). By doing this, we can find all the possible pairs in 2-delta area without increasing the time complexity.

	int yOrderInfo[][] = this.sortPointsByY(ps,leftEdge,rightEdge);
	int idxToOrderY[] = yOrderInfo[0];
	int orderYToIdx[] = yOrderInfo[1];

For each point in [leftEdge, mediumX], check if there exists a point in [mediumX, rightEdge] which has a distance less than delta to that point.
The out-loop continues no more than n/2 times. There cannot be more than 9 points in area of 2-delta * 2-delta. (Actually, it is no need to test the distance if both of the two points come from the same side.) Otherwise there would be a pair of points that would have distance less than delta. Thus, the two loops inside continue no more than 9 times.

	for(int i=leftEdge;i<=mediumX;i++){
		int orderY = idxToOrderY[i-leftEdge];
		//the following two loops will continue for only 12 times
		//search up
		for(int j=orderY-1;j>=0;j--){
			Point[] pair = new Point[]{ps[i],ps[orderYToIdx[j]]};
			//break when Y distance is bigger than delta
			if(this.getYDistance(pair) > delta) break;
			//check if this distance is smaller
			double distance = this.getDistance(pair);
			if(distance < returnDelta){
				returnDelta = distance;
				returnPair = pair;
			}
		}
		//search down
		for(int j=orderY+1;j<idxToOrderY.length;j++){
			//...			
		}
	}

Testing##

To test the accuracy and efficiency of this algorithm, a brute-force algorithm has been used as comparison.

public Point[] getClosestPairInBruteForce(Point[] ps){
	if(ps.length==2)
		return ps;
	Point[] pair = new Point[2];
	double min = Double.MAX_VALUE;
	for(int i=0;i<ps.length;i++)
		for(int j=i+1;j<ps.length;j++){
			double dis = this.getDistance(ps[i],ps[j]);
			if(dis<min){
				min = dis;
				pair[0] = ps[i];
				pair[1] = ps[j];
			}
		}
	return pair;
}

In the testing, we need to estimate the time cost of both algorithms for a specific size of input. Input size ranges from 100 to 15000. To get average running time, each case will be execute 5 times.

//case number
int total = 5;
//how many points in each case
int pointSize[] = new int[{100,200,500,800,1000,
    1200,1500,2000,3000,3500,4500,
    6000,7500,8500,10000,11000,
    12000,13000,14000,15000};

The input data are generated randomly.

public Point[] generateRandomPoints(int size,
        double maxX, double minX,
        double maxY,double minY){
	Point[] points = new Point[size];
	for(int i=0;i<size;i++){
		points[i] = new Point(
            Math.random()*(maxX-minX)+minX,
            Math.random()*(maxY-minY)+minY);
	}
	return points;
}

Running results of the test cases
Size 100 200 500 800 1000 1200 1500 2000 3000 3500
DC(ms) 0 0 1 1 1 1 2 3 5 7
Brute(ms) 0 1 10 27 46 71 111 186 415 556
Size 4500 6000 7500 8500 10000 11000 12000 13000 14000 15000
DC(ms)10 16 23 30 45 62 77 90 107 125
Brute(ms) 943 1645 2539 3231 4478 5575 6875 7961 9135 10432

The following graph shows the time cost of Divide and Conquer algorithm(O(n log n)).


The efficiency of divide and conquer algorithm can be easily observed when compared with brute-force algorithm(***O(n^2)***).

Java File###

Download: FindClosestPairOfPoints.java


Reference###

Amrinder Arora. Analysis And Design of Algorithms