📜  几何中位数

📅  最后修改于: 2021-05-31 22:54:37             🧑  作者: Mango

在正常的中位数中,我们找到了一个具有最小距离总和的点。类似的概念适用于二维空间。
给定二维空间中的N个点,任务是找出一个单点(x,y),从该点到输入点的距离之和被最小化(也称为最小距离的中心)。

例子:

方法:
乍一看,这个问题似乎要求我们找到给定输入点的中点或几何中心点(换句话说,质心)。由于它是输入的“中心”点,因此应自动最小化从中心到所有给定输入点的距离之和。此过程类似于找到重心N离散质量粒子。第一个示例测试用例甚至给出了正确的答案。但是,当我们对第二个示例应用相同的逻辑时会发生什么呢?

我们可以清楚地看到,几何中心或质心(0, 0), (0, 0), (0, 12)(0, 4) 。因此,根据欧几里德距离公式,从质心到所有三个输入点的总距离为4+4+8 = 16但是最佳点应该是(0, 0) ,使我们的总距离为12那么,我们哪里错了?

直观地,您可以认为输入点的质心为我们提供了输入点的算术平均值。但是我们需要的是输入点的中心趋势,以便使达到该中心趋势的成本(或换句话说,欧几里得距离)最小化。这被称为一组点的几何中位数,这有点像在概念上,中位数与给定输入的均值有很大不同。

没有任何定义好的正确算法可以找到几何中值。解决此类问题的方法是近似解决方案,并确定我们的解决方案是否确实是“几何中位数”。

算法

有两个重要变量:

  • current_point –存储该点的x和y坐标,该坐标可以是“几何中值”。
  • minimum_distance –存储从current_point到所有输入点的欧几里得距离的总和。

每次近似后,如果找到距离总和较低的新点,则将current_point和minimum_distance的值都更新为新点和新距离。

首先,我们找到给定点的质心,将其作为current_point(或中位数)并将距离之和存储在minimum_distance中。然后,我们迭代给定的输入点,依次假设每个输入点为中位数,然后计算到其他点的距离。如果此距离小于minimum_distance,则将current_point和minimum_distance的旧值更新为新值。否则,旧值保持不变。

然后我们进入一个while循环。在该循环内,我们总共从current_point移动了一个test_distance距离(对于本示例,我们假设test_distance为1000) 4方向(左,上,右,下)。因此我们得到4新点。然后我们计算从这些新点到给定输入点的距离。如果此距离之和小于先前的minimum_distance,则我们将current_point和minimum_distance的旧值更新为新值,并重复while循环。否则,我们将test_distance除以2然后重复while循环。

while循环的终止条件是一个称为“ lower_limit”的特定值。值越小,近似的精度越高。当lower_limit超过test_distance时,循环终止。

下面是上述方法的实现:

// C++ implementation of the approach
#include 
using namespace std;
  
// To store a point in 2-D space
struct Point {
    double x, y;
};
  
// Test points. These points are the left,
// up, right and down relative neighbours
// (arranged circularly) to the
// current_point at a distance of
// test_distance from current_point
Point test_point[] = { { -1.0, 0.0 },
                       { 0.0, 1.0 },
                       { 1.0, 0.0 },
                       { 0.0, -1.0 } };
  
// Lowest Limit till which we are going
// to run the main while loop
// Lower the Limit higher the accuracy
double lower_limit = 0.01;
  
// Function to return the sum of Euclidean
// Distances
double distSum(Point p,
                        Point arr[], int n)
{
    double sum = 0;
    for (int i = 0; i < n; i++) {
        double distx = abs(arr[i].x - p.x);
        double disty = abs(arr[i].y - p.y);
        sum += sqrt((distx * distx) + (disty * disty));
    }
  
    // Return the sum of Euclidean Distances
    return sum;
}
  
// Function to calculate the required
// geometric median
void geometricMedian(Point arr[], int n)
{
  
    // Current x coordinate and y coordinate
    Point current_point;
  
    for (int i = 0; i < n; i++) {
        current_point.x += arr[i].x;
        current_point.y += arr[i].y;
    }
  
    // Here current_point becomes the
    // Geographic MidPoint
    // Or Center of Gravity of equal
    // discrete mass distributions
    current_point.x /= n;
    current_point.y /= n;
  
    // minimum_distance becomes sum of
    // all distances from MidPoint to
    // all given points
    double minimum_distance = 
       distSum(current_point, arr, n);
  
    int k = 0;
    while (k < n) {
        for (int i = 0; i < n, i != k; i++) {
            Point newpoint;
            newpoint.x = arr[i].x;
            newpoint.y = arr[i].y;
            double newd = 
                   distSum(newpoint, arr, n);
            if (newd < minimum_distance) {
                minimum_distance = newd;
                current_point.x = newpoint.x;
                current_point.y = newpoint.y;
            }
        }
        k++;
    }
  
    // Assume test_distance to be 1000
    double test_distance = 1000;
    int flag = 0;
  
    // Test loop for approximation starts here
    while (test_distance > lower_limit) {
  
        flag = 0;
  
        // Loop for iterating over all 4 neighbours
        for (int i = 0; i < 4; i++) {
  
            // Finding Neighbours done
            Point newpoint;
            newpoint.x = current_point.x
                 + (double)test_distance * test_point[i].x;
            newpoint.y = current_point.y
                 + (double)test_distance * test_point[i].y;
  
            // New sum of Euclidean distances
            // from the neighbor to the given
            // data points
            double newd = distSum(newpoint, arr, n);
  
            if (newd < minimum_distance) {
  
                // Approximating and changing
                // current_point
                minimum_distance = newd;
                current_point.x = newpoint.x;
                current_point.y = newpoint.y;
                flag = 1;
                break;
            }
        }
  
        // This means none of the 4 neighbours
        // has the new minimum distance, hence
        // we divide by 2 and reiterate while
        // loop for better approximation
        if (flag == 0)
            test_distance /= 2;
    }
  
    cout << "Geometric Median = ("
         << current_point.x << ", "
         << current_point.y << ")";
    cout << " with minimum distance = "
         << minimum_distance;
}
  
// Driver code
int main()
{
  
    int n = 2;
    Point arr[n];
    arr[0].x = 1;
    arr[0].y = 1;
    arr[1].x = 3;
    arr[1].y = 3;
    geometricMedian(arr, n);
  
    return 0;
}
输出:
Geometric Median = (2, 2) with minimum distance = 2.82843

参考:几何中值,最小距离中心

如果您希望与行业专家一起参加现场课程,请参阅《 Geeks现场课程》和《 Geeks现场课程美国》。