[python] numpy 배열에서 가장 가까운 값 찾기

배열에서 가장 가까운 값 을 찾는 함수와 같은 수많은 방식이 있습니까?

예:

np.find_nearest( array, value )

답변

import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

array = np.random.random(10)
print(array)
# [ 0.21069679  0.61290182  0.63425412  0.84635244  0.91599191  0.00213826
#   0.17104965  0.56874386  0.57319379  0.28719469]

value = 0.5

print(find_nearest(array, value))
# 0.568743859261

답변

경우 배열을 정렬하고 매우 큰되고, 이것은 훨씬 빠른 솔루션입니다 :

def find_nearest(array,value):
    idx = np.searchsorted(array, value, side="left")
    if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])):
        return array[idx-1]
    else:
        return array[idx]

이것은 매우 큰 배열로 확장됩니다. 배열이 이미 정렬되어 있다고 가정 할 수없는 경우 위의 방법으로 정렬하여 쉽게 정렬 할 수 있습니다. 작은 어레이에는 과잉이지만, 일단 커지면 훨씬 빠릅니다.

답변

약간 수정하면 위의 대답은 임의의 차원 (1d, 2d, 3d, …)의 배열에서 작동합니다.

def find_nearest(a, a0):
    "Element in nd array `a` closest to the scalar value `a0`"
    idx = np.abs(a - a0).argmin()
    return a.flat[idx]

또는 한 줄로 작성하십시오.

a.flat[np.abs(a - a0).argmin()]

답변

답변 요약 : 정렬 된 array경우 이분법 코드 (아래 제공)가 가장 빠릅니다. 대형 어레이의 경우 ~ 100-1000 배, 소형 어레이의 경우 ~ 2-100 배 더 빠릅니다. numpy도 필요하지 않습니다. 분류되지 않은 array경우 array큰 경우 먼저 O (n logn) 정렬을 사용한 다음 이분법을 고려해야하며, array작 으면 방법 2가 가장 빠릅니다.

먼저 가장 가까운 값으로 의미를 명확히해야합니다 . 가로축의 간격을 원할 때가 종종 있습니다 (예 : array = [0,0.7,2.1], value = 1.95, answer는 idx = 1입니다. 이것은 내가 필요하다고 생각하는 경우입니다 (그렇지 않으면 간격을 찾으면 후속 조건문을 사용하여 다음을 매우 쉽게 수정할 수 있습니다). 이 작업을 수행하는 최적의 방법은 이분법을 사용하는 것입니다 (먼저 제공 할 것입니다-전혀 numpy가 필요하지 않으며 중복 작업을 수행하기 때문에 numpy 함수를 사용하는 것보다 빠릅니다). 그런 다음 다른 사용자가 여기에 제시 한 다른 것과의 타이밍 비교를 제공합니다.

이등분:

def bisection(array,value):
    '''Given an ``array`` , and given a ``value`` , returns an index j such that ``value`` is between array[j]
    and array[j+1]. ``array`` must be monotonic increasing. j=-1 or j=len(array) is returned
    to indicate that ``value`` is out of range below and above respectively.'''
    n = len(array)
    if (value < array[0]):
        return -1
    elif (value > array[n-1]):
        return n
    jl = 0# Initialize lower
    ju = n-1# and upper limits.
    while (ju-jl > 1):# If we are not yet done,
        jm=(ju+jl) >> 1# compute a midpoint with a bitshift
        if (value >= array[jm]):
            jl=jm# and replace either the lower limit
        else:
            ju=jm# or the upper limit, as appropriate.
        # Repeat until the test condition is satisfied.
    if (value == array[0]):# edge cases at bottom
        return 0
    elif (value == array[n-1]):# and top
        return n-1
    else:
        return jl

이제 다른 답변에서 코드를 정의하고 각각 인덱스를 반환합니다.

import math
import numpy as np

def find_nearest1(array,value):
    idx,val = min(enumerate(array), key=lambda x: abs(x[1]-value))
    return idx

def find_nearest2(array, values):
    indices = np.abs(np.subtract.outer(array, values)).argmin(0)
    return indices

def find_nearest3(array, values):
    values = np.atleast_1d(values)
    indices = np.abs(np.int64(np.subtract.outer(array, values))).argmin(0)
    out = array[indices]
    return indices

def find_nearest4(array,value):
    idx = (np.abs(array-value)).argmin()
    return idx


def find_nearest5(array, value):
    idx_sorted = np.argsort(array)
    sorted_array = np.array(array[idx_sorted])
    idx = np.searchsorted(sorted_array, value, side="left")
    if idx >= len(array):
        idx_nearest = idx_sorted[len(array)-1]
    elif idx == 0:
        idx_nearest = idx_sorted[0]
    else:
        if abs(value - sorted_array[idx-1]) < abs(value - sorted_array[idx]):
            idx_nearest = idx_sorted[idx-1]
        else:
            idx_nearest = idx_sorted[idx]
    return idx_nearest

def find_nearest6(array,value):
    xi = np.argmin(np.abs(np.ceil(array[None].T - value)),axis=0)
    return xi

이제 코드의 시간을 정하겠습니다.
참고 방법 1,2,4,5는 간격을 올바르게 지정하지 않습니다. 방법 1,2,4는 배열에서 가장 가까운 점으로 반올림하고 (예 :> = 1.5-> 2), 방법 5는 항상 반올림합니다 (예 : 1.45-> 2). 방법 3, 6 및 물론 이분법 만 구간을 적절하게 제공합니다.

array = np.arange(100000)
val = array[50000]+0.55
print( bisection(array,val))
%timeit bisection(array,val)
print( find_nearest1(array,val))
%timeit find_nearest1(array,val)
print( find_nearest2(array,val))
%timeit find_nearest2(array,val)
print( find_nearest3(array,val))
%timeit find_nearest3(array,val)
print( find_nearest4(array,val))
%timeit find_nearest4(array,val)
print( find_nearest5(array,val))
%timeit find_nearest5(array,val)
print( find_nearest6(array,val))
%timeit find_nearest6(array,val)

(50000, 50000)
100000 loops, best of 3: 4.4 µs per loop
50001
1 loop, best of 3: 180 ms per loop
50001
1000 loops, best of 3: 267 µs per loop
[50000]
1000 loops, best of 3: 390 µs per loop
50001
1000 loops, best of 3: 259 µs per loop
50001
1000 loops, best of 3: 1.21 ms per loop
[50000]
1000 loops, best of 3: 746 µs per loop

큰 배열의 이분법은 다음 최고 180us 및 가장 긴 1.21ms (~ 100-1000 배 더 빠름)에 비해 4us를 제공합니다. 더 작은 어레이의 경우 ~ 2-100 배 빠릅니다.

답변

다음은 벡터 배열에서 가장 가까운 벡터를 찾는 확장 기능입니다.

import numpy as np

def find_nearest_vector(array, value):
  idx = np.array([np.linalg.norm(x+y) for (x,y) in array-value]).argmin()
  return array[idx]

A = np.random.random((10,2))*100
""" A = array([[ 34.19762933,  43.14534123],
   [ 48.79558706,  47.79243283],
   [ 38.42774411,  84.87155478],
   [ 63.64371943,  50.7722317 ],
   [ 73.56362857,  27.87895698],
   [ 96.67790593,  77.76150486],
   [ 68.86202147,  21.38735169],
   [  5.21796467,  59.17051276],
   [ 82.92389467,  99.90387851],
   [  6.76626539,  30.50661753]])"""
pt = [6, 30]
print find_nearest_vector(A,pt)
# array([  6.76626539,  30.50661753])

답변

numpy를 사용하지 않으려면 다음을 수행하십시오.

def find_nearest(array, value):
    n = [abs(i-value) for i in array]
    idx = n.index(min(n))
    return array[idx]

답변

스칼라 이외의 “값”배열을 처리하는 버전은 다음과 같습니다.

import numpy as np

def find_nearest(array, values):
    indices = np.abs(np.subtract.outer(array, values)).argmin(0)
    return array[indices]

또는 입력이 스칼라 인 경우 숫자 유형 (예 : int, float)을 반환하는 버전 :

def find_nearest(array, values):
    values = np.atleast_1d(values)
    indices = np.abs(np.subtract.outer(array, values)).argmin(0)
    out = array[indices]
    return out if len(out) > 1 else out[0]