問題描述
我正在嘗試將一張圖片與其他圖片列表進行比較,并返回該列表中相似度高達 70% 的圖片選擇(如 Google 搜索圖片).
I'm trying to compare a image to a list of other images and return a selection of images (like Google search images) of this list with up to 70% of similarity.
我在 這篇文章中得到了這段代碼 并根據我的上下文進行更改
I get this code in this post and change for my context
# Load the images
img =cv2.imread(MEDIA_ROOT + "/uploads/imagerecognize/armchair.jpg")
# Convert them to grayscale
imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# SURF extraction
surf = cv2.FeatureDetector_create("SURF")
surfDescriptorExtractor = cv2.DescriptorExtractor_create("SURF")
kp = surf.detect(imgg)
kp, descritors = surfDescriptorExtractor.compute(imgg,kp)
# Setting up samples and responses for kNN
samples = np.array(descritors)
responses = np.arange(len(kp),dtype = np.float32)
# kNN training
knn = cv2.KNearest()
knn.train(samples,responses)
modelImages = [MEDIA_ROOT + "/uploads/imagerecognize/1.jpg", MEDIA_ROOT + "/uploads/imagerecognize/2.jpg", MEDIA_ROOT + "/uploads/imagerecognize/3.jpg"]
for modelImage in modelImages:
# Now loading a template image and searching for similar keypoints
template = cv2.imread(modelImage)
templateg= cv2.cvtColor(template,cv2.COLOR_BGR2GRAY)
keys = surf.detect(templateg)
keys,desc = surfDescriptorExtractor.compute(templateg, keys)
for h,des in enumerate(desc):
des = np.array(des,np.float32).reshape((1,128))
retval, results, neigh_resp, dists = knn.find_nearest(des,1)
res,dist = int(results[0][0]),dists[0][0]
if dist<0.1: # draw matched keypoints in red color
color = (0,0,255)
else: # draw unmatched in blue color
#print dist
color = (255,0,0)
#Draw matched key points on original image
x,y = kp[res].pt
center = (int(x),int(y))
cv2.circle(img,center,2,color,-1)
#Draw matched key points on template image
x,y = keys[h].pt
center = (int(x),int(y))
cv2.circle(template,center,2,color,-1)
cv2.imshow('img',img)
cv2.imshow('tm',template)
cv2.waitKey(0)
cv2.destroyAllWindows()
我的問題是,如何將圖像與圖像列表進行比較并僅獲得相似的圖像?有什么方法可以做到嗎?
My question is, how can I compare the image with the list of images and get only the similar images? Is there any method to do this?
推薦答案
我建議你看一下圖像之間的推土機距離(EMD).該指標讓人感覺將標準化灰度圖像轉換為另一個灰度圖像有多難,但可以推廣到彩色圖像.可以在以下論文中找到對該方法的非常好的分析:
I suggest you to take a look to the earth mover's distance (EMD) between the images. This metric gives a feeling on how hard it is to tranform a normalized grayscale image into another, but can be generalized for color images. A very good analysis of this method can be found in the following paper:
robotics.stanford.edu/~rubner/papers/rubnerIjcv00.pdf
它既可以在整個圖像上完成,也可以在直方圖上完成(這確實比整個圖像方法更快).我不確定哪種方法可以進行完整的圖像比較,但對于直方圖比較,您可以使用 cv.CalcEMD2 函數.
It can be done both on the whole image and on the histogram (which is really faster than the whole image method). I'm not sure of which method allow a full image comparision, but for histogram comparision you can use the cv.CalcEMD2 function.
唯一的問題是這個方法沒有定義相似度的百分比,而是一個你可以過濾的距離.
The only problem is that this method does not define a percentage of similarity, but a distance that you can filter on.
我知道這不是一個完整的工作算法,但仍然是它的基礎,所以我希望它有所幫助.
I know that this is not a full working algorithm, but is still a base for it, so I hope it helps.
這是 EMD 原則上如何工作的惡搞.主要思想是有兩個歸一化矩陣(兩個灰度圖像除以它們的總和),并定義一個通量矩陣,描述如何將灰色從一個像素移動到另一個像素以獲得第二個圖像(甚至可以定義對于非標準化的,但更難).
Here is a spoof of how the EMD works in principle. The main idea is having two normalized matrices (two grayscale images divided by their sum), and defining a flux matrix that describe how you move the gray from one pixel to the other from the first image to obtain the second (it can be defined even for non normalized one, but is more difficult).
在數學術語中,流矩陣實際上是一個四維張量,它給出了從舊圖像的點 (i,j) 到新圖像的點 (k,l) 的流,但是如果您將圖像展平,您可以將其轉換為普通矩陣,只是更難閱讀.
In mathematical terms the flow matrix is actually a quadridimensional tensor that gives the flow from the point (i,j) of the old image to the point (k,l) of the new one, but if you flatten your images you can transform it to a normal matrix, just a little more hard to read.
這個流矩陣有三個約束:每一項都應該是正數,每行之和應該返回相同的目標像素值,每列之和應該返回起始像素的值.
This Flow matrix has three constraints: each terms should be positive, the sum of each row should return the same value of the desitnation pixel and the sum of each column should return the value of the starting pixel.
鑒于此,您必須最小化轉換的成本,由 (i,j) 到 (k,l) 的每個流的乘積之和對于 (i,j) 和 (k,l).
Given this you have to minimize the cost of the transformation, given by the sum of the products of each flow from (i,j) to (k,l) for the distance between (i,j) and (k,l).
文字看起來有點復雜,下面是測試代碼.邏輯是正確的,我不確定為什么 scipy 求解器會抱怨它(你應該看看 openOpt 或類似的東西):
It looks a little complicated in words, so here is the test code. The logic is correct, I'm not sure why the scipy solver complains about it (you should look maybe to openOpt or something similar):
#original data, two 2x2 images, normalized
x = rand(2,2)
x/=sum(x)
y = rand(2,2)
y/=sum(y)
#initial guess of the flux matrix
# just the product of the image x as row for the image y as column
#This is a working flux, but is not an optimal one
F = (y.flatten()*x.flatten().reshape((y.size,-1))).flatten()
#distance matrix, based on euclidean distance
row_x,col_x = meshgrid(range(x.shape[0]),range(x.shape[1]))
row_y,col_y = meshgrid(range(y.shape[0]),range(y.shape[1]))
rows = ((row_x.flatten().reshape((row_x.size,-1)) - row_y.flatten().reshape((-1,row_x.size)))**2)
cols = ((col_x.flatten().reshape((row_x.size,-1)) - col_y.flatten().reshape((-1,row_x.size)))**2)
D = np.sqrt(rows+cols)
D = D.flatten()
x = x.flatten()
y = y.flatten()
#COST=sum(F*D)
#cost function
fun = lambda F: sum(F*D)
jac = lambda F: D
#array of constraint
#the constraint of sum one is implicit given the later constraints
cons = []
#each row and columns should sum to the value of the start and destination array
cons += [ {'type': 'eq', 'fun': lambda F: sum(F.reshape((x.size,y.size))[i,:])-x[i]} for i in range(x.size) ]
cons += [ {'type': 'eq', 'fun': lambda F: sum(F.reshape((x.size,y.size))[:,i])-y[i]} for i in range(y.size) ]
#the values of F should be positive
bnds = (0, None)*F.size
from scipy.optimize import minimize
res = minimize(fun=fun, x0=F, method='SLSQP', jac=jac, bounds=bnds, constraints=cons)
變量 res 包含最小化的結果......但正如我所說,我不確定它為什么抱怨奇異矩陣.
the variable res contains the result of the minimization...but as I said I'm not sure why it complains about a singular matrix.
這個算法唯一的問題是速度不是很快,所以不可能按需做,但你必須耐心地在創建數據集時執行它并將結果存儲在某個地方
The only problem with this algorithm is that is not very fast, so it's not possible to do it on demand, but you have to perform it with patience on the creation of the dataset and store somewhere the results
這篇關于使用 OpenCV 和 Python 比較圖像的相似性的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!