問題描述
我正在做一個有趣的項目:使用 OpenCV(如在 Google 護目鏡等中)從輸入圖像中求解數(shù)獨.而且我已經(jīng)完成了任務,但是最后我發(fā)現(xiàn)了一個小問題,所以我來到這里.
I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.
我使用 OpenCV 2.3.1 的 Python API 進行了編程.
I did the programming using Python API of OpenCV 2.3.1.
以下是我所做的:
- 閱讀圖片
- 尋找輪廓
- 選擇面積最大的那個,(也有點相當于正方形).
找到角點.
- Read the image
- Find the contours
- Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.
例如下面給出:
(請注意,這里的綠線與數(shù)獨的真實邊界正確重合,因此數(shù)獨可以正確扭曲.查看下一張圖片)
將圖像變形為完美的正方形
warp the image to a perfect square
例如圖片:
執(zhí)行 OCR(為此我使用了我在 OpenCV-Python 中的簡單數(shù)字識別 OCR )
Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )
而且這個方法效果很好.
And the method worked well.
問題:
查看這張圖片.
在此圖像上執(zhí)行第 4 步會得到以下結果:
Performing the step 4 on this image gives the result below:
畫出的紅線是原始輪廓,是數(shù)獨邊界的真實輪廓.
The red line drawn is the original contour which is the true outline of sudoku boundary.
繪制的綠線是近似輪廓,將是扭曲圖像的輪廓.
The green line drawn is approximated contour which will be the outline of warped image.
當然,數(shù)獨上邊緣的綠線和紅線是有區(qū)別的.所以在變形時,我沒有得到數(shù)獨的原始邊界.
Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.
我的問題:
如何在數(shù)獨的正確邊界(即紅線)上扭曲圖像,或者如何消除紅線和綠線之間的差異?OpenCV中有什么方法嗎?
How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?
推薦答案
我有一個可行的解決方案,但您必須自己將其翻譯成 OpenCV.它是用 Mathematica 編寫的.
I have a solution that works, but you'll have to translate it to OpenCV yourself. It's written in Mathematica.
第一步是調(diào)整圖像中的亮度,通過將每個像素除以關閉操作的結果:
The first step is to adjust the brightness in the image, by dividing each pixel with the result of a closing operation:
src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]
下一步是找到數(shù)獨區(qū)域,這樣我就可以忽略(屏蔽)背景.為此,我使用連通分量分析,并選擇具有最大凸面面積的分量:
The next step is to find the sudoku area, so I can ignore (mask out) the background. For that, I use connected component analysis, and select the component that's got the largest convex area:
components =
ComponentMeasurements[
ColorNegate@Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All,
2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]
通過填充這張圖片,我得到了數(shù)獨網(wǎng)格的掩碼:
By filling this image, I get a mask for the sudoku grid:
mask = FillingTransform[largestComponent]
現(xiàn)在,我可以使用二階導數(shù)過濾器在兩個單獨的圖像中找到垂直線和水平線:
Now, I can use a 2nd order derivative filter to find the vertical and horizontal lines in two separate images:
lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];
我再次使用連通分量分析從這些圖像中提取網(wǎng)格線.網(wǎng)格線比數(shù)字長得多,所以我可以使用卡尺長度來僅選擇網(wǎng)格線連接的組件.按位置對它們進行排序,我得到圖像中每個垂直/水平網(wǎng)格線的 2x10 蒙版圖像:
I use connected component analysis again to extract the grid lines from these images. The grid lines are much longer than the digits, so I can use caliper length to select only the grid lines-connected components. Sorting them by position, I get 2x10 mask images for each of the vertical/horizontal grid lines in the image:
verticalGridLineMasks =
SortBy[ComponentMeasurements[
lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks =
SortBy[ComponentMeasurements[
lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 2]] &][[All, 3]];
接下來我取每一對垂直/水平網(wǎng)格線,將它們擴大,逐個像素地計算交點,并計算結果的中心.這些點是網(wǎng)格線的交點:
Next I take each pair of vertical/horizontal grid lines, dilate them, calculate the pixel-by-pixel intersection, and calculate the center of the result. These points are the grid line intersections:
centerOfGravity[l_] :=
ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters =
Table[centerOfGravity[
ImageData[Dilation[Image[h], DiskMatrix[2]]]*
ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h,
horizontalGridLineMasks}, {v, verticalGridLineMasks}];
最后一步是通過這些點為 X/Y 映射定義兩個插值函數(shù),并使用這些函數(shù)變換圖像:
The last step is to define two interpolation functions for X/Y mapping through these points, and transform the image using these functions:
fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed =
ImageTransformation[
srcAdjusted, {fnX @@ Reverse[#], fnY @@ Reverse[#]} &, {9*50, 9*50},
PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]
所有的操作都是基本的圖像處理功能,所以這在 OpenCV 中應該也是可以的.基于樣條的圖像轉換可能更難,但我認為你并不需要它.使用您現(xiàn)在對每個單獨的單元格使用的透視變換可能會產(chǎn)生足夠好的結果.
All of the operations are basic image processing function, so this should be possible in OpenCV, too. The spline-based image transformation might be harder, but I don't think you really need it. Probably using the perspective transformation you use now on each individual cell will give good enough results.
這篇關于如何消除數(shù)獨方塊中的凸性缺陷?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!