opencv学习笔记

1 图片读取与通道

图片读取后,默认是个numpy的3维数组(row行数是height, col高度是图片宽度,3是BGR通道)

import cv2
img1 = cv2.imread('./dog_backpack.png')
img1.shape
(1401, 934, 3)

注意上面通道顺序是BGR哦,反人类吧,需要做转换,才能正常显示图片,如下:

img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)

import matplotlib.pyplot as plt
plt.imshow(img)

调整图片尺寸(硬调)

img1 = cv2.resize(img1, (1200, 1200))

根据百分比调整的api有点反人类,建议用ratio计算后调用上面的

2 如何给窗口设置回调

# Create a named window for connections
cv2.namedWindow('Test')

# Bind draw_rectangle function to mouse cliks
cv2.setMouseCallback('Test', draw_circle) 

def draw_circle(event,x,y,flags,param):

    global center,clicked

    # get mouse click on down and track center
    if event == cv2.EVENT_LBUTTONDOWN:
        center = (x, y)
        clicked = False
    
    # Use boolean variable to track if the mouse has been released
    if event == cv2.EVENT_LBUTTONUP:
        clicked = True

3 图形绘制

 

4 图片混合(Blending)和叠加(Paste)

只有相同尺寸的图才能Blending,效果如下图:

blended = cv2.addWeighted(src1=img1, alpha=0.7, src2=img2, beta=0.3, gamma=0)

不相同尺寸图片,可以先构造与大图片相同的空图片,把小图放上去,两张图做blend

import cv2
import numpy as np
import matplotlib.pyplot as plt

img1 = cv2.imread('dog_backpack.png')
img2 = cv2.imread('watermark_no_copy.png')

img1 = cv2.cvtColor(img1, cv2.COLOR_BGR2RGB)
img2 = cv2.cvtColor(img2, cv2.COLOR_BGR2RGB)
img2 = cv2.resize(img2, (640, 640))

# make a new img of big size
img_new = np.ones(img1.shape, dtype=np.uint8) * 255
# set small to new big size
img_new[0:img2.shape[0], 0:img2.shape[1],:] = img2

# blended same size
blended = cv2.addWeighted(src1=img1,alpha=0.7,src2=img_new,beta=0.3,gamma=0)
plt.imshow(blended)

效果:

5 阈值(Threshold)处理

可以在灰度图的基础上,做阈值(Threshold)处理,可以得到对比度不同的图,这个在很多场景很有用:

img = cv2.imread('./rainbow.jpg', 0)
th, th_img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
plt.imshow(th_img, cmap='gray')

效果如下图:

参数还可以用THRESH_BINARY_INV(黑白反)、THRESH_TRUNC(大于th截断,小于的不变)、THRESH_TOZERO(大于th不变,小于设成0)、THRESH_TOZERO_INV(大于设成0,小于不变)

此外,上述与参数还可以 逻辑或 cv2.THRESH_OTSU、cv2.THRESH_TRIANGLE两个自动选择阀值的参数(此时需保证参数的阀值为0)

效果可以参考:

6 模糊(Blur)、平滑(Smooth)

主要用于去除图片噪声,保留主要细节。

没太get到这俩的区别,都有很多实现方法

不同kernel在线效果网站:Image Kernels

原始图

使用2D核过滤,可以发现边缘模糊了,空心字变实了

kernel = np.ones(shape=(5,5), dtype=np.float32) / 25
img_filter = cv2.filter2D(img, -1, kernel)

使用cv内置模糊blur:

img_filter = cv2.blur(img, ksize=(5,5))

高斯模糊:

img_filter = cv2.GaussianBlur(img, (5,5), 10)

中值模糊:

img_filter = cv2.medianBlur(img, 7)

双边滤波(Bilateral Filtering):

img_filter = cv2.bilateralFilter(img,9,75,75) 

7 使用均值模糊(Median Blur)去噪声

均值模糊的主要用途,其实是去除噪声:

先给图加点噪点:

修复后:

img_fix = cv2.medianBlur(img, 5)

 

 

8 伽玛校正

原始图

注意:所有变换前,都需要把0~255弄到0~1之间,方便变换,变换完成后,其实是需要转会0~255的,但是plt的imshow能兼容识别0~1。

使用伽玛校正(Gamma Correction) 提升亮度:

import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('./bricks.jpg').astype(np.float32) / 255
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

gamma = 1/4
img_gamma = np.power(img, gamma)

plt.imshow(img_gamma)

gamma参数小于1变亮

gamma参数大于1变暗

9 Erosion(腐蚀)

腐蚀(erosion)用于提取图像的边缘

原图:

腐蚀操作,直接这里kernel和iteration都会影响'腐蚀'的效果:

kernel = np.ones((5,5), np.uint8)
erosion_img = cv2.erode(img, kernel, iterations=4)

plt.imshow(erosion_img, cmap='gray')

10 形态学(Morphological Operators)操作之开运算(Open)去除随机噪声

先加一些随机噪声:

white_noise = np.random.randint(low=0, high=2, size=(600,600))
white_noise = white_noise * 255

noise_img = white_noise + img

 

使用形态学(Morphological)开运算(Open)处理,去处图像中孤立的随机噪声,其原理是先腐蚀再膨胀:

kernel = np.ones((5,5),np.uint8)
open_img = cv2.morphologyEx(noise_img, cv2.MORPH_OPEN, kernel)
plt.imshow(open_img, cmap='gray')

 


11 形态学(Morphological Operators)操作之闭运算(Close)去除前景噪声

现在增加难度,只给前景(ABCDE)增加随机黑色噪声

black_noise = np.random.randint(low=0,high=2,size=(600,600))
black_noise= black_noise * -255
noise_img = black_noise + img
noise_img[noise_img==-255] = 0

此时再用OPEN方法,就会发现,整个都被摸掉了,需要用Close方法,原理是先扩张再腐蚀:

kernel = np.ones((5,5),np.uint8)
open_img = cv2.morphologyEx(noise_img, cv2.MORPH_CLOSE, kernel)
plt.imshow(open_img, cmap='gray')

12 形态学(Morphological Operators)操作之梯度(Gradient)

形态学梯度(Gradient)是膨胀与腐蚀(Erosion)的差值,用于刻画目标边界 或 边缘位于图形灰度级别剧烈变化的区域,突出高亮区域的外围,如下(图片使用无噪声的初始版本):

kernel = np.ones((5,5),np.uint8)
g_img = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)
plt.imshow(g_img, cmap='gray')

 

13 索贝尔算子(Sobel)、Laplacian算子提取边缘

原图:

索贝尔算子,提取边缘,可以用第3、4个参数控制横向和纵向

img_sobelx = cv2.Sobel(img, cv2.CV_64F,1,0,ksize=5)
img_sobely = cv2.Sobel(img, cv2.CV_64F,0,1,ksize=5)
img_sobel = cv2.Sobel(img, cv2.CV_64F,1,1,ksize=5)

以下3个图分别是,横向、纵向、横纵的:

可以发现最后的这个,反而不如x和y的,可以对这俩做叠加(Blending):

img_sobelx = cv2.Sobel(img, cv2.CV_64F,1,0,ksize=5)
img_sobely = cv2.Sobel(img, cv2.CV_64F,0,1,ksize=5)
img_blended = cv2.addWeighted(src1=img_sobelx,alpha=0.5,src2=img_sobely,beta=0.5,gamma=0)

也可以用Laplacian算子:

img_lap = cv2.Laplacian(img,cv2.CV_64F)

14 计算图片的直方图

单通道R:

hist_values = cv2.calcHist([bricks_img], channels=[0], mask=None, histSize=[256], ranges=[0,256])

绘制3个通道的,注意我这里转成rgb了:

bricks_img = cv2.imread('./bricks.jpg')
bricks_img = cv2.cvtColor(bricks_img, cv2.COLOR_BGR2RGB)


img = bricks_img

color = ('r', 'g', 'b')

for i, col in enumerate(color):
  histr = cv2.calcHist([img], [i], None, [256], [0,256])
  plt.plot(histr,color = col)
  plt.xlim([0,256])

plt.show()

15 遮罩

img = rainbow_img

mask = np.zeros(img.shape[:2], np.uint8)
mask.shape
mask[300:400, 100:400] = 255

masked_img = cv2.bitwise_and(img, img, mask = mask)
plt.imshow(masked_img)

16 利用hsv通道的均值化(equalize)提高对比度

hsv的v通道一般是亮度,我们单独对它做均值话,可以提高图片整体亮度

原图

import cv2
import matplotlib.pyplot as plt

img_gorilla = cv2.imread('./gorilla.jpg')
img_gorilla_hsv = cv2.cvtColor(img_gorilla, cv2.COLOR_RGB2HSV)

img_gorilla_hsv[:,:,2] = cv2.equalizeHist(img_gorilla_hsv[:,:,2])
img_gorilla_new = cv2.cvtColor(img_gorilla_hsv, cv2.COLOR_HSV2RGB)
plt.imshow(img_gorilla_new)

 

效果

17 打开摄像头并绘制桢

import cv2

cap = cv2.VideoCapture(0)


# 获取设备长宽
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# 计算坐标并保持是int
x = width//2
y = height//2

# Width and height
w = width//4
h = height//4

while True:
    # 绘制帧
    ret, frame = cap.read()

    # Draw a rectangle on stream
    
    cv2.rectangle(frame, (x, y), (x+w, y+h), color=(0,0,255),thickness= 4)

    # Display the resulting frame
    cv2.imshow('frame', frame)

   # This command let's us quit with the "q" button on a keyboard.
    # Simply pressing X on the window won't work!
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# When everything is done, release the capture
cap.release()
cv2.destroyAllWindows()

18 打开视频并绘制桢

注意这里速度要按照fps做一下等待,否则速度过快

import cv2
import time
# Same command function as streaming, its just now we pass in the file path, nice!
cap = cv2.VideoCapture('../DATA/video_capture.mp4')

# FRAMES PER SECOND FOR VIDEO
fps = 25

# Always a good idea to check if the video was acutally there
# If you get an error at thsi step, triple check your file path!!
if cap.isOpened()== False: 
    print("Error opening the video file. Please double check your file path for typos. Or move the movie file to the same location as this script/notebook")
    

# While the video is opened
while cap.isOpened():
    
    
    
    # Read the video file.
    ret, frame = cap.read()
    
    # If we got frames, show them.
    if ret == True:
        
        

        # 解决速度过快问题
        time.sleep(1/fps)
        cv2.imshow('frame',frame)
 
        # Press q to quit
        if cv2.waitKey(25) & 0xFF == ord('q'):
            
            break
 
    # Or automatically break this whole loop if the video is over.
    else:
        break
        
cap.release()
# Closes all the frames
cv2.destroyAllWindows()

18 目标检测之模板匹配(Template Matching)

用于目标是大图的一部分的情况,最简单的case。

import cv2
import matplotlib.pyplot as plt

full = cv2.imread('./sammy.jpg')
full = cv2.cvtColor(full, cv2.COLOR_BGR2RGB)

face = cv2.imread('./sammy_face.jpg')
face = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)
height, width, channels = face.shape

methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR','cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED']

for m in methods:
  full_copy = full.copy()

  method = eval(m)
  res = cv2.matchTemplate(full_copy, face, method)

  min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)

  if m in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
      top_left = min_loc    
  else:
      top_left = max_loc

  bottom_right = (top_left[0] + width, top_left[1] + height)
  cv2.rectangle(full_copy,top_left, bottom_right, 255, 10)

  # plt
  plt.subplot(121)
  plt.imshow(res)
  plt.title('Result of Template Matching')
  
  plt.subplot(122)
  plt.imshow(full_copy)
  plt.title('Detected Point')
  plt.suptitle(m)

  plt.show()

不同方法的效果如下:

 

 

19 角点检测(Corner Detection)

Harris算法

import cv2
import matplotlib.pyplot as plt

chess = cv2.imread('flat_chessboard.png')
chess_gray = cv2.cvtColor(chess, cv2.COLOR_BGR2GRAY)

dst = cv2.cornerHarris(src=chess_gray, blockSize=2, ksize=3, k=0.04)
dst = cv2.dilate(dst, None)

chess[dst>0.01*dst.max()] = [0, 255, 0]

plt.imshow(chess, cmap='gray')

Shi-Tomasi (Good Features to Track Paper)
import cv2
import matplotlib.pyplot as plt
import numpy as np

chess = cv2.imread('real_chessboard.jpg')
chess_gray = cv2.cvtColor(chess, cv2.COLOR_BGR2GRAY)

# 64 is how many best corner try to find
corners = cv2.goodFeaturesToTrack(chess_gray,64,0.01,10)
corners = np.int0(corners) # convert np arr to int

for i in corners:
    x,y = i.ravel()
    cv2.circle(chess,(x,y),3,255,-1)

plt.imshow(chess, cmap='gray')

20 边缘检测(Edge Detection)

Canny算法
import cv2
import matplotlib.pyplot as plt

img = cv2.imread('./sammy_face.jpg')

# best threshold for canny
median_val = np.median(img)
lower = int(max(0, 0.7 * median_val))
upper = int(max(255, 1.3 * median_val))

# blur & canny
blur_img = cv2.blur(img, ksize=(4,4))
edges = cv2.Canny(image=blur_img, threshold1=lower , threshold2=upper)

plt.imshow(edges)

其他算法如前面提过的Sobel、Laplacian也能用。会粗一些。

21 网格检测(Grid Detection)

Grid detection(网格检测)是指在图像中检测出网格的位置和大小。这个技术通常用于计算机视觉和机器人视觉中,例如在相机标定和姿态估计中。在相机标定中,我们需要知道相机的内部参数和外部参数,以便将图像中的像素坐标转换为真实世界中的坐标。为了计算这些参数,我们需要在图像中检测出一个已知大小的网格,并使用它来计算相机的畸变和投影矩阵。一部分Grid Detection是用前面讲过的角点检查算法实现的。

import cv2
import matplotlib.pyplot as plt

flat_chess = cv2.imread('./flat_chessboard.png')
found, corners = cv2.findChessboardCorners(flat_chess,(7,7))

if found:
  flat_chess_copy = flat_chess.copy()
  cv2.drawChessboardCorners(flat_chess_copy, (7, 7), corners, found)
  plt.imshow(flat_chess_copy)
else:
    print("OpenCV did not find corners. Double check your patternSize.")

圆形网格检测:

import cv2
import matplotlib.pyplot as plt

dots = cv2.imread('./dot_grid.png')
found, corners = cv2.findCirclesGrid(dots, (10,10), cv2.CALIB_CB_SYMMETRIC_GRID)

dbg_image_circles = dots.copy()
cv2.drawChessboardCorners(dbg_image_circles, (10, 10), corners, found)

plt.imshow(dbg_image_circles)

22 轮廓检测(Contour Detection)

Contour Detection(轮廓检测)是一种在计算机视觉中用于检测图像中物体边界的技术。轮廓是连接物体边界上所有连续点的曲线,这些点具有相同的颜色或强度。轮廓检测在许多计算机视觉应用中都很有用,例如目标识别、图像分割和形状分析。

  • 下面代码中的RETR_CCOMP是内外轮廓都要,RETR_EXTERNAL是之要外轮廓,还有RETR_TREE / RETR_LIST可选。
  • 数组最后一个值-1 / 0 / 1 / 4是分组,这里的-1是外轮廓
import cv2
import numpy as np
import matplotlib.pyplot as plt

img = cv2.imread('./internal_external.png',0)

contours, hierarchy = cv2.findContours(img, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)

# Set up empty array
external_contours = np.zeros(img.shape)

# For every entry in contours
for i in range(len(contours)):
    
    # last column in the array is -1 if an external contour (no contours inside of it)
    if hierarchy[0][i][3] == -1:
        
        # We can now draw the external contours from the list of contours
        cv2.drawContours(external_contours, contours, i, 255, -1)

plt.imshow(external_contours,cmap='gray')

换成 != -1,则是内轮廓:

if hierarchy[0][i][3] != -1:

23 根据轮廓检测绘制外框,合并父子结构

在边缘检测的基础上,还可以利用父子结构,进行简单合并,只保留最大的

img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


edges = cv2.Canny(img_gray, 100, 300)
contours, hierarchy = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

merged_contours = []
for i, c in enumerate(contours):
    if hierarchy[0][i][3] == -1:
        merged_contours.append(c)
    else:
        merged_contours[-1] = cv2.convexHull(np.concatenate((merged_contours[-1], c)))

cv2.drawContours(img, merged_contours, -1, (0, 255, 0), 2)

cv2_imshow(img)

24 特征匹配(Feature Matching)

"特征匹配"(Feature Matching),是计算机视觉中的一种技术,用于在两个或多个图像中找到相同的特征点,并将它们匹配起来。特征匹配在很多计算机视觉应用中都有广泛的应用,例如图像拼接、物体识别和跟踪等。

ORB特征匹配:

效果不好,

import cv2
import numpy as np
import matplotlib.pyplot as plt

def display(img,cmap='gray'):
    fig = plt.figure(figsize=(12,10))
    ax = fig.add_subplot(111)
    ax.imshow(img,cmap='gray')

img1 = cv2.imread('./reeses_puffs.png',0)
img2 = cv2.imread('./many_cereals.jpg',0)

# Initiate ORB detector
orb = cv2.ORB_create()

# find the keypoints and descriptors with ORB
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)

# create BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

# Match descriptors.
matches = bf.match(des1,des2)

# Sort them in the order of their distance.
matches = sorted(matches, key = lambda x:x.distance)

# Draw first 25 matches.
img_matches = cv2.drawMatches(img1,kp1,img2,kp2,matches[:25],None,flags=2)

display(img_matches)

SIFT特征:

效果较好,发现大部分特征都匹配到了对的图上

import cv2
import numpy as np
import matplotlib.pyplot as plt

def display(img,cmap='gray'):
    fig = plt.figure(figsize=(12,10))
    ax = fig.add_subplot(111)
    ax.imshow(img,cmap='gray')

img1 = cv2.imread('./reeses_puffs.png',0)
img2 = cv2.imread('./many_cereals.jpg',0)

# Create SIFT Object
sift = cv2.xfeatures2d.SIFT_create()

# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)

# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1,des2, k=2)

# Apply ratio test
good = []
for match1,match2 in matches:
    if match1.distance < 0.75*match2.distance:
        good.append([match1])

# cv2.drawMatchesKnn expects list of lists as matches.
img_matches = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=2)

display(img_matches)

上面两个搜索都是基于bf(暴力的),速度有些慢,我们可以换一个匹配方法,例如flann的:

这里不止画了最佳match,把特征也画出来了(红点)

import cv2
import numpy as np
import matplotlib.pyplot as plt

def display(img,cmap='gray'):
    fig = plt.figure(figsize=(12,10))
    ax = fig.add_subplot(111)
    ax.imshow(img,cmap='gray')

img1 = cv2.imread('./reeses_puffs.png',0)
img2 = cv2.imread('./many_cereals.jpg',0)

# Initiate SIFT detector
sift = cv2.xfeatures2d.SIFT_create()

# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)

# FLANN parameters
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50)  

flann = cv2.FlannBasedMatcher(index_params,search_params)

matches = flann.knnMatch(des1,des2,k=2)

# Need to draw only good matches, so create a mask
matchesMask = [[0,0] for i in range(len(matches))]

# ratio test
for i,(match1,match2) in enumerate(matches):
    if match1.distance < 0.7*match2.distance:
        matchesMask[i]=[1,0]

draw_params = dict(matchColor = (0,255,0),
                   singlePointColor = (255,0,0),
                   matchesMask = matchesMask,
                   flags = 0)

img_matches = cv2.drawMatchesKnn(img1,kp1,img2,kp2,matches,None,**draw_params)

display(img_matches)

25 分水岭算法(Watershed Algorithm)

Watershed Algorithm(分水岭算法)是一种图像分割算法,主要用于将图像分割成多个区域。它可以将图像中的每个像素视为地形表面上的一个点,并将这些点视为水滴。当水滴落在山峰上时,它会沿着山峰的边缘流动,直到到达山谷。在图像中,山峰表示图像中的物体,而山谷表示物体之间的边界。通过将每个像素视为水滴并模拟水滴的流动,Watershed Algorithm可以将图像分割成多个区域,每个区域代表一个物体。Watershed Algorithm在计算机视觉和图像处理领域有广泛的应用,例如图像分割、目标检测、图像分析等。

在下面的例子中,首先找到明确是前景的点,然后再用分水岭算法

import numpy as np
import cv2
import matplotlib.pyplot as plt

def display(img,cmap=None):
    fig = plt.figure(figsize=(10,8))
    ax = fig.add_subplot(111)
    ax.imshow(img,cmap=cmap)

img_coins = cv2.imread('./pennies.jpg')
# blur
img_coins_blur = cv2.medianBlur(img_coins, 35)
# threshold adaptive
img_coins_gray = cv2.cvtColor(img_coins_blur,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(img_coins_gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)
# sure background area
sure_bg = cv2.dilate(opening,kernel,iterations=3)
# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv2.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
# Apply Watershed Algorithm to find Markers
markers = cv2.watershed(img_coins,markers)

contours, hierarchy = cv2.findContours(markers.copy(), cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)

# For every entry in contours
for i in range(len(contours)):
    
    # last column in the array is -1 if an external contour (no contours inside of it)
    if hierarchy[0][i][3] == -1:
        
        # We can now draw the external contours from the list of contours
        cv2.drawContours(img_coins, contours, i, (255, 0, 0), 10)

display(img_coins,cmap='gray')

26 人脸检测(Face Detection)

这里是检测有没有人脸,不是识别,用的是cv的传统方法

import numpy as np
import cv2 
import matplotlib.pyplot as plt

nadia = cv2.imread('./Nadia_Murad.jpg',0)
denis = cv2.imread('./Denis_Mukwege.jpg',0)
solvay = cv2.imread('./solvay_conference.jpg',0)

face_cascade = cv2.CascadeClassifier('./haarcascade_frontalface_default.xml')

def detect_face(img):
    
  
    face_img = img.copy()
  
    face_rects = face_cascade.detectMultiScale(face_img) 
    
    for (x,y,w,h) in face_rects: 
        cv2.rectangle(face_img, (x,y), (x+w,y+h), (255,255,255), 10) 
        
    return face_img
    
result = detect_face(nadia)
plt.imshow(result,cmap='gray')

在摄像头 / 视频上检测

cap = cv2.VideoCapture(0) 

while True: 
    
    ret, frame = cap.read(0) 
     
    frame = detect_face(frame)
 
    cv2.imshow('Video Face Detection', frame) 
 
    c = cv2.waitKey(1) 
    if c == 27: 
        break 
        
cap.release() 
cv2.destroyAllWindows()

27 光流(Optical Flow)

GitHub Copilot: 在计算机视觉中,光流(Optical Flow)是指在连续的两帧图像中,同一物体上的像素在时间上的变化量。光流可以用来描述物体的运动,是计算机视觉中的一个重要问题。

光流算法可以通过比较相邻帧之间的像素强度值来计算像素的运动。具体来说,它会在第一帧中选择一个像素,然后在第二帧中找到与该像素最相似的像素,并计算它们之间的位移量。这个过程会对图像中的每个像素进行,从而得到整个图像的光流场。

光流算法可以用于许多计算机视觉应用,例如运动跟踪、视频稳定和人体姿态估计等。

nextPts, status, err = cv2.calcOpticalFlowPyrLK(prev_gray, frame_gray, prevPts, None, **lk_params)

flow = cv2.calcOpticalFlowFarneback(prvsImg,nextImg, None, 0.5, 3, 15, 3, 5, 1.2, 0)

28 MeanShift目标追踪(MeanShift Tracking)

在计算机视觉中,MeanShift Tracking是一种目标跟踪算法,用于在视频序列中跟踪运动的目标。它基于颜色直方图的密度估计方法,可以自适应地调整目标区域的大小和形状,以适应目标的运动和变化。

MeanShift Tracking算法的基本思想是在当前帧中选择一个初始目标区域,并计算该区域的颜色直方图。然后,它会在下一帧中搜索与该直方图最相似的区域,并将其作为新的目标区域。这个过程会迭代进行,直到目标区域的位置不再发生变化或达到最大迭代次数为止。

# Grab the Frame in HSV
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        
        # Calculate the Back Projection based off the roi_hist created earlier
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
        
        # Apply meanshift to get the new coordinates of the rectangle
        ret, track_window = cv2.meanShift(dst, track_window, term_crit)
        
        # Draw the new rectangle on the image
        x,y,w,h = track_window
        img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), (0,0,255),5)
        
        cv2.imshow('img2',img2)

29 CamShift目标追踪(CamShift Tracking)

GitHub Copilot: 在计算机视觉中,CamShift Tracking是一种基于MeanShift Tracking算法的改进算法,用于在视频序列中跟踪运动的目标。它可以自适应地调整目标区域的大小和形状,以适应目标的运动和变化,并且可以处理目标的旋转和缩放。

CamShift Tracking算法的基本思想是在当前帧中选择一个初始目标区域,并计算该区域的颜色直方图。然后,它会在下一帧中搜索与该直方图最相似的区域,并将其作为新的目标区域。与MeanShift Tracking算法不同的是,CamShift Tracking算法会根据目标区域的颜色直方图计算目标的中心点和大小,并使用这些信息来调整搜索窗口的大小和形状。这个过程会迭代进行,直到目标区域的位置和大小不再发生变化或达到最大迭代次数为止。

# Grab the Frame in HSV
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        
        # Calculate the Back Projection based off the roi_hist created earlier
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
        
        
        # Apply Camshift to get the new coordinates of the rectangle
        ret, track_window = cv2.CamShift(dst, track_window, term_crit)
        
        # Draw it on image
        pts = cv2.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv2.polylines(frame,[pts],True, (0,0,255),5)
        cv2.imshow('img2',img2)

30 其他目标追踪算法

GitHub Copilot: OpenCV中的tracking API提供了多种目标跟踪算法,包括Boosting、MIL、KCF、TLD、MedianFlow、CamShift和MeanShift等。这些算法都是用于在视频序列中跟踪运动的目标,但它们的实现方式和性能特点不同。

其中,CamShift和MeanShift算法是基于颜色直方图的密度估计方法,可以自适应地调整目标区域的大小和形状,以适应目标的运动和变化。CamShift算法是MeanShift算法的改进版,可以处理目标的旋转和缩放。

Boosting、MIL和KCF算法是基于机器学习的目标跟踪算法,使用分类器来区分目标和背景,并根据分类器的输出来更新目标的位置和大小。这些算法通常具有较高的跟踪精度和鲁棒性,但需要较长的训练时间和较高的计算资源。

TLD和MedianFlow算法是基于特征点的目标跟踪算法,使用特征点来描述目标的外观和运动,并根据特征点的位置和运动来更新目标的位置和大小。这些算法通常具有较快的跟踪速度和较好的鲁棒性,但对目标的外观和运动有一定的限制。

在实际应用中,可以根据目标的特点和应用场景选择合适的跟踪算法。例如,对于需要处理目标旋转和缩放的场景,可以选择CamShift算法;对于需要高精度和鲁棒性的场景,可以选择Boosting、MIL或KCF算法;对于需要快速跟踪和鲁棒性的场景,可以选择TLD或MedianFlow算法。

 

Leave a Reply

Your email address will not be published. Required fields are marked *