先决条件:K均值聚类
互联网上以图像形式充满了大量数据。人们每天在社交媒体网站(如Instagram, Facebook)和云存储平台(如Google Drive等)上上传数百万张图片。由于海量数据, 图像压缩技术对于压缩图像和减少存储空间变得至关重要。
在本文中, 我们将研究使用无监督学习算法K-means聚类算法进行的图像压缩。
【使用K均值聚类的图像压缩】图像由称为像素的几个强度值组成。在彩色图像中, 每个像素为3个字节, 每个像素包含RGB(红-蓝-绿)值, 该值具有红色强度值, 然后是蓝色, 然后是绿色强度值。
方法:
K均值聚类会将相似的颜色归为不同颜色(RGB值)的”
k”
个聚类(例如k = 64)。因此, 每个簇质心代表其各自簇的RGB颜色空间中的颜色矢量。现在, 这些”
k”
簇质心将替换它们各自簇中的所有颜色矢量。因此, 我们只需要存储每个像素的标签, 就可以告诉该像素所属的集群。此外, 我们保留每个聚类中心的颜色向量的记录。
所需的库–
-> Numpy库:sudo pip3 install numpy。以下是Python实现:
-> Matplotlib库:sudo pip3 install matplotlib。
-> scipy库:sudo pip3 install scipy
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as img
# from scipy.io import loadmat
from scipy import miscdef read_image():# loading the png image as a 3d matrix
img = misc.imread( 'bird_small.png' ) # uncomment the below code to view the loaded image
# plt.imshow(A) # plotting the image
# plt.show() # scaling it so that the values are small
img = img /255 return imgdef initialize_means(img, clusters):# reshaping it or flattening it into a 2d matrix
points = np.reshape(img, (img.shape[ 0 ] * img.shape[ 1 ], img.shape[ 2 ]))
m, n = points.shape# clusters is the number of clusters
# or the number of colors that we choose.# means is the array of assumed means or centroids.
means = np.zeros((clusters, n)) # random initialization of means.
for i in range (clusters):
rand1 = int (np.random.random( 1 ) * 10 )
rand2 = int (np.random.random( 1 ) * 8 )
means[i, 0 ] = points[rand1, 0 ]
means[i, 1 ] = points[rand2, 1 ]return points, means# Function to measure the euclidean
# distance (distance formula)
def distance(x1, y1, x2, y2):dist = np.square(x1 - x2) + np.square(y1 - y2)
dist = np.sqrt(dist)return distdef k_means(points, means, clusters):iterations = 10 # the number of iterations
m, n = points.shape# these are the index values that
# correspond to the cluster to
# which each pixel belongs to.
index = np.zeros(m) # k-means algorithm.
while (iterations>
0 ):for j in range ( len (points)):# initialize minimum value to a large value
minv = 1000
temp = Nonefor k in range (clusters):x1 = points[j, 0 ]
y1 = points[j, 1 ]
x2 = means[k, 0 ]
y2 = means[k, 1 ]if (distance(x1, y1, x2, y2) <
minv):
minv = distance(x1, y1, x2, y2)
temp = k
index[j] = k for k in range (clusters):sumx = 0
sumy = 0
count = 0for j in range ( len (points)):if (index[j] = = k):
sumx + = points[j, 0 ]
sumy + = points[j, 1 ]
count + = 1if (count = = 0 ):
count = 1means[k, 0 ] = float (sumx /count)
means[k, 1 ] = float (sumy /count)iterations - = 1return means, indexdef compress_image(means, index, img):# recovering the compressed image by
# assigning each pixel to its corresponding centroid.
centroid = np.array(means)
recovered = centroid[index.astype( int ), :]# getting back the 3d matrix (row, col, rgb(3))
recovered = np.reshape(recovered, (img.shape[ 0 ], img.shape[ 1 ], img.shape[ 2 ]))# plotting the compressed image.
plt.imshow(recovered)
plt.show()# saving the compressed image.
misc.imsave( 'compressed_' + str (clusters) +
'_colors.png' , recovered)# Driver Code
if __name__ = = '__main__' :img = read_image()clusters = 16
clusters = int ( input ( 'Enter the number of colors in the compressed image. default = 16\n' ))points, means = initialize_means(img, clusters)
means, index = k_means(points, means, clusters)
compress_image(means, index, img)
输入图片:
文章图片
输出:
文章图片
推荐阅读
- 使用OpenCV进行图像修复-项目示例
- 使用霍夫曼编码进行图像压缩原理和实现细节
- 使用CNN的图像分类器原理和实现
- Python在Kivy中设置背景模板示例
- htop命令总结
- docker-compose部署nginx#yyds干货盘点#
- Linux第11周
- 显示磁盘空间使用情况的命令来咯,冲吧,点赞加关注后面还有更精彩的!
- Gitlab基本配置与使用