使用 K-means 聚类进行图像压缩
先决条件:K-means聚类
互联网上充斥着大量图像形式的数据。人们每天在 Instagram、Facebook 等社交媒体网站和 google drive 等云存储平台上上传数百万张图片。面对如此大量的数据,图像压缩技术对于压缩图像和减少存储空间变得非常重要。
在本文中,我们将研究使用 K-means 聚类算法进行图像压缩,这是一种无监督学习算法。
图像由多个称为像素的强度值组成。在彩色图像中,每个像素为 3 个字节,其中包含每个像素的 RGB(红-蓝-绿)值,具有红色强度值,然后是蓝色,然后是绿色强度值。
方法:
K-means 聚类会将相似的颜色组合成不同颜色(RGB 值)的“k”个簇(比如 k=64)。因此,每个簇质心代表其各自簇的 RGB 颜色空间中的颜色向量。现在,这些“k”簇质心将替换它们各自簇中的所有颜色向量。因此,我们只需要存储每个像素的标签,该标签告诉该像素所属的集群。此外,我们保留每个聚类中心的颜色向量的记录。
需要的图书馆——
-> Numpy library: sudo pip3 install numpy
.
-> Matplotlib library: sudo pip3 install matplotlib
.
-> scipy library: sudo pip3 install scipy
下面是Python实现:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as img
# from scipy.io import loadmat
from scipy import misc
def read_image():
# loading the png image as a 3d matrix
img = misc.imread('bird_small.png')
# uncomment the below code to view the loaded image
# plt.imshow(A) # plotting the image
# plt.show()
# scaling it so that the values are small
img = img / 255
return img
def initialize_means(img, clusters):
# reshaping it or flattening it into a 2d matrix
points = np.reshape(img, (img.shape[0] * img.shape[1],
img.shape[2]))
m, n = points.shape
# clusters is the number of clusters
# or the number of colors that we choose.
# means is the array of assumed means or centroids.
means = np.zeros((clusters, n))
# random initialization of means.
for i in range(clusters):
rand1 = int(np.random.random(1)*10)
rand2 = int(np.random.random(1)*8)
means[i, 0] = points[rand1, 0]
means[i, 1] = points[rand2, 1]
return points, means
# Function to measure the euclidean
# distance (distance formula)
def distance(x1, y1, x2, y2):
dist = np.square(x1 - x2) + np.square(y1 - y2)
dist = np.sqrt(dist)
return dist
def k_means(points, means, clusters):
iterations = 10 # the number of iterations
m, n = points.shape
# these are the index values that
# correspond to the cluster to
# which each pixel belongs to.
index = np.zeros(m)
# k-means algorithm.
while(iterations > 0):
for j in range(len(points)):
# initialize minimum value to a large value
minv = 1000
temp = None
for k in range(clusters):
x1 = points[j, 0]
y1 = points[j, 1]
x2 = means[k, 0]
y2 = means[k, 1]
if(distance(x1, y1, x2, y2) < minv):
minv = distance(x1, y1, x2, y2)
temp = k
index[j] = k
for k in range(clusters):
sumx = 0
sumy = 0
count = 0
for j in range(len(points)):
if(index[j] == k):
sumx += points[j, 0]
sumy += points[j, 1]
count += 1
if(count == 0):
count = 1
means[k, 0] = float(sumx / count)
means[k, 1] = float(sumy / count)
iterations -= 1
return means, index
def compress_image(means, index, img):
# recovering the compressed image by
# assigning each pixel to its corresponding centroid.
centroid = np.array(means)
recovered = centroid[index.astype(int), :]
# getting back the 3d matrix (row, col, rgb(3))
recovered = np.reshape(recovered, (img.shape[0], img.shape[1],
img.shape[2]))
# plotting the compressed image.
plt.imshow(recovered)
plt.show()
# saving the compressed image.
misc.imsave('compressed_' + str(clusters) +
'_colors.png', recovered)
# Driver Code
if __name__ == '__main__':
img = read_image()
clusters = 16
clusters = int(input('Enter the number of colors in the compressed image. default = 16\n'))
points, means = initialize_means(img, clusters)
means, index = k_means(points, means, clusters)
compress_image(means, index, img)
输入图像:
输出 :