📌  相关文章
📜  如何使用Google Cloud的Vision API |套装2

📅  最后修改于: 2021-04-17 01:41:38             🧑  作者: Mango

先决条件:创建虚拟机并在Google Cloud中设置API

在上一篇文章中,我们已经了解了如何使用Vision API使用Vision的面部检测,徽标检测,标签检测和地标检测功能,现在让我们看看诸如光学字符识别,手写文本检测,图像属性检测等更多功能。

文本检测(光学字符识别):

它检测并提取图像中的文本。

import os
import io
from google.cloud import vision
from matplotlib import pyplot as plt
from matplotlib import patches as pch
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
     os.path.join(os.curdir, 'credentials.json')
  
client = vision.ImageAnnotatorClient()
  
f = 'image_filename.jpg'
with io.open(f, 'rb') as image:
    content = image.read()
      
image = vision.types.Image(content = content)
response = client.text_detection(image = image)
texts = response.text_annotations
  
a = plt.imread(f)
fig, ax = plt.subplots(1)
ax.imshow(a)
  
for text in texts:
    print(text.description)
      
    vertices = ([(vertex.x, vertex.y) 
                 for vertex in text.bounding_poly.vertices]) 
    
    print('Vertices covering text: {}\n\n'.format(vertices))
    rect = pch.Rectangle(vertices[0], (vertices[1][0] - vertices[0][0]), 
                        (vertices[2][1] - vertices[0][1]), linewidth = 1, 
                                       edgecolor ='r', facecolor ='none') 
    ax.add_patch(rect)
      
plt.show()

上面的代码从给定的图像中提取文本,并打印包含文本的矩形的顶点坐标。
例如,当以下图像作为输入给出时:

输出:

MY MORNING
ROUTINE
How Successful People Start
Every Day Inspired
BENJAMIN SPALL and MICHAEL XANDER

Vertices covering text: [(38, 71), (348, 71), (348, 602), (38, 602)]


MY
Vertices covering text: [(46, 71), (108, 82), (100, 128), (38, 117)]


MORNING
Vertices covering text: [(129, 79), (348, 118), (338, 170), (120, 131)]


ROUTINE
Vertices covering text: [(96, 135), (292, 170), (283, 219), (87, 184)]


How
Vertices covering text: [(68, 200), (101, 205), (98, 221), (65, 216)]


Successful
Vertices covering text: [(104, 207), (196, 222), (193, 238), (101, 224)]


People
Vertices covering text: [(202, 222), (257, 231), (254, 251), (199, 242)]


Start
Vertices covering text: [(265, 232), (311, 239), (309, 255), (262, 248)]


Every
Vertices covering text: [(112, 238), (155, 246), (152, 265), (109, 258)]


Day
Vertices covering text: [(160, 246), (189, 251), (185, 271), (157, 266)]


Inspired
Vertices covering text: [(194, 251), (262, 263), (258, 283), (191, 271)]


BENJAMIN
Vertices covering text: [(57, 534), (118, 546), (115, 561), (54, 549)]


SPALL
Vertices covering text: [(122, 550), (160, 558), (157, 572), (119, 564)]


and
Vertices covering text: [(165, 560), (185, 564), (182, 577), (162, 573)]


MICHAEL
Vertices covering text: [(190, 564), (250, 576), (247, 590), (187, 578)]


XANDER
Vertices covering text: [(254, 575), (311, 587), (308, 602), (251, 591)]

文档/手写文字检测:

此功能还可以对密集的文档(包括手写体)执行光学字符识别。

import os
import io
from google.cloud import vision
from matplotlib import pyplot as plt
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
     os.path.join(os.curdir, 'credentials.json')
  
client = vision.ImageAnnotatorClient()
  
f = 'image_filename.jpg'
with io.open(f, 'rb') as image:
    content = image.read()
      
image = vision.types.Image(content = content)
response = client.document_text_detection(image = image)
  
a = plt.imread(f)
plt.imshow(a)
  
txt = []
for page in response.full_text_annotation.pages:
        for block in page.blocks:
            print('\nConfidence: {}%\n'.format(block.confidence * 100))
            for paragraph in block.paragraphs:
  
                for word in paragraph.words:
                    word_text = ''.join([symbol.text for symbol in word.symbols])
                    txt.append(word_text)
                      
print(txt)

上面的代码从图像中识别并提取手写文本,然后将其输出。
例如,当我们提供以下图像作为输入时:

输出:

Block confidence: 97.00000286102295%

['Geeks', 'for', 'Geeks', 'A', 'computer', 'science', 'portal', 'for', 'Geeks', '.']

图像属性检测:

此功能可检测图像的一般属性,例如主色。

import os
import io
from google.cloud import vision
from matplotlib import pyplot as plt
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] =
      os.path.join(os.curdir, 'credentials.json')
  
client = vision.ImageAnnotatorClient()
  
f = 'image_filename.jpeg'
with io.open(f, 'rb') as image:
    content = image.read()
      
image = vision.types.Image(content = content)
  
response = client.image_properties(image = image)
properties = response.image_properties_annotation
  
a = plt.imread(f)
plt.imshow(a)
  
for color in properties.dominant_colors.colors:
        print('fraction: {}'.format(color.pixel_fraction))
        print('\tr: {}'.format(color.color.red))
        print('\tg: {}'.format(color.color.green))
        print('\tb: {}'.format(color.color.blue))

该代码将图像作为输入,并返回其颜色属性,即红色,绿色和蓝色的数量。例如,当以下图像作为输入给出时:

输出:

fraction: 0.036332178860902786
        r: 5.0
        g: 185.0
        b: 6.0
fraction: 0.03337658569216728
        r: 131.0
        g: 207.0
        b: 13.0
fraction: 0.029988465830683708
        r: 253.0
        g: 169.0
        b: 5.0
fraction: 0.0262399073690176
        r: 254.0
        g: 123.0
        b: 5.0
fraction: 0.03553921729326248
        r: 253.0
        g: 248.0
        b: 12.0
fraction: 0.02104959636926651
        r: 249.0
        g: 36.0
        b: 6.0
fraction: 0.024581892415881157
        r: 3.0
        g: 35.0
        b: 188.0
fraction: 0.03424163907766342
        r: 6.0
        g: 122.0
        b: 200.0
fraction: 0.027032872661948204
        r: 140.0
        g: 32.0
        b: 185.0
fraction: 0.029411764815449715
        r: 10.0
        g: 177.0
        b: 217.0

安全搜索属性检测:

检测图像中的露骨内容,例如成人内容或暴力内容。此功能使用五个类别(“成人”,“欺骗”,“医疗”,“暴力”和“犯罪”),并返回给定图像中存在每种可能性。

import os
import io
from google.cloud import vision
from matplotlib import pyplot as plt
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
    os.path.join(os.curdir, 'credentials.json')
  
client = vision.ImageAnnotatorClient()
  
f = 'image_filename.jpg'
with io.open(f, 'rb') as image:
    content = image.read()
      
image = vision.types.Image(content = content)
  
a = plt.imread(f)
plt.imshow(a)
  
response = client.safe_search_detection(image = image)
safe = response.safe_search_annotation
  
likelihood_name = ('UNKNOWN', 'VERY_UNLIKELY', 'UNLIKELY',
                   'POSSIBLE', 'LIKELY', 'VERY_LIKELY')
  
print('Adult: {}'.format(likelihood_name[safe.adult]))
print('Medical: {}'.format(likelihood_name[safe.medical]))
print('Spoofed: {}'.format(likelihood_name[safe.spoof]))
print('Violence: {}'.format(likelihood_name[safe.violence]))
print('Racy: {}'.format(likelihood_name[safe.racy]))

对于给定的图像,代码将确定其为具有图形或成人内容的图像的可能性。

物体检测:

从图像中检测并提取多个对象。它定位多个对象并返回其坐标。

import os
import io
from google.cloud import vision
from matplotlib import pyplot as plt
  
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 
      os.path.join(os.curdir, 'credentials.json')
  
client = vision.ImageAnnotatorClient()
  
f = 'image_filename.jpg'
with io.open(f, 'rb') as image:
    content = image.read()
      
image = vision.types.Image(content = content)
  
a = plt.imread(f)
plt.imshow(a)
  
response = client.object_localization(image = image)
objects = response.localized_object_annotations
  
print('Number of objects found: ', len(objects))
for object_ in objects:
    print('Object: ', object_.name)
    print('Confidence: ', object_.score)

例如,当我们输入以下图像时:

输出:

Number of objects found:  1
Object:  Scissors
Confidence:  0.540185272693634

有关更多信息,请在此处访问Cloud Vision API文档。