로드 중...
검색 결과가 없습니다.

Google Cloud 콘솔에서 기술 적용

Computer Vision Fundamentals with Google Cloud

700개 이상의 실습 및 과정 이용하기

Extracting Text from the Images using the Google Cloud Vision API

실습 1시간 universal_currency_alt 크레딧 5개 show_chart 고급
info 이 실습에는 학습을 지원하는 AI 도구가 통합되어 있을 수 있습니다.
700개 이상의 실습 및 과정 이용하기

Overview

In this lab, you learn how to extract text from the images using the Google Cloud Vision API. This lab demonstrates how to upload image files to Google Cloud Storage, extract text from the images using the Google Cloud Vision API, translate the text using the Google Cloud Translation API, and save your translations back to Cloud Storage. Google Cloud Pub/Sub is used to queue various tasks and trigger the right Cloud Functions to carry them out.

Lab objectives

In this lab, you learn how to perform the following tasks:

  • Write and deploy several Background Cloud Functions.
  • Upload images to Cloud Storage.
  • Extract, translate and save text contained in uploaded images.

Setup and requirements

For each lab, you get a new Google Cloud project and set of resources for a fixed time at no cost.

  1. Sign in to Qwiklabs using an incognito window.

  2. Note the lab's access time (for example, 1:15:00), and make sure you can finish within that time.
    There is no pause feature. You can restart if needed, but you have to start at the beginning.

  3. When ready, click Start lab.

  4. Note your lab credentials (Username and Password). You will use them to sign in to the Google Cloud Console.

  5. Click Open Google Console.

  6. Click Use another account and copy/paste credentials for this lab into the prompts.
    If you use other credentials, you'll receive errors or incur charges.

  7. Accept the terms and skip the recovery resource page.

Activate Cloud Shell

Cloud Shell is a virtual machine that contains development tools. It offers a persistent 5-GB home directory and runs on Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources. gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab completion.

  1. Click the Activate Cloud Shell button (Activate Cloud Shell icon) at the top right of the console.

  2. Click Continue.
    It takes a few moments to provision and connect to the environment. When you are connected, you are also authenticated, and the project is set to your PROJECT_ID.

Sample commands

  • List the active account name:
gcloud auth list

(Output)

Credentialed accounts: - <myaccount>@<mydomain>.com (active)

(Example output)

Credentialed accounts: - google1623327_student@qwiklabs.net
  • List the project ID:
gcloud config list project

(Output)

[core] project = <project_ID>

(Example output)

[core] project = qwiklabs-gcp-44776a13dea667a6 Note: Full documentation of gcloud is available in the gcloud CLI overview guide.

Task 1. Visualize the flow of data

The flow of data in the Extract Text from the Images using the Google Cloud Vision API lab application involves several steps:

  1. An image that contains text in any language is uploaded to Cloud Storage.
  2. A Cloud Function is triggered, which uses the Vision API to extract the text and detect the source language.
  3. The text is queued for translation by publishing a message to a Pub/Sub topic. A translation is queued for each target language different from the source language.
  4. If a target language matches the source language, the translation queue is skipped, and text is sent to the result queue, another Pub/Sub topic.
  5. A Cloud Function uses the Translation API to translate the text in the translation queue. The translated result is sent to the result queue.
  6. Another Cloud Function saves the translated text from the result queue to Cloud Storage.
  7. The results are found in Cloud Storage as txt files for each translation.

It may help to visualize the steps:

Visualizing the flow of data

Task 2. Prepare the application

  1. Copy below script and paste it in the Cloud Shell. Before hitting the enter, change the bucket name (In order to set a unique name use your project ID because it is unique. For example, “image_bucket_YOUR_PROJECT_ID” can be your unique bucket name. Or feel free to choose any name as long as you use only lowercase letters, numbers, hyphens (-), underscores (_) and dots (.))
gcloud storage buckets create gs://YOUR_IMAGE_BUCKET_NAME --location={{{project_0.default_region|set at lab start}}}
  1. Copy below script and paste it in the Cloud Shell. Before hitting the enter, change the bucket name (In order to set a unique name use your project ID because it is unique. For example, “result_bucket_YOUR_PROJECT_ID” can be your unique bucket name. Or feel free to choose any name as long as you use only lowercase letters, numbers, hyphens (-), underscores (_) and dots (.))
gcloud storage buckets create gs://YOUR_RESULT_BUCKET_NAME --location={{{project_0.default_region|set at lab start}}}

Click Check my progress to verify the objective.

Create two cloud storage buckets
  1. Copy below script and paste it in the Cloud Shell. Before hitting the enter, change YOUR_TRANSLATE_TOPIC_NAME.
gcloud pubsub topics create YOUR_TRANSLATE_TOPIC_NAME
  1. Copy below script and paste it in the Cloud Shell. Before hitting the enter, change YOUR_RESULT_TOPIC_NAME.
gcloud pubsub topics create YOUR_RESULT_TOPIC_NAME

Click Check my progress to verify the objective.

Create pubsub topics
  1. Clone the sample app repository to your Cloud Shell:
git clone https://github.com/GoogleCloudPlatform/python-docs-samples.git
  1. Change to the directory that contains the Cloud Functions sample code:
cd python-docs-samples/functions/ocr/app/
  1. python-docs-samples/functions/ocr/app/ folder consists of a main.py file which includes ocr_detect, ocr_process, ocr_translate, ocr_ save and message_validatation_helper functions defined in Task 1. Visualizing the flow of data.

Task 3. Understand the code

Let’s look at your main.py file a bit closer:

Import dependencies

The application must import several dependencies in order to communicate with Google Cloud services:

functions/ocr/app/main.py

import base64 import json import os from google.cloud import pubsub_v1 from google.cloud import storage from google.cloud import translate_v2 as translate from google.cloud import vision vision_client = vision.ImageAnnotatorClient() translate_client = translate.Client() publisher = pubsub_v1.PublisherClient() storage_client = storage.Client() project_id = os.environ["GCP_PROJECT"]

Process images

The following function reads an uploaded image file from Cloud Storage and calls a function to detect whether the image contains text:

functions/ocr/app/main.py

def process_image(file, context): """Cloud Function triggered by Cloud Storage when a file is changed. Args: file (dict): Metadata of the changed file, provided by the triggering Cloud Storage event. context (google.cloud.functions.Context): Metadata of triggering event. Returns: None; the output is written to stdout and Stackdriver Logging """ bucket = validate_message(file, "bucket") name = validate_message(file, "name") detect_text(bucket, name) print("File {} processed.".format(file["name"]))

The following function extracts text from the image using the Cloud Vision API and queues the text for translation:

functions/ocr/app/main.py

def detect_text(bucket, filename): print("Looking for text in image {}".format(filename)) futures = [] image = vision.Image( source=vision.ImageSource(gcs_image_uri=f"gs://{bucket}/{filename}") ) text_detection_response = vision_client.text_detection(image=image) annotations = text_detection_response.text_annotations if len(annotations) > 0: text = annotations[0].description else: text = "" print("Extracted text {} from image ({} chars).".format(text, len(text))) detect_language_response = translate_client.detect_language(text) src_lang = detect_language_response["language"] print("Detected language {} for text {}.".format(src_lang, text)) # Submit a message to the bus for each target language to_langs = os.environ["TO_LANG"].split(",") for target_lang in to_langs: topic_name = os.environ["TRANSLATE_TOPIC"] if src_lang == target_lang or src_lang == "und": topic_name = os.environ["RESULT_TOPIC"] message = { "text": text, "filename": filename, "lang": target_lang, "src_lang": src_lang, } message_data = json.dumps(message).encode("utf-8") topic_path = publisher.topic_path(project_id, topic_name) future = publisher.publish(topic_path, data=message_data) futures.append(future) for future in futures: future.result()

Translate text

The following function translates the extracted text and queues the translated text to be saved back to Cloud Storage:

functions/ocr/app/main.py

def translate_text(event, context): if event.get("data"): message_data = base64.b64decode(event["data"]).decode("utf-8") message = json.loads(message_data) else: raise ValueError("Data sector is missing in the Pub/Sub message.") text = validate_message(message, "text") filename = validate_message(message, "filename") target_lang = validate_message(message, "lang") src_lang = validate_message(message, "src_lang") print("Translating text into {}.".format(target_lang)) translated_text = translate_client.translate( text, target_language=target_lang, source_language=src_lang ) topic_name = os.environ["RESULT_TOPIC"] message = { "text": translated_text["translatedText"], "filename": filename, "lang": target_lang, } message_data = json.dumps(message).encode("utf-8") topic_path = publisher.topic_path(project_id, topic_name) future = publisher.publish(topic_path, data=message_data) future.result()

Save the translations

Finally, the following function receives the translated text and saves it back to Cloud Storage:

functions/ocr/app/main.py

def save_result(event, context): if event.get("data"): message_data = base64.b64decode(event["data"]).decode("utf-8") message = json.loads(message_data) else: raise ValueError("Data sector is missing in the Pub/Sub message.") text = validate_message(message, "text") filename = validate_message(message, "filename") lang = validate_message(message, "lang") print("Received request to save file {}.".format(filename)) bucket_name = os.environ["RESULT_BUCKET"] result_filename = "{}_{}.txt".format(filename, lang) bucket = storage_client.get_bucket(bucket_name) blob = bucket.blob(result_filename) print("Saving result to {} in bucket {}.".format(result_filename, bucket_name)) blob.upload_from_string(text) print("File saved.")

Task 4. Deploy the functions

This task describes how to deploy your functions.

  1. Enter the command below to fetch the project details for grant pubsub.publisher and eventarc.serviceAgent permission to service account.
PROJECT_ID=$(gcloud config get-value project) PROJECT_NUMBER=$(gcloud projects list --filter="project_id:$PROJECT_ID" --format='value(project_number)') SERVICE_ACCOUNT=$(gcloud storage service-agent --project=$PROJECT_ID) gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT \ --role roles/pubsub.publisher
  1. To deploy the image processing function with a Cloud Storage trigger, run the following command in the directory that contains the sample code. Replace YOUR_IMAGE_BUCKET_NAME, YOUR_GCP_PROJECT_ID, YOUR_TRANSLATE_TOPIC_NAME and YOUR_RESULT_TOPIC_NAME.
gcloud functions deploy ocr-extract \ --gen2 \ --runtime python312 \ --region={{{project_0.default_region|set at lab start}}} \ --source=. \ --entry-point process_image \ --trigger-bucket YOUR_IMAGE_BUCKET_NAME \ --service-account $PROJECT_NUMBER-compute@developer.gserviceaccount.com \ --allow-unauthenticated \ --set-env-vars "^:^GCP_PROJECT=YOUR_GCP_PROJECT_ID:TRANSLATE_TOPIC=YOUR_TRANSLATE_TOPIC_NAME:RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME:TO_LANG=es,en,fr,ja"

where YOUR_IMAGE_BUCKET_NAME is the name of your Cloud Storage bucket where you upload the images.

Note: If you get an permission error while deploying function wait for 2-3 minutes and re-run the commands.

Click Check my progress to verify the objective.

Deploy the image processing function with a Cloud Storage trigger
  1. To deploy the text translation function with a Cloud Pub/Sub trigger, run the following command in the directory that contains the sample code. Replace YOUR_TRANSLATE_TOPIC_NAME, YOUR_GCP_PROJECT_ID and YOUR_RESULT_TOPIC_NAME.
gcloud functions deploy ocr-translate \ --gen2 \ --runtime python312 \ --region={{{project_0.default_region|set at lab start}}} \ --source=. \ --trigger-topic YOUR_TRANSLATE_TOPIC_NAME \ --entry-point translate_text \ --service-account $PROJECT_NUMBER-compute@developer.gserviceaccount.com \ --allow-unauthenticated \ --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_TOPIC=YOUR_RESULT_TOPIC_NAME"

Click Check my progress to verify the objective.

Deploy the text translation function with a Cloud Pub/Sub trigger
  1. To deploy the function that saves results to Cloud Storage with a Cloud Pub/Sub trigger, run the following command in the directory that contains the sample code. Replace YOUR_RESULT_TOPIC_NAME, YOUR_GCP_PROJECT_ID and YOUR_RESULT_BUCKET_NAME.
gcloud functions deploy ocr-save \ --gen2 \ --runtime python312 \ --region={{{project_0.default_region|set at lab start}}} \ --source=. \ --trigger-topic YOUR_RESULT_TOPIC_NAME \ --entry-point save_result \ --service-account $PROJECT_NUMBER-compute@developer.gserviceaccount.com \ --allow-unauthenticated \ --set-env-vars "GCP_PROJECT=YOUR_GCP_PROJECT_ID,RESULT_BUCKET=YOUR_RESULT_BUCKET_NAME"

Click Check my progress to verify the objective.

Deploy the function that saves results to Cloud Storage with a Cloud Pub/Sub

Task 5. Upload an image

  1. Download an image from cloud-training bucket.
gsutil cp gs://cloud-training/OCBL307/menu.jpg .

You can use your image which contains text or any image from this sample project to upload in your Cloud Storage bucket.

  1. Upload the image to your Cloud Storage bucket:
gsutil cp IMAGE_NAME gs://YOUR_IMAGE_BUCKET_NAME

where

  • IMAGE_NAME is the name of an image file (that contains text) which you have downloaded in previous step.
  • YOUR_IMAGE_BUCKET_NAME is the name of the bucket where you are uploading images.
  1. Watch the logs to be sure the executions have completed:
gcloud functions logs read --limit 100 --region={{{project_0.default_region| REGION}}}
  1. You can view the saved translations in the Cloud Storage bucket you used for YOUR_RESULT_BUCKET_NAME.

Click Check my progress to verify the objective.

Upload an image to your image Cloud Storage bucket

Task 6. Delete the Cloud Functions

Deleting Cloud Functions does not remove any resources stored in Cloud Storage.

To delete the Cloud Functions you created, run the following commands and follow the prompts:

gcloud functions delete ocr-extract --region={{{project_0.default_region| REGION}}} gcloud functions delete ocr-translate --region={{{project_0.default_region| REGION}}} gcloud functions delete ocr-save --region={{{project_0.default_region| REGION}}}

You can also delete Cloud Functions from the Google Cloud.

End your lab

When you have completed your lab, click End Lab. Qwiklabs removes the resources you’ve used and cleans the account for you.

You will be given an opportunity to rate the lab experience. Select the applicable number of stars, type a comment, and then click Submit.

The number of stars indicates the following:

  • 1 star = Very dissatisfied
  • 2 stars = Dissatisfied
  • 3 stars = Neutral
  • 4 stars = Satisfied
  • 5 stars = Very satisfied

You can close the dialog box if you don't want to provide feedback.

For feedback, suggestions, or corrections, please use the Support tab.

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.

이전 다음

시작하기 전에

  1. 실습에서는 정해진 기간 동안 Google Cloud 프로젝트와 리소스를 만듭니다.
  2. 실습에는 시간 제한이 있으며 일시중지 기능이 없습니다. 실습을 종료하면 처음부터 다시 시작해야 합니다.
  3. 화면 왼쪽 상단에서 실습 시작을 클릭하여 시작합니다.

시크릿 브라우징 사용

  1. 실습에 입력한 사용자 이름비밀번호를 복사합니다.
  2. 비공개 모드에서 콘솔 열기를 클릭합니다.

콘솔에 로그인

    실습 사용자 인증 정보를 사용하여
  1. 로그인합니다. 다른 사용자 인증 정보를 사용하면 오류가 발생하거나 요금이 부과될 수 있습니다.
  2. 약관에 동의하고 리소스 복구 페이지를 건너뜁니다.
  3. 실습을 완료했거나 다시 시작하려고 하는 경우가 아니면 실습 종료를 클릭하지 마세요. 이 버튼을 클릭하면 작업 내용이 지워지고 프로젝트가 삭제됩니다.

현재 이 콘텐츠를 이용할 수 없습니다

이용할 수 있게 되면 이메일로 알려드리겠습니다.

감사합니다

이용할 수 있게 되면 이메일로 알려드리겠습니다.

한 번에 실습 1개만 가능

모든 기존 실습을 종료하고 이 실습을 시작할지 확인하세요.

시크릿 브라우징을 사용하여 실습 실행하기

이 실습을 실행하려면 시크릿 모드 또는 시크릿 브라우저 창을 사용하세요. 개인 계정과 학생 계정 간의 충돌로 개인 계정에 추가 요금이 발생하는 일을 방지해 줍니다.
미리보기