arrow_back

Scikit-learn Model Serving with Online Prediction Using AI Platform

Join Sign in

Scikit-learn Model Serving with Online Prediction Using AI Platform

1 hour 20 minutes 1 Credit

GSP245

Google Cloud selp-paced labs logo

Overview

If you've built machine learning models with scikit-learn, and you want to serve your models in real time for an application, managing the resulting infrastructure may sound like a nightmare. Fortunately, there's an alternative - serving your trained scikit-learn models on AI Platform.

You can now upload a model you've already trained onto Cloud Storage and use AI Platform Prediction to support scalable prediction requests against your trained model.

In this lab you learn how to train a simple scikit-learn model, deploy the model to AI Platform Prediction, and make online predictions against that model.

How to bring a scikit-learn model to AI Platform

Getting your model ready for prediction can be done in 5 steps:

  1. Create and save a model to a file
  2. Upload the saved model to Cloud Storage
  3. Create a model resource in AI Platform
  4. Create a model version (linking your scikit-learn model)
  5. Make an online prediction

This lab walks you through the five steps listed above.

What you will build

389c24d3517cec5c.png

What you'll learn

  • How to create a model on AI Platform
  • How to make online predictions against your model on AI Platform

Before you jump in to the lab, learn about the different tools you'll be using to get online prediction up and running on AI Platform:

Google Cloud lets you build and host applications and websites, store data, and analyze data on Google's scalable infrastructure.

AI Platform Prediction is a managed service that enables you to easily build machine learning models that work on any type of data, of any size.

Cloud Storage is a unified object storage for developers and enterprises, from live data serving to data analytics/ML to data archiving.

Cloud SDK is a command line tool which allows you to interact with Google Cloud products.

Set up and requirements

Before you click the Start Lab button

Read these instructions. Labs are timed and you cannot pause them. The timer, which starts when you click Start Lab, shows how long Google Cloud resources will be made available to you.

This hands-on lab lets you do the lab activities yourself in a real cloud environment, not in a simulation or demo environment. It does so by giving you new, temporary credentials that you use to sign in and access Google Cloud for the duration of the lab.

To complete this lab, you need:

  • Access to a standard internet browser (Chrome browser recommended).
Note: Use an Incognito or private browser window to run this lab. This prevents any conflicts between your personal account and the Student account, which may cause extra charges incurred to your personal account.
  • Time to complete the lab---remember, once you start, you cannot pause a lab.
Note: If you already have your own personal Google Cloud account or project, do not use it for this lab to avoid extra charges to your account.

How to start your lab and sign in to the Google Cloud Console

  1. Click the Start Lab button. If you need to pay for the lab, a pop-up opens for you to select your payment method. On the left is the Lab Details panel with the following:

    • The Open Google Console button
    • Time remaining
    • The temporary credentials that you must use for this lab
    • Other information, if needed, to step through this lab
  2. Click Open Google Console. The lab spins up resources, and then opens another tab that shows the Sign in page.

    Tip: Arrange the tabs in separate windows, side-by-side.

    Note: If you see the Choose an account dialog, click Use Another Account.
  3. If necessary, copy the Username from the Lab Details panel and paste it into the Sign in dialog. Click Next.

  4. Copy the Password from the Lab Details panel and paste it into the Welcome dialog. Click Next.

    Important: You must use the credentials from the left panel. Do not use your Google Cloud Skills Boost credentials. Note: Using your own Google Cloud account for this lab may incur extra charges.
  5. Click through the subsequent pages:

    • Accept the terms and conditions.
    • Do not add recovery options or two-factor authentication (because this is a temporary account).
    • Do not sign up for free trials.

After a few moments, the Cloud Console opens in this tab.

Note: You can view the menu with a list of Google Cloud Products and Services by clicking the Navigation menu at the top-left. Navigation menu icon

Activate Cloud Shell

Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to your Google Cloud resources.

  1. Click Activate Cloud Shell Activate Cloud Shell icon at the top of the Google Cloud console.

  2. Click Continue.

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. The output contains a line that declares the PROJECT_ID for this session:

Your Cloud Platform project in this session is set to YOUR_PROJECT_ID

gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.

  1. (Optional) You can list the active account name with this command:

gcloud auth list

Output:

ACTIVE: * ACCOUNT: student-01-xxxxxxxxxxxx@qwiklabs.net To set the active account, run: $ gcloud config set account `ACCOUNT`
  1. (Optional) You can list the project ID with this command:

gcloud config list project

Output:

[core] project = <project_ID>

Example output:

[core] project = qwiklabs-gcp-44776a13dea667a6 Note: For full documentation of gcloud, in Google Cloud, refer to the gcloud CLI overview guide.

Create a storage bucket

Select Navigation menu > Cloud Storage > Buckets:

cloud-storage-bucket.png

Then click CREATE BUCKET. Give it a unique name and click continue. Set the Location type to Region and make sure that the location is set to . Then click CREATE.

Click Check my progress to verify the objective.

Create a storage bucket

Create Virtual Machine

Run this command in cloud shell to create a Debian 11 virtual machine:

gcloud compute instances create scikit-vm \ --image-project=debian-cloud \ --image-family=debian-11 \ --service-account=$(gcloud config get-value project)@$(gcloud config get-value project).iam.gserviceaccount.com \ --scopes=cloud-platform,default,storage-full \ --zone={{{project_0.default_zone}}} \ --tags http-server,https-server The service account, scopes, and tags flags are used to give our Virtual Machine access to our bucket and the AI Platform API.

Click Check my progress to verify the objective.

Create Virtual Machine

When the Virtual Machine creation finishes, ssh into it:

gcloud compute ssh --zone={{{project_0.default_zone}}} scikit-vm

When prompted, type Y. Then, press Enter twice to continue with an empty passphrase.

Now working from the Virtual Machine, update and install pip and virtualenv.

sudo apt-get update sudo apt-get install -y python3-pip sudo apt-get install -y virtualenv

Install scikit-learn

Create isolated Python environment:

virtualenv ml-env -p python3.9

Activate virtualenv:

source ml-env/bin/activate

Run the following to install the required packages in your virtualenv:

pip install google-api-python-client==1.6.2 pip install scikit-learn==1.1.2 pip install pandas==1.4.3 pip install --upgrade google-api-python-client

Set up environment variables

You'll next set up these environment variables in order to run subsequent steps.

  • PROJECT_ID - Use the PROJECT_ID that matches your Google Cloud project.
  • MODEL_PATH - The path to your model on Cloud Storage.
  • MODEL_NAME - Use your own model name, such as ‘census'.
  • VERSION_NAME - Use your own version, such as ‘v1'.
  • REGION - Use the default region ; this is the region where the model will be deployed.
  • INPUT_DATA_FILE- Include a JSON file that contains the data used as input to your model's predict method. (For more info see the complete guide.)

Replace the lines below with the names of your variables.

export PROJECT_ID=your-project-id export MODEL_PATH=gs://your-created-bucket-id export MODEL_NAME=census export VERSION_NAME=v1 export REGION={{{project_0.default_region}}}

The data for this lab

The Census Income Data Set that this sample uses for training is hosted by the UC Irvine Machine Learning Repository. Citation: Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science.

  • Training file is adult.data
  • Evaluation file is adult.test

Create a directory to hold the data:

mkdir census_data

Download the data you'll use for this lab:

curl https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data --output census_data/adult.data curl https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test --output census_data/adult.test Disclaimer: This dataset is provided by a third party. Google provides no representation, warranty, or other guarantees about the validity or any other aspects of this dataset.

Click Check my progress to verify the objective.

Create directory and download data files

Train and save your model

7e8d8d8c39135ad1.png

Load the data into a pandas DataFrame to prepare it for use with XGBoost. Train a simple model in XGBoost. Save the model to a file that can be uploaded to AI Platform.

First, we need to create a scikit-learn model and train it. Once we've done that, you can save your model and put it in a format that is usable by AI Platform.

At this point you need a code editor to create and update files. You can use the shell editors that are installed on your Virtual Machine, such as nano or vim. This lab uses the Nano code editor.

Create a new file called train.py.

nano train.py

Paste or type the following content to the train.py file. Each section is explained as you go.

Add this to set up the imports:

import pandas as pd from sklearn.ensemble import RandomForestClassifier import joblib from sklearn.feature_selection import SelectKBest from sklearn.pipeline import FeatureUnion from sklearn.pipeline import Pipeline from sklearn.preprocessing import LabelBinarizer

Now define the format of your input data including unused columns. (These are the columns from the census data files):

# Define the format of your input data including unused columns (These are the columns from the census data files) COLUMNS = ( 'age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'income-level' ) # Categorical columns are columns that need to be turned into a numerical value to be used by scikit-learn CATEGORICAL_COLUMNS = ( 'workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country' )

Next, add this to load the training census dataset:

# Load the training census dataset with open('./census_data/adult.data', 'r') as train_data: raw_training_data = pd.read_csv(train_data, header=None, names=COLUMNS) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists train_features = raw_training_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists train_labels = (raw_training_data['income-level'] == ' >50K').to_numpy().tolist() # Load the test census dataset with open('./census_data/adult.test', 'r') as test_data: raw_testing_data = pd.read_csv(test_data, names=COLUMNS, skiprows=1) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists test_features = raw_testing_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists test_labels = (raw_testing_data['income-level'] == ' >50K.').to_numpy().tolist()

Add the following to convert the categorical columns to a numerical value in the training dataset:

# Since the census data set has categorical features, we need to convert # them to numerical values. We'll use a list of pipelines to convert each # categorical column and then use FeatureUnion to combine them before calling # the RandomForestClassifier. categorical_pipelines = [] # Each categorical column needs to be extracted individually and converted to a numerical value. # To do this, each categorical column will use a pipeline that extracts one feature column via # SelectKBest(k=1) and a LabelBinarizer() to convert the categorical value to a numerical one. # A scores array (created below) will select and extract the feature column. The scores array is # created by iterating over the COLUMNS and checking if it is a CATEGORICAL_COLUMN. for i, col in enumerate(COLUMNS[:-1]): if col in CATEGORICAL_COLUMNS: # Create a scores array to get the individual categorical column. # Example: # data = [39, 'State-gov', 77516, 'Bachelors', 13, 'Never-married', 'Adm-clerical', # 'Not-in-family', 'White', 'Male', 2174, 0, 40, 'United-States'] # scores = [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] # # Returns: [['Sate-gov']] scores = [] # Build the scores array for j in range(len(COLUMNS[:-1])): if i == j: # This column is the categorical column we want to extract. scores.append(1) # Set to 1 to select this column else: # Every other column should be ignored. scores.append(0) skb = SelectKBest(k=1) skb.scores_ = scores # Convert the categorical column to a numerical value lbn = LabelBinarizer() r = skb.transform(train_features) lbn.fit(r) # Create the pipeline to extract the categorical feature categorical_pipelines.append( ('categorical-{}'.format(i), Pipeline([ ('SKB-{}'.format(i), skb), ('LBN-{}'.format(i), lbn)]))) # Create pipeline to extract the numerical features skb = SelectKBest(k=6) # From COLUMNS use the features that are numerical skb.scores_ = [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0] categorical_pipelines.append(('numerical', skb)) # Combine all the features using FeatureUnion preprocess = FeatureUnion(categorical_pipelines)

Note: There is a tradeoff when preprocessing the data before it gets used by the model:

  • Pros: The model is very simple.
  • Cons: The data must be converted to numerical values before it gets passed to the model

Finally, create, train, and save the classifier to a file:

# Create the classifier classifier = RandomForestClassifier() # Transform the features and fit them to the classifier classifier.fit(preprocess.transform(train_features), train_labels) # Create the overall model as a single pipeline pipeline = Pipeline([ ('union', preprocess), ('classifier', classifier) ]) # Export the model to a file joblib.dump(pipeline, 'model.joblib') print('Model trained and saved')

Completed File

Your completed code should look similar to what is below:

import pandas as pd from sklearn.ensemble import RandomForestClassifier import joblib from sklearn.feature_selection import SelectKBest from sklearn.pipeline import FeatureUnion from sklearn.pipeline import Pipeline from sklearn.preprocessing import LabelBinarizer # Define the format of your input data including unused columns (These are the columns from the census data files) COLUMNS = ( 'age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'income-level' ) # Categorical columns are columns that need to be turned into a numerical value to be used by scikit-learn CATEGORICAL_COLUMNS = ( 'workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country' ) # Load the training census dataset with open('./census_data/adult.data', 'r') as train_data: raw_training_data = pd.read_csv(train_data, header=None, names=COLUMNS) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists train_features = raw_training_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists train_labels = (raw_training_data['income-level'] == ' >50K').to_numpy().tolist() # Load the test census dataset with open('./census_data/adult.test', 'r') as test_data: raw_testing_data = pd.read_csv(test_data, names=COLUMNS, skiprows=1) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists test_features = raw_testing_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists test_labels = (raw_testing_data['income-level'] == ' >50K.').to_numpy().tolist() # Since the census data set has categorical features, we need to convert # them to numerical values. We'll use a list of pipelines to convert each # categorical column and then use FeatureUnion to combine them before calling # the RandomForestClassifier. categorical_pipelines = [] # Each categorical column needs to be extracted individually and converted to a numerical value. # To do this, each categorical column will use a pipeline that extracts one feature column via # SelectKBest(k=1) and a LabelBinarizer() to convert the categorical value to a numerical one. # A scores array (created below) will select and extract the feature column. The scores array is # created by iterating over the COLUMNS and checking if it is a CATEGORICAL_COLUMN. for i, col in enumerate(COLUMNS[:-1]): if col in CATEGORICAL_COLUMNS: # Create a scores array to get the individual categorical column. # Example: # data = [39, 'State-gov', 77516, 'Bachelors', 13, 'Never-married', 'Adm-clerical', # 'Not-in-family', 'White', 'Male', 2174, 0, 40, 'United-States'] # scores = [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] # # Returns: [['Sate-gov']] scores = [] # Build the scores array for j in range(len(COLUMNS[:-1])): if i == j: # This column is the categorical column we want to extract. scores.append(1) # Set to 1 to select this column else: # Every other column should be ignored. scores.append(0) skb = SelectKBest(k=1) skb.scores_ = scores # Convert the categorical column to a numerical value lbn = LabelBinarizer() r = skb.transform(train_features) lbn.fit(r) # Create the pipeline to extract the categorical feature categorical_pipelines.append( ('categorical-{}'.format(i), Pipeline([ ('SKB-{}'.format(i), skb), ('LBN-{}'.format(i), lbn)]))) # Create pipeline to extract the numerical features skb = SelectKBest(k=6) # From COLUMNS use the features that are numerical skb.scores_ = [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0] categorical_pipelines.append(('numerical', skb)) # Combine all the features using FeatureUnion preprocess = FeatureUnion(categorical_pipelines) # Create the classifier classifier = RandomForestClassifier() # Transform the features and fit them to the classifier classifier.fit(preprocess.transform(train_features), train_labels) # Create the overall model as a single pipeline pipeline = Pipeline([ ('union', preprocess), ('classifier', classifier) ]) # Export the model to a file joblib.dump(pipeline, 'model.joblib') print('Model trained and saved')

Save the file by pressing Ctrl+x, y, and then Enter.

Run the train.py file in Cloud Shell:

python train.py

Example Output (do not copy)

Model trained and saved Note: If you get deprecation warning related to the Numpy module, please ignore it and move forward in the lab

Upload the saved model

b1aab6056f853b5f.png

To use your model with AI Platform, you'll need to upload it to Cloud Storage. This step takes your local model.joblib file and uploads it to Cloud Storage via the Cloud SDK using gsutil.

gsutil cp ./model.joblib $MODEL_PATH/model.joblib

Click Check my progress to verify the objective.

Upload the saved model

Create a model resource

fff088671637247d.png

AI Platform organizes your trained models using model and version resources. An AI Platform model is a container for the versions of your machine learning model. More information on model resources and model versions can be found in the scikit-learn documentation.

In this step you'll create a container that can hold several different versions of your actual model.

gcloud ai-platform models create $MODEL_NAME --region $REGION

Click Check my progress to verify the objective.

Create a model resource

Create a model version

203ac7d48b147060.png

Now it's time to get a version of your model uploaded to your container. This will allow you to make online predictions. The model version requires a few components, specified here.

  • name - The name specified for the version when it was created. This will be the VERSION_NAME variable you declared at the beginning.
  • deploymentUri - The Cloud Storage location of the trained model used to create the version. This is the bucket where you uploaded the model with your MODEL_PATH.
  • runtimeVersion - The AI Platform runtime version to use for this deployment. This is set to 2.8.
  • framework - This specifies whether you're using SCIKIT_LEARN or XGBOOST. This is set to SCIKIT_LEARN.
  • pythonVersion - Set the value to "3.7".

Run the following to upload your model to your container:

gcloud beta ai-platform versions create $VERSION_NAME \ --model $MODEL_NAME \ --origin $MODEL_PATH \ --runtime-version="2.8" \ --framework="SCIKIT_LEARN" \ --python-version="3.7" \ --region=$REGION Note: It can take several minutes before your model becomes available.

Confirm your model's deployment with the following command:

gcloud ai-platform versions list --model $MODEL_NAME --region $REGION

You should receive a similar output:

NAME DEPLOYMENT_URI STATE v1 gs://my-bucket READY

Click Check my progress to verify the objective.

Create a model version

Make an online prediction

4731d5f9bf05aca1.png

It's time to make an online prediction in Python with your newly deployed model.

First, use the your editor to create a new file called test.py:

nano test.py

Now add content to this file. This first part is very similar to what you did with the training data, so we'll skim over this part.

  • Define the format of the columns
  • Load the test dataset
  • Convert the test data to numerical values
import googleapiclient.discovery import os import pandas as pd from google.api_core.client_options import ClientOptions PROJECT_ID = os.environ['PROJECT_ID'] VERSION_NAME = os.environ['VERSION_NAME'] MODEL_NAME = os.environ['MODEL_NAME'] # Define the format of your input data including unused columns (These are the columns from the census data files) COLUMNS = ( 'age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'income-level' ) # Categorical columns are columns that need to be turned into a numerical value to be used by scikit-learn CATEGORICAL_COLUMNS = ( 'workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country' ) # Load the training census dataset with open('./census_data/adult.data', 'r') as train_data: raw_training_data = pd.read_csv(train_data, header=None, names=COLUMNS) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists train_features = raw_training_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists train_labels = (raw_training_data['income-level'] == ' >50K').to_numpy().tolist() # Load the test census dataset with open('./census_data/adult.test', 'r') as test_data: raw_testing_data = pd.read_csv(test_data, names=COLUMNS, skiprows=1) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists test_features = raw_testing_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists test_labels = (raw_testing_data['income-level'] == ' >50K.').to_numpy().tolist()

Now set up the Google API client to make a prediction request with your test data, then print out the results:

endpoint = 'https://{{{project_0.default_region}}}-ml.googleapis.com' client_options = ClientOptions(api_endpoint=endpoint) service = googleapiclient.discovery.build('ml', 'v1', client_options=client_options) name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME) name += '/versions/{}'.format(VERSION_NAME) # Due to the size of the data, it needs to be split in 2 first_half = test_features[:int(len(test_features)/2)] second_half = test_features[int(len(test_features)/2):] complete_results = [] for data in [first_half, second_half]: responses = service.projects().predict( name=name, body={'instances': data} ).execute() if 'error' in responses: print(responses['error']) else: complete_results.extend(responses['predictions']) # Print the first 10 responses for i, response in enumerate(complete_results[:10]): print('Prediction: {}\tLabel: {}'.format(response, test_labels[i]))

Completed file

The completed code should look similar to this:

import googleapiclient.discovery import os import pandas as pd from google.api_core.client_options import ClientOptions PROJECT_ID = os.environ['PROJECT_ID'] VERSION_NAME = os.environ['VERSION_NAME'] MODEL_NAME = os.environ['MODEL_NAME'] # Define the format of your input data including unused columns (These are the columns from the census data files) COLUMNS = ( 'age', 'workclass', 'fnlwgt', 'education', 'education-num', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week', 'native-country', 'income-level' ) # Categorical columns are columns that need to be turned into a numerical value to be used by scikit-learn CATEGORICAL_COLUMNS = ( 'workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race', 'sex', 'native-country' ) # Load the training census dataset with open('./census_data/adult.data', 'r') as train_data: raw_training_data = pd.read_csv(train_data, header=None, names=COLUMNS) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists train_features = raw_training_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists train_labels = (raw_training_data['income-level'] == ' >50K').to_numpy().tolist() # Load the test census dataset with open('./census_data/adult.test', 'r') as test_data: raw_testing_data = pd.read_csv(test_data, names=COLUMNS, skiprows=1) # Remove the column we are trying to predict ('income-level') from our features list # Convert the Dataframe to a lists of lists test_features = raw_testing_data.drop('income-level', axis=1).to_numpy().tolist() # Create our training labels list, convert the Dataframe to a lists of lists test_labels = (raw_testing_data['income-level'] == ' >50K.').to_numpy().tolist() endpoint = 'https://{{{project_0.default_region}}}-ml.googleapis.com' client_options = ClientOptions(api_endpoint=endpoint) service = googleapiclient.discovery.build('ml', 'v1', client_options=client_options) name = 'projects/{}/models/{}'.format(PROJECT_ID, MODEL_NAME) name += '/versions/{}'.format(VERSION_NAME) # Due to the size of the data, it needs to be split in 2 first_half = test_features[:int(len(test_features)/2)] second_half = test_features[int(len(test_features)/2):] complete_results = [] for data in [first_half, second_half]: responses = service.projects().predict( name=name, body={'instances': data} ).execute() if 'error' in responses: print(responses['error']) else: complete_results.extend(responses['predictions']) # Print the first 10 responses for i, response in enumerate(complete_results[:10]): print('Prediction: {}\tLabel: {}'.format(response, test_labels[i]))

Save your file by pressing Ctrl+x, Y, and then Enter.

Run test.py in Cloud Shell:

python test.py

Your output should look similar to this:

Prediction: False Label: False Prediction: False Label: False Prediction: True Label: True Prediction: True Label: True Prediction: False Label: False Prediction: False Label: False Prediction: False Label: False Prediction: True Label: True Prediction: False Label: False Prediction: False Label: False

How did your model perform? Most likely the predictions and the labels are the same, indicating that the trained model performed well.

Congratulations!

This concludes the self-paced lab, Scikit-learn Model Serving with Online Prediction Using Cloud AI Platform. You learned how to train a simple scikit-learn model, upload the model to AI Platform, and make online predictions against that model.

ML-ML Infrastructure-badge.png

Finish your Quest

This self-paced lab is part of the Qwiklabs Advanced ML: ML Infrastructure Quest. A Quest is a series of related labs that form a learning path. Completing this Quest earns you the badge above, to recognize your achievement. You can make your badge (or badges) public and link to them in your online resume or social media account. Enroll in this Quest and get immediate completion credit if you've taken this lab. See other available Qwiklabs Quests.

Next steps / learn more

Be sure to check out the following labs to get more practice with Google AI Platform:

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Manual Last Updated August 12, 2022
Lab Last Tested August 12, 2022

Copyright 2022 Google LLC All rights reserved. Google and the Google logo are trademarks of Google LLC. All other company and product names may be trademarks of the respective companies with which they are associated.