arrow_back

Data Analysis with the FraudFinder Workshop

로그인 가입
700개 이상의 실습 및 과정 이용하기

Data Analysis with the FraudFinder Workshop

실습 1시간 30분 universal_currency_alt 크레딧 1개 show_chart 입문
info 이 실습에는 학습을 지원하는 AI 도구가 통합되어 있을 수 있습니다.
700개 이상의 실습 및 과정 이용하기

GSP1149

Google Cloud self-paced labs logo

Overview

FraudFinder is a series of notebooks to show how an end-to-end Data to AI architecture works on Google Cloud, through a toy use case of real-time fraud detection system. Orchestration overview for Data to AI

FraudFinder represents a golden Data to AI workshop to show an end-to-end architecture from raw data to MLOps, through the use case of real-time fraud detection. Fraudfinder is a series of labs to showcase the comprehensive Data to AI journey on Google Cloud, through the use case of real-time fraud detection. Throughout the Fraudfinder labs, you will learn how to read historical payment transactions data stored in a data warehouse, read from a live stream of new transactions, perform exploratory data analysis (EDA), do feature engineering, ingest features into a Vertex AI Feature Store, train a model using Feature Store, register your model in a model registry, evaluate your model, deploy your model to an endpoint, do real-time inference on your model with Feature Store, and monitor your model. Data to AI is the process of using AI/ML on data to generate insights, inform decision-making, and to augment downstream applications.

Scenario

Imagine that you've just joined Cymbal Bank, and you've been asked to design and create an end-to-end fraud detection solution using Google Cloud. Real time detection system

This hands-on lab will walk you through the entire end-to-end architecture across a series of notebooks.

What you will learn:

  • How to read historical payment transactions data stored in a data warehouse
  • Read from a live stream of new transactions, perform exploratory data analysis (EDA)
  • Feature engineering & ingest features into a Feature Store
  • Train a model using Feature Store
  • Register your model in a model registry & evaluate your model
  • Deploy your model to an endpoint
  • Real-time inference on your model with Feature Store
  • Monitor your model.

Notebook Organization

This lab is organized across various notebooks as:

FraudFinder

Notebook Description
00_environment_setup.ipynb Setting up the data and checking to make sure you can query the data.
01_exploratory_data_analysis.ipynb Exploratory data analysis of historic bank transactions stored in BigQuery.
02_feature_engineering_batch.ipynb This notebook shows how to generate new features on bank transactions by customer and terminal over the last n days, by doing batch feature engineering in SQL with BigQuery.
03_feature_engineering_streaming.ipynb Computing features based on the last n minutes, you will use streaming-based feature engineering using Dataflow.

After feature engineering, you can take either of the following paths for model training and MLOps:

  • BigQuery ML
  • Vertex AI custom training

BigQuery ML

BigQuery ML (BQML) enables users to create and execute machine learning models in BigQuery using GoogleSQL queries. Learn more. If you would prefer to learn how to train a model using Python packages for machine learning, such as xgboost, then skip this section and move onto the next section on "Vertex AI Custom Training".

Notebook Description
bqml/04_model_training_and_prediction.ipynb In this notebook, using the data in Vertex AI Feature Store that you previously ingested data into, you will train a model using BigQuery ML, register the model to Vertex AI Model Registry, and deploy it to an endpoint for real-time prediction.
bqml/05_model_training_pipeline_formalization.ipynb Train and deploy a Logistic Regression model using BQML, register the model with Model Registry & Create a Vertex AI Endpoint & upload the BQML to the endpoint.
bqml/06_model_deployment.ipynb In this notebook, you learn to set up the Vertex AI Model Monitoring service to detect feature skew and drift in the input predict requests.
bqml/07_model_inference.ipynb In this notebook, you will create a Cloud Run app to perform model inference on the endpoint deployed in the previous notebooks.

Vertex AI Custom Training

Vertex AI custom training enables users to write any ML code to be trained in the cloud, using Vertex AI. Learn more. If you would prefer to learn how to train machine learning models directly in BigQuery with SQL, followed by MLOps with Vertex AI, then please instead use the notebooks in the above section for "BigQuery ML".

Notebook Description
vertex_ai/04_experimentation.ipynb In this notebook, using the data in Vertex AI Feature Store that you previously ingested data into, you will train a model using xgboost in a local kernel, track hyperparameter-tuning experiments on Vertex AI, and deploy the model to an endpoint for real-time prediction.
vertex_ai/05_model_training_xgboost_formalization.ipynb In this notebook, you will learn how to build a Vertex AI dataset, build a Docker container and train a custom XGBoost model using Vertex AI custom training, evaluate the model, and deploy the model to Vertex AI as an endpoint.
vertex_ai/06_formalization.ipynb In this notebook, you will use Vertex AI Feature Store, Vertex AI Pipelines and Vertex AI Model Monitoring for building and executing an end-to-end ML pipeline using components.

Task 1. Vertex AI Workbench

In your Google Cloud project, navigate to Vertex AI Workbench. To do so, you can either click on the link below, or search for "Vertex AI Workbench" in the search bar at the top of the Google Cloud console. https://console.cloud.google.com/vertex-ai/workbench/ Search Bar to access Vertex AI Workbench

Task 2. Open JupyterLab

On the Workbench page, you should see a notebook instance has already been created for you.

  1. Click "Open JupyterLab",
  2. The JupyterLab will run in a new tab.
Open Notebook

Task 3. Open the first notebook

  1. On the left-hand side view the file directory menu
  2. Double click on the "fraudfinder/" folder: 00_environment_setup.ipynb
  3. The first notebook will be displayed as shown below:
View of notebook setup

Task 4. Follow the instructions in the notebooks

  1. Run each cell one at a time to execute the notebook.
  2. Continue through the remaining notebooks in the fraudfinder/ folder
Note: The emphasis of the lab is to complete the FraudFinder notebooks within the allotted time. Completion of the content within the bqml/ and vertex_ai/ folders are not required.

Congratulations

Next steps

Google Cloud training and certification

...helps you make the most of Google Cloud technologies. Our classes include technical skills and best practices to help you get up to speed quickly and continue your learning journey. We offer fundamental to advanced level training, with on-demand, live, and virtual options to suit your busy schedule. Certifications help you validate and prove your skill and expertise in Google Cloud technologies.

Lab Last Tested November 01, 2023

시작하기 전에

  1. 실습에서는 정해진 기간 동안 Google Cloud 프로젝트와 리소스를 만듭니다.
  2. 실습에는 시간 제한이 있으며 일시중지 기능이 없습니다. 실습을 종료하면 처음부터 다시 시작해야 합니다.
  3. 화면 왼쪽 상단에서 실습 시작을 클릭하여 시작합니다.

시크릿 브라우징 사용

  1. 실습에 입력한 사용자 이름비밀번호를 복사합니다.
  2. 비공개 모드에서 콘솔 열기를 클릭합니다.

콘솔에 로그인

    실습 사용자 인증 정보를 사용하여
  1. 로그인합니다. 다른 사용자 인증 정보를 사용하면 오류가 발생하거나 요금이 부과될 수 있습니다.
  2. 약관에 동의하고 리소스 복구 페이지를 건너뜁니다.
  3. 실습을 완료했거나 다시 시작하려고 하는 경우가 아니면 실습 종료를 클릭하지 마세요. 이 버튼을 클릭하면 작업 내용이 지워지고 프로젝트가 삭제됩니다.

현재 이 콘텐츠를 이용할 수 없습니다

이용할 수 있게 되면 이메일로 알려드리겠습니다.

감사합니다

이용할 수 있게 되면 이메일로 알려드리겠습니다.

한 번에 실습 1개만 가능

모든 기존 실습을 종료하고 이 실습을 시작할지 확인하세요.

시크릿 브라우징을 사용하여 실습 실행하기

이 실습을 실행하려면 시크릿 모드 또는 시크릿 브라우저 창을 사용하세요. 개인 계정과 학생 계정 간의 충돌로 개인 계정에 추가 요금이 발생하는 일을 방지해 줍니다.