GCP Cloud Functions - Python

After following this tutorial you will have a working target update stream, consuming Tracking Stream from Python, on Google Cloud Functions.

In this example, we log the arriving target updates of a filtered query that only shows target updates over Atlanta airport. Before reaching a set time-out we gracefully disconnect right after receiving a position_token; this way we avoid duplicate target update delivery. We store the position_token to re-connect to the same point in the stream on the next scheduled Cloud Functions call.

The GCP services comprising the infrastructure are:

Google Cloud Function to configure the connection and to receive target updates;
Google Cloud Scheduler Job to trigger the Cloud Function; and
Google Cloud Storage to store the location of the stream at time-out.

Source code#

You can find and download the source code for this example here.

Overview#

This overview outlines the purpose of each file in this example, and which role it plays.

terraform.tf contains the infrastructure definition for GCP. Using Terraform 13 you can create the Cloud Function, Storage bucket and Scheduler Job.
function/client.py and function/client_test.py contains production-ready sample client code that wraps the v2/targets/stream API;
It exposes target updates via callbacks and handles graceful disconnection to avoid duplicate target update delivery.
function/main.py uses function/client.py and manages loading and storing position_tokens, that encode the position in the stream that the client has progressed to. This is also where the TargetProcessor class is located: This class processes target updates as they come in, and exposes a callback function to do so.
function/Pipfile and function/Pipfile.lock defines all Python package dependencies for the client and the handler, like the google package to write to Storage, the requests package to easily call network resources, and pytest for testing and development. Pip can be used inside the Pipenv virtual environment to export a list of installed requirements to function/requirements.txt, which Google Cloud Build uses to build the Cloud Function.

The next section lists all prerequisites, with links and notes on how to install them.

Prerequisites#

To execute this tutorial you need the following accounts and tools on your system:

A Google Cloud account with an "Organization", and "Billing" enabled;
The Google Cloud SDK (gcloud) command installed and authenticated;
Terraform 1.3.7 to create the GCP infrastructure;
pyenv to install and load the correct Python version;
Git to download the source code.

Having these prerequisites in place you can walk through the next section to set up the example on your account.

Setup#

The first part of this section contains and describes the necessary commands to create the infrastructure and deploy the example. The second part shows how to set up the development environment and adapt the code to your needs.

Creating a GCP project#

First, we set up the project and environment such that Terraform can do the rest for us.

We create a new, empty project, set up billing, and set environment variables that Terraform will use for variables in terraform.tf.

Retrieve and set an environment variable with your organization ID.
We will use this variable to create an empty project.

gcloud organizations list
export TF_VAR_org_id="<YOUR-ORG-ID>"

Create an empty project.
Retrieve the project ID and store it in an environment variable. Terraform will use this variable to know which project to create the services in.

gcloud projects create airsafe-v2-stream --organization $TF_VAR_org_id --set-as-default
gcloud projects list
export TF_VAR_project_id="<THE-airsafe-v2-stream-PROJECT-ID>"

Retrieve the billing accounts, and store the relevant one in an environment variable

gcloud alpha billing accounts list
export TF_VAR_billing_account="<YOUR-BILLING-ACCOUNT_ID>"

Enable cloud billing for your project, to allow us to use services that depend on that.

gcloud services enable cloudbilling.googleapis.com
gcloud beta billing projects link $TF_VAR_project_id --billing-account $TF_VAR_billing_account

Enable the services we will use further on. Cloud Billing is a requirement for Cloud Scheduler.

gcloud services enable iam.googleapis.com
gcloud services enable cloudfunctions.googleapis.com
gcloud services enable storage.googleapis.com
gcloud services enable cloudscheduler.googleapis.com
gcloud services enable cloudbuild.googleapis.com
gcloud services enable cloudresourcemanager.googleapis.com

Setting up the project for Terraform#

Create a terraform IAM service account

gcloud iam service-accounts create terraform --display-name "Terraform admin account"

Allow terraform to create infrastructure using these services

gcloud projects add-iam-policy-binding $TF_VAR_project_id  --member "serviceAccount:terraform@$TF_VAR_project_id.iam.gserviceaccount.com" --role roles/editor
gcloud projects add-iam-policy-binding $TF_VAR_project_id  --member "serviceAccount:terraform@$TF_VAR_project_id.iam.gserviceaccount.com" --role roles/cloudfunctions.admin

Set credentials location for our terraform service account

export TF_CREDS="$HOME/.config/gcloud/$TF_VAR_project_id-terraform-admin.json"

Create keys for the terraform service account, and set environment variables for gcloud.

gcloud iam service-accounts keys create "$TF_CREDS" --iam-account "terraform@$TF_VAR_project_id.iam.gserviceaccount.com"
export GOOGLE_APPLICATION_CREDENTIALS=$TF_CREDS
export GOOGLE_PROJECT=$TF_VAR_project_id

Set the region to create the infrastructure in, and make your Tracking Stream token available to Terraform

export TF_VAR_airsafe2_token="<YOUR-AIRSAFE-2-STREAM-TOKEN>"
export TF_VAR_region="us-west2"

To use Terraform, the project needs a service account that is allowed to create new service instances and set Cloud Function IAM roles.

Furthermore, terraform.tf makes use of variables, which are set in this section using environment variables with the prefix TF_VAR_.

Creating the infrastructure#

Create a Google App Engine app. This is a requirement for Terraform to create a Cloud Scheduler Job.

gcloud app create --region=$TF_VAR_region

Initialize Terraform, and create and deploy the infrastructure.

terraform init
terraform plan
terraform apply

Infrastructure creation might take a few minutes since Google Cloud Build will build and deploy the Cloud Function behind the scenes.

Results#

In https://console.cloud.google.com/functions you should now see the streamer function. Click on it, and go to "View Logs" to see the function invoked regularly, logging incoming target updates, gracefully disconnecting, and reconnecting to the stream seamlessly, using position_tokens.

Note: Propagation of IAM roles and permissions might take a while, so the first one or two invocations of the Cloud Scheduler might fail; after a while, things will consolidate.

Note: The first call of the streamer Cloud Function will report that it can't find a valid position_token and starts without one. Following invocations will re-start from their respective previous position_token.

Development#

The core functionality for further processing, filtering, or forwarding of target updates is located in main.py in the class TargetProcessor. This section shows how to modify and re-deploy that code.

For this example, the target processor only logs target updates. For a real use-case, the callback might add the target update to a list, to allow batch-processing after the time-out; it might forward the target update to a stream processor, or to a PubSub topic.

To modify the Python code of the GCP Cloud Function (main.py) or the API wrapper (client.py), first install pyenv, then cd into the function directory.

Execute pyenv init and follow the instructions, to get pyenv ready to load the correct python version
Run pyenv install $(cat .python-version) to install the required Python version on your system.
Run pyenv shell $(cat .python-version) to load this python version into the active shell.
Run pip install pipenv to install pipenv into the active pyenv virtual environment.
Run pipenv --python $(cat .python-version) to create a virtual environment for this project.
Run pipenv shell to load it into the active shell.
Run pipenv sync --dev to install development and production requirements.
pipenv --venv shows the virtual environment location, to correctly set the environment in your IDE.

Updating Dependencies for deployment#

Cloud Build for Cloud Functions installs dependencies that are specified in requirements.txt.

To mirror the content of Pipfile.lock in requirements.txt, execute the following in the pipenv shell:

pipenv --rm && pipenv sync && pip list --format=freeze > requirements.txt

Redeploying#

terraform.tf is set up in a way that triggers a re-deployment on every execution of terraform apply .: The call to timestamp() in the streamer source name triggers a re-creation of that resource. So applying terraform is enough to re-deploy new code.

Cleanup#

To remove all resources you can do one of two things:

Delete the project with

gcloud projects delete $TF_VAR_project_id

Delete only the terraform-created resources with

terraform destroy