GCP Cloud Functions - Python
After following this tutorial you will have a working target update stream, consuming Tracking Stream from Python, on Google Cloud Functions.
In this example, we log the arriving target updates of a filtered query that only shows target updates over Atlanta airport. Before reaching a set time-out we gracefully disconnect right after receiving a position_token; this way we avoid duplicate target update delivery. We store the position_token to re-connect to the same point in the stream on the next scheduled Cloud Functions call.
The GCP services comprising the infrastructure are:
- Google Cloud Function to configure the connection and to receive target updates;
- Google Cloud Scheduler Job to trigger the Cloud Function; and
- Google Cloud Storage to store the location of the stream at time-out.
#
Source codeYou can find and download the source code for this example here.
#
OverviewThis overview outlines the purpose of each file in this example, and which role it plays.
terraform.tf
contains the infrastructure definition for GCP. Using Terraform 13 you can create the Cloud Function, Storage bucket and Scheduler Job.function/client.py
andfunction/client_test.py
contains production-ready sample client code that wraps the v2/targets/stream API;
It exposes target updates via callbacks and handles graceful disconnection to avoid duplicate target update delivery.function/main.py
usesfunction/client.py
and manages loading and storing position_tokens, that encode the position in the stream that the client has progressed to. This is also where the TargetProcessor class is located: This class processes target updates as they come in, and exposes a callback function to do so.function/Pipfile
andfunction/Pipfile.lock
defines all Python package dependencies for the client and the handler, like the google package to write to Storage, the requests package to easily call network resources, and pytest for testing and development. Pip can be used inside the Pipenv virtual environment to export a list of installed requirements tofunction/requirements.txt
, which Google Cloud Build uses to build the Cloud Function.
The next section lists all prerequisites, with links and notes on how to install them.
#
PrerequisitesTo execute this tutorial you need the following accounts and tools on your system:
- A Google Cloud account with an "Organization", and "Billing" enabled;
- The Google Cloud SDK (
gcloud
) command installed and authenticated; - Terraform 1.3.7 to create the GCP infrastructure;
- pyenv to install and load the correct Python version;
- Git to download the source code.
Having these prerequisites in place you can walk through the next section to set up the example on your account.
#
SetupThe first part of this section contains and describes the necessary commands to create the infrastructure and deploy the example. The second part shows how to set up the development environment and adapt the code to your needs.
#
Creating a GCP projectFirst, we set up the project and environment such that Terraform can do the rest for us.
We create a new, empty project, set up billing, and set environment variables that Terraform will use
for variables in terraform.tf
.
Retrieve and set an environment variable with your organization ID.
We will use this variable to create an empty project.
Create an empty project.
Retrieve the project ID and store it in an environment variable. Terraform will use this variable to know which project to create the services in.
Retrieve the billing accounts, and store the relevant one in an environment variable
Enable cloud billing for your project, to allow us to use services that depend on that.
Enable the services we will use further on. Cloud Billing is a requirement for Cloud Scheduler.
#
Setting up the project for TerraformCreate a terraform IAM service account
Allow terraform to create infrastructure using these services
Set credentials location for our terraform service account
Create keys for the terraform service account, and set environment variables for
gcloud
.
Set the region to create the infrastructure in, and make your Tracking Stream token available to Terraform
To use Terraform, the project needs a service account that is allowed to create new service instances and set Cloud Function IAM roles.
Furthermore, terraform.tf
makes use of variables, which are set in this section
using environment variables with the prefix TF_VAR_
.
#
Creating the infrastructureCreate a Google App Engine app. This is a requirement for Terraform to create a Cloud Scheduler Job.
Initialize Terraform, and create and deploy the infrastructure.
Infrastructure creation might take a few minutes since Google Cloud Build will build and deploy the Cloud Function behind the scenes.
#
ResultsIn https://console.cloud.google.com/functions you should now see the streamer function. Click on it, and go to "View Logs" to see the function invoked regularly, logging incoming target updates, gracefully disconnecting, and reconnecting to the stream seamlessly, using position_tokens.
Note: Propagation of IAM roles and permissions might take a while, so the first one or two invocations of the Cloud Scheduler might fail; after a while, things will consolidate.
Note: The first call of the streamer Cloud Function will report that it can't find a valid
position_token
and starts without one. Following invocations will re-start from their
respective previous position_token
.
#
DevelopmentThe core functionality for further processing, filtering, or forwarding of target updates is located
in main.py
in the class TargetProcessor
. This section shows how to modify and re-deploy that code.
For this example, the target processor only logs target updates. For a real use-case, the callback might add the target update to a list, to allow batch-processing after the time-out; it might forward the target update to a stream processor, or to a PubSub topic.
To modify the Python code of the GCP Cloud Function (main.py
) or the API wrapper (client.py
),
first install pyenv, then cd
into the function
directory.
- Execute
pyenv init
and follow the instructions, to get pyenv ready to load the correct python version - Run
pyenv install $(cat .python-version)
to install the required Python version on your system. - Run
pyenv shell $(cat .python-version)
to load this python version into the active shell. - Run
pip install pipenv
to install pipenv into the active pyenv virtual environment. - Run
pipenv --python $(cat .python-version)
to create a virtual environment for this project. - Run
pipenv shell
to load it into the active shell. - Run
pipenv sync --dev
to install development and production requirements. pipenv --venv
shows the virtual environment location, to correctly set the environment in your IDE.
#
Updating Dependencies for deploymentCloud Build for Cloud Functions installs dependencies that are specified
in requirements.txt
.
To mirror the content of Pipfile.lock
in requirements.txt
, execute the following in the pipenv shell:
#
Redeployingterraform.tf
is set up in a way that triggers a re-deployment on every execution of terraform apply .
:
The call to timestamp()
in the streamer source name triggers a re-creation of that
resource. So applying terraform is enough to re-deploy new code.
#
CleanupTo remove all resources you can do one of two things:
- Delete the project with
- Delete only the terraform-created resources with