Tag Archives: google-cloud

Notes for logging with Python on App Engine

These are my notes for how logging works for Python applications on Google App Engine and how it integrates with Cloud Logging.

I used the logging_tree package to understand how Python’s logging system gets initialized in different scenarios. Read Brandon Rhodes’ article introducing logging_tree.

Logging on the Python 2.7 runtime

  • Python’s logging system is automatically configured to use the handler in the google.appengine.api.logservice package from the SDK. No code is required in your application to enable this integration.
  • The default logging level is DEBUG.
  • Messages are buffered for a maximum of 60 seconds.
  • The log name is "projects/[PROJECT-ID]/logs/appengine.googleapis.com%2Frequest_log".
  • Messages can have multiple lines (records), which can be used by the handler to combine more than one log record.

Logging on the Python 3.9 runtime (default setup)

  • No automatic configuration for Python logging.
  • The default handler is the STDERR handler.
  • The default logging level is WARNING (this is the default level for Python’s logging module).
  • App Engine reads the application output from STDERR and sends it to Cloud Logging.
  • The log name is "projects/[PROJECT-ID]/logs/stderr".
  • Messages are created with a textPayload but no other structured information from the Python logging – message.

Logging with google-cloud-logging on the Python 3.9 runtime

  • Add the google-cloud-logging package to "requirements.txt".
  • Enable it with import google.cloud.logging; google.cloud.logging.Client().setup_logging().
  • The default logging level is INFO.
  • Use setup_logging(log_level=logging.DEBUG) to set a DEBUG level.
  • The log name is "projects/[PROJECT-ID]/logs/app".
  • Messages are created with a "jsonPayload" and with the correct log level (the "severity" field in log records).
  • If Flask is installed, the logging handler gets the trace ID from the request.
  • If Django is installed, and you enabled google.cloud.logging.handlers.middleware.RequestMiddleware, the logging handler gets the trace ID from the request.

Logging for applications that don’t use Flask or Django

For Python applications on App Engine, the important thing is to enable the AppEngineHandler logging handler provided by the google-cloud-logging package when the application starts:

# main.py
import google.cloud.logging

google.cloud.logging.Client().setup_logging()

This will give you log messages such that you can filter by level in the logs explorer.

However if your application does not use Flask or Django, log messages will not have the request’s trace ID, and that makes it harder to identify which messages are associated with a request.

I wrote a demo application with a logging handler to use with the Bottle web framework. The handler adds the correct logging trace ID and other request information.

It does not take much code to extend the logging system this way, but reading the source code for google-cloud-logging reminded me that sending messages to the logging system has an overhead and can only make your application a little slower. Make sure to use the logging API features that avoid unnecessary work (loggers, levels and positional message arguments) and even better just don’t log a message unless you know you will need it to debug a problem later.

Find your website’s URL on Cloud Run

If you have a Python app on Google Cloud Run, how can your app determine its own website URL?

When you deploy to Cloud Run, you specify a service name, and every app deployed to Cloud Run gets a unique URL. The domain in the URLs look something like "my-foo-service-8oafjf26aq-uc.a.run.app". That part in the middle is weird ("8oafjf26aq" in my example), and until you have deployed your first service in a project, it is not obvious how to determine what your app’s domain will be.

Here’s one way for your app to discover its own URL after it is deployed, saving you having to hard-code the value somewhere:

# Tested with Python 3.7.
import os

# Requires google-api-python-client google-auth
import googleapiclient.discovery
import google.auth
import google.auth.exceptions

def get_project_id():
    """Find the GCP project ID when running on Cloud Run."""
    try:
        _, project_id = google.auth.default()
    except google.auth.exceptions.DefaultCredentialsError:
        # Probably running a local development server.
        project_id = os.environ.get('GOOGLE_CLOUD_PROJECT', 'development')

    return project_id

def get_service_url():
    """Return the URL for this service, depending on the environment.

    For local development, this will be http://localhost:8080/. On Cloud Run
    this is https://{service}-{hash}-{region}.a.run.app.
    """
    # https://cloud.google.com/run/docs/reference/rest/v1/namespaces.services/list
    try:
        service = googleapiclient.discovery.build('run', 'v1')
    except google.auth.exceptions.DefaultCredentialsError:
        # Probably running the local development server.
        port = os.environ.get('PORT', '8080')
        url = f'http://localhost:{port}'
    else:
        # https://cloud.google.com/run/docs/reference/container-contract
        k_service = os.environ['K_SERVICE']
        project_id = get_project_id()
        parent = f'namespaces/{project_id}'

        # The global end-point only supports list methods, so you can't use
        # namespaces.services/get unless you know what region to use.
        request = service.namespaces().services().list(parent=parent)
        response = request.execute()

        for item in response['items']:
            if item['metadata']['name'] == k_service:
                url = item['status']['url']
                break
        else:
            raise EnvironmentError('Cannot determine service URL')

    return url

This code uses the metadata service to find the Google Cloud Platform project ID, and the K_SERVICE environment variable to find the Cloud Run service name. With that, it makes a request to the namespaces.services.list API, which returns a list of all the services deployed in a project. Looping through the list, it finds the matching service definition, and returns the URL for the service.

Maybe there’s a simpler approach, because this is a bunch of code that one really shouldn’t need to write. I wish Cloud Run would expose the same sort of environment variables that App Engine provides.

Finding the Cloud Tasks location from App Engine

When using the Google Cloud Tasks API you need to specify the project ID and location. It would be good to not hard-code these for your app, and instead determine the values when the application starts or on first using the API.

Code for this post is available on GitHub.

For an App Engine service, the project ID is readily available, both as a runtime environment variable and from the metadata service. The GOOGLE_CLOUD_PROJECT environment variable is your GCP project ID.

The tasks API location is a bit harder to determine. The App Engine region name (us-central, europe-west, etc.) is not exposed as an environment variable, and there’s no end-point for it in the App Engine metadata service.

However on App Engine the GAE_APPLICATION environment variable exposes the appliction ID (same as the project ID) prefixed by a short region code. We can use this cryptic region code to identify a Cloud Tasks API location. For example, all App Engine services deployed in the us-central region have a GAE_APPLICATION value that starts with s~, such as s~dbux-test.

Google’s documentation lists all the App Engine regions, but as far as I know there is no Google documentation for these short region code prefixes. So here is the list of App Engine regions, taken from the gcloud app regions list command, along with the short region prefix that appears when an App Engine application is deployed in each.

App Engine region Short prefix
asia-east1 zde
asia-east2 n
asia-northeast1 b
asia-northeast2 u
asia-northeast3 v
asia-south1 j
asia-southeast1 zas
asia-southeast2 zet
australia-southeast1 f
europe-central2 zlm
europe-west e
europe-west2 g
europe-west3 h
europe-west6 o
northamerica-northeast1 k
southamerica-east1 i
us-central s
us-east1 p
us-east4 d
us-west1 zuw
us-west2 m
us-west3 zwm
us-west4 zwn

N.B. App Engine’s europe-west and us-central region names are equivalent to the Cloud Tasks locations europe-west1 and us-central1 respectively.

So from your Python code you can determine the Cloud Tasks location using this list of short region prefixes.

import os

# Hard-coded list of region prefixes to location names.
REGIONCODES_LOCATIONS = {
    'e': 'europe-west1',  # App Engine region europe-west.
    's': 'us-central1',  # App Engine region us-central.
    'p': 'us-east1',
    'j': 'asia-south1',
    # And others.
}

def get_project_and_location_for_tasks():
    # This works on App Engine, won't work on Cloud Run.
    app_id = os.environ['GAE_APPLICATION']
    region_code, _, project_id = app_id.partition('~')

    return project_id, REGIONCODES_LOCATIONS[region_code]

Nice! Does feel a little hacky to hard-code that region/locations map. And this won’t handle App Engine regions not in the list.

A more robust solution is to get the location from the Cloud Tasks API. This has the advantage of also working on Cloud Run, but requires 3 more HTTP requests (although those should be super quick). From the command-line, one can use gcloud --project=[project_id] tasks locations list (docs).

The equivalent API method is projects.locations.list.

# pip install google-auth google-api-python-client
import google.auth
import googleapiclient.discovery

def get_project_and_location_for_tasks():
    # Get the project ID from the metadata service. Works on
    # Cloud Run and App Engine.
    _, project_id = google.auth.default()

    name = f'projects/{project_id}'
    service = googleapiclient.discovery.build('cloudtasks', 'v2')
    request = service.projects().locations().list(name=name)
    # Fails if the Cloud Tasks API is not enabled.
    response = request.execute()

    # Grab the first location (there's never more than 1).
    # The response also includes 'name' which is a full location ID like
    # 'projects/[project_id]/locations/[locationId]'.
    first_location = response['locations'][0]

    return project_id, first_location['locationId']

That will fail with an exception if the tasks API is not enabled on the project. When running the application in your local development environment, you will probably want to avoid making requests to public APIs, so that will add some complexity that this code ignores.

The projects.locations.list response is a list of locations. I don’t know if it is currently possible for there to be more than 1 location in the list, the docs suggest that the Cloud Tasks service always follows whichever region the App Engine service is deployed to, and an App Engine service is always tied to 1 region (and cannot change its region later).

What is the location if you use the tasks API from a Cloud Run service, in a project which has never deployed an App Engine service? I don’t know.

I did a test with a project that had an existing App Engine service deployed to region us-central. In the same GCP project I deployed a Cloud Run service in the europe-west1 region, and from the Cloud Run service the call to the projects.locations.list API returned 1 location: us-central1.