There are various bits of metadata about your project and the current request available to your Web app when running on Google App Engine. Let’s see what they are.
These notes show how to access things using Python 3, but they apply to any App Engine standard runtime (except for the older Python 2.7, Go and Java runtimes).
Additional fields in the WSGI request environ
A request to your app results in your code being invoked with a WSGI environ object that captures the request as a series of key / value pairs describing the requested URL, the headers sent by the client, and any data sent by the client.
As well as standard variables such as HTTP_HOST and QUERY_STRING, apps on App Engine receive additional headers that are inserted by the App Engine service itself (not sent by the client). When present, you can rely on these headers being set by App Engine itself, they cannot be set by a malicious client trying to trick your app.
These extra header names all start with HTTP_X_APPENGINE. Here’s a table of the names and example values,
Key | Example value |
---|---|
HTTP_X_APPENGINE_CITY | london |
HTTP_X_APPENGINE_CITYLATLONG | 51.507351,-0.127758 |
HTTP_X_APPENGINE_COUNTRY | GB |
HTTP_X_APPENGINE_DEFAULT_VERSION_HOSTNAME | app-engine-example.appspot.com |
HTTP_X_APPENGINE_HTTPS | on |
HTTP_X_APPENGINE_REGION | eng |
HTTP_X_APPENGINE_REQUEST_LOG_ID | 5ec7d80400ff0736739830b90c0001737e64627578746f6e2d706573740001696e626a756e64000820 |
HTTP_X_APPENGINE_USER_IP | 33.53.215.240 |
Those location-related headers are cool! In my experience they tend to be fairly accurate, but there are times Google is unable to determine a location. Note that HTTP_X_APPENGINE_REGION describes the requesting user’s region, not the region/zone where your app is hosted.
The HTTP_X_APPENGINE_DEFAULT_VERSION_HOSTNAME header is interesting because in February 2020 App Engine moved to having a short region code in *.appspot.com host names. I wonder if projects created prior to the introduction of the new scheme all have the old-style host names in this header.
How come the headers from the request are prefixed with HTTP_? Because WSGI extends CGI.
Environment variables
Your code has access to a regular Unix-style environment. App Engine uses this to share details of the runtime. Here’s a table with some of those keys and example values. See the documentation for environment variables for a complete list.
Key | Example value |
---|---|
GAE_APPLICATION | s~app-engine-example |
GAE_DEPLOYMENT_ID | 426619129872753442 |
GAE_ENV | standard |
GAE_INSTANCE | 0c61b117cf53359b13df64f50b43903717d4d1d624f37d602146c9642e9ab832ea12a3a |
GAE_MEMORY_MB | 256 |
GAE_RUNTIME | python37 |
GAE_SERVICE | default |
GAE_VERSION | 20200221t144119 |
GOOGLE_CLOUD_PROJECT | app-engine-example |
PYTHONDONTWRITEBYTECODE | 1 |
Interesting things:
- GAE_APPLICATION is your project name, but prefixed with a letter and tilde. That prefix identifies the App Engine region your app belongs to, which you must choose the first time you create the App Engine service in your project. You can’t change the region later. I swear there’s documentation listing these 1 letter prefixes, but I can’t find it now.
- GAE_SERVICE and GAE_VERSION identify the version of your app that is running. Well useful if you think everything must be divided into micro-services (it should not) or if you deploy multiple versions to test things before setting the default version (you should).
- PYTHONDONTWRITEBYTECODE is a Python-specific thing that prevents the creation of
*.pyc
files and__pycache__
directories. In theory your app would start a little faster on the first request if this option was turned off, but in practise it doesn’t matter, and this makes life easier for how App Engine runs your code. Given that, I don’t get why the Google Cloud blog highlights the parallel filesystem cache feature in their Python 3.8 beta announcement. Maybe it is enabled on that runtime, I haven’t tested it myself.
For the older Python 2.7 standard runtime, the OS environment includes all the request things as well. Again, because CGI.
The metadata service
Google’s Compute Engine metadata service is there on App Engine too (but not for the older Python 2.7 standard runtime). Except App Engine doesn’t get all the same things, and it is read-only.
The metadata service runs as an HTTP server. Your code makes requests to http://metadata.google.internal/computeMetadata/v1/ and descendant paths. On App Engine the metadata service exposes information about service accounts, the project’s zone, and the project’s ID.
The service account path is particularly useful, covering some of what used to be available with the google.appengine.api.app_identity APIs. If your code uses the google-auth package, you can get credentials with a short-lived access token for the default service account via google.auth.default()
which in turn gets the token from the metadata service.
But if you want to change the OAuth scopes of the access token, you can go get it yourself. This Python snippet gets a token with scopes for the Google Spreadsheets API:
import requests
url = ‘http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token‘
headers = {‘Metadata-Flavor’: ‘Google’}
scopes = [‘https://www.googleapis.com/auth/spreadsheets‘]
params = {‘scopes’: ‘,’.join(scopes)}
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
token = response.text
These are the paths currently exposed by the App Engine metadata service, relative to http://metadata.google.internal/computeMetadata/v1/, with example values:
Path | Example value |
---|---|
instance/service-accounts/default/aliases | default |
instance/service-accounts/default/email | [email protected] |
instance/service-accounts/default/identity?audience=foo | JWT token |
instance/service-accounts/default/scopes | https://www.googleapis.com/auth/appengine.apis |
https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/cloud_debugger
https://www.googleapis.com/auth/devstorage.full_control
https://www.googleapis.com/auth/logging.write
https://www.googleapis.com/auth/monitoring.write
https://www.googleapis.com/auth/trace.append
https://www.googleapis.com/auth/userinfo.email
|
instance/service-accounts/default/token | {
"access_token": "token",
"expires_in": 1799,
"token_type": "Bearer"
} |
instance/zone | projects/123456789012/zones/us16 |
project/numeric-project-id | 123456789012 |
project/project-id | app-engine-example |
If your project has multiple service accounts, there will be multiple entries under the instance/service-accounts path.
You can get all this (except for the access and identity tokens) with 1 request:
import requests
url = ‘http://metadata.google.internal/computeMetadata/v1/‘
headers = {‘Metadata-Flavor’: ‘Google’}
params = {‘recursive’: ‘true’}
response = requests.get(url, headers=headers, params=params)
response.raise_for_status()
data = response.json()
It annoys me that the zone is provided here, but not the region, and that there is no documented way to derive the longer region identifier from a short zone identifier (as far as I am aware). But looks like that will change soon!
Conclusion
Getting information about the App Engine runtime environment turns out to be incredibly useful for your app because you can remove hard-coded assumptions from your code. The metadata service is particularly interesting because it allows more flexible response types (as opposed to strings in Unix environment variables) and a clear path for Google to extend it without fear of breaking your crazy code.