In the post called How to Create AWS Lambda Layers for Python, I briefly touch upon the need to retrieve workflow and environment data from GitLab’s GraphQL API. I publish this data to an internal dashboard to track the status of workflows (success, failure, skipped) and the state of environments (available, stopped). The dashboard uses InfluxDB as a data source.

In this post, I take the concepts described in An Introduction to GraphQL Queries and Mutations and apply them to the GitLab GraphQL API. I start by exploring the GraphQL API schema for GitLab and providing an example of the results. Then, I define a Python function to query GraphQL APIs of any type. Finally, I touch on translating the results into a payload compatible with InfluxDB’s line protocol.
Exploring the GraphQL API Schema
GitLab hosts a browser-based integrated development environment (IDE) using GraphiQL (pronounced “graphical“). It is supremely handy for walking through various GraphQL schemas to find desired information.
A few things I like about GraphiQL:
- Information on queries, mutations, and fields are found on the right-hand panel using the same hierarchy as the API.
- The use of
Ctrl + Space
provides a list of available fields based on cursor placement. - The “play” button sends a live request to the API with user credentials supplied.

The two query fields I am interested in are environments
and pipelines
. These are both found within the projects
field along with a membership
argument to limit results to projects of which my account is a member. Additionally, the first: 1
argument for pipelines
limits results to the most recent pipeline run.
Once I tweak the query in GraphiQL to provide the desired results, I save the query into a Python variable named gitlabQuery
for later usage.
gitlabQuery = """
{
projects(membership: true) {
nodes {
name
fullPath
environments {
nodes {
name
state
}
}
pipelines(first: 1) {
nodes {
status
duration
finishedAt
}
}
}
}
}
"""
Example GraphQL Query Results
What do these results look like? I’ve snipped out one object from the query results as an example of what to expect:
{
"name": "lab-azure-site-deploy-eus",
"fullPath": "string",
"environments": {
"nodes": [
{
"name": "lab",
"state": "stopped"
}
]
},
"pipelines": {
"nodes": [
{
"status": "SUCCESS",
"duration": 1396,
"finishedAt": "2020-07-17T06:29:31Z"
}
]
}
}
This lab-azure-site-deploy-eus
project provides Infrastructure as a Service (IaaS) cloud resources in the Azure East US region. It is used for on-demand demonstrations, hence the lab
environment name, as is stopped when not in use to save cost. Additionally, the last pipeline run took 1396 seconds, finished on 17th July, and produced a status result of success.
Armed with a working GraphQL query, it is time to switch from GraphiQL to Python for further scripted queries.
Defining a Python Function for GraphQL Queries
I use a simple Python function named run_query
to send a request to an API. I found a slightly different version of this function on GitHub and altered it to suit my needs – kudos to Andrew Mulholland.
The function accepts the URI address, query (as defined earlier in this post), a desired status code, and the authentication header. If the desired status code is not returned, the function throws an exception.
def run_query(uri, query, statusCode, headers):
request = requests.post(uri, json={'query': query}, headers=headers)
if request.status_code == statusCode:
return request.json()
else:
raise Exception(f"Unexpected status code returned: {request.status_code}")
The next step is defining variables needed for the request. I define the URI, headers, and status code directly in the Python script. I don’t see these values changing very often, if ever. The token, however, is both sensitive and frequently changed. It is stored elsewhere (Vault) and retrieved as an environmental variable when invoked by an AWS Lambda function.
gitlabURI = 'https://gitlab.com/api/graphql'
gitlabToken = 'string'
gitlabHeaders = {"Authorization": "Bearer " + gitlabToken}
gitlabStatusCode = 200
Finally, I execute the query while passing along all required information. The results are returned to a variable named result
.
result = run_query(gitlabURI, gitlabQuery, gitlabStatusCode, gitlabHeaders)
Working with the Results
The result
variable contains the value of return request.json()
. This is JSON formatted content from the API request. A structured object is fairly easy to parse when compared to scraping a large string payload! 🙂
As an example, these rudimentary for
statements will walk the JSON payload and store information in a format that is compatible with InfluxDB’s line protocol.
payloadWorkflow = []
for n in result["data"]["projects"]["nodes"]:
for v in n["pipelines"]["nodes"]:
if v["status"] is not None:
name = n["name"]
status = v["status"]
duration = v["duration"]
finishedAt = v["finishedAt"]
payloadWorkflow.append(f"workflow,project={name},duration={duration},finishedAt={finishedAt} status=\"{status}\"")
Once the payload has been translated, it is sent over to InfluxDB by way of the influxdb-client
for Python. Sensitive information has been replaced with the word string
.
influxToken = "string"
influxOrg = "string"
influxBucket = "string"
influxClient = InfluxDBClient(url="https://us-west-2-1.aws.cloud2.influxdata.com", token=influxToken)
write_api = influxClient.write_api(write_options=SYNCHRONOUS)
write_api.write(influxBucket, influxOrg, payloadWorkflow)
This InfluxDB table displays the stored results.

The steps up to this point should provide enough information to successfully query the GraphQL API of choice using Python. What is ultimately done with that information will be driven by the use case being solved.
Next Steps
Please accept a crisp high five for reaching this point in the post!
If you’d like to learn more about APIs, or other modern technology approaches, head over to the Guided Learning page.