Creating Google Analytics Alerts Based On Calculated Metrics

They told us it was impossible. They told us it was madness. They were wrong.

Full source code can be found at: https://github.com/Risgn94/CalculatedMetrics

Background

There comes a point in time when you need to automate yet another manual task, due to lack of hours in a day. Our belief is, that most processes can be automated and definitely should. The only question is how intelligent, dynamic and autonomous a system should be.

For some time, we have used the calculated metric AdWords Clicks / AdWords Sessions to look for tracking issues. Please read this post from my colleague Lars Larsen for further elaboration.

As we are using this metric for making sure our tracking is working, it is crucial to catch it right away if something happens. So, the question is: How do we make sure that we catch disrepancies?

Calculated Metrics does not allow for custom Google Analytics Alerts (due to the fact they are custom, I guess). To start off with, we used a combination of SuperMetrics for Google Sheets and a simple Python script reading the data and under certain conditions sending an email. This process was not really scalable we knew a better procedure must exit— But it was the best alternative at the point (Trust me, I have been Googling a lot, and found tools which almost was able to do it).

We knew the AdWords and Google Analytics API existed, but it had always seemed as an “over-kill” for simply reporting two metrics — But in the end, that was the final solution.

Getting started

After finally admitting to ourself we needed the Google Analytics Core Reporting API, a simple “googling” for getting started google core reporting api” was enough to get the right answer. This lead me to this page elaborative explaining how set up a simple API call with a service account: https://developers.google.com/analytics/devguides/reporting/core/v3/quickstart/installed-py

The first few steps are easily followed, if you know a little bit about Python. Be aware that I am using Python 3.x and not 2.x as showed in the quick start.

Assuming you have:

  • Enabled the Google Analytics Core Reporting API
  • Downloaded JSON credentials
  • Installed Client Library

Let’s move on.

Now, you are able to copy the source code from the before mentioned url into your favourite Python editor (Mine is PyCharm). The first thing we need to change, is a library. After some googling I located that one of the libraries used, have changed. Therefore at line 5 change:

from apiclient.discovery import build

to:

from googleapiclient.discovery import build //new "apiclient"
from datetime import timedelta, datetime //explained later

Your libaries now needs to look like this:

import argparse
from googleapiclient.discovery import build
import httplib2
from oauth2client import client
from oauth2client import file
from oauth2client import tools
from datetime import timedelta, datetime

If you are using Python 3.x you need to change a couple of “print” functions located now at line: 95, 96 and 99. If you cannot find them, running the script will definitely tell you were they are located.

Running the script, will now allow you to generate an “analytics.dat” file, which tells Google that your script is authorized to use the data associated with your account. The console will print:

If your browser does not automatically opening a new window with the image below, click the link highlighted in the console. In the window you just need to choose the account for which the view you want to get the data from is associated:

Now, you have generated the analytics.dat file which will be in the project folder with your other files, and if you only want to get the sessions for the last 7 days, are we now done. But I guess that is not the case, so now we are going to modify the script — If you are not familiar with Python or programming hold on tight!

Modifying the script

Since I use the script for multiple accounts, the first thing I would like to do, is to make it a class instead and convert the API call to a function of this class with the client_secrets as class variables.

First of all, insert the following code in the HelloAnalytics.py:

def __init__(self):
self.SCOPES = ['https://www.googleapis.com/auth/analytics.readonly']
self.DISCOVERY_URI = ('https://analyticsreporting.googleapis.com/$discovery/rest')
self.CLIENT_SECRETS_PATH = 'client_secrets.json' # Path to client_secrets.json file.

"""Initializes the analyticsreporting service object.

Returns:
analytics an authorized analyticsreporting service object.
"""
# Parse command-line arguments.
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter,
parents=[tools.argparser])
flags = parser.parse_args([])

# Set up a Flow object to be used if we need to authenticate.
flow = client.flow_from_clientsecrets(
self.CLIENT_SECRETS_PATH, scope=self.SCOPES,
message=tools.message_if_missing(self.CLIENT_SECRETS_PATH))

# Prepare credentials, and authorize HTTP object with them.
# If the credentials don't exist or are invalid run through the native client
# flow. The Storage object will ensure that if successful the good
# credentials will get written back to a file.
storage = file.Storage('analyticsreporting.dat')
credentials = storage.get()
if credentials is None or credentials.invalid:
credentials = tools.run_flow(flow, storage, flags)
http = credentials.authorize(http=httplib2.Http())

# Build the service object.
analytics = build('analytics', 'v4', http=http, discoveryServiceUrl=self.DISCOVERY_URI)
self._analytics = analytics

The first three lines creates the variables instantiated for later use, and the rest of the code prepare an object from which, we are able to make calls to the Google Analytics Core Reporting API. The final line:

self._analytics = analytics

is the object from which we do this.

You are now able to remove all the rest of the code, so we only have our libraries and our Init class.

Next, we need to create two functions: one for requesting the data, and one for retrieving the data from the call. First, let us create our api call function:

def get_sessions_30_days_total(self, view_Id):
return self.return_response(self._analytics.reports().batchGet(
body={
'reportRequests':[
{
'viewId': view_Id,
'dateRanges': [{'startDate': '30daysAgo', 'endDate': 'yesterday'}],
'metrics': [
{'expression': 'ga:sessions'},
{'expression': 'ga:adClicks'}
],
"metricFilterClauses": [{
"filters": [{
"metricName": "ga:sessions",
"operator": "GREATER_THAN",
"comparisonValue": "0"
}]
}],
"orderBys":[
{"fieldName": "ga:sessions", "sortOrder": "DESCENDING"},
],
'dimensions':[
{"name": "ga:sourceMedium"},
{"name": "ga:date"},
{"name": "ga:adDistributionNetwork"}
]
}
]
}
).execute())

This function takes one parameter, namely the viewid from which we want to extract the data, and do a request. Explaining the structure of the call itself would take up a lot of space, and I therefore recommend reading the documentation at: https://developers.google.com/analytics/devguides/reporting/core/v4/basics but to make it short; I get the sessions and clicks from the viewID: xx, from 30 days ago to yesterday, filtered by more than one session, ordered by sessions descending, with the dimensions sourceMedium, date and addistributionnetwork.

Next, we need to implement our response sorting function. What it does, is basically ordering the response in a way which makes the data easier manageable:

def return_response(self, response):
"""Parses and prints the Analytics Reporting API V4 response"""
return_Values = []
for report in response.get('reports', []):
columnHeader = report.get('columnHeader', {})
metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])
rows = report.get('data', {}).get('rows', [])

for row in rows:
dimensions = row.get('dimensions', [])
dateRangeValues = row.get('metrics', [])

for i, values in enumerate(dateRangeValues):
for metricHeader, value in zip(metricHeaders, values.get('values')):
name = metricHeader.get('name')
return_Values.append({'dimension': dimensions, 'name': name, 'value': value})

return return_Values

Now, our HelloAnalytics.py class is done. Now we need to create some extra functionality to further sort our data.

Be aware that we are only looking at Search Network data in this example. If you want all CPC channels, remove this line:

network_Dict = [x for x in source_Dict if x['dimension'][2] != 'Content']

Lets create a file named “functions.py” with the following code:

from datetime import date, timedelta

def getYesterday():
yesterday = date.today() - timedelta(1)
date_String = yesterday.strftime('%Y%m%d')
return date_String

def sortAdwClicksSessions(json_Data):
yesterday = getYesterday()
data_Dict = [x for x in json_Data if x['dimension'][1] == yesterday]
source_Dict = [x for x in data_Dict if x['dimension'][0] == 'google / cpc']
network_Dict = [x for x in source_Dict if x['dimension'][2] != 'Content']
adw_Clicks_Arr = [x['value'] for x in network_Dict if x['name'] == 'ga:adClicks']
adw_Sessions_Arr = [x['value'] for x in network_Dict if x['name'] == 'ga:sessions']
adw_Clicks = 0
adw_Sessions = 0
for values in adw_Clicks_Arr:
adw_Clicks = adw_Clicks+float(values)
for values in adw_Sessions_Arr:
adw_Sessions = adw_Sessions+float(values)
try:
adw_Clicks = float(adw_Clicks)
except IndexError:
adw_Clicks = 0
try:
adw_Sessions = float(adw_Sessions)
except IndexError:
adw_Sessions = 0
return {"adw_Clicks":adw_Clicks, "adw_Sessions":adw_Sessions}

def sortSessions30Total(json_Data):
source_Dict = [x for x in json_Data if x['dimension'][0] == 'google / organic']
sessions_Dict = [x for x in source_Dict if x['name'] == 'ga:sessions']
total_Sessions = 0
for values in sessions_Dict:
total_Sessions = total_Sessions+int(values['value'])
try:
return float(total_Sessions)
except IndexError:
return 0

def sortSessionsYesterday(json_Data):
yesterday = getYesterday()
source_Dict = [x for x in json_Data if x['dimension'][0] == 'google / organic']
data_Dict = [x for x in source_Dict if x['dimension'][1] == yesterday]
sessions_Dict = [x for x in data_Dict if x['name'] == 'ga:sessions']
try:
return float(sessions_Dict[0]['value'])
except IndexError:
return 0

The last file we need to create is our “main.py” containing the following code:

import HelloAnalytics as HA
import functions as f

def main():
view_Id = 'YOUR_VIEW_ID'
data_Object = HA.init()
data_Response = data_Object.get_sessions_30_days_total(view_Id)

adw = f.sortAdwClicksSessions(data_Response)
adw_Sessions = adw['adw_Sessions']
adw_Clicks = adw['adw_Clicks']

print("AdWords Clicks vs. Sessions for yesterday:")
print("Sessions: ",adw_Sessions)
print("Clicks: ",adw_Clicks)

if __name__ == "__main__":
main()

That’s it! Running the main.py should result in the following (with your data of course):

and just to check with Google Analytics:

To further fully create an alert, you need to integrate it with a constant running service. We used an Ubuntu droplet at digitalocean.com. Also, you might want to inform users using email, which can be done with the following: http://naelshiab.com/tutorial-send-email-python/

If you need any help, feel free to contact me.

If you like this post, please show your support, and let me know in the comments if something is missing.

Best Regards

Asger Thyregod — AdNudging.com