Forecasting web traffic using Google Analytics and Facebook Prophet

Posted On 16 Sep 2022
Comment: Off

Learn how you can predict traffic changes and forecast when periods of stagnation or negative growth are to be expected.

Seriously.

This article will show you how you can:

  • Predict traffic changes, and maybe even let your boss know when periods of stagnation or negative growth are to be expected.
  • What to expect during times of increased or decreased traffic, so you could tell if your declines are in line with predictions, or if something might be going wrong and traffic is declining more than it should.
  • Include a graph with an update to your boss or client of what’s coming in advance, so they know you aren’t just making excuses after the fact.

Want to skip the info and just click a few buttons?

While we’ll be going through running the code to forecast your web traffic and what each of the sections does, you can skip this and jump right to the Colab here if you aren’t interested in knowing what’s going on and how to make adjustments.

For those who want to run the code locally and be able to edit the hyperparameters (a fancy name for some of the variables that do important things and generally have one value for a complete run of a model) let’s go!

Important note before you begin: The further ahead you ask it to predict, the wider the gap between the low and high estimates gets as the model becomes “less sure of itself.”

How to forecast your Google Analytics traffic

We’ll be using two systems to accomplish our goal:

  1. UA Query Explorer: In this example, we’re going to use Universal Analytics for our forecasting. I will adjust the code in the Colab in about a year to GA4, but because it needs a year or more of data to really do the job, using UA, for now, makes the most sense and few people have GA4 data going back more than a year. UA Explorer is a tool that will quickly and easily generate the API URL that will pull our analytics for us.
  2. Facebook Prophet: Prophet is a forecasting model built and open-sourced by Facebook. It includes a lot of great built-in features, such as the ability to import holidays. It’s what’ll turn our analytics data into a forecast.

For those who wish to run locally, you can obviously do so, and the code provided will get the job done.

So, let’s dive in and get you predicting your future traffic!

1. Connect your instance

What this means is you’re “turning on” Google Colab so you can run scripts from it.

2. Import the needed libraries

The next thing we need to do is to import the libraries we need to make all this work.

They are:

  • pandas – a Python library for data manipulation (to help us work with time-series data structures).
  • numpy – needed to work with arrays (like our data and sessions array).
  • matplotlib – we’ll be using this to create some visualizations.
  • json – used to work with JSON data.
  • requests – used to make HTTP requests (like pulling analytics data).
  • fbprophet – used for time series forecasting.
  • pystan – used to update probabilities. Like the probability of the traffic being X on a date in the future.

To run is all you need to do is click the play button.

You’ll see a bunch of downloads start and the play button turn into a spinning icon indicating it’s working, and when they’re done downloading and installing the play button with re-appear.

3. Sign up for Google Analytics demos & tools

You need to log in using the Google account tied to the analytics you want to access.

4. Configure the analytics you’re pulling

Next you need to select the account, property and view you want to pull your traffic data from.

Where it notes to pick a metric, you can pick from many of your traffic metrics depending on what you want to know. Examples might be:

  • Sessions (the one I use most)
  • Visitors
  • Unique visitors
  • Pageviews

Additionally, when you click the “segments” field a list of all the segments for the property (including custom segments) will display so you can select what traffic you want to look at.

After you’ve run the query just copy the API request URL:

5. Import analytics into the colab

Click the play button in the next cell:

You will be asked to enter the API query you just copied:

Paste it in and hit “Enter.”

You should be presented with a graph of the traffic over the data range you selected:

6. Formatting

The next cell just changes the column headings to what Facebook Prophet expects.

7. (Optional) Save

This step is completely unnecessary if you don’t plan on referencing back to the traffic numbers or forecasted numbers. I personally find it handy, but some won’t.

The first thing you’ll track is simply the traffic numbers (same as you could export).

I promise it gets more interesting.

8. Adding holidays

The next step is to add holidays and to determine how seasonality is considered. There are some options and ways you can tweak things, or you can run it as is.

The decisions you need to make are:

  • What years do you want to pull the holidays for?
  • What country do you want to pull the holidays for?

Additionally, you’ll notice the line:

m = Prophet(interval_width=0.95, yearly_seasonality=True, weekly_seasonality=True, daily_seasonality=False, seasonality_mode = "additive", changepoint_range = 0.85)

You can change any of the parameters to suit your needs, though these settings should work decently in most scenario:

  • interval_width: This is how uncertain we’re willing to let the model be. Set to 0.95 it means that when training, 95% of all points must fit within the model. Set it too low, and it follows general trends but isn’t overly accurate. Set too high and it chases too many outliers and becomes inaccurate in that direction.
  • yearly_seasonality: Monitors and responds to yearly trends.
  • weekly_seasonality: Monitors and responds to weekly trends.
  • daily_seasonality: Monitors and responds to daily trends.
  • seasonality_mode: Set to either “additive” or “multiplicative”. Additive (the default) results in the magnitude of change being constant. You’d use this in most case to deal with things like holiday traffic spikes where the percentage increase vs pre-Black Friday is more-or-less steady. Multiplicative is used in scenario where there are growing surges. For example, in a growing town that sees an additional increase each year. Not only is there growth, but that growth gets larger with each interval.
  • changepoint_range: A change point are points where the traffic changes significantly. By default the changepoint

This is a tip-of-the-iceberg scenario. There are other parameters you can review and apply as you feel so inspired. Details on them are available here.

I’ve set things here to what seems to work well for me in most (but not all cases).

Yearly and monthly seasonality impact most businesses. Daily, not so much.

9. Crunch the numbers

Thankfully you don’t have to do it.

Simply click the run button.

And you’ll soon see:

Not all the rows or columns are showing. If they were, what you’d see is:

  • The highest number the model predicts likely (yhat_upper).
  • The lowest (yhat_lower).
  • The predicted value (yhat).

Importantly, you’ll see “periods=90” in the code above. That is the number of days I’m going to get predictions for.

I’ve found 90 works decently. After that, the range gets pretty large between high and low but can be interesting to look at.

10. (Optional) Save predictions

This is an optional step for those who would like to save their predicted values, or use them to test against different parameter values (those discussed in step eight above).

Once run, you’ll just click the link:

Which takes you to:

Each time you run it your numbers and results will be stored and can be easily accessed at a future time to compare with different runs.

It will also give you the numbers to reference if you’re ever asked for a predicted value for a specific day.

11. The magic

Hit the run bottom and you get what you’ve likely come here to get.

Optional

I’ve added an extra Insights section. It simply displays the impact of some of the areas we’ve been discussing above.

You can see in the top chart, where the different change points are. Further down you get insights into how the different seasonal trends are impacting the predictions, etc.

Closing

I’ve always looked for ways to predict in advance what’s coming my way.

It’s always better to show your boss or client that a slowdown is expected a week before it happens rather than try to explain it after the fact.

Additionally, this insight can also help you plan your strategy.

Your work may be different when in your peak traffic points, than it is when you’re in a lull. You can look back over your analytics trends month-by-month, year-by-year and try to piece it together – or just let machines do what machines do best.

About the Author