Main/Data Layer in Google Tag Manager: An Advanced Guide

Tue, May 14, 2019

Data Layer in Google Tag Manager: An Advanced Guide

DataLayer and the data layer. When I first started with web analytics, I couldn’t figure out whether these were the same thing or different. Every source claimed something different, and I hadn't yet learned how to read Google's documentation :)

In any case, I can divide my entire work with GTM into two large periods: BEFORE I understood what the data layer is, and AFTER. Although the data layer is not the simplest thing to grasp, in this article I’ll try to present the information as clearly as possible.

dataLayer and the data layer – what’s the difference?
The data layer – a two-sided view
Pushing data into dataLayer
Interacting with the dataLayer in Google Tag Manager

Data Layer and the data layer – so what’s the difference?

The first thing to address is what exactly the data layer and dataLayer are, and what is the difference between these concepts.

The data layer is a data structure that ideally contains all the data you want to process and transfer from your website or another resource to other services. Examples of such services include analytics platforms like Google Analytics 4. And an example of the data we send might be transaction details: transaction ID, transaction revenue, shipping cost.

There are several reasons why we use such an intermediary environment:

Data is not always available by default on the page or in the markup. Very often, on the thank-you page we only see the order number, while additional data such as transaction amount and the list of purchased items are not displayed. But these are exactly the kind of data we want to see in our analytics system and use for more precise advertising.

24.1 Order information on the thank you page

The data is on the page/markup, but it’s difficult to collect and format properly. Even if the data is there, you’ll need at least basic coding skills to extract it.

24.2 Getting data from the page using jQuery

The data on the page may change. Even if you know how to grab the right data using a small code snippet, remember that information on the page can change. The slightest rearrangement of elements can break your analytics system and cause data loss. Are you willing to take that risk?

If not, let’s continue exploring the data layer and dataLayer. We've already talked a bit about the first, now let’s move on to the second.

Put simply, you can think of dataLayer as the name of the data layer in Google Tag Manager. When you install the GTM code on your site, it already includes the creation of an array with this name.

24.3 Adding a dataLayer array when installing the GTM code on the website

Of course, you can rename it if you want, but this is rarely done.

In other words, dataLayer is a specific implementation of the data layer within Google’s tag manager. Throughout this article, I will deliberately use both terms (data layer and dataLayer), and I mean the same thing — the data layer in GTM.

The data layer – a two-sided view

The difficulty in working with dataLayer is that you need to look at this entity from two sides — business and programming:

dataLayer — a data structure for storing, processing, and transferring important business information about your website’s context to other programs or systems.
dataLayer — a JavaScript array that stores data in key-value pairs. The key is the name of the variable (a string), and the value can be any valid JavaScript data type.

Marketers usually don’t want to dive into programming and its limitations, while developers typically just code and don’t want to get into the business side. This very situation is the root of most issues.

Pushing data into dataLayer

Since dataLayer is an array, the data is stored as key-value pairs, and that’s exactly how we need to pass it in. To create an array with some data, you can use the following command:

javascript

<script> 
dataLayer = [{'varName' : 'varVal'}]; 
</script>

This command assigns the value varVal to the variable varName. Sounds simple enough, but this method of pushing data may have side effects. The code initializes the array from scratch, so if there were previous entries in the dataLayer, they’ll be overwritten.

To avoid this, if you already have a dataLayer array on the page, use the command below — it allows you to append new values:

javascript

<script> 
dataLayer.push({ 'varName' : 'varVal'}); 
</script>

However, this method has its own downside: if the dataLayer array hasn’t been created yet at the time the code runs, the information won’t be saved.

To summarize, use the first method before the GTM code loads, and the second after. Or use this alternative that works in both cases:

javascript

<script> 
window.dataLayer = window.dataLayer || []; 
window.dataLayer.push({ 'varName' : 'varVal'}); 
</script>

This code first checks whether the dataLayer exists. If it does, it appends to it; if not, it initializes it and then appends.

In practice, the amount of data passed into the data layer is often much greater than a single key-value pair. For example, transaction data, which we mentioned earlier, would look something like this:

24.4 Sending transaction information to the dataLayer

Interacting with the dataLayer in Google Tag Manager

GTM relies heavily on interacting with the dataLayer. It’s this object that allows GTM to fire tags asynchronously. The algorithm works like this:

Information about an event is pushed to the dataLayer;
The event is detected using the event variable;
A trigger is configured based on the event;
When the trigger condition is met, a tag is fired.

No one waits for the tag to finish executing. If new data is pushed to the dataLayer, a new cycle begins in parallel. This is very similar to a queue working on a “first in – first out” principle. And thanks to this, when using the data layer, you can be sure that the required data will be available as soon as it’s needed. To add something to the queue, use the dataLayer.push method. You can use it to pass simple data or declare events.

It’s important to understand that declaring a data layer is not mandatory. You can technically get values from the page using other methods. However, you cannot use events without a data layer. Therefore, it's recommended to not only pass data but also declare an event. A good example is the transaction event shown earlier.

Let’s look at an example:

24.5 Data sent to the dataLayer on the Window Loaded page load event

We have three events related to user or browser actions:

Window Loaded — the full page load event. At this point, we only see the event name and a unique event ID.
Click — the event of clicking an element. In this case, we see a different event name and lots of technical data about the click: element ID, class, etc.

24.6 Data sent to the dataLayer on the element click event

Main Funnel — a custom event sent at a moment we’ve defined ourselves (in our case, a click on the “create project” button). Here we only send the specific action information and the user ID who performed it.

24.7 Data sent to the dataLayer during a custom event

As we can see, at the moment of each event (and by the way, these events are used to configure triggers), we can only interact with certain pieces of information.

To access the data available in the data layer, you can use a variable of the “Data Layer Variable” type, where the variable name should match the key name. For example, here’s how we can retrieve the UID value that we passed in our earlier custom event:

24.8 Setting up a Data Layer variable to retrieve the User ID

As for the second part of the variable’s configuration — Data Layer Version — you’ll typically use version 2, as version 1 is used much less often. More on the differences between the versions can be found in this article.

Instead of Conclusion

At the beginning of the article, I said that the data layer is a data structure that ideally contains all the data you want to process and transmit from your website. A good practice is to consolidate all necessary data into this structure. A great example of this kind of unification is the following situation: you need to set up dynamic remarketing in AdWords, enhanced e-commerce in Google Analytics 4, and dynamic remarketing in Facebook. The clumsy solution would be to write three separate technical specs and pass the required data to each system individually. But there’s a smarter way: push all the data into the data layer and then distribute it to the necessary platforms.

Check out these codes — they’re very similar, for example, for the product detail view event:

Google Ads dynamic remarketing code:

24.9 AdWords dynamic remarketing code for the product detail view event

Enhanced Ecommerce in Google Analytics 4:

24.10 Enhanced Ecommerce Google Analytics 4 code for the product detail view event

Facebook dynamic remarketing code:

24.11 Facebook dynamic remarketing code for the product detail view event

With the help of the data layer, you can gather all this information into a single array and save both developer time and your own nerves. Here’s an example of such an array:

24.12 Data array in the dataLayer for the product detail view event

I’m sure that after reading this article, you still have some questions. Feel free to ask them in the comments.

If you enjoyed this content, subscribe to my LinkedIn page.

I also run a LinkedIn newsletter with fresh analytics updates every two weeks — here’s the link to join.

Intermediate Java Script DataLayer Google Tag Manager

Author

Maks Hapchuk

Web Analyst, Marketer