DataLayer and the data layer. When I first started with web analytics, I couldn’t figure out whether these were the same thing or different. Every source claimed something different, and I hadn't yet learned how to read Google's documentation :)
In any case, I can divide my entire work with GTM into two large periods: BEFORE I understood what the data layer is, and AFTER. Although the data layer is not the simplest thing to grasp, in this article I’ll try to present the information as clearly as possible.
The first thing to address is what exactly the data layer and dataLayer are, and what is the difference between these concepts.
The data layer is a data structure that ideally contains all the data you want to process and transfer from your website or another resource to other services. Examples of such services include analytics platforms like Google Analytics 4. And an example of the data we send might be transaction details: transaction ID, transaction revenue, shipping cost.
There are several reasons why we use such an intermediary environment:
If not, let’s continue exploring the data layer and dataLayer. We've already talked a bit about the first, now let’s move on to the second.
Put simply, you can think of dataLayer as the name of the data layer in Google Tag Manager. When you install the GTM code on your site, it already includes the creation of an array with this name.
Of course, you can rename it if you want, but this is rarely done.
In other words, dataLayer is a specific implementation of the data layer within Google’s tag manager. Throughout this article, I will deliberately use both terms (data layer and dataLayer), and I mean the same thing — the data layer in GTM.
The difficulty in working with dataLayer is that you need to look at this entity from two sides — business and programming:
Marketers usually don’t want to dive into programming and its limitations, while developers typically just code and don’t want to get into the business side. This very situation is the root of most issues.
Since dataLayer is an array, the data is stored as key-value pairs, and that’s exactly how we need to pass it in. To create an array with some data, you can use the following command:
<script>
dataLayer = [{'varName' : 'varVal'}];
</script>
This command assigns the value varVal
to the variable varName
. Sounds simple enough, but this method of pushing data may have side effects. The code initializes the array from scratch, so if there were previous entries in the dataLayer, they’ll be overwritten.
To avoid this, if you already have a dataLayer array on the page, use the command below — it allows you to append new values:
<script>
dataLayer.push({ 'varName' : 'varVal'});
</script>
However, this method has its own downside: if the dataLayer array hasn’t been created yet at the time the code runs, the information won’t be saved.
To summarize, use the first method before the GTM code loads, and the second after. Or use this alternative that works in both cases:
<script>
window.dataLayer = window.dataLayer || [];
window.dataLayer.push({ 'varName' : 'varVal'});
</script>
This code first checks whether the dataLayer exists. If it does, it appends to it; if not, it initializes it and then appends.
In practice, the amount of data passed into the data layer is often much greater than a single key-value pair. For example, transaction data, which we mentioned earlier, would look something like this:
GTM relies heavily on interacting with the dataLayer. It’s this object that allows GTM to fire tags asynchronously. The algorithm works like this:
event
variable;No one waits for the tag to finish executing. If new data is pushed to the dataLayer, a
new cycle begins in parallel. This is very similar to a queue working on a “first in – first out” principle. And thanks to this, when using the data layer, you can be sure that the required data will be available as soon as it’s needed. To add something to the queue, use the dataLayer.push
method. You can use it to pass simple data or declare events.
It’s important to understand that declaring a data layer is not mandatory. You can technically get values from the page using other methods. However, you cannot use events without a data layer. Therefore, it's recommended to not only pass data but also declare an event. A good example is the transaction event shown earlier.
Let’s look at an example:
We have three events related to user or browser actions:
Window Loaded
— the full page load event. At this point, we only see the event name and a unique event ID.Click
— the event of clicking an element. In this case, we see a different event name and lots of technical data about the click: element ID, class, etc.Main Funnel
— a custom event sent at a moment we’ve defined ourselves (in our case, a click on the “create project” button). Here we only send the specific action information and the user ID who performed it.As we can see, at the moment of each event (and by the way, these events are used to configure triggers), we can only interact with certain pieces of information.
To access the data available in the data layer, you can use a variable of the “Data Layer Variable” type, where the variable name should match the key name. For example, here’s how we can retrieve the UID value that we passed in our earlier custom event:
As for the second part of the variable’s configuration — Data Layer Version — you’ll typically use version 2, as version 1 is used much less often. More on the differences between the versions can be found in this article.
At the beginning of the article, I said that the data layer is a data structure that ideally contains all the data you want to process and transmit from your website. A good practice is to consolidate all necessary data into this structure. A great example of this kind of unification is the following situation: you need to set up dynamic remarketing in AdWords, enhanced e-commerce in Google Analytics 4, and dynamic remarketing in Facebook. The clumsy solution would be to write three separate technical specs and pass the required data to each system individually. But there’s a smarter way: push all the data into the data layer and then distribute it to the necessary platforms.
Check out these codes — they’re very similar, for example, for the product detail view event:
With the help of the data layer, you can gather all this information into a single array and save both developer time and your own nerves. Here’s an example of such an array:
I’m sure that after reading this article, you still have some questions. Feel free to ask them in the comments.
If you enjoyed this content, subscribe to my LinkedIn page.
I also run a LinkedIn newsletter with fresh analytics updates every two weeks — here’s the link to join.
Web Analyst, Marketer