Deep dive: Mobile Attribution via Android Privacy Sandbox and without GAID
Google has been working on their Privacy Sandbox initiative for a while, mostly focused on the web. Yesterday Google also announced their plans to expand the project to the Android platform. While the web sandbox is trying to remove third-party cookies (mostly because of outside pressure), the Android Privacy Sandbox project’s goal would be to deprecate the GAID.
Doing so would impact a lot of mobile marketing systems from measurement, targeting, retargeting, and more.
The good news is that Google’s proposal for measurement – as it stands right now – seems quite good. In fact, it’s better than good. It’s surprisingly thoughtful, flexible, and proves that you can get rid of GAID without wreaking havoc on the app ecosystem.
It might not be very surprising given Google’s deep roots and foundations in the advertising ecosystem, their inherent understanding of the space, and their desire to leave it unbroken. It’s probably also a lot of battle scars from trying to do similar things on the Web, that led them to a pretty great first draft.
Google says the expected timeline for the project is 2+ years – and we suspect that what we’re seeing today will keep on changing in that period. Nonetheless, I want to applaud Google for putting these ideas forth as proposals, ahead of time, and seeking feedback! We didn’t get a chance to do that with SKAdNetwork, and I think it’s such a shame …
The Android Privacy Sandbox itself has a few topics that address targeting and retargeting, but this article is going to focus mostly on the attribution stuff … because well … it’s a thing we care about a lot here at Singular 🙂
But first – a caveat:
This post is based on my initial interpretation of what is a fairly early and complex document released by Google. If you think I got a detail wrong, please tell me about it!
The TL;DR: Attribution with the Android Privacy Sandbox
The main idea here is that Google will eventually deprecate GAID, but still wants advertisers to be able to run user acquisition effectively. As the biggest ad network on the planet, Google knows measurement is key to that.
To do that, Google will create an on-device service that will be built-in to the Android OS. They call that service the “Attribution Reporting API,” and this service will have a few tasks:
- Store views and clicks reported by ad networks (Google calls these “sources”)
- Store conversion events like installs, purchases, signups reported by Apps/Singular (Google calls these “triggers”)
- Match reported conversion events with reported views/clicks that are stored on the device
- Send the data out to networks/measurement partners/advertisers in two forms:
- Event-level reports: reports that contain super detailed upper-funnel breakdowns (campaign, sub-campaign, creative, down to the click_id itself) paired with super limited lower-funnel data (1 to 3 bit conversion value)
- Aggregatable reports – a more balanced report that contain selected upper funnel breakdowns (GEO, campaign, etc) data with more detailed lower-funnel data (revenue, number of purchases, etc)
Event-level reports kinda sound like SKAdNetwork. Are they similar?
At first read – it seems way better than SKAdNetwok.
The “event-level reports” will provide a super granular breakdown of your upper-funnel data (think campaign, sub-campaign, creative, down to the click_id itself), paired with fairly limited conversion values (3 bit conversion values – vs SKAN’s 6 bits).
It does seem, however, that it will be possible to send up to 3 conversion events at separate times that will each be attributed to the view or click. So probably the first conversion event will be used for the install, while the subsequent 2 conversion events will be for meaningful KPIs that can happen later in the user’s lifecycle. That’s huge … and was one of the main features we wanted from SKAdNetwork.
Here’s a summary of event-level reports vs. SKAN:
What kind of reports can we build with event-level reports?
To make it simpler to understand, here’s an example of what a set of event-level postbacks could look like:
The “Source event ID” is defined by the network, and is essentially the click/view ID defined by the network. That means that this has a 1:1 mapping to all the data marketers care about, such as Campaign Name, Creative Name, Country, OS, Audience, etc.
The “trigger type” is defined by the advertiser/MMP, and is very similar to SKAN conversion values. It does have fewer bits than SKAN, but you can send up to 3 triggers at different times … so good news for cohort reporting!
Here’s a super simplistic example for how an advertiser/MMP can define “trigger type:”
- 001 = install
- 010 = level completed
- 101 = 7 day revenue > $10
Putting all this together, the result in a Singular dashboard could look like this:
So while the “trigger type” (conversion value) is more limited than SKAN, you get incredible granularity on the upper funnel data, and you get the possibility of sending multiple conversion events … and that by itself already provides a more powerful report than what we’re currently getting from SKAdNetwork.
And you know what – there’s more.
Check out the next section about “Aggregatable Reports.”
What kind of reports can we build with the “aggregatable” data?
This area is particularly exciting, and it has a pretty clever implementation as proposed by Google, but there are also some vague details at this point.
The general concept is as follows:
- When an ad network reports clicks and views to the Android Attribution API, they will include “aggregation keys” that will represent certain breakdowns. These keys can be up to 128 bits, which is super useful (128 bits is a TON). For example, imagine that ad network “MobCool” always uses 16 bits for their Campaign ID, and 16 bits for their Creative ID. And let’s say unimaginatively their Campaign ID is 1 and Creative ID is 2. To create the breakdown, they convert the Campaign ID and Creative ID to bits, and concatenate them. And this would result in a 32 bit string:
0000 0000 0000 0001 0000 0000 0000 0010
- Sometime later, when the app is installed, the advertiser/MMP will report conversion events from the app to the Android Attribution API. These events can also append their own breakdowns into the aggregation key. For example, if the advertiser wants to further break down their marketing reports by device model, they could build a key with 10 bits (I chose 10 arbitrarily – they can use 20 bits if they want), and it will look like this:
- The bits from the ad network and the bits from the advertiser/MMP would get appended together. So for a user that came from Campaign 1, saw Creative 1, and uses Acer Liquid A1, the breakdown key would be: 0000 0000 0000 0001 0000 0000 0000 0010 0000 0010 And this is only 40 bits. We have up to 128 bits … so imagine the useful breakdowns possible with this! (Note: we’re not sure yet if there’s a limit on how many different breakdowns you can have. 128 bits is very large.)
- When reporting the conversion event, the advertiser/MMP must also report a numerical value for the conversion event itself. This can be a simple counter for things like installs (e.g. 1), or any number for things like revenue (e.g. $100). This value is what gets summed together, when the aggregation happens (see more about that later). (Note: there are limits on the values you can pass here. This is part of the differential privacy mechanism Google employs).
- Every time there’s a successful match between a conversion event (“trigger”) and a view/click (“source”), Android’s on-device service will store this information, together with the aggregation key that was concatenated together. It will then send this user-level data in an encrypted form to the customer’s ad tech platform (MMP, ad network).
- Now the customer’s adtech platforms (MMP, ad network, etc) have a lot of these “encrypted aggregatable user-level data” records. In order to aggregate them, Google came up with a cool idea. There will be a separate network server called an “Aggregation Service” that will simultaneously decrypt and aggregate the user-level data, based on these predefined aggregation keys. The end result from the above example could look like this:
- And this is obviously simplistic. A proper implementation would add a ton more dimensionality, and at Singular this table would also be connected with various data sources to arrive at real ROI. Google is allowing the aggregation key to be as long as 128 bits – which means there’s a ton of room for adding meaningful breakdowns, and the resulting reports could be really good.
Plus, on the aggregation service:
This “Aggregation Service” is a program built and signed by Google, but will be running on other companies’ servers in what’s called a “Trusted Execution Environment.” That means that if Singular is running an “Aggregation Service” we can’t manipulate how it works, and it’ll work how Google intended.
How will the attribution waterfall logic work?
We think we understand what’s going on, but as we said at the top of this article – there’s a lot of complexity and some unclear direction in the developer documentation as it currently exists. That will probably get cleared up as the project progresses, but not everything is 100% set in stone or fully explained.
Here’s what we think is happening:
- Every ad network can report their views and/or clicks (which Google calls “sources”) of its ads.
- Networks are also allowed to assign priorities for each of their touchpoints (e.g. most networks will probably agree that clicks should have a higher priority than impressions).
- When a conversion event (what Google calls a “trigger”) is reported by the installed app, Android will try to match that conversion event with all the relevant views and clicks, and choose the one with the highest priority.
- Apparently, this logic is done completely separately for each network… which seems to imply that if two or more networks had clicks and/or views (something that happens in real life) both of them will receive attribution postbacks (event-level and aggregated reports).
Clearly item #4 above raises the question of deduplication across ad networks. We can’t have all the networks winning (and getting paid) as this would skew the marketer’s view quite considerably by double counting.
The documentation specifically addresses third party measurement use-cases by enabling redirects on the view/click events. So this is how we think companies like Singular could achieve deduplication across networks for their customers:
- When Ad Networks report clicks and views, they would use the “Attribution-Reporting-Redirects” header in the reply (see function registerAttributionSource) which will include’s Singular endpoint.
- The API will then reach out to Singular’s servers, and we would register the exact same views and clicks based on our customer’s waterfall and prioritization choice (e.g. clicks are more important than views, the attribution window should be X, etc …).
- Once a conversion event is reported by our SDK in the advertiser’s app, the Android API will match it against all views and clicks that we reported, and choose the relevant touchpoint based on the prioritization we supplied – thus achieving deduplication across ad networks.
If this flow is implemented in the way we proposed, this would also have the added benefit of providing some MTA reporting!
Open questions: so much more to learn
There’s a lot we have yet to dig into, and plenty that will need clarification from Google or the ecosystem as we progress. Here are a few open questions for me:
- How will we do fraud detection, management, and mitigation?
For now, it’s unclear what prevents companies from claiming that they got a response from Google. We didn’t see any mentions of cryptographic signatures, as we have in SKAdNetwork.Also, how does this API prevent malicious parties from registering endless clicks and impressions whenever they’d like? (In other words, click spamming.)We did notice the registerAttributionSource function expects an “InputEvent” to register clicks, but not sure how it verifies this was an Ad Click, and not just any InputEvent. Also – for views, there is no such need for an event.
- Why URLs versus parameters?
Why do the registerAttributionSource and triggerAttribution methods always require receiving a URL versus receiving the parameters directly? (We assume it’s some protection as the URL resolution will be done in another process, but can’t be sure.)
- How will the industry coordinate on aggregation keys?In the section above, we explained how aggregation keys are a combination of values from the ad networks (that define part of the key on views/clicks), and advertisers/MMPs (that define the other part of the key on conversion events). The API enables the advertiser/MMP to overwrite all breakdowns, and caution must be taken so that we don’t overwrite something the ad network needed by accident. This creates some complexity, and perhaps something Google could improve in the API, so that it won’t require coordination between a ton of companies. (64 bits for networks, 64 bits for advertisers/MMPs? :))
- How will Google Ads use this functionality?
Apple has a different process for Apple Search Ads than other ad networks have to follow. Will Google Ads follow this new Privacy Sandbox for Android just like every other ad network? Early indications are yes, but there are still some questions.
- How can mobile app deferred deep linking functionality be supported?
Deferred deep linking has traditionally been supported with server-side attribution processing, Google’s current install referrer framework, and even some on-device methods. However they all could also be used to circumvent the privacy goals of the Privacy Sandbox. We would like to see a deferred deep linking solution built into the APIs to help advertisers deliver the streamlined user-experience for their new users.
It’s early days. Google literally just announced this initiative, and clearly there’s going to be a lot of iteration. We’re excited that Google is looking for feedback, and we’re excited to collaborate with them on this initiative. The early documents show a much healthier approach that can offer dramatically better reporting than SKAdNetwork, which is very encouraging. Who knows, maybe there’s even a few things here that Apple might include in SKAN …
One thing is sure, you can count on us to keep covering this, and iterating on it with Google and our customers.
If you’re looking to discuss this further, we highly encourage you join our professional Slack group on all matters of mobile attribution privacy.