Privacy Sandbox on Android and Singular: how it will all work (part 1)
How do you break more while being less disruptive? When we recently hosted a webinar on Privacy Sandbox on Android, InMobi’s Sergio Serra said that while Privacy Sandbox breaks more than SKAdNetwork, it will be less disruptive.
Sure, that might sound illogical, but part of the reason for it can be found in the chat I recently had with two software architects at Singular on the Growth Masterminds podcast.
Privacy Sandbox breaks more
Essentially, Privacy Sandbox will break more in the industry because in a sense it’s a full-scope advertising suite, with capability for targeting, for attribution and measurement, and for retargeting. That’s a 360-degree view from an advertiser wanting to place an ad (Topics API), to being able to measure the effectiveness of that ad in what people do after experiencing it (Attribution Reporting API), to being able to create audiences and retarget former customers, users, or players (Protected Audiences API, formerly known as Fledge).
In contrast of course, Apple’s SKAdNetwork has no targeting mechanism, no notion of audiences, and no ability to retarget: it’s a privacy framework for adtech, not an adtech framework for privacy.
So naturally, Privacy Sandbox is more complicated. It creates new solutions for more parts of the advertising ecosystem, which is exactdly why it’s also going to break more existing tech.
“Integrating with Android Privacy Sandbox is more complicated than integrating with SKAdNetwork,” says Singular software architect Ron Shub.
“It is a really complex solution for marketing and performance measurement,” agrees Singular chief software architect Yuval Carmel.
But Privacy Sandbox is also less disruptive
But that’s not the whole story. It’s also less disruptive.
And there’s a very simple reason why: adtech companies are going to build technology to manage all the aspects of targeting and audiences and measurement that Privacy Sandbox is breaking. And marketers will be able to use those solutions pretty much just like they’re using existing solutions.
With a few caveats, of course.
“Luckily for our customers, of course, we plan to be there for them and handle the heavy lifting, of course,” says Yuval Carmel. “We’ll guide them quite a lot.”
Of course it’s not just Singular.
Ad networks, demand-side platforms, exchanges, and supply-side platforms are all building targeting and audiences capabilities based on Topics API and Protected Audiences API, plus all the other privacy-safe targeting and retargeting criteria they can still utilize after the eventual deprecation of the Google Ad ID (GAID). That includes context, time, type of supply, coarse location, and dozens of other factors that DSPs and other adtech solution providers use for identifier-free mobile traffic.
The plan and the hope — to be validated by actual experience when Google eventually turns on Privacy Sandbox at scale — is that marketers will be able to do their jobs and the adtech ecosystem will manage everything else.
We’ll see how well that goes, of course, but most in the mobile performance marketing ecosystem seem to feel that the time and preparation Google is investing into Privacy Sandbox will result in much less marketing signal degradation than we saw on iOS with the introduction of SKAN 3.
The Google referrer
One big helper in that regard: the Google referrer
Just like on the web where a website receives a referrer when a potential customer clicks on a link telling the website where that visitor is coming from, the Google referrer in mobile app installs provides insight into where an ad was clicked, providing a direct-line last-click measurement solution for advertisers.
(Note: this wasn’t 100% clear at the time of our conversation, as you might note in the video or podcast transcript below, but it is now.)
“The Google Play install referrer … is probably the best and most accurate mechanism to attribute and match ad clicks to app installs on Android devices,” says Yuval Carmel. “As long as it’s available … most marketing attribution and our attribution as well will work with this reliable and robust solution. It’s the best solution. You can basically pass things to the Google Play Store and you get it as an intent from them in the MMP SDK, in our SDK. So it works the best and you can reliably attribute users that way.”
Referrer data is critical, but it’s not the only data that marketers will be getting from Privacy Sandbox.
Attribution Reporting API in Privacy Sandbox
The Attribution Reporting API in Privacy Sandbox on Android supports 2 kinds of reports:
- Event-level data
- Super detailed upper-funnel breakdowns
- Super limited lower-funnel data
- Aggregated data
- Selected upper funnel breakdowns
- More detailed lower-funnel data
(See more about these reports in Singular CEO Gadi Eliashiv’s Attribution API breakdown.)
The event-level data for conversions or engagement is granular but super limited: 1 bit for view-through attribution and 3 bits of data for click-through. One bit will pretty much tell you that an event happened or didn’t: someone registered or didn’t, someone bought or didn’t. More challenging: you have to pick the one event you care about most and won’t get data about anything else.
The upper funnel insights from event-level data, on the other hand, are super-detailed.
Pretty much the opposite is true for the aggregated data, where you only get a few upper funnel breakdowns, but you get more detailed conversion data. Since the data is aggregated — and there’s noise added — privacy is preserved.
Each will have its own place.
“Event level reports are good for optimization; the aggregatable reports are very good for reporting for a campaign performance measurement,” Shub says. “And that’s why here in Singular we really focus on that because it’s going to give the most accurate data with less noise and much more flexibility.”
So how does the flow work in Privacy Sandbox?
At a very simplified level, here’s what a real-world scenario looks like under Privacy Sandbox for Android:
- User clicks on an ad
- Ad network registers what Google calls a source for that click, which encodes some data about where the click was, what campaign the ad is from, creative information, and more.
- User installs the app
- App publisher registers conversion events, which Google is calling triggers.
- A job schedule runs on the user’s device in Privacy Sandbox, creating encrypted data about those triggers.
- When the app publisher decides they have enough data, that encrypted data goes to an attribution aggregation service.
- The attribution aggregation services decrypts the data, summarizes it to a report, and adds noise.
- A marketer gets a summary report and has to decode the dimensions that were encoded when registering sources and triggers.
It all sounds very complicated, and that’s probably because it is. But the good news is that Singular — and other adtech vendors — are doing the heavy lifting.
“We will do everything for the UA managers,” says Shub. “They don’t even need to know everything that is going on under the hood. I’m sure they want to know how confident we are in the data, but all this infrastructure is something that we’re going to support together with ad networks. And it’s our job to prepare this infrastructure.”
There’s more, of course, including details on the data you get back and how you use it, and what to set up in your sources and triggers.
That will all have to wait for the second part of our interview with Yuval Carmel and Ron Shub, coming soon.
Subscribe to Growth Masterminds
Subscribe to our YouTube channel with a single click.
Or pick your favorite audio podcasting platform on the podcast home page.
And … a full transcript: Singular and Privacy Sandbox on Android
Note: this is AI-generated and lightly edited. It may contain errors. Check the actual audio or video if in doubt.
Privacy Sandbox on Android is coming soon and it’s time to embrace the suck.
Hello and welcome to Growth Masterminds. My name is John Koetsier. I kind of love the phrase embrace the suck. It’s something I’ve said to my kids a lot when they’re doing something that’s hard. It’s something I say to myself when I’m doing something that’s hard but worthwhile. It’s about leaning in to hard stuff so you can eventually make it easy.
We saw that happening on iOS with ATT and SKAN marketers who leaned in and got it, got better at it. They did better at it and they achieved better results. It’s likely the same thing is going to happen with Privacy Sandbox and Android. It’s not gonna be easy, but it will happen. It’s coming. The GAID is going away and it will be worthwhile to lean in. So to start embracing the sock, we’re chatting today with two people.
Ron Chubb is a software architect at Singular and Yuval Carmel is the chief architect at Singular. Welcome Ron, welcome Yuval.
Hey, great to be here.
Awesome, super pumped to have you. Hey, we’re diving into Privacy Sandbox for one of the first times, I think, on Growth Masterminds. We’ve talked about it a bunch on the blog. We’re gonna do more on that as well. But as everybody knows, Privacy Sandbox is an attempt from Google to do three things, right?
Show relevant content and ads, measure ad effectiveness, and limit tracking.
Yuval, maybe kick us off here. What’s your overall impression of Privacy Sandbox on Android?
So yeah, John, I think Google are investing a lot of time in engineering this solution. You can see it quite a bit. They’re building a privacy-oriented solution, which will require UA marketers to define properly what they want to track and in which morality in advance.
So the solution basically migrates the attribution logic to the device. Like we’re moving in to do more work on the client itself.
And then basically, it requires more complex management of the triggers and the conversions you want to track. So the marketing data won’t be available anymore with PII attached to it as we are used to it, like user-level data. Things are going to be different a little bit and probably reporting and like the data reporting is going to be more aggregated. We’re going to see more aggregated reports, people using more aggregated data and less relying on user-level data.
So, and the idea here is obviously to, that it would be hard to track back single users behavior and usage of specific apps. One thing to mention is that the sandbox team in Google are working closely with us and other partners to make sure the solution is well scrutinized and tested, like well architected as well, which is pretty cool. And it’s a great approach.
If you take the opposite, you mentioned embrace the suck. So embrace the suck is like a military term, right? And because you have to, there’s no other way. You have to embrace the suck. And with previous changes in the privacy landscape with other companies, you really had to embrace the suck. You couldn’t do anything, you couldn’t talk to anyone. There was no work relations over there.
And here, basically, we see Google working with the partners and trying to make sure the PSA will be easier to embrace. Let’s say that.
Love it, love it, love it. Ron, if you look at Privacy Sandbox and you look at, you know, SKAN on iOS, compare and contrast them for us.
So yeah, Google are trying to solve the same problem, right? And doing a privacy enhancing solution that limits user level tracking. But I can say that the approach is super different between a SKAdNetwork and Android Privacy Sandbox, for example.
And firstly, taking ATT and just to recap ATT is App Tracking Transparency first introduced in iOS 14.5, the change the behavior from opting out to opt in. So when you install an app, you get a pop-up, and you need to say if you allow tracking. And then most people say no. And then you don’t have we or the MMPs, they don’t have the IDFA anymore. The IDFA is reset, and it makes it much harder to track marketing data.
So Google’s approach is a bit different. They say we first want to have an alternative. We want to have the Privacy Sandbox and APIs. And only later, perhaps, we’ll talk about deprecating the Google Advertising ID. So we think the Google Advertising ID is here to stay for now, and there’s no particular plans to deprecate it [immediately].
And when they will deprecate it, they will give a substantial notice before. And when talking about SKAdNetwork compared to Google attribution, which is the Attribution Reporting API, then they’re both, again, trying to solve the same thing. But the approaches are, again, super, super different. So a few teasers, documentation-wise, Google supplies tons of documentations and repositories. They basically open sourced everything to us. And it’s really helpful in understanding how the infrastructure will look, what we can do with it, how we can deal with noise, et cetera, et cetera. And a few specific teasers are, for example, dimensions.
In SKAdNetwork, you get a SKAN campaign ID, which is a number between 0 and 99, just 100 options. Now in SKAdNetwork 4.0, you get a bit more options with Source ID, but it’s really standardized and it’s quite capped.
Google supplies 128 bits, which is much, much, much bigger. It gives you a lot of places to encode your dimensions. Another small teaser is … Apple limited the way that you, when you get the conversion, the conversion window is practically limited. So you can only track after installs for a few days.
And Google doesn’t have this limit at all. So you can get lots of aggregatable reports and raw data. And you decide when you want to get the summary report. So you can also track a month after the install happened. And it really attests to the level of freedom that Google provides. But with that comes also complexity. So integrating with Android Privacy Sandbox is more complicated than integrating with SKAdNetwork.
That is an interesting phrase and that perhaps will not come to many marketers as happy news because most have said that SKAN is really challenging and that was already in 3 and 4 is more challenging with more options and more post-bacs and more delays and all those sorts of things. Now there’s some testing happening right now, Privacy Sandboxes and data. You’ve been working with a developer preview.
What are some of the biggest learnings and takeaways from working with the preview?
Yeah, so we’ve been very impressed with Google’s efforts, like Yuval mentioned. Bunch of documentations and repositories that helped us go through and understand what’s going to be introduced really soon. Our current objective is to complete an end-to-end testing of the Attribution Reporting API, which means from registering a click through querying the aggregation service and getting back the decoded and decrypted summary report.
There are some hurdles on the way. It doesn’t go very smoothly, but Google has been super attentive.
There’s currently another issue that we’re facing, and we think we’re on the verge of solving it. And we’re really excited to be able to finish the end-to-end test and be practically ready for the Privacy Sandbox. So yeah, Google is super responsive in answering questions and helping us out and they seem to be investing tons of resources and we enjoy working together.
Super interesting. You’re working on it now. As we’re recording, it’s May 1. This will probably be released in two weeks or so. We have, I think, till the end of the year before the GAID might go away and other things will happen. But yeah, there’s going to be some time there.
So big picture, if we look at Privacy Sandbox, there’s three privacy-preserving APIs, as Google calls them across web and Android, right?
There’s Topics API, which should allow for much better targeting than we currently have on iOS, but I still think the granularity is gonna be a challenge. There’s what they used to call Fledge API, which is now Protected Audiences API, that should allow retargeting and maybe a couple other use cases. But you’re most concerned, of course, with the Attribution Reporting API.
I don’t know why, there’s some odd reason about that, but that’s where you’re spending most of your time. Your thoughts, how good is it?
Yeah, so that’s a great question. As you mentioned, it looks like Google is basically trying to find a solution for every reason you use Google Advertising ID, right? So they get the Topics API for you for retargeting purposes and the Fledge audiences, Protected API for audiences. And then Attribution Reporting API.
So yeah, what’s my thoughts regarding it and how good is it?
The attribution reporting Privacy Sandbox is going to change the way some of us are used to see marketing data. So we have a lot of customers that are still used to seeing user-level data and marketing data attached to see specific users, to their PII even. So although we have SKAN and the changes in the industry, many people just, like many customers, we heard in the industry basically stopped marketing in iOS and just heavily based themselves on Android.
And now they’ll have the same issue in Android as well. And user-level data is going to be super limited.
You have like three bits, as Ron mentioned, or we’d mention in one of the next questions maybe. But you have like just three bits to collect information regarding an event. But of course, the attribution aggregated reporting can be utilized for the marketing purposes, if you understand the requirements and how to execute it.
So is it good?
It’s a good solution for privacy, first and foremost. It’s a great solution for privacy because everything will be more private. Your PII is more protected. Your information is more protected. Your behavior is less easily measured and tracked. But it is a really complex solution for marketing and performance measurement. Luckily for our customers, of course, we plan to be there for them and handle the heavy lifting, of course, in the beginning, and we’ll guide them quite a lot. But what’s going to be hard? So it’s going to be hard to understand how to define your aggregation keys or basically dimensions or the granularity in which you want to measure and track your conversions.
It’s going to be a little bit complex to understand how to manage the conversion events themselves. And we will talk about the meat more later on, but the idea here is that the solution is really generic and it gives you the freedom to figure out the best measurement, that method for you, like whatever you need, you can basically implement here, which is a great advantage, but it will be quite complex, I think, for advertisers to manage by themselves.
Super, super interesting. Just to take a half a step back for a second and think about the landscape of mobile and mobile apps and how it’s changing, changed over the past year and a half and over the next year.
It’ll be fascinating to look back at this point in history and say, oh, that’s why apps do this now, because the conversion models for advertisers need to change, so the product needed to. It’ll be interesting to see. Ron, let’s turn to you.
I always ask when I talk to people I’m talking about SKAN, you know, what their estimate is for the level of drop of ad efficiency from IDFA to SKAN?
And I’m gonna ask you that in terms of Android. It’s really early, of course, you’re just starting to work with this. It’s not out in the wild. It’s not being used by very many people at all, just tested.
Can you guesstimate how much ad efficiency might drop as we move from GAID to Android Privacy Sandbox?
I’ll take this one. So I think the GAID and the ad efficiency drop, or basically the problem here is measurement, right? And the measurement efficiency is going to drop on GAID. And once the measurement isn’t as accurate and possible, of course, the ad efficiency will drop as well.
So the thing is that for click-through attributions, we have the Google Play install referrer as well. I think most of our Android-based attributions are based on Google Play install referrer, which is probably the best and most accurate mechanism to attribute and match ad clicks to app installs in Android devices.
So as long as it’s available, the Google Play install referrer … I believe most marketing attribution and our attribution as well will work with this reliable and robust solution. It’s the best solution. You can basically pass things to the Google Play Store and you get it as an intent from them in the MMP SDK, in our SDK. So it works the best and you can reliably attribute users that way.
So if Google will deprecate the Google Advertising ID but won’t deprecate the Google Play install refer.
I think for click through attribution, we still have a really robust and accurate method to calculate attributions, the same as we do today. Great question to Google, which probably none of them really have a definitive answer, is what will happen with Google Play install refer. We’re still waiting on an answer and I think some of them still waiting on an answer. So no one really knows what’s the plan over there.
Regarding view through attribution, it’s a bit different. You don’t have the Google Play install refer over there. We’re currently relying on getting impressions reporting from the networks themselves, from the partners themselves. Note, I’m not talking about self-attributing ones like Google Ads and Facebook.
I think about the other guys out there that are basically sending the impressions to us and then we’re the one matching using the Google Advertising ID with the clicks that we also get. And in that aspect, I think that the Google Advertising ID deprecation will affect the views to attribution quite heavily. And I think we see that the sandbox solution will probably be used first and foremost for view through attribution and not for click-through as long as we have the referrer solution.
Super interesting. So we’ll see if we continue to have that. We saw that fingerprinting became an issue on iOS. And Apple addressed it and said, hey, that is measurement. That is tracking. That is not allowed. What do you see happening with fingerprinting on Android as Privacy Sandbox comes in?
Right, so we assume the same will probably happen. I mean, people will still continue to use fingerprinting as long as they can and feel comfortable doing it. But as Apple did, I’m assuming that also one day in the future, Google will decide that they want to make fingerprinting harder and even not possible, in some cases. And will not really enable it anymore. And when that happens, then people, we need to find other solutions. But Google did mention that they will give a substantial heads up before that. So we’re not thinking it’s going to happen any time soon. And for now, fingerprinting is still in play.
Talk about event level reports, and how they differ from aggregated reports?
Right, so Google’s in-attribution reporting API supports two types of reports.
Event-level reports are more suited for optimization. So they can be used, for example, for training data for machine learning models, for example, when you want to optimize ad placements. But the issue with event-level reports is that the conversion data is super coarse. You don’t get almost any data about what happened after the install. So practically for view through attribution, you get one bit. And for click through attribution, you get three bits. So that’s practically, you can’t encode anything almost there.
Go into detail here … one bit. What can you do in one bit? Is that like on or off? Is that one bit? It’s binary?
You can practically say, you can say, did that happen or did that not happen? So you can say add to cart happened. So that’s pretty much it. You have to choose one event and focus on it and just say if it happened or didn’t happen, but that’s not much information in aggregated reports, you get a lot of freedom in the conversion data … the same 128 bits we mentioned before, that you can split between the registering of the source of the click with the view and registering triggers. So you can register many metrics, events or revenue, and really measure what happened, what the journey that the device, the user did in the device after he installed the app. So again, event level reports are good for optimization, aggregatable reports are very good for reporting for a campaign performance measurement. And that’s why here in Singular we really focus on that because it’s going to give the most accurate data with less noise and much more flexibility.
So there’s some noise added. Walk us through the flow. Give us a sense of what happens when you’re placing an ad. Somebody clicks on it, they install, they do something. How does that flow work through all of Privacy Sandbox?
Yeah, so that’s a question that there’s no really short answer for, but I will try to summarize it as much as possible. So the first steps are on the device. So a user sees an ad, and then you register a source in Google. And you define already the dimensions that you want to measure. And then the user.
And registering a source is simply telling Google where the ad was placed, what app it was in or what website it was in, correct?
Exactly. You can put a bunch of things in the aggregation key, which practically means the campaign, like you said, what country, whatever you decide to encode into the dimensions there. But yes, you define the dimension on that, on that step. And that’s the registered source. It’s in the publisher app.
After the user installs the app, once they do events that you want to measure or revenue you’ll register it as triggers. So you can register multiple triggers. Exactly. So you can measure multiple triggers. There is a limitation here that we’re going to mention soon enough.
But triggers are practically only the events and metrics that you want to measure.
And after you register the triggers, some time later, there’s like a job schedule that runs on the device. And once it runs, then you get what’s called aggregated data reports, which are practically raw encrypted data. You can’t do anything with it so far. So you have to keep it in your database and store it for a while until you get enough aggregated reports, enough raw data.
And once you do, then you take a batch of aggregated raw reports and you send them to the attribution aggregation service. What the aggregation service does for you is decrypt the data, summarize it to a report. You can think about it as a table and add noise to each row of that table. So each dimension’s permutations, for example, campaign A from country US will contain multiple installs.
And to that, and multiple revenue and multiple events, and to that, Google will add noise. So they say exactly what the distribution of the noise is. And you can already start thinking about, how do I encode the right numbers in order to make the noise not very harmful? So signal-to-noise ratio is here, right? So you want to have a much higher signal than the noise.
So once you get back from the aggregation service, a summary report, you have to decode the dimensions that you encoded when you registered the source and the trigger. And you need to try, using data science algorithms and whatnot, you need to try to remove the noise and to give the actual numbers. And the way we did that with a scale network, and I assume we’ll do the same here, is give you the approximation of a number and a confidence interval that really says how I’m sure we are in the number that we give you. And that depends on the volume of the installs you had in that period of time.
And so Ron, there was a lot of you need to and you’ll get and other stuff like that. And what you’re saying and you’re getting data that you can’t understand and it has to go somewhere that’s not you. And then you get it back and there’s some noise added to it before you get it back.
Now, please tell me that all the you here is actually Singular getting this data you can’t decrypt and then sending it someplace where it can get decrypted where it gets its noise added and then you get back out something that you can actually read is that as a marketer the mark the you marketer doesn’t have to do all that stuff is that correct
That’s absolutely correct.
So we will try to do, we will do everything for the UA managers. They don’t even need to know everything that is going on under the hood. I’m sure they want to know how confident we are in the data, but all this infrastructure is something that we’re going to support together with ad networks. And, it’s our job to prepare this infrastructure …
That’s a promise. You heard it here first.
… which is a really cool infrastructure, by the way. But UA managers don’t have to worry about it at all. They have to worry about what metrics and events they want to measure and decide which ones are the most prioritized because it’s hard decisions like we have in the SCAD network. We can’t get all the events and revenue as we used to.
Excellent. And you know what? You just ended our session here because there’s so much more that we do need to talk about. We need to talk about, wow, SDK runtime.
We need to talk about the data that you’ve been talking about, the flow, what’ll come back, you know, all of that stuff, the modeling that you’ll do. We’ll have to talk about, you know, web to app, what the impact we think will be here. You’ve all, you’ve got a lot more to ask you and Ron, we’ve got a lot more to ask you, but this just became part A and they’re going to have to do part B a couple of weeks down the road or something like that.
‘Cause it’s like 30 minutes now and you know, probably a little long for this particular one and also you guys have lives you’ve got to, you know, get back to and I’ve got a meeting that I’m already late for as well, but thank you so much for this time. I really do appreciate that. We’ll set something up.
You have to talk to me again. That’s the bad news. But the good news is you get the rest of your evening off …