This is the final part in a 3-part series on why the era of the “mobile attribution provider” is coming to an end – and what we can expect from Attribution 2.0. Read part 1 here and part 2 here. The full article was originally published by Hacker Noon.

We’re exploring Branch’s perspective on an ideal Attribution 2.0 solution in five chapters. Today, we’ll cover the fifth and final chapter:

  1. What does “attribution” even mean? A brief history of marketing attribution, including offline, digital, and mobile.
  2. How mobile attribution providers became blind. The reason why these platforms are rapidly losing the ability to do their job.
  3. The future of attribution. How a “persona graph” provides reliable and accurate measurement everywhere.
  4. Reviewing traditional attribution techniques. A deep dive into how measurement worked in the single-platform worlds of websites and apps.
  5. The next generation: a persona graph. Why a persona graph works, and how we built one at Branch.

Chapter 5: The next generation: a persona graph

This chapter explains how a persona graph works, addresses common concerns around user privacy and data security, and goes in depth on how we built Branch’s persona graph. It ends by comparing the older generation of mobile attribution providers with what is possible with a persona graph.

The problem with traditional attribution techniques is they are either probabilistic (meaning there’s a chance the data is wrong), or siloed inside a single platform (web or app). A persona graph provides the best of both worlds.

Imagine the game of Concentration (for those who haven’t played this in a few years, it’s the one where you flip two random cards over, hoping to find a match). The chances of discovering a pair on your first turn are extremely low, but over time (and time is the critical element here), you learn where everything is. Eventually, assuming you have a good memory, you’re uncovering matches on almost every round.

Now, let’s take the metaphor one step further: instead of you flipping cards to learn where they are, imagine a hypothetical situation where you get to join a game in progress, where every card on the table has already been turned face up by other players before your first turn. It wouldn’t be much of a game, but you’d be guaranteed to find a match every time.

Like a Concentration game where all the cards have already been flipped before your first turn, a persona graph allows you to accurately match users that YOU haven’t seen before, but someone else in the network has.Click to Tweet

That’s the concept behind a persona graph: by sharing matches between anonymous data points, everyone wins. Like a Concentration game where all the cards have already been flipped before your first turn, a persona graph allows you to accurately match users that YOU haven’t seen before, but someone else in the network has.

The elephants in the room: privacy, security, and confidentiality.

For a persona graph to survive, there are a couple of critical things that must be guaranteed: 1) privacy and security of user data, and 2) confidentiality.

User privacy and data security. A persona graph makes it possible to recognize a given user in different places, but it does not tell you anything about WHO that user is. If the user wants you to know that information, then you already have it in your own system — the persona graph simply closes the loop by telling you that you’re seeing an existing customer in a new place. And like cookies or device IDs, the user can reset their connection to the persona graph on demand.

In other words, the persona graph must take the same approach to privacy as the postal service. Our letter carriers need to know our physical location in order to deliver mail, but they’re only concerned with the address, not the addressee. We trust that they won’t open our letters and won’t sell information about what we buy to the highest bidder.

At Branch, we feel so strongly about user privacy that we have made a number of public commitments about it. The short version can be expressed as three points in plain English: 1) we proactively limit the data we collect to only what is absolutely necessary to power the service that we deliver to our customers, 2) we will only ever provide our customers with data about end-user activity that happens on their own apps or websites, and 3) we do not rent or sell end-user personal data, period (not as targeting audiences to other Branch customers, not via cookie-syncing side deals with identity companies, not via an “independent” subsidiary — we just don’t do it).

In addition, we rigorously and proactively follow best practices to purge sensitive data and protect our platform against bad actors.

Confidentiality. The only data that is available via a persona graph is knowledge of the connection itself. Not where or how the connection was made, or by which company’s end user. A persona graph must guarantee that it will never allow Pepsi to purchase a list of Coke’s customers.

Said another way, the Swiss have avoided every war in Europe for over 500 years, because everyone recognizes that they are (and always will be) neutral. A persona graph must maintain the same unimpeachable reputation.

A peek inside the Branch persona graph

When we set out to build Branch in 2014, there was already a well-established industry of mobile attribution providers. All of them were competing with each other for the low-hanging fruit of measuring ad-driven app installs. If you work in the mobile industry, you’re likely familiar with their names already (Branch acquired the attribution business of one last year).

Even though the Branch platform might resemble a traditional attribution provider on the surface, the engine underneath is something fundamentally, radically different.Click to Tweet

We decided to take a different approach: we realized the app install ad was a bubble that would eventually deflate, and we also knew that seamless user experiences would become increasingly important as marketers began to care about other channels and conversion events again. So we started by solving the more difficult technical problems that everyone else was ignoring (this is the story we told two years ago in Deep Linking is Not Enough).

The result: through solving the cross-platform user experience problem at scale, for many of the best-known brands in the world, we created a persona graph that allows Branch to provide an attribution solution that is both more accurate and more reliable than anything else available.

Here’s how it works today:

Step 1: Collect deterministic IDs

Believe it or not, this is actually the relatively easy part. User activity occurs in fragments across platforms, and the goal is to have a deterministic ID for each of them. Since Branch’s customers invest most of their marketing resources into websites and mobile apps, these are the platforms where we’ve focused the majority of our effort so far. But the same principle applies anywhere.

To create deterministic IDs on the web, we use a javascript SDK to set first-party cookies. Inside apps, we offer native SDKs to leverage device IDs.

We’ve also built SDKs for desktop apps on macOS and Windows, and custom OTT (Over The Top) device integrations. We will continue adding support for new platforms as customers request them.

Step 2: Create persona matches

Once we have an ID for an identity fragment, we use a layered system of cross-platform matching techniques to tie it back to a persona record on the persona graph. Here are a few examples:

  • Deep links. When a user clicks a link to go from one place to another, that is an ideal time to make a connection. This is our primary method for matching fragments that exist on the same device (e.g., Safari, Facebook browser, native apps), and one of the most reliable because it’s driven by the user’s own activity.
  • User IDs. When a user logs into an account, they’re providing a unique ID that can then be matched if the same user signs in later in another place. We only use this signal to a limited extent today, because there are a number of tricky problems related to shared devices, but we’re actively working on solutions and see a lot of promise in this method. As a side note, this is the only matching method we’ve seen competitors use when they talk about “people-based attribution.” Given the shared device challenges mentioned above, or the fact that (depending on the vertical) the vast majority of visitors never log in, this is certainly an area to question if you’re currently working with one of them.
  • Google Play referrer. Google passes a limited amount of data through the Play Store during the first install. Branch uses this one-time connection to create a permanent match back to the persona graph.
  • Fingerprinting. This is one cross-platform matching method we don’t use to build the persona graph, but it deserves a mention because it is so commonplace in the attribution industry. Branch sometimes has to fall back on fingerprinting when the persona graph can’t provide a stronger pre-existing match, so we’ve invested in an IPv6-based engine that greatly increases accuracy over traditional mobile attribution providers that still rely exclusively on IPv4.

Because of Branch’s massive, worldwide scale, we can also use machine learning to uncover connections between different personas that likely belong to the same user, and just haven’t yet been deterministically merged. We call these “probabilistic matches” because they’re not 100% guaranteed on each end, but they’re still useful and helpful when combined with the high degree of confidence that we get from observing other deterministic patterns.

Here’s how probabilistic matching compares to fingerprinting:

Fingerprinting. Fingerprinting has to happen in real time. In other words, it requires a guess to be made based solely on whatever data is available at the exact moment a user does something. That user might be sitting alone at home (high accuracy situation), or they might be sharing public wifi with several thousand other people while walking around a shopping mall (very low accuracy situation). With fingerprinting, the system has only two choices: 1) it can take a gamble and make the match, or 2) it can throw away the match and say no attribution happened. All of the fancy “dynamic fingerprinting” systems offered by traditional mobile attribution providers are really just trying to decide when to choose option 2.

Probabilistic matching. Because the persona graph is persistent, Branch can afford to be patient. We don’t have to play roulette in real time when the conversion event occurs; instead, we’re able to preemptively store “prob-matches” when the system detects no ambiguity (e.g., when the user is alone at home) to use later (e.g., when the user is inside a crowded shopping mall). For example, the algorithm might create a prob-match if it notices that persona A and persona B have matching fingerprints, were both active on the same IP within 60 seconds of each other, and no other activity occurred from that IP within the last day.

When making these prob-matches between different personas, our system records a “confidence level.” This allows us to move linked personas in and out of consideration depending on the use case. For example, a “match guaranteed” deep link used for auto-login would obviously require a confidence level of 100%, but the industry expects ad installs to be matched with a confidence level usually between 50–85% (the persona graph allows Branch to hit the top end of this range without being forced to accept lower-confidence matches).

Today, Branch dynamically sets the confidence level required for each use case, but this is a configuration we could expose directly to our customers in the future.

Step 3: Scale the network

It’s impossible to just “build a persona graph” because — in the beginning — there is no reason for anyone to sign up.

Why? The value of a persona graph increases for everyone as more companies contribute to it, which means the benefit of joining an existing persona graph is enormous, but there is very little incentive to be one of the best participants in a brand new persona graph — it would be like giving up that already-flipped Concentration game for a new one where you’re playing all by yourself.

Because Branch started out by solving cross-platform user experiences, our persona graph scaled as a natural side-effect of other products that provide independent value at the same time. This approach allowed the Branch persona graph (which now covers over 50,000 companies) to reach critical mass. However, while basic deep linking was a hard problem to solve back in 2014, it is now well on the way to commoditization. Today, it would be almost impossible to get a persona graph off the ground using basic deep links, let alone ever reach a similar level of coverage.

Step 4: Use the match data

What can Branch do with these cross-platform/cross-channel/cross-device personas? Here are a few examples:

Solve attribution ambiguities. This is the obvious one, of course. The persona graph makes it possible to correctly attribute the complicated user journeys we’ve been discussing, such as when you and the other Starbucks customer were both using the same shopping app, and traditional fingerprint-based attribution methods couldn’t tell the difference.

Provide data for true multi-touch reporting. Using multi-touch modeling to better understand user activity is the Promised Land of attribution: every marketer wants it, and everyone has a different idea of what it should be. But there’s one thing everyone should agree on: multi-touch attribution is only as good as the data you feed it, and bad data compounds the problem.

The persona graph allows Branch to consolidate data from across channels and platforms. Legacy mobile attribution providers completely miss this data, which means their “multi-touch attribution” is really just “multi-ad app install attribution.”

Protect user privacy. Fingerprinting has long been a necessary evil for mobile attribution, but inaccurate measurement isn’t the only cost — when fingerprinting matches the wrong user, this also introduces user privacy issues because it means the system believes it is dealing with someone else. The persona graph allows Branch to dramatically reduce the risk of incorrect matching (we even offer a “match guaranteed” flag to enforce it), better protecting the privacy of end users.

Go beyond measurement. Attribution is only possible if the conversion happens in the first place. The persona graph allows Branch to provide the seamless cross-platform user experiences that make this more likely, improving the performance of all your marketing efforts.

For example, if a user lands on your website, even though they already have your app installed, Branch can use the persona graph to detect this and show that user the option to seamlessly switch over to the same content inside your app, where they’re much more likely to complete a purchase.

Comparing persona graph attribution with previous-generation alternatives

To wrap up, let’s revisit the three core tasks of an attribution system, and compare the capabilities of a persona graph-based platform with the traditional alternatives.

1. Capture interactions

Mobile attribution providers started with ads, and have struggled ever since to retrofit their systems in a way that accommodates other channels.

A persona graph is able to support ads, but also support email, web, social, search, offline, and more.

2. Count conversions

Mobile attribution providers are optimized to capture app install events, and aren’t set up to handle non-install conversions that happen on other platforms. Many of them are now rushing to figure out how to perform basic web measurement, a problem that was solved years before apps entered the picture.

A persona graph can attribute app installs, and also captures other down-funnel conversions on websites, desktop apps, OTT devices, and more.

3. Link conversions back to interactions that drove them

As described in Building Attribution 2.0, mobile attribution providers have two matching methods available: they default to device IDs, and fall back on fingerprinting.

A persona graph-powered system can also use device IDs for single-platform user journeys (app-to-app), and has device ID <> web cookie pairs for cross-platform (web-to-app) user journeys. It may occasionally have to fall back on fingerprinting when a matched ID pair is not yet available, but this is a far less frequent situation.

What comes next

Fragmentation in the digital ecosystem is a hornet’s nest that can’t be un-kicked, and the challenge of attribution between web and app is just the beginning — it’s going to get worse (just imagine what it will be like when you need to attribute between your toaster and your car!)

Web and app is just the beginning — it’s going to get worse. Just imagine what it will be like when you need to attribute between your toaster and your car.Click to Tweet

Attribution based on a persona graph makes it possible to handle this fragmentation, and a persona graph built on user-driven link activity is even more powerful because it leads to a virtuous circle: links are the common thread of digital marketing, which means they’ll always be the natural choice for every channel, platform, and device. These links help build the persona graph, and the result is increased ROI, comprehensive measurement everywhere, and more reliable links.

No other platform-specific attribution solution is even in the same league.

At Branch, we see attribution as one part of a holistic solution that provides far more than app install measurement. Our true mission is to solve the problem of content discovery in the modern digital ecosystem. Deep linking was one critical part of this mission. Fixing attribution is another. But the real win is yet to come…stay tuned!

Appendix: FAQ & Objections

What if device manufacturers try to limit the persona graph?

Device manufacturers have a duty to protect their users. They also need to ensure their ecosystems allow companies to be commercially viable. A privacy-conscious, third-party persona graph is an excellent fit for both of these requirements.

Branch works closely with a number of device manufacturers. They are aware of our platform, and supportive of the solution we’ve built.

Doesn’t a persona graph allow companies to steal their competitors’ proprietary data?

No, it does not, because the only data available via a persona graph is knowledge of the connection itself. Not where or how the connection was made, or by which company’s end user. A healthy persona graph contains thousands of participants, ensuring no single company is disproportionately represented, and to survive, a persona graph must guarantee that it will never allow any company to access data it hasn’t independently earned.

Persona graphs sound problematic for user privacy…

A persona graph makes it possible to recognize a given user in different places, but it does not tell you anything about WHO that user is. And like cookies or device IDs, the connection is resettable on demand.

Branch feels so strongly about user privacy that we’ve adopted the Branch Guiding Privacy Principles. Here they are in full:

We limit the data we collect. We practice data minimization, which means that we avoid collecting or storing information that we don’t need to provide our services. The personal data that we collect is limited to data like advertising identifiers, IP address, and information derived from resettable cookies (the full list is below in our privacy policy). We do not collect or store information such as names, email addresses, physical addresses, or SSNs. Nor do we want to. In fact, our Terms & Conditions prohibit our customers from sharing with Branch any kind of sensitive end-user information. We will collect phone numbers if a customer uses our Text-Me-the-App feature — but in that case, we will collect and process end-user phone numbers solely to enable the text message, and will delete it within 7 days afterward.

We will only provide you with data about actual end-user activity on your apps or websites. Our customers can only access “earned” cookies or identifiers. This means that an end user must visit a customer’s site before our customer can see the cookie; and an end user must download a customer’s app in order for Branch to collect the end user’s advertising identifier for that customer. In short, the Branch services benefit customers who already have seen an end user across their platforms and want to understand the relationship between those web visits and app sessions.

We do not rent or sell personal data. No Branch customer can access another Branch customer’s end-user data. And we are not in the business of renting or selling any customer’s end-user data to anyone else. To enable customers to control their end-user personal data, they can request deletion of that data at any time, whether in bulk or for a specific end user. These controls are available to customers worldwide, although we designed them to comply with GDPR requirements as well.

How is a persona graph different from “identity resolution” or “people-based marketing” products?

While these products may have similar-sounding names and seem comparable on the surface, they are very different underneath. Here are three major contrasts:

How they are built. The data for these products is typically purchased in bulk from third-parties and then aggregated into profiles. The Branch persona graph is built from directly-observed user activity, and does not incorporate any personal data acquired from external sources.

What they contain. The user profiles available via these products typically contain sensitive personal data like name, email address, age, gender, shopping preferences, and so on. The Branch persona graph contains only anonymized, cross-platform identifier matches, and has no use for sensitive personal data — we don’t even accept it from customers.

How they are used. A major use case for these products is selling audiences for retargeting ads. This is a fundamentally different objective than the accurate measurement and seamless user experiences that Branch exists to provide.

What about fraud?

Fraud is a never-ending game of cat-and-mouse: as long as there is value changing hands (the literal definition of an ad), fraud can never be truly solved because savvy fraudsters will always find a way through.

The realistic objective of a mobile attribution provider is to block “stupid fraud,” and make fraud hard enough that fraudsters will go somewhere else. The best way to do this is by weeding out anything that doesn’t reflect a realistic human activity pattern. A persona graph has vastly more sophisticated data to use for this assessment than any single-channel, single-platform system.

What about when the persona graph doesn’t cover a user?

Even with a network size of Branch, there are still situations when the persona graph isn’t available. As just a few examples: the first time seeing a new device, browser cookie resets, ITP in iOS, etc.

In those situations, the system has to fall back on the next-best matching technique available. In Branch’s case, this is still as good as (and usually better than) what is available via legacy attribution providers.

What about cross-device attribution?

Cross-device is a surprisingly complicated problem. In theory, the persona graph can connect data across devices, just like it does across channels and platforms.

Some mobile attribution providers have recently begun leaning on cross-device tracking as their entry into “people-based attribution.” Essentially, they merge activity based on a customer-supplied identifier such as an email address or username — if you sign in with the same ID on two devices, then they consider these to belong to the same person for attribution purposes.

This sounds logical on the surface, and it works for these providers because they’re still approaching measurement from a siloed, one-app-at-a-time perspective. Branch already does similar cross-device conversion merging based on user IDs on an app-by-app basis, in addition to the persona graph.

Here’s where things get complicated for cross-device as part of a persona graph:

It’s fairly rational to assume the majority of activity on a single mobile device is from a single human. Sure, people let their friends make a phone call, or check the status of a flight, and this has the potential to muddy attribution data somewhat, but the impact is pretty limited. However, if a user lets a friend sign into their email account on a laptop to print a flight confirmation, and the attribution provider then uses that as the basis to merge identity fragments across the entire persona graph network, the cascading effects could lead to massive unintended consequences.

Our customers ask us about cross-device attribution regularly, and our research team has made good progress. We feel data integrity is the most valuable thing we can offer, so we haven’t rushed because we want to make sure we get this right.

Why are deep links so important?

Some legacy mobile attribution providers feel that deep links aren’t critical to attribution. And from a certain perspective, they’re right: it’s perfectly possible to be a bean counter without also being a knowledgeable guide. At Branch, we feel this is an extremely shortsighted perspective, because the ongoing fragmentation of our digital ecosystem means that without working links, eventually there will be nothing left to measure.

Let’s illustrate this with an example from the offline world:

Imagine a billboard for your local car dealership. Driving down the highway, on the way home from the grocery store, you see this billboard advertising the newest plug-in hybrid. You don’t really need a new car, but your old one has been leaking oil all over the garage floor for months and the Check Engine light came on last week, so you decide (on the spur of the moment) that you want to stop in for a test drive.

You’re excited. You can almost smell “new car” already, and you’re all set to take the highway exit for the dealership…but the off-ramp is blocked by a big orange sign: “Closed for Construction.” You’d have to go five minutes further up the highway to the next exit, and then spend ten minutes figuring out how to drive back on local roads. And besides, that milk in the back seat is going to spoil if you leave it in the sun. You give up and go home.

A week later, you happen to be driving by the dealership again. The highway exit has reopened, and that new car smell has been following you around everywhere for the last few days. But the billboard is now advertising your local bank, and that ad you saw a week earlier has completely faded from your memory. When the salesperson asks, “What caused you to come by today?”, you say, “Oh, I just happened to be in the neighborhood.”

Now, the dealership has two problems:

You might never have come back after the first broken journey. You might even have gone to another dealership instead, because all new cars smell pretty much the same.

The dealership has no idea that the billboard is the real reason behind your visit. Because you don’t even remember yourself. If you end up buying, the billboard was a worthwhile investment…but they’ll never know this, because the highway construction interrupted your journey and broke the dealership’s attribution loop.

It’s not much of a stretch to replace “car” with “app,” “billboard” with “install ad,” and “highway exit” with “link.”

The reality is that in the digital world today, links are the customer journey. If your links don’t work, then even the best measurement tool in the world can’t help you attribute conversions that never happened.

Bottom line: if you find an attribution system that claims to provide measurement without also solving for links that work in every situation (and proving with verifiable data that their links don’t break), be very, very skeptical. It’s likely you’re dealing with a legacy system that hasn’t adapted to changes that have happened in the ecosystem over the last few years.

What if another company creates a persona graph?

This is always a possibility, but due to the nature of network effect, it would be extremely challenging for any other company to reach the critical mass necessary to compete with the Branch persona graph.

Because Branch started out by solving cross-platform user experiences, our persona graph scaled as a natural side-effect of other products that provide independent value at the same time. This approach allowed the Branch persona graph (which now covers over 50,000 companies) to reach critical mass. However, while basic deep linking was a hard problem to solve back in 2014, it is now well on the way to commoditization. Today, it would be almost impossible to get a persona graph off the ground using basic deep links, let alone ever reach a similar level of coverage.

What about Self-Attributing Networks (SANs)?

SANs like Facebook, Google, Twitter, and so on hook their walled gardens into the ecosystem via the device ID. The difference is that instead of allowing attribution providers to observe all of the user’s interactions, the SAN just responds with a Yes or No when asked the question “hey, this device ID just did something…did you see that user in the last X days?”

The SAN approach has advantages (fraud is an almost non-existent problem) and disadvantages (it’s a black box that provides very little visibility), but it’s a reality of the ad ecosystem.

Since most walled gardens already connect their users across platforms through a user ID or email address, there’s no reason why the SAN can’t start reporting activities by that user on other devices/platforms. This sort of connection gets incorporated into the persona graph automatically through the associated device ID.

What about Limit Ad Tracking (LAT) on iOS?

When LAT is enabled, iOS sends the IDFA as a string of zeros. Currently, it appears that around 20% of iOS have this setting enabled. Without an IDFA, Branch is unable to connect that user to the persona graph, but we are still able to perform attribution via fingerprinting or the IDFV (an alternative device ID that is available even with LAT enabled, but scoped to a single app/vendor).

Branch is a mobile linking platform providing unified mobile experiences and measurement for more than 50,000 mobile apps, including Airbnb, Pinterest, BuzzFeed, Tinder, Foursquare, Yelp, and Sephora. Branch’s linking platform can help you grow your mobile app through features like deep linking, sharing, referrals, mobile banners and interstitials, custom app onboarding, and unified attribution across platforms and channels. Learn more about Branch or contact sales today.
Contact Sales Create Links