The Data Integration Imperative
The Guest
Eric Dodds is the Head of Product Marketing at RudderStack, a Warehouse Native Customer Data Platform built to help data teams at brands such as Crate&Barrel, Stripe, Anheuser Busch, and Priceline, deliver value across the entire data lifecycle, from collection to unification and activation. Eric has been a founder, a CMO, worked as a strategist, an account manager, he hosts his own podcast, The Data Stack Show, and he’s here today to talk about data integration
Common mistakes in big data
Our interview with Eric began with an overview of common big data mistakes he sees with marketers and data teams.
“Maybe this is a little bit of a technical way to describe it,” begins Eric, “but the biggest mistake I see is companies not abstracting data collection/transformation from the tools where it’s used in a business context. The challenge I see is that when you rely on a tool as your main source of truth, and you don’t abstract information from the data layer, you’re beholden to the tool’s data, API, and models - so you end up working with their limitations.”
Marketing platforms make data abstraction difficult
Eric goes on to describe how one of the largest CDP databases in the world, Salesforce, makes this work of abstraction particularly cumbersome for marketers, sales teams, and businesses.
“Salesforce is a frankenstein of a nightmare when it comes to data abstraction,” explains Eric. “Look at two pain point scenarios; 1) subscription business; customers are on recurring payments. If I wanted to find out the lifetime value of a customer, representing that in Salesforce is difficult. You’d need a sum total of value, plans, and try to define the relationship and value it. So you have to create different objects for each customer and it gets complicated. 2) User actions; let’s say there’s an important part of your customer journey, content, webinar, downloads. Mapping touchpoints in a journey within Salesforce is very hard. If you want to build a churn model, you can’t do that through Salesforce. If you aren’t able to abstract data, you have limited ability to create the analytics you need to understand marketing efforts.”
The difference between structured data and event data
Eric then breaks into the importance of data classification in abstraction, and talks about the differences between structured data (highly specific to platforms and stored in a predefined format) and event data (the connective tissue describing the provenance of structured data).
“Dealing with event data that is chronological is a lot different than dealing with structured data that describes user attributes,” says Eric. “Then you end up with custom fields that don’t help - ETL is one source to one destination, and generally with event data, you need a one to many integration functionality. You want to send event data to the warehouse, showing up as tables but structured entirely differently than a platform data. So you have a table of events (form submissions, calls, clicks, page views) in strict chronological order, and this can help you get a full sense of the business intelligence in the data layer.”
For a full rundown on data integration imperatives, why data maturity has nothing to do with size, how to avoid V-look up hell, and if Eric prefers a house full of kittens to a house full of puppies, listen to the full episode at the links provided!
The Links
https://www.rudderstack.com/
Rudderstack Data Maturity Guide
LISTEN TO THE FULL SHOW -> Stay tuned, stay curious and subscribe to What Gets Measured on Apple Podcasts, Spotify, YouTube or add it as a Favorite on your podcast player of choice.