I had a conversation recently with Brian Lent, founder, chairman, and CTO of Medio. If you don’t know Brian, he has worked with companies such as Google and Amazon to build and hone their algorithms and is currently taking predictive analytics to mobile engagement. The perspective he brings as a data scientist not only has ramifications for big data analytics, but drastically shifts the paradigm for how we architect our master data and ensure quality.
We discussed big data analytics in the context of behavior and engagement. Think shopping carts and search. At the core, analytics is about the “closed loop.” It is, as Brian says, a rinse and repeat cycle. You gain insight for relevant engagement with a customer, you engage, then you take the results of that engagement and put them back into the analysis.
Sounds simple, but think about what that means for data management. Brian provided two principles:
  • Context is more important than source.
  • You need to know the customer.
To get the most from your data, it is commonly accepted that flexibility needs to be introduced. Often this is translated into more processing power either at the server or RAM. Additionally, the database structure will also change by shedding the model and moving to a columnar or distributed file system. However, the conversation is still grounded on the data source. This is the fallacy.
The beauty and strength of big data systems that originated with companies like Google, Amazon, and Netflix is not how the data was stored and where processing occurred. It is the ability to place context first. Brian illustrated that instead of caring about search terms (structured and catalogued), it was understanding of the content and navigation that made Google successful. This shift continuously asks and answers “why?” and “what for?”, then using this insight to influence again, again, and again. To make this happen, analytics and semantic technology need to work together and live as a service, influencing the information fabric, not sculpting a persistent state in a data source.
This brings us to, know the customer. The insight gained from behavior and events is interesting in broad terms. Those patterns provide strategic insight. But, to truly influence, you need strong master data. The difference today, the customer master is constantly redefined in a “closed-loop” system. Context across an ever-increasing array of channels through direct engagement or behavior in the market are a continuous force. Just as with data sources, data quality and master data management tools have to shed strict enforcement and sculpted standards. They have to continuously take these inputs and learn the new customer master definition and be ready to apply these new profiles in the next cycle.
When you consider data quality and master data management within a big data and analytic strategy, remember context over catalogue and converging analytics with processing. This will ensure insight patterns stay connected with master data.