March 1, 2009
I had an amazing client experience the other day. I searched long and hard for a client with flawless, perfect, 100% efficient and effective BI environment and applications. My criteria were tough and that's why it took me so long (I've been searching for as long as I've been in the BI business, almost 30 years). These applications had to be plug & play, involve little or no manual setup, be 100% automated, incorporate all relevant data and content, and allow all end users to self service every single BI requirement. Imagine my utter and absolute amazement when I finally stumbled on one.
The most remarkable part was that this was a very typical large enterprise. It grew over many years by multiple acquisitions, and as a result had many separate and disconnected front and back office applications, running on various different platforms and architectures. Its senior management suffered from a typical myopic attitude, mostly based on immediate gratification, caused by compensation structure that rewarded only immediate tangible results, and did not put significant weight and emphasis on long term goals and plans. Sounds familiar? If you haven't worked for one of these enterprises, the color of the sky in your world is probably purple.
But while the enterprise suffered from all typical ills of its peers, they hit the jackpot with BI. Here's what I saw:
- All data and content repositories, regardless of where they were located were constantly crawled and inspected by special "agent" programs. These programs extracted all entities and relationships between entities from structured databases, semi-structured content repositories, and unstructured content from all over the enterprise.
- Every time the company acquired a new business, or needed to bring a new application (especially a legacy one, typically poorly documented) into its BI mix, all they had to do is put a special "sniffer" device on the new network. Within a few hours, these sniffer devices interpreted all of the IP packets from the network, reverse engineered application data and process models, and constructed a nice neat repository of all business transactions, each component described by rich metadata.
- All of the entities extracted by agent programs and business transactions constructed by sniffer devices where put on a service bus and set out to the recipients across the enterprise.
- One recipient was a special hybrid index/database which with no manual intervention and no pre-built data models just indexed, cross-referenced and stored all entities, including many-to-many relationships between all entities. Each data element was also automatically tagged with a "data confidence" index, based on whether the data came from a trustworthy source like a controlled application database, or a spreadsheet from someone's C: drive.
- The portions of the index/database that contain mission critical and / or highly sensitive and proprietary information were stored on servers inside the corporate firewall. All other segments were stored in the cloud for lower maintenance costs and ease of deployment. Data virtualization layer made the physical storage location irrelevant – all data seemed like it came from the same logical location, as far as the analytical applications and users were concerned.
- Other recipients of these entities and transactions were in memory caches or streaming databases that applied business rules to the data (such as correlating events, identifying patterns, outliers and other conditions, and acting on them) in real time.
- More recipients of the entities and transactions were predictive modeling applications, which correlated past and current events, applied models and instantaneously constructed various predictive indicators – more entities.
- All rules and process workflows (business rules, data transformation rules, data quality rules, metrics calculations, predictive models, etc) were created and managed in a single application, but automatically applied and executed in the right locations. Such locations could be business rules engines, ETL applications, BPM applications, data cleansing and profiling applications, predictive models, etc. The rule and process building application was agnostic to where physically the rules and processes were executed.
Other than the application that built the rules, all steps described so far were completely automated, setup and execution proceeding with little or no manual intervention. Sounds too good to be true? But wait!
- Now that all information was in one place, with every single entity cross-referenced and related to every other entity, and described by rich metadata, any user was able to self serve all of their BI needs with little or no involvement from IT. End users who knew exactly what they were looking for, pulled data into in-memory databases and constructed models (with hierarchies and aggregates) on the fly. The other use cases involved so called guided searches, where users did not have a specific questions they were looking to answer, but rather needed to explore all relevant information that was available to them.
- The queries and analysis that these users ran automatically updated the underlying index/database with new entity-to-entity relationships, indices, aggregates and other optimizations that further enhanced future similar analysis. The index/database also correlated specific business questions asked by the users with the actual queries and models and proactively suggested – via hints – similar approaches to new users just beginning their data explorations.
- While most of the structured processes and rules were built by the application described above, the environment also constantly analyzed ad-hoc processes and automatically converted them to structured workflows, if necessary. For example, when an end user discovered a previously unknown data pattern or a condition, he/she may have posted a question on what the condition meant and how to act on it to a collaborative web site, such as a blog. Coworkers who recognized the data condition, suggested certain actions (such as updating an operational application, sending an email to a subject matter expert, etc), which the original user then acted upon. Collaborative BI application noticed that this ad-hoc process repeated a few times and automatically created a structured workflow with specific rules on what the steps were and who had to act on the newly discovered pattern. Next time a user discovered a similar data pattern he/she was presented with an option to execute a pre-built workflow in order to act on that information.
I was speechless. I was completely stunned that such BI environment actually existed. Even though I was a little worried that once such applications become ubiquitous most of the BI practitioners, such as myself, will have to start looking for other jobs.
And then I woke up, and realized I just fell asleep after working for many hours on several research papers that would help Forrester clients (vendors and IT) achieve such a BI nirvana state sometime in the future. And I felt comfortable that our BI jobs were safe for now.