We naturally use them when making informed decisions, and our applications need them to deliver results with even greater relevance. Exogenous data are central to new practices. But organizations need a way of ingesting and leveraging such data.
Exogenous, external, third-party… whatever you decide to call them, data coming from the other side of the wall (not a wall as in Game of Thrones, but a wall marking the boundary of the company’s IT assets) are now essential. Organizations will gradually end up using exogenous data just as naturally as individuals who use them to make informed decisions. This mix between internal and external data is invaluable for avoiding a blinkered view and opening up a wealth of new possibilities.
It is clearly difficult to adjust or sustain predictive models without this type of data mix. On a building site, poor weather conditions can turn roads into a swamp, which slows down truck movements and increases energy consumption. In a movie theater, customer reviews, school holidays and obviously the weather (again!) can influence box office sales. We could provide countless examples, but they would all point towards the same challenges that need to be addressed in order to get the best results out of exogenous data.
#1 Incorporate data with the focus on speed
Exogenous data are intrinsically scattered, fragmented and technically varied. What they can contribute to the organization is still in the realm of theory, since no tests have been carried out. That is why it is important for organizations to quickly incorporate exogenous data while keeping costs under control and performing tests with short turnaround times. The keyword here is agility.
Quickly set up a test environment, capture data according to available methods (import, API, etc.), cleanse the data for greater ingestion, and store in a model matching the type of data for subsequent exploitation… in a conventional environment – i.e. where no applications can be deployed without first issuing detailed specifications – stacking up these tasks soon leads to a tunnel effect. Only an Analytics as a Service approach can shorten the timeline and quickly provide the means for detecting correlations. These approaches are driven by agile methods, as well as highly automated Modern Analytics platforms, of which ForePaaS is a perfect illustration. The aim is to spare users all the complexity of assembling the stack (the required software components), so that they can concentrate on the essentials, namely interpreting the data for business insights.
#2 Disseminate exogenous data
How exactly do you determine the value of the contribution made by exogenous data? The answer is simple: by ensuring that they are effectively circulated. It may be hard to keep one point of view in mind without knowing exactly what you are looking for, especially since the relevance of a given data set may only be revealed at the local scale. A mass retail company looking to use new external data to improve the performance of its different sections may struggle if it chooses to ignore feedback from its section managers…
Experience has proven that unexpected practices arise on the front lines. To be used effectively, data must be able to circulate throughout the company, which means avoiding the belief that data decentralization is a simple concept. In practice, decentralizing data is an integral part of an Analytics as a Service strategy.
However, such good intentions may sometimes clash with the very nature of Modern Analytics platforms when billing is based on the number of users. That is one of the reasons why we have not chosen this particular model, since we believe that it goes against the natural grain: involving the right users in a data project increases its value. The idea is to foster, and not hinder, such synergy.
#3 Transform data into a service
External data, and especially open data, should not be considered to be a “gift”. It is a rendered service that invokes another service in return. The aim is not only to be transparent about where data come from, but make a commitment to the value of the resulting service.
An insurance company collecting healthcare data from its policyholders is required to explain the reasons behind its project and specify which new service (for the customer) will benefit from the data, not only for regulatory reasons, but to obtain consent from the customer that is worth the paper it is written on. Explaining the purpose for collecting data is also the best guarantee of obtaining high-quality data. The critical question in both the B2B and B2C markets is how to return data as a service to the people submitting their information?
As we have seen, the key to a successful transformation is the ability to incorporate and correlate information with other data sets, and also display or easily incorporate them into applications. This is actually the natural evolution of data applications: enrich the mix of internal and external data over time to produce meaningful information with greater relevance. There is no need to gaze into a crystal ball to realize that it will become more and more difficult in the future to tell the difference between both types. In the age of Analytics as a Service, the dividing line between data will become increasingly blurred.