The continued increase in business competition in every industry sector, and the resulting efforts to find new means and ways to stay ahead of the game has led, among other things, to an emerging increase in the demand for data scientists.
But what exactly are they expected to do that can’t already be done by the existing business intelligence (BI) folk? Are data scientists going to replace conventional BI practitioners? The answer is easy: no, they will not, even though the kind of technology that exists today enables BI professionals and users to query and report on data in just about any way imaginable.

In every modern large (and even small) enterprise today data is captured and stored in several databases. ETL tools enable this data to be massaged, manipulated, moved, merged, transformed and integrated as needed. Database and BI professionals work together to identify the data elements needed and how to analyse them across a multitude of data stores using data warehousing techniques and powerful OLAP tools. Big Data stores may also exist alongside conventional warehouses, and these too are tapped in similar ways, although using a different set of technologies, tools and techniques. The approach used by BI professionals may appear similar to that used by data scientists.

It all starts with the business user asking a question or wanting to find out something that does not readily appear in available reports, but has to be gleaned out as an insight by creatively bringing together and querying the right types of data in bespoke ways, and present the results either in tabular formats or graphical visualizations. The BI technologists figures out how to produce the results given the all the data available. They look at where the data is stored, and how it is stored, and use the right tools at their disposal to extract it, clean and integrate it, and then query and analyse it. The business user has to remain closely involved at all stages to provide direction and interpretation while the BI professionals and other data technologists apply their technical skills.

The data scientist goes a step further, but in a different way. A data scientist also understands what physical data stores look like, and they are able to integrate data, query it and extract meaning from it either at a prototype level or for a full data set. They are able to do this using various tools and a knowledge of data modelling and SQL. Apart from this they, too, are able to use reporting and visualization tools to present and explain their results.

The additional skill that they bring with them, and which sets them apart from the BI practitioners is a strong understanding and perspective of applying statistical techniques to not only produce insights about the past, but also to make predictions about the future. This is not purely a capability to apply mathematics, but to work very closely with the business in order to iteratively bring out the best validations of both, input data sets as well interpretations of the output, and to present them in a meaningful manner. Modelling data and presenting the results requires the ability to work with languages like R, and even additional visualization tools. A data scientist is more than likely to be able to work with a number of tools and technologies for the extraction, integration and querying of data, and the presentation of visual results, but depending on the depth of technical know-how required on the IT side it is quite possible that they would need additional help from the BI professionals.

In short, if there is a question about whether the work required to provide information from data needs a BI specialist or a data scientist the first rule of thumb to apply is to ask whether what’s required is a prediction of the future, or a historical look at the past.