Once the business question most critically needs an answer has been framed, the next thing to do is to take a step towards determining whether it’s actually feasible to find an answer to that question.

Step 2: Think about what data might help answer those questions.

This is important. Is it data about what customers already bought? If so, what kind of data about what they bought? How many kinds of data? At this point, the thinking should not be constrained by any knowledge of what data is already available within structured enterprise databases, but must simply be a fresh wishlist of desirable information.

The kinds of data required leads to the next train of thought, which is how and where the data that could provide the required answers might be found. Could it be in conversations that customers are having outside of the company? Or could it be found in customer support data, including conversations? Would sales data be of help? Thinking about the sources of data in such broad or generalized terms leads to further thinking that narrows down the source within each broad area. For example, maybe there could be clues in emails received from customers. Maybe there are clues in discussions that are going on in social media. Maybe the data is available within the enterprise, but just not in the same department.

At this point it’s not necessary, of course, to try and think very completely about this and come out with an exhaustive list. That would be an activity to be pursued later on by the actual project team, but for now it would be adequate to arrive at a high level understanding because this will be useful in determining courses of action to factor into the analytics plan.

It’s common knowledge that businesses today send and receive vast amounts of data in different forms and through various channels of communication. It is estimated that in some environments traditional structured data stores (including enterprise databases and data warehouses) contain no more than 10% of all the data that actually flows through the business. Vast amounts of data exist in unstructured or semi-structured forms. Some of it may have been stored for various purposes, some of it may not have been stored. Structured data includes types that are traditionally captured and stored by systems for ERP, CRM, accounting, and so on.

Unstructured data includes the content of email, voice conversations, video streams, and streams of data from devices such as sensors and scanners that may be used in sales, operations or security functions. There could also be various types of logs, and clickstreams captured from websites. Other examples include location data, and sentiment data obtained through analysis of internet activity and communication.

The point of getting this early understanding of the types of data and possible sources in which they may be found is to gather at least a starting idea of their scope as this would be useful in discussing the initiative further with IT and the business, and to provide guidance towards the next step of the business analytics project.