Tuesday, January 14, 2014
Data, data everywhere
“They want 78 new reports from the EDW and Big Data which has taken more than a year and a major part of my budget to build ! They already have hundreds from the transactional systems which are printed on reams of paper which no one reads. All the Excel sheets that they were churning out from data dumps from various systems and a bit of external data get into management meetings where everyone has a different number”. Looking at the CIO I sympathized with his predicament, it was a familiar story.
We all have at some time or the other been frustrated with endless requirements of reports and data dumps from all and sundry; lot of effort is spent in analyzing the past and validating hypothesis on what worked or what did not. Requests flow like rainwater on a slope, never ending stream many similar to others from neighbors at workplace not talking to each other. Reports get built for a casual question in a meeting never to be used again; when another one pops up from new quarters, the effort is repeated.
We hear of associations, correlations and insights not possible in the past as we did not know how to combine an apple and pineapple to get a watermelon. Structured data was easy until we started going to multiple sources with limited commonality. Even then with statistical models diving through seas of data, the proverbial needle could be found in the haystack. People buying napkins buy beer, not vice versa; owners of red cars have a higher propensity to be rash drivers, and so on. You could correlate anything to sunspots !
Not too long ago the need to explore unstructured data began and with social media explosion the dimensions for analysis changed. Thus Big Data began its journey to challenge conventional way of looking at data and information. Jumping on the bandwagon the term was hyped by one and all to include variants that stretched imagination. Came along new skills everyone thought were important for the future: Data Scientists and Chief Digital Officer to name a couple; did such a species exist or it was glamorized plain old profiles ?
Moving from hundreds of GB of data to thousands of GB does not make it Big Data. The amount of data being created and added to corporate storage is growing exponentially. Data types are also expanding with technology offering ways to mine it. Dashboards and cubes work well in selected situations, their action-ability is still wishful thinking. Enterprise manager thinking has yet to evolve beyond reports from transactional systems; thus the data scientist continues to remain a glorified report writer.
The CIO narrated his woes which started with the Company Board approving a really large budget and unrealistic expectations from the project they called BBF (Bigger Better Faster). With much fanfare the project was kicked off, many people inducted into the team and a few pretentious youngsters hired to lead them to gaining insights thus far unknown, from this prestigious first of a kind in the industry Big Data project. The CIO kept his reservations to himself knowing his meanderings would not be given a kind ear.
The project team got bigger faster than anyone thought possible; the technology they bought was deemed better than what they had. Everyone loved the progress they made in the initial months. Then started the reality check with the target audience (managers) putting across what they wanted to run their business better, to grow bigger and reach out to customers faster than their competitors. Challenges with technology and data consistency appeared small compared to the change required in the mind set within.
Activity Reports on social media, portal registration, access reports, keyword searches and some more were the peak of expectation. There was no marriage between the old and the new as if they lived in separate worlds. What could have been remained buried somewhere while everyone wanted better and faster transactional or tactical reports. The rich stream of data that could have been big for the business was diverted and converted into wasted effort. In the corporate world, I believe that the overwhelming data deluge is far from being tamed.
Do you know different ?