Global Sources
EE Times-India
Stay in touch with EE Times India
EE Times-India > Memory/Storage

How to handle Big Data

Posted: 29 Jan 2015     Print Version  Bookmark and Share

Keywords:Big Data  analytics  visualisation  entropy 

The point of this is that simply visualising the data or using tricky transformations is a very hit-and-miss approach to extracting information from noisy data. And the trickier the visualisation or the transformation becomes, the more uncertain the whole analysis becomes. Information does not just emerge from data; you have to start with a hypothesis and test your hypothesis against the data to avoid biases and artefacts. Visualisations can provide suggestive hints—that is part of the discovery process—but these are hints and clues, not foundation pillars in a supportable claim.

Information is not the same as understanding

Big data analytics make no claim to provide any understanding of what you are analysing. They will almost certainly provide trend graphs because their models need to have some predictive power to have value, but they have no idea whether they are modelling the spot price of pork bellies or shoe-buying trends in affluent neighbourhoods. This may be enough for many business applications, and can possibly guide investment decisions over the short-term, but it probably isn't good enough to drive a long-term plan, and it almost certainly isn't good enough to drive decisions on technical data. The problem is that big data analytics don't reveal the underlying "gears" in the data. They see correlation only between the variables you chose to analyse.

Again, statisticians run into this all the time. If two variables track, you have no idea if the relation is causal, if it is driven by a hidden variable, or if it is an accidental correlation in the selected sample. Of course, you can perform additional experiments to see if you can eliminate, or better understand, these possibilities, but this is ultimately a guessing game unless you have a model—a presumed understanding—for what you believe is happening. Once you have such a model, you need to test it against the data and—particularly—you need to test the predictive power of the model and its ability to inform you about new data possibilities you have not yet seen. Doing this is no longer the domain of big data analysis; this is hard-core statistical analysis—a field that is already well understood.


Any empirically-supported assertion comes with uncertainty; for example: "Polls show the Senate will swing Republican by 6 seats, with a 2 seat margin of error." Uncertainty can be quantified quite precisely, once you have a model; you fit the data to the model and use one of many possible statistical techniques to compute the uncertainty in the fit—and therefore in your model.

 First Page Previous Page 1 • 2 • 3 • 4 Next Page Last Page

Comment on "How to handle Big Data"
*  You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.


Go to top             Connect on Facebook      Follow us on Twitter      Follow us on Orkut

Back to Top