A place to convene. A place to discuss. A place for ideas.

Defining "Big Data"

By Ideas Lab Staff February 22, 2013

Big Data, a recently launched journal, explains how we can define and make sense of "big data."

The term “big data” has emerged as a popular way to describe the increasing volume of information available about companies, consumers and other subjects. But how can we define and make sense of big data?

According to IBM, big data spans four dimensions: volume, velocity, variety and veracity. “Big data is more than simply a matter of size; it is an opportunity to find insights in new and emerging types of data and content, to make your business more agile, and to answer questions that were previously considered beyond your reach,” IBM states.

SAS, a software solutions company, says the issue goes beyond acquiring large amounts of data. “It’s what you do with your big data that matters,” the company states on its website. “The hopeful vision for big data is that organizations will be able to harness relevant data and use it to make the best decisions.”

And in the first volume of a new journal, Big Data, Editor-in-Chief Edd Dumbill attempts a few definitions of his own, including, “Big data is data that exceeds the processing capacity of conventional database systems.”

Either way it’s defined, “big data” will be a term “hard to escape this year,” Dumbill states. And that term is also now useful to organizations that exist even outside of digital environments. Mobile phones, for example, provide opportunities to engage and interact with people both physically and virtually, while advances in sensors, robots and other tools have “broadened the reach of the algorithm,” Dumbill writes.

But some, such as The New York Times’ columnist David Brooks, argue that there are still limitations of big data analysis. In a Feb. 19 column, Brooks outlines several “things big data does poorly.” For example, he points out that big data lacks the ability to measure the quality of social interactions rather than the quantity of them and that data struggles to give context.

“Data analysis is pretty bad at narrative and emergent thinking, and it cannot match the explanatory suppleness of even a mediocre novel,” Brooks states in his column.

Still, Dumbill suggests that the line between virtual and physical worlds is blurring. “We stand at an incredible time of opportunity, considerably better able to improve our lot through computing than in any previous era,” he says.