This all-too-true analogy is attributed to Duke Professor Dan Ariely. Big Data, its connotations, and its often-incorrect implications, also are a pet-peeve of mine. This week’s diatribe will hopefully demystify Big Data, as well as give a functional, albeit basic, understanding of what Big Data is. A later installment will examine what it can and can’t do.
Big Data is one of those terms that everyone says but no one really knows how to use properly. Here’s the basic description: Big Data is the large amount of information that is generated on a daily basis and is too big for traditional analytics software to decipher. In other words, it’s a lot of data. But truth be told, it’s so much more complex than just a simple definition.
Technically speaking, Big Data is originally defined by the 3 V’s – Volume, Velocity and Variety. Some argue that we are now up to 5 V’s adding in Veracity and Value, but for this piece, we will just stick with the original 3.
Volume– As the name would suggest, “Big Data” is…big. Conceptually, this means there is too much data for a human to be able to functionally use, far more than the one million rows Excel caps you at. Big Data was measured in gigabytes, but is now measured in terabytes and is quickly approaching petabytes. To put the data growth in context, according to Domo, 90% of the world’s data was created in the last two years.
Velocity– Velocity is the speed that the data is generated, absorbed and analyzed. It is key in Big Data because all of the data in the world is useless if you can’t use it in a timely manner. One of the most alluring processes of Big Data is using information to recognize and leverage trends as they are happening.
Variety– This is the diversity of the data, both in types and formats. This is also where structured vs. unstructured data comes in to play. Whatever the format, media or structure, true Big Data can organize, analyze and report on these diverse styles allowing new insights, and theoretically new advantages, to be gleaned. To understand how an increased presence of X means a higher likelihood of Y is impossible if you don’t bring in both X and Y. For example, X could be demographic data and Y could be a product- knowing if the surrounding populous is college students or retirees baselines pricing for everything from housing to food options.
Another way to look at Variety is from the aspect of a crime scene. Looking at an event from a variety of angles encourages transparency. Think about what a DA looks for compared to a coroner, or an LEO, or a witness, or the family, or the investigator. By pulling all perspectives together, usually a defensible perspective comes to light.
Big Data is an incredibly powerful tool. Some, including myself, argue it has the potential to have a bigger impact on our world than any other tech discovery. It has the ability to influence every aspect of your life from what sits on your grocery store shelves to how world leaders address a crisis. But, before it can be used, it needs to at least conceptually be understood. It’s very elaborate, but it’s also incredibly useful when it is utilized correctly. Hopefully, breaking Big Data down to and simplifying it means, in turn, you can do the same. I promise that if the next time someone tries to sell you a “Big Data” solution and you come back with “tell me about the velocity of your data” or “what is your variety and what tools do you use to handle it?” you will likely know if it’s a modern miracle or technological snake oil.
Our team of experts collaborate and share insights to keep
high-risk industries safe and compliant.