The following is an excellent blog written Bernard Marr. It gives an insightful and concise view of what is Big Data.
Big Data is THE
biggest buzzwords around at the moment and I believe big data will
change the world. Some say it will be even bigger than the Internet.
What’s certain, big data will impact everyone's life. Having said that, I
also think that the term 'big data' is not very well defined and is, in
fact, not well chosen. Let me use this article to explain what's behind
the massive 'big data' buzz and demystify some of the hype.
Basically,
big data refers to our ability to collect and analyze the vast amounts
of data we are now generating in the world. The ability to harness the
ever-expanding amounts of data is completely transforming our ability to
understand the world and everything within it. The advances in
analyzing big data allow us to e.g. decode human DNA in minutes, find
cures for cancer, accurately predict human behavior, foil terrorist
attacks, pinpoint marketing efforts and prevent diseases. Take this
business example: Wal-Mart is able to take data from your past buying
patterns, their internal stock information, your mobile phone location
data, social media as well as external weather information and analyze
all of this in seconds so it can send you a voucher for a BBQ cleaner to
your phone – but only if you own a barbeque, the weather is nice and
you currently are within a 3 miles radius of a Wal-Mart store that has
the BBQ cleaner in stock. That's scary stuff, but one step at a time,
let's first look at why we have so much more data than ever before.
In
my talks and training sessions on big data I talk about the
'datafication of the world'. This datafication is caused by a number of
things including the adoption of social media, the digitalization of
books, music and videos, the increasing use of the Internet as well as
cheaper and better sensors that allow us to measure and track
everything. Just think about it for a minute:
- When you were reading a book in the past, no external data was
generated. If you now use a Kindle or Nook device, they track what you
are reading, when you are reading it, how often you read it, how quickly
you read it, and so on.
- When you were listening to CDs in the past no data was generated.
Now we listen to Music on your iPhone or digital music player and these
devices are recording data on what we are listening to, when and how
often, in what order etc.
- Today, most of us carry smart phones and they are constantly
collecting and generating data by logging our location, tracking our
speed, monitoring what apps we are using as well as who we are ringing
or texting.
- Sensors are increasingly used to monitor and capture everything from
temperature to power consumption, from ocean movements to traffic
flows, from dust bin collections to your heart rate. Your car is full of
sensors and so are smart TVs, smart watches, smart fridges, etc. Take
my new scales (which I - as a gadget freak - love!), they measure (and
keep a record of) my weight, my % body fat, my heart rate and even the
air quality in our bed room. When I step on the scales they
automatically recognize me, take all the measurement and then send them
via Bluetooth to my iPhone which gives me stats on how my Body Mass
Index etc. is changing. This information is then also synced with the
data collected by my Up band, which tracks how many calories I have
consumed and burnt in a day and how well I have slept at night.
- Finally, combine all this now with the billions of internet searches
performed daily, the billions of status updates, wall posts, comments
and likes generated on Facebook each day, the 400+ million tweets sent
on Twitter per day and the 72 hours of video uploaded to YouTube every
minute.
I am sure you are getting the point. The volume of data is
growing at a freighting rate. Google’s executive chairman Eric Schmidt
brings it to a point: “From the dawn of civilization until 2003,
humankind generated five exabytes of data. Now we produce five exabytes
every two days…and the pace is accelerating.”
Not only do we have a
lot of data, we also have a lot of different and new types of data:
text, video, web search logs, sensor data, financial transactions and
credit card payments etc. In the world of ‘Big Data’ we talk about the 4
Vs that characterize big data:
- Volume – the vast amounts of data generated every second
- Velocity – the speed at which new data is generated and moves around
(credit card fraud detection is a good example where millions of
transactions are checked for unusual patterns in almost real time)
- Variety – the increasingly different types of data (from financial
data to social media feeds, from photos to sensor data, from video
capture to voice recordings)
- Veracity – the messiness of the data (just think of Twitter posts with hash tags, abbreviations, typos and colloquial speech)
So, we have a lot of data, in different formats, that is often
fast moving and of varying quality – why would that change the world?
The reason the world will change is that we now have the technology to
bring all of this data together and analyze it.
In the past we had
traditional database and analytics tools that couldn’t deal with
extremely large, messy, unstructured and fast moving data. Without going
into too much detail, we now have software like Hadoop and others which
enable us to analyze large, messy and fast moving volumes of structured
and unstructured data. It does it by breaking the task up between many
different computers (which is a bit like how Google breaks up the
computation of its search function). As a consequence of this, companies
can now bring together these different and previously inaccessible data
sources to generate impressive results. Let’s look at some real
examples of how big data is used today to make a difference:
- The FBI is combining data from social media, CCTV cameras, phone
calls and texts to track down criminals and predict the next terrorist
attack.
- Facebook is using face recognition tools to compare the photos you
have up-loaded with those of others to find potential friends of yours
(see my post on how Facebook is exploiting your private information using big data tools).
- Politicians are using social media analytics to determine where they have to campaign the hardest to win the next election.
- Video analytics and sensor data of Baseball or Football games is
used to improve performance of players and teams. For example, you can
now buy a baseball with over 200 sensors in it that will give you
detailed feedback on how to improve your game.
- Artists like Lady Gaga are using data of our listening preferences
and sequences to determine the most popular playlist for her live gigs.
- Google’s self-driving car is analyzing a gigantic amount of data
from sensor and cameras in real time to stay on the road safely.
- The GPS information on where our phone is and how fast it is moving is now used to provide live traffic up-dates.
- Companies are using sentiment analysis of Facebook and Twitter posts to determine and predict sales volume and brand equity.
- Supermarkets are combining their loyalty card data with social media
information to detect and leverage changing buying patterns. For
example, it is easy for retailers to predict that a woman is pregnant
simply based on the changing buying patterns. This allows them to target
pregnant women with promotions for baby related goods.
- A hospital unit that looks after premature and sick babies is
generating a live steam of every heartbeat. It then analyses the data to
identify patterns. Based on the analysis the system can now detect
infections 24hrs before the baby would show any visible symptoms, which
allows early intervention and treatment.
And these examples are just the beginning. Companies are barely
starting to get to grips with the new world of big data. In conclusion
then, big data will change the world. In terms of language I prefer to
talk about the ‘datafication of the world’ in relation to the
ever-growing amounts of data and ‘large-scale analytics’ (or simply
‘analytics’ because what is large now will be normal tomorrow) in
relation to our ability to analyze and harness big data.
---------------------------------------
Authored by:Bernard Marr
https://www.linkedin.com/today/post/article/20130527063838-64875646-what-the-hell-is-big-data