|Photo Credit: Alan Brandt Photography|
Big Data garners a lot of hype these days.
Hype is good, hype drives innovation. Hype comes from legitimate, compelling financial motives in gaining a competitive edge and driving ROI. Yet, on the other hand, hype also leads to a lot of misunderstanding. Sometimes actual definitions of a concept, such as Big Data, become secondary in the frenzy.
A post from Quartz helps outline many misconceptions about Big Data in their piece titled “Most data isn’t “big,” and businesses are wasting money pretending it is.” The article points out that the processing some firms perform on powerful data clusters could actually be done on a personal computer. A recent study by CS Berkeley found that most of the jobs that engineers at Yahoo and Facebook ask their computing clusters to perform are within the “megabyte to gigabyte” range. These are some of the largest, most prominent internet firms in the world that would conceivably be processing the largest amount of data on the planet. But, Yahoo and Facebook aren’t coming close to using all of that computing power.
In addition to the striking fact that many large firms aren’t processing Big Data, it is crazy to think that some sources are recommending small businesses to analyze big data, such as the piece in Forbes, “3 Steps to Incorporate Big Data Into Your Small Business.” Small businesses would do better to run a few calculations on Excel instead of pretending they’re processing Big Data.
MIT consolidated a list of definitions of Big Data where Intel, I believe, has an excellent quantifiable definition.
“Big data opportunities emerge in organizations generating a median of 300 terabytes of data a week. The most common forms of data analyzed in this way are business transactions stored in relational databases, followed by documents, e-mail, sensor data, blogs, and social media.”Gartner also defines Big Data by the “three Vs:” Volume, Velocity, and Variety.
To expand upon this definition, it’s helpful to list and categorize data forms that will make up Big Data. Jeff Kelly from Wikibon outlines sources and potential uses of Big Data in his post “Big Data: Hadoop, Business Analytics and Beyond.” They are summarized and expanded upon here:
In an apt research report titled “Nobody ever got fired for buying a cluster;” Microsoft underscores the notion that since Big Data is so widely hyped, IT professionals can plunge into purchasing powerful server clusters even if they do not need them to process all of that data.With these reality checks in mind, I recommend these steps before your enterprise goes out and buys a cluster.
- If you don’t have terabytes of data to process yet, start small. There’s no harm in just doing analysis. Rapid experimentation based on personal hypothesis can sometimes be as valuable as analytics coming from mountains of data.
- As FastCompany underlines, enterprises should first Track Everything in preparation for Big Data analysis tomorrow. This should include everything previously identified as a source of data.
- Understand that mobile data presents the missing link to Big Data analysis right now. Mobile data can fill a major hole in existing analysis - location. If we know where someone is in conjunction with their actions and what’s going on around them, we can build more contextual applications to optimize and personalize mobile experiences.
This is powerful stuff. Believe in the hype but proceed strategically in your Big Data efforts. Down the line, we can start to generate these types of analysis.
- Recommendation Engines
- Sentiment Analysis
- Risk Modeling
- Fraud Detection
- Marketing Campaign Analysis
- Customer Churn Analysis
- Social Graph Analysis
- Customer Experience Analytics
- Network Modeling
- Research and Development