This is a guest post written by Scott Raspa. He works at IKANOW, a big data software company, where he is involved in the company’s sales and marketing efforts supporting public and private sector clients. He can be found on Twitter @sraspa.
Big Data is the biggest trend in IT right now, however the term is loosely thrown around and becoming increasingly ambiguous. Everyone seems to be doing some sort of “Big Data” nowadays, which can cause great confusion among organizations with actual Big Data needs. We at IKANOW focus on unstructured data analytics, and may be a little bias, but believe it is an essential part of any Big Data offering.
One question we hear all the time is “what’s the business value in unstructured data?” Occasionally, we also receive raised eyebrows and blank stares when referring to unstructured data. To help answer this question and provide clarity about unstructured data, let’s take a look at the differences between structured and unstructured, and how it all relates to Big Data and the business value.
Structured data, according to TheFreeDictionary, “resides in fixed fields within a record or file. Relational databases and spreadsheets are examples of structured data. Although data in XML files are not fixed in location like traditional database records, they are nevertheless structured, because the data are tagged and can be accurately identified.”
On the other hand, unstructured data does not follow a predefined data model and does not fit into relational databases. Emails, videos, social media, RSS, documents, and text are all examples of unstructured data.
So is unstructured data the same as Big Data? Unstructured data is just one piece of the puzzle. Many of you have likely heard of the four V’s of Big Data: Velocity, Volume, Variety, and the fourth being Value or Veracity, depending on whom you ask. For the most part, unstructured data is responsible for the Variety, as new forms of data are constantly bursting into the marketplace and we can only speculate what we’ll see by the end of 2013. As IBM highlights in their article What is Big Data, “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone.”
In my opinion, Variety is one of the most important aspects of Big Data and what makes it so Valuable! The ability to capture billions of raw data points from a diverse set of sources, often directly from customers, subscribers, or target audience, and transform that data into actionable intelligence, is a capability that has only recently become widely available. We all know the amount of data and the speed at which it is accumulating is growing exponentially. This can be a difficult problem to tackle in and of itself, however, if you fail to take advantage of the Variety of data, the Volume and Velocity become much more of a downside than a value-add.
So where should you start? Often times we want to start with the technology. This can be a huge mistake. Steve Jones from Capgemini wrote a great piece in which he states, “Making sense of unstructured data isn’t about technology, it’s a business challenge;” “the first task is to establish the question you’re trying to answer” and “identify the problem you’re trying to solve.” I couldn’t agree more. You need to first understand your business problem and the question(s) you ultimately need to answer, however, this is often more difficult than it sounds, because “you don’t know what you don’t know.” Many of us were trained to gain insights from structured data our entire career and now have access to this enormous amount of data that we don’t know what to do with. These are some of the most important aspects of a Big Data solution provider’s job – to understand an organization’s problems and help develop a solution.
Once you’ve identified the business problem and the questions you need answered, it’s important to tie this back to a Return on Investment (ROI). It is critical that we not only answer our client’s questions, but also determine the value our solution will bring to the company? Will their sales increase? If so, by how much? Will they gain a competitive edge? How will that impact their top-line growth? These are the type of questions that need to get answered, for two main reasons. First, to gain upper management’s blessing to move forward, and second, to make the project a success.
The great thing about unstructured data is there are benefits for organizations of any size and in any industry – from startups and small businesses, to multinational corporations and Fortune 10s. While larger organizations tend to be more concerned with Volume and Velocity, the elements of Variety and Veracity are vital to small businesses and startups.
Whether you Like it or not, Facebook happens to be a great example to highlight the importance of Variety in unstructured data. Since going live in 2004, Facebook has prompted its more than one billion users to create a profile which advertises their age and gender; political and religious preferences; favorite music, movies, and books; interests and activities; and in many cases their email and phone number. You don’t need a background in marketing to understand the value of this data, and the business world quickly caught on. The numbers fluctuate depending on the source, but a majority of American companies now maintain a Facebook page and those who do not are attending courses or paying consultants to get them up and running.
Another great example is Foursquare, a location-based social media site that allows users to “check in” from venues identified by the GPS in their smartphone. A number of promotional pilots and partnerships has allowed them to increasingly incentivize the call to action, in June 2011 pairing with American Express to offer discounts applied directly to account holders who “checked in” to a participating vendor. The concept of “checking in” quickly caught on with the social media power players, with both Facebook and Twitter implementing the feature in 2010 and 2011. Digital marketing firm Silverpop analyzed all publicly available “check ins” via Twitter for the night of Thanksgiving and Black Friday 2012. Starbucks managed to top the nation’s mega-retailers (Target, Walmart, and Best Buy), possibly due in part to the rewards they offer customers for each check-in. These loyalty marketing tactics can be applied by any small business in any locality, whether just opening their doors or barely making ends meet, they can nearly effortlessly and cheaply expand their reach and influence first-time customers.
How an organization wants to use their unstructured data is dependent upon their business objectives. There are many great applications available today that allow organizations to manage and monitor their web or social media activity (Google Analytics, HubSpot, Hootsuite, KISSMetrics, etc), each with measurable ROI for their clients. There is nothing wrong with these applications, and they will continue to play an important role, however, as technologies evolve and demands change, organizations will be looking for ways to increase their ROI from more in-depth analysis of unstructured data. This is where companies like IKANOW and Datasiftcome into play, allowing organizations to go beyond “basic” analytics and dive deeper into your unstructured data to do things such as predictive analytics, temporal and geospatial visualization, sentiment, and much more.
So, is there business value in unstructured data? With a sound strategy and the right tools, absolutely.