The phenomenon of “Big Data” exacerbates the tension between potential benefits and privacy risks by upping the ante on both sides of the equation.  Any project can fail for any number of reasons: bad management, poor budget management, or just a lack of relevant skills. However, big data projects bring their own specific risks.

Disturbingly Currently Only 13% of Companies Achieve Full-scale Implementation of Their In-house Big Data Projects

Such a low success rate should be concerning for organisations embarking on  big data projects, since many businesses are choosing to adopt big data without a clear understanding of what the return on investment (ROI) will be.

Big Data Investment Projects Requires Experimentation to Discover Where They Can Best Be Used to Yield Significant ROI

Given the amount of risk involved, we need to be aware of the dangers that could potentially arise if we fail to cover all of the bases. Here are the eight biggest risks of big data projects, essentially a basic checklist that should be taken into account when developing a strategy for your “Big Data” project.

Big Data Presents New Challenges Impacting the Entire Risk Spectrum

1. Data Security

This risk is obvious and often uppermost in our minds when we are considering the logistics of data collection and analysis. Data theft is a rampant and growing area of crime and attacks are getting bigger and more damaging. The bigger your data, the bigger the target it presents to criminals.

2. Data Privacy

Closely related to the issue of security is privacy. Risk mitigation strategies are essential for protecting privacy. You need to be sure that the sensitive information you are storing and collecting isn’t going to be divulged through damaging misuse by yourself or by the people to whom you have delegated the responsibility for analysing and reporting on it.

On the one hand, big data unleashes tremendous benefits not only to individuals but also to communities and society at large, including breakthroughs in health research, sustainable development, energy conservation and personalised marketing. On the other hand, big data introduces new privacy and civil liberties concerns including high-tech profiling, automated decision-making, discrimination, and algorithmic inaccuracies or opacities that strain traditional legal protections.

3. Costs

Data collection, aggregation, storage, analysis, mapping and reporting costs a lot of money. These costs can be mitigated by careful budgeting, but getting it wrong at that point can lead to spiralling costs, potentially negating any value added to your bottom line by your data-driven initiative. A well-developed strategy will clearly set out what you intend to achieve and the benefits that can be gained so they can be balanced against the resources allocated to the project.

4. Time to Deployment
The amount of time required to deploy a big data solution can vary significantly depending on the type of implementation. It’s worth considering that an in-house solution can take over six months to build depending on the requirements, however, a cloud-based solution requires no internal infrastructure.

5. Scalability
The ability to scale a project up or down is crucial. Organisations underestimate how quickly their data can and will grow, or fail to take into account varying usage levels. Cloud-based systems will offer more scalability, allowing businesses to increase or decrease usage as required.

6. Enhanced Transparency

Big data analysis is prone to errors, inaccuracies and bias. Consequently, organisations should provide more transparency into their automated processing operations and decision-making processes, including eligibility factors and marketing profiles.

7. Bad Data

Collecting irrelevant, out of date, or erroneous data. The big data revolution has led to a “collect everything and analyse it later approach. If you are not analysing the right up-to-date data, you won’t be drawing the right conclusions to provide value.

8. Accessibility
Big data is only useful if it is accessible by the people who can actually learn something from the data and implement it into everyday business practices.

Currently 57% of Organisations Cite Skills Gap as a Major Inhibitor to New System Adoption

Once again, ease of accessibility will vary based on implementation, so selecting a vendor that matches your internal capabilities will be crucial. Consider the time and investment it will take to train teams when calculating time to value and overall ROI.

Here are some real-world examples of Big Data in action:

  • Manufacturers are monitoring minute vibration data from their equipment, which changes slightly as it wears down, to predict the optimal time to replace or maintain.
  • The government is making data public at both the national, state, and city level for users to develop new applications for public good.
  • Financial Services organisations are using data mined from customer interactions to slice and dice their users into finely tuned segments.
  • Hospitals are analysing medical data and patient records to predict those patients that are likely to seek readmission within a few months of discharge.
  • Web-based businesses are developing information products that combine data gathered from customers to offer more appealing recommendations.

Responses

Leave a Reply

  1. Greg Thein

    Interesting post (as usual!).

    I’m always a fan of KISS (keep it simple, stupid). I think the ease of collecting data and information can lead to grand plans, but, like the article said, will implementation always follow through or yield desired results? And if it’s improperly analyzed, will it yield an incorrect conclusion?

    Interestingly, people think of big data and privacy issues as a recent phenomenon. My father worked for a newspaper in advertising and market research. Back in the 80’s and 90’s (and probably even earlier) it was already quite common. There was one data aggregator who tracked and compiled lots of data on people from a wide variety of sources. If you sent in a card when you purchased a product – that went into the data base. If you called the free movieline (or other free information number – sports, horses,) number in the newspaper, it went in the database (your phone number was liked to you). If you subscribed to a certain magazine – it went into the database. If you bought a car, it went in the database (new vehicle purchases were available from counties). They knew the income demographics of your neighborhood from the census. He said it was incredible how much information was collected and aggregated about individuals. The difference was that it was probably better controlled and couldn’t be hacked (maybe physically stolen, though). It was primarily used by and sold to those who advertised to people.

    Data can be good and revealing though. My wife works at a major hospital system. In the annual employee survey, cleaning people complained that they often didn’t have the materials they needed in their closets and had to go to a central repository and restock them themselves. Their badges also often didn’t give them access to areas they needed to clean (even non-sensitive areas) and they had to go get a security guard and tie up themselves and the security guard for up to 20 minutes. When they did a study and tallied up all the wasted time, they found it was a staggering amount and costly. The problems were resolved almost immediately – closets were always fully stocked and badges had all the access they needed. That wouldn’t be an argument for collecting data from people’s badges to track them every minute of every day; it would, though, be an argument for targeted data studies to investigate and evaluate a specific problem.

  2. Anonymous

    In a sense, security and privacy are the easy problems, where technology can shine. The more difficult upfront problems are knowing what you are collecting and that it is consistent and well defined, especially when coming from multiple sources.

    Then it gets trickier. As the author says, “Big data analysis is prone to errors, inaccuracies and bias” – exacerbated by any weaknesses in data collection, which are often invisible to the analyst. Analysis creep can occur: As more data are available, more analysis is possible, leading more data requested, then more data to analyze . . .

    I am particularly interested in predictive uses and making sure that the predictions can be reviewed for accuracy (and then revised to improve future prediction).

  3. Anonymous

    From a data perspective, the the techniques of data mining, Bayesian and BAGGed regressions haven’t changed too much but the technology that allows you to run bandsaw analytics is much more prolific and outsource-able even over ther past 3 years. The issues that remain constant in these pushes are havnig sound theory-driven approaches to structuring the analyses and as stated good, clean data. Even inconsistent data can be logged out and refined, but reliability of the base connection with inputs and realities is noisy and stubborn obstacle. In the end, Big Data efforts often fail most frequently if the leader (or lead group) has insufficient analytics skillset to guide the business partners through the parameters, the WIIFM, purveying analysis quality and attach the capablity of the results back to filling a knowledge gap for the business. If those are done well, analysis leads to insight, insight leads to action. If not, then quite the contrary.