EXPLAINER: Big Data

The information landscape is undergoing a significant shift and extracting the full potential of this trend will need an effective management and analysis strategy.

Big Data, as the name suggests, is about making sense out of extremely large sets of data. It is set to change the information landscape and, for those who embrace it, will provide strong competitive advantage and insight previously impossible.

Every day, organisations are producing and capturing enormous amounts of information about their customers, suppliers, and operations.   As well as the size of the datasets, big data is increasing the type of datasets that can be added to the mix so that the information now available from multimedia, smart phones and social networking sites may be collected stored and analysed. We are faced with more potential information than ever before.

Gartner analysts claim that information volume is growing at a minimum rate of 59 percent annually . IDC’s latest Digital Universe study estimated that the total volume of data being stored in the world will reach 35 zettabytes by 2020  (one zettabyte equals one trillion gigabytes).

The value in Big Data is that it allows us to access and use these highly valuable, large scale data sets for increasingly sophisticated analysis and strategic direction for businesses. Decisions can be made by reviewing all of the facts, not just a sample or an aggregation. Information will no longer be just stored and archived to tape, aggregated to a manageable size or simply thrown away due to the inability of current data warehousing solutions to cope.

Who benefits from Big Data?

Organisations with truly enormous datasets in highly competitive markets will benefit the most from Big Data technology.

Telecommunications companies can bring together and analyse data from web logs and call records that will allow them to reduce churn, manage risk, predict consumer behaviour and support long term strategic thinking.

In the finance industry, global credit card companies can use Big Data to analyse consumer behaviour and fraud activity, and international trading exchanges can look at risk, fraud and money laundering through pattern analysis across all trades.

Government agencies across the world have already begun to embrace Big Data for its ability to better manage risk, improve public services, drive policy decisions and develop strategic direction.

Retailers, organisations in the energy and manufacturing industry and those in the sciences will benefit greatly from the advantages delivered by Big Data.

The technology behind Big Data

Big Data tools are characterised by open source technologies, often originating from communities of the smartest people from large web players such as Facebook, Yahoo or Google. 

The most widely accepted Big Data tool is Hadoop, an open source technology. Hadoop is intended to ease the complexities of performing large-scale batch operations on data and is managed within the Apache project framework; it will provide the basic tools needed for Big Data projects. Hadoop is redundant and reliable; it doesn’t stop or lose data even in the event of hardware failure as the data is replicated in multiple locations.

The major BI vendors are announcing support for, or solutions using, Big Data technology based on Hadoop.

A matter of cost

Big Data technology complements any existing investment in data warehouse technology.

Building large data lakes is minimal in terms of toolsets – storage is now so cheap, it’s almost free. Hadoop also means that those embracing Big Data face inexpensive hardware costs to build scale as needed.

When looking at long term hardware requirements, organisations will need to rely less on small numbers of large machines and look more at large numbers of commodity hardware or cloud resources. 

A good option is to buy capacity from small standard units; Infrastructure as a Service (IaaS) vendors and cloud resources provide massive time-to-market and timeliness advantages to those organisations capable of taking advantage.

Considerations for Big Data adoption

As with any new technology, there are a number of considerations to review before incorporating Big Data into a BI program.

Skills are critical. Big Data adoption will need people with a very specific skill set; those who can manage large, distributed data sets and the hardware that comes with it. It also requires people who can make sense of the data and can put that into a business context; think data scientists as opposed to the data analysts and data miners of today.

Managing expectations is a vital element. Proof of concept will only be seen at scale; the benefits of Big Data will kick in at the multi-Terabyte and Petabyte range, not on a couple of laptops. Similarly, it’s important to note that Big Data is good for large scale analytics and long term strategic direction. It won’t deliver monthly management reporting or ad-hoc queries over structured data.

A final note

As the amount of data available continues to grow rapidly, businesses that fail to develop the skills and resources to manage and analyse it will find themselves at a competitive disadvantage.

Successful business intelligence projects will need to consider Big Data as part of their data landscape for the value that it delivers. More and more organisations will look toward statistics and data mining to set strategic direction and gain greater insights to stay ahead of the pack.

Big Data will help organisations better manage risk and improve the customer experience, fundamentally changing the way information is managed and used.

Cameron Wall is managing partner of business intelligence consulting firm C3 Business Solutions.