Why your big data collection is failing

 In Big Data, Solutions

The world’s enterprise data is doubling every two years, but for most organizations, it isn’t doing much good. One in three businesses relies on incomplete or unreliable data, and only 4 percent are using best practices to get the most out of their data.

Unfortunately, most data uncertainty comes from simple mistakes business owners don’t realize they’re making. Data can be immensely helpful, but left unchecked, human error can turn a data lake into a swamp and render your data useless.

Yet by taking a few steps to ensure your data is clean, the results can be transformative for your business. Using data, Walmart can predict the type of toys it needs to stock for Christmas, as well as how to price them to beat out the competition. Amazon uses buyer data to figure out what customers might be interested in, or what items are commonly purchased together, increasing sales. UPS optimizes its truck routes to minimize left turns, stop signs, and other enemies of fuel efficiency, saving as much as $300 million a year.

It’s time to take a closer look at your data. Here are a couple of ways to avoid mistakes and make your data collection stronger.

Make data more than an IT operation

While everyone wants business goals and company actions to line up, the distance between the corporate and IT sides of a business can muddle the space between those goals and actions. Unfortunately, this leads to poor data management, governance, and collection, not to mention uncertainty.

Before you ramp up data collection or proceed any further, consult with the business side of the company to figure out what they’re looking for. How should this data benefit the business? What problem should it solve? What are the organization’s goals? If you don’t know the answers to these questions, sit down and hash out how IT can help executives fulfill your organizational objectives.

Otherwise, for example, you might get stuck collecting product data, when you should really be gathering more customer information. This creates unnecessary noise, leading to poor analysis and low ROI.

One of the best methodologies for aligning your data and business needs is IBM’s Cross-Industry Standard Process for Data Mining (CRISP-DM). This model ensures you deliver on your organization’s expectations through six steps: Understanding the business, understanding the data, data preparation, data modeling, evaluation, and deployment.

Succeed through data scrubbing

Once the business goals are clear, ensure you’re getting the most accurate input possible. Some of the most common data errors can be avoided by simply scrubbing the data before running analytics or business intelligence.

Although the process is time-consuming, data scrubbing can help remove incomplete, inaccurate, and duplicate information. For example, two people filling out forms may enter the same street, but one uses “Avenue,” while the other puts “Ave.” Instead of recognizing these as the same, the system could register them as two streets, making any analytics you run less effective than data scrubbed for consistency. The same goes for typos, empty fields, or other human errors.

To push accuracy further, scrubbing should be done by a subject matter expert—someone who knows the business side of what you’re doing, and can reliably say that the input data exists, makes sense, and is accurate. Product data, for example, should be reviewed by someone who is familiar with the product and can recognize what actually exists and was sold at the time of input.

You can also add a Chief Data Officer (CDO) to ensure data governance. While data scrubbing can be a long and tedious process, it’s absolutely crucial to getting accurate and beneficial results from your analytics or business intelligence.

Better data makes better business

Although many companies continue to make data mistakes, those that set clear business goals can have massive success. Walmart, Amazon, UPS—each of these companies sets specific, data-oriented business goals, tracks the correct data, and puts it to use. With the right understanding and practices, your company can do the same.

Related Posts: You may also be interested in...


Leave a Comment

5 − 3 =

Pin It on Pinterest