What You Need to Know About Data Quality

Data Quality

7 factors to consider when shopping for data

Big data is big business. As more and more data providers strive to feed the growing demand for deep audience insights, the quality of the data being offering can drop off dramatically. It is no secret that selling data is a volume based business.  The higher the volume, the more money the data provider earns.  Data providers that are only after high volume and lack quality control standards should be avoided. What’s more, even good-quality data can lead you astray if it’s misinterpreted or used inappropriately.

Here’s a look at seven considerations when shopping for data so you can make the best decisions for your brand.

How old is it?

All data decays, or becomes less relevant, over time. People move, their interests change, and their emails can become spam traps. Also, think about the importance of the age of the data in the context of your products and services. For example, a list of likely engagement ring buyers has a shorter shelf life than, say, gourmet food delivery customers with ongoing intent to purchase.

Logical connections

Does the data reflect logically sound inferences? For example, if an individual visits a web page that mentions the word “car” is it reasonable to assume they are in the market for a vehicle? Similarly, a list of people interested in the paint might include house painters as well as fine artists. Choose a data provider that pieces together hundreds of data points to avoid inaccurate assumptions about audience intent.

Combating Fraudsters

What is the data provider doing to combat fraudulent records? This is the most overlooked question we’ve seen from brands and agencies. With data breaches, hacks, and scams commonplace, you need to understand what specific actions your data providers are taking to purge harmful data from the ecosystem.  Their answer will be telling as to whether volume or quality takes priority.   

Deterministic vs. probabilistic

Some providers compile deterministic data, which is based on actual behaviors and self-reported data from surveys, applications, or transactions. Others offer probabilistic data, which uses modeling to create assumptions about your audience. While probabilistic data allows you to gain audience scale, it falls short when it comes to accuracy and individual insights.

Consider the source

The data world has a ton of players—the data collector, data aggregator, the data-management platform, a demand-side platform, deduping, domain-space resolution, and more. What goes in one end of the supply chain may be unrecognizable on the other end. The closer you are to your data source, the better your data.

Cookies and reach

Be aware that not all cookies create reach for your online ads. Cookies get deleted or users fail to show up in the footprint. You may only find half or a tenth of the users you purchase. Ask your data provider if they account for some of these potential pitfalls when estimating your ad reach.

Prospect density

As a function of quality, how much data does the provider reject? Consider whether the number of prospects in the segment you’re purchasing seems reasonable. For example, are there really 300 million females in the U.S.?  That is an actual reference we’ve seen in consulting with a client.  Inflated estimates of prospect density will lead to poor campaign ROI later on. A little common sense will go a long way.

While there might not be established integrity standards in the data industry, you can do your part to ensure you get the best data possible. Asking strategic questions and discussing the topics we’ve mentioned will help you compare data providers.  

Webbula offers only self-reported, deterministic data that is linked at an individual level. We keep brands safe each and every day from fraud, robots, and scammers and remove those hazardous records from our audiences . Contact us today to see how you can drive campaign ROI with high-quality data.