The problem with data quality in advertising insights, according to data scientists

Kirsten Lamb

Good data is essential to great advertising. 

Good data = validated ideas. Targeted, personalized campaigns. And better ROI. 

But if you want to prove the power of an idea or feel confident in your advertising decisions you need high-quality data to get you there.  

And bad data is one of the biggest issues in advertising research. Only 45% of professionals trust their data assets and according to AdAge, around half of the data creatives use for targeting advertising is incorrect. 

In the words of Timur Yarnall, founder and CEO of Neutronian, bad data powers the ad industry: 

“Unfortunately, most ad campaigns are fueled by irrelevant, untrustworthy data. If you don't have high-quality data, you're in a lose-lose situation: you lose out on potential customers because of mistargeted ads, and you put yourself at risk financially, legally, and reputationally.”

From data bias to data hacking and duplicate data to incorrect data, there are a ton of reasons why your data may not be trustworthy or reliable. 

In this post, I’ll talk you through the issues surrounding data quality in advertising research. We’ll cover what good data is and show you how you can improve the quality of your data with the right tools and best practices. 

Let’s jump in.

What is data quality?

Data quality measures the condition of data. You can measure whether your data is “good enough” by assessing elements of your data — including data integrity, accuracy, completeness, consistency, uniqueness and validity. 

Why data quality matters 

As an essential part of data management, data quality is all about making sure the data you use in your reporting, analysis and decision making is accurate, reliable and trustworthy. 

If your data is incomplete, inconsistent, or incorrect then you won’t have the information you need to make the best decisions for your brand. 

Bad data can lead to mistakes and misinterpretations. 

It can give you incomplete insights into your audience, throw off segmentation and targeting, and drag down your ROI.  

With bad data you may:

  • Think you need to switch up a much-loved product 

  • Choose the wrong channels

  • Push for campaign messaging that will never resonate with your ideal customers

Experian reports that bad data: 

  1. Wastes 42% more resources 

  2. Impacts the customer experience by 39% 

  3. Damages the reliability of your analytics by 38% 

While the benefits of good quality data are too many to count, from helping you find the perfect ad concept to giving you in-depth insights into your ever-changing audiences — you may even ask if a brand can survive without it. 

And what’s the first step to better data? 

Putting the right data quality measures in place. Let’s take a look.

How to measure data quality

To make sure your data quality is at the level you need it to be, you need to set up data quality standards. Data quality standards are a set of clearly defined and documented data criteria that your data needs to hit for it to be fit for purpose. The right data quality criteria can help you make sure your data is reliable and accurate. 

Data quality metrics 

To assess your data quality, you can use a handful of metrics to help you uncover any hidden data problems. 

Here are some of the best data quality metrics examples: 

1. Data-to-error ratio

How many errors does your data have? The less errors you find, the better your data is. 

Data-to-error ratio formula: Divide your total number of errors by the total number of items. 

2. Duplicate data record rate 

Duplicate data is common because people make mistakes. It’s also common because of data silos. Different people can store the same data in different databases or systems — leading to duplicate records. Whatever the reason for it, duplicate data can have a huge impact on the accuracy of your data.  

Duplicate data record rate formula: Calculate the percentage of duplicate entries in your dataset compared to the rest of your records. 

3. Data time-to-value 

How quickly are you turning your data into insights? And how quickly does your data have an impact on your revenue, customer satisfaction, and audience engagement? 

Quality data is easy to work with. 

If you need to spend a large chunk of time on revising or updating your data then this may be a sign of poor data quality. 

Data time-to-value formula:Define the value you want to measure. Then assess how long it takes until you hit that value. 

4. Number of empty values 

Missing data can throw off your analysis. The number of empty values shows you how often important data is missing or added to the wrong field. 

Calculation: This one’s simple: count and keep a record of how many of your entries have empty fields. Then track this figure over time. 

The 8 dimensions of data quality

Along with the right metrics, you also need to understand the essence of good data. 

Quality data is made up of a number of essential elements or dimensions. 

By understanding each one, you can quickly figure out if your data is good enough. You can also use data quality dimensions to find the root of your data problems. 

For example, you can find out whether your analysis is being muddied by out-of-date information (known as data timeliness) or missing data points (that’s part of data completeness). 

“Data quality dimensions also allow for monitoring how the quality of data stored in various systems and/or across departments changes over time. These attributes are one of the building blocks for any data quality initiative. 

Once you know against what data quality dimensions you will evaluate your datasets, you can define metrics. For instance, duplicate records number or percentage will indicate the uniqueness of data.”

— Altexsoft

Data quality dimensions image

1. Accuracy

Data accuracy is all about how correct your data is. 

In his research with a top advertiser, CEO at Exchange Lab, Chris Dobson found that two important elements of campaign insight — attribution and performance data — only matched half of the time. 

As a result, the standard of data the company used with their software was only around 50% accurate — impacting their ability to make reliable marketing decisions. 

Accuracy can also be a huge issue when it comes to surveys, whether your respondents speed through the questions or biased questions skew their answers.  

If your data is inaccurate then you may make the wrong assumptions about your audiences and campaigns. You may target the wrong people, push for the wrong ad concepts, or pour your advertising budget into the wrong channels. 

2. Completeness

You can define your datasets as complete when they have all the elements they should and you have all the information you need to make an informed decision.

Take customer data — if your customer data is complete then you’ll have in-depth data on demographics, psychographics, buying behaviors and channel interactions. 

You can use it to get a holistic view of your customers and make reliable data-informed decisions about how to target them, convert them, and keep them. 

3. Consistency

If your data is consistent then it will stay the same across different records, networks and applications. 

For example: 

  1. You store a customer’s age across different databases. The age of your customer should be the same throughout each of your datasets.

  2. You run a survey on a new product release. If you ran the survey again then your answers would be like the answers you got in your first batch.

As IBM notes, when your data is consistent, you know you can trust that information, “Consistent data improves the ability to link data from multiple sources. This, in turn, supplements your data set and increases the utility of the data.”

4. Uniqueness

Data uniqueness means that the data in a dataset is original. 

If your data is unique then you’ll be able to answer “yes” to the question: “Does this information appear only once in my dataset?”

But if you have duplicate data on a respondent’s buying behavior then your data isn’t unique. 

5. Timeliness

Timeliness measures how up-to-date your data is. If you’re relying on data from 2022 to make decisions in 2024 then your data won’t pass the timeliness test. 

And if your data isn’t timely then it’s not reliable. 

You need to make decisions based on who your audience is right now, not who they were last year. And brand perception can change in weeks with a sudden event, competitor decision or unexpected celebrity endorsement.

6. Validity

Data validity measures the extent to which your data matches the format, type, standards and range you expect it to. To keep it simple: if your data is valid then it measures what you expect it to measure. 

Examples of data validity include emails following the expected “@” format or a date of birth following the standard “month, day, year” format. 

7. Freshness

Your data is fresh if it accurately describes the “real world, right now.” When your data is fresh, it’s up-to-date. It can tell you what’s currently happening with your campaigns, audiences, creatives and platforms. 

If you’re thinking that sounds like data timeliness, you’re not wrong. 

Metaplane does a great job of explaining how it's different: “This data quality dimension is closely related to the timeliness of the data but is compared against the present moment, rather than the time of a task.” 

8. Integrity

With data integrity, your data stays accurate and consistent over time. When your data is consistent, you know it hasn’t changed and you can still rely on it to get all the information you need. 

The two main types of data integrity are: 

  1. Physical integrity

Events like natural disasters can physically impact your data. You can protect the physical integrity of your data by putting protections and disaster recovery plans in place to protect your hardware. 

2. Logical integrity 

Database hacks or user mistakes can undercut your logical data integrity. You can support logical integrity by making sure that your data remains unchanged when you use it in a relational database — maintaining its accuracy and consistency.

4 types of bad verbatim responses: Why they 'Shall Not Pass' at Zappi

Data quality management challenges and best practices

Now you know the essence and the importance of good data, let’s take a look at some of the biggest data quality management challenges and their solutions. 

Data quality management challenges

As you’ve seen in the different dimensions of data quality we’ve covered above, there are a lot of things that can drag down the quality of your data. 

Here’s what you need to watch out for: 

1. Siloed data 

The standard business uses 91 marketing cloud services. And that’s just one type of tool in your tech stack. Many businesses struggle with data silos — with data isolated or scattered across different tools and departments, as Chris Dobson says in this Forbes article: “For many marketers, the abundance of data produced by disparate sources has made the task of identifying and unifying relevant insight seem colossal.” 

If you don’t put integration processes in place or centralize your data then your data may be “watered down” — undermining your data integrity. 

2. Duplicate data 

Data silos often lead to duplicate data — one of the biggest data quality problems. Without the right data cleansing practices (more on that below), it’s easy to ‘collect’ duplicate data. Duplicate data can mess up your statistical results and undermine your analytics. This can throw off your ability to make data-informed choices. 

3. Missing data 

Missing data is another one of the biggest data quality issues. Missing data often comes from failures during data collection. The more complex your datasets are, the more likely you are to miss important information.

Not only can missing data impact your analysis, but it’s also time-consuming to get the data you need. 

One way to make sure you have the data you need is to use automated systems for form completion. Take consumer surveys: you can use automation to automatically reject any form submissions with incomplete fields.

4. Inaccurate data 

What’s the cause of 75% of data loss? Human error. 

From migration to input, it’s easy for both consumers and businesses to make mistakes when dealing with data. 

Think: Extra zeros, misspelled locations and misreported psychographics. 

Other factors like data decay can also make your data less accurate. Gartner reports that globally, 3% of data decays each month. Data decay is the slow break down of data quality over time. And data decay can come down to things like data aging or software glitches.

Data quality management best practices

“Successful brands treat data as foundational — and not as a commodity.

To ensure successful execution, strategic brands see data as the foundational piece of their campaigns — not a supplemental afterthought.”

— Matt Frattaroli, Vice President of Alliant

If you want to treat data as a fundamental part of your advertising strategy rather than an afterthought or commodity, then put your time into building processes and architecture that will support your data quality. 

This is where data quality management comes into play. 

Data quality management (or DQM for short) is the name for the set of processes and practices organizations build on to make sure their data hits the level of quality they need it to. With the right data quality management practices in place, you can swerve or lessen the impact of siloed, duplicate, or missing data. 

With DMQ practices, you can step up your data quality — improving your data timeliness, integrity, and more. Here’s how. 

1. Profile your data 

Data profiling is the first step in the DQM process. When you profile your data, consolidate it and analyze it for any issues — such as errors or missing information. Once you know what’s wrong with your data, you can move to step two: cleaning your data. 

2. Clean your data 

After profiling your data, it’s time to clean it. When you clean your data, you correct, remove or update it to make sure it meets a certain standard. By putting regular data cleansing practices in place, you can remove duplicate data, check for missing data and fix any mistakes. 

How often do you need to clean your data?That depends on how big your business is. 

If you're a larger business, clean your data every three to six months. If you’re a smaller business, you can get away with cleaning your data once a year. 

3. Integrate your data 

“When insights processes happen on an ad hoc basis, the consumer sits at the periphery instead of at the center. When brands test ads and don’t even store their data in any platform or system, insights are obtained on a one-off basis. When that data doesn’t get integrated into a broader ecosystem, the focus is only on the point-in-time impact.”

- Nataly Kelly, Zappi CMO

As we explore above, data silos get in the way of advertising insights. Data integration is essential for giving you a complete view of your audience, channel, and campaign data. Put the infrastructure and platforms in place to integrate and centralize your data. 

4. Monitor and repair your data  

By monitoring your data, you can spot any mistakes and empty data fields and fix them to improve your data’s reliability and accuracy. 

Data validation is often part of this process — put checks in place to make sure your data meets the expected criteria and formats. You can use automation to help roll out data monitoring and repair at scale, over time. 

5. Data governance

Data governance involves putting data governance policies and procedures into place. 

Examples of data governance include defining data ownership, access controls and data maintenance processes. 

Data quality management stakeholders

Data quality management stakeholders help to shape, roll out and uphold the data quality management practices your team lays out. 

What data governance roles are the most important?

  1. Data admin 

The data admin puts your data governance program into place. They manage and organize your data and provide data access to data users. 

2. Data steward

It’s a data steward’s role to manage both data and processes. It’s their job to make sure that your data meets your quality standards. They also help people use data in an effective and appropriate way. They typically set up and make sure teams stick to policies. 

3. Data custodian 

The data custodian has responsibility for your technical environment and databases. They handle the storage, movement and security of your data. They also work with your data stewards to fix data quality issues. 

4. Data users 

Like the name suggests, your data users are the people who use your data. Examples may include: analysts, marketers, and high-level decision makers. 

Data quality management tools = better data

Data quality management tools can be an important part of your data quality management approach. You can use these tools to help automate your data quality management processes and improve the quality of your data. 

At Zappi, we help you make sure your consumer insights data is reliable and trustworthy. We built our agile market research platform to help give you data you can trust and turn into actionable advertising insights. 

When it comes to survey data, we’ve built targeted systems that screen out low-quality responses from bots or fraudulent respondents. Known as the Zappi Quality Score, this system assesses 14 signals of quality at the individual respondent level — empowering you to improve the accuracy and reliability of your research data.

Read more on the Zappi Quality Score here.

You’re able to centralize your research methodologies and improve the quality of your research data with Zappi, as well as help turn actionable consumer insights into the kinds of ads that keep people talking and buying. 

“Since partnering with Zappi, our creative effectiveness has improved by almost a third across all our advertising. This equates to PepsiCo gaining hundreds of millions in value.”

— Stephan Gans, SVP Chief Consumer Insights & Analytics Officer, PepsiCo

How PepsiCo makes winning Super Bowl ads time and again

Improve your data quality with Zappi

From a successful ad campaign to a winning product launch, so much of your brand’s success depends on the quality of your data. But without a solid understanding of the essential elements of good data and the right DMQ processes and tools in place — bad data can take you in the wrong direction, from ads that bomb to wasted revenue. 

If you’re looking to create targeted ads and better products with consumer insights data you can trust, reach out to us.

Subscribe to our newsletter

Each month we share the latest thinking from insights leaders and Zappi experts, open roles that might interest you, and maybe even a chart or two for all you data nerds out there.

Talk to us