How to clean your dirty data
In the geolocation business, accuracy is paramount. It’s all too easy to fuel an otherwise perfect campaign with dirty data, rendering it null and void, or pinpoint an individual to within five meters of their location, only to find you targeted the wrong person.
Dirty data, arising from technical glitches, corrupted data and criminal activity, is a common problem in location-based marketing. Its collection and sale affects 88% of companies and leads to a 12% revenue loss on average. But brands can protect themselves. They can start by working with partners who take fraud and inaccuracy seriously. They must demand transparency from their vendors by asking how they gather, filter and validate their data. This will help advertisers discern which vendors are worth working with, and in the process build a data landscape where vendors are held accountable.
Alternatively, accountability can be outsourced to third-party validators who carefully assess the accuracy of each campaign and ensure it’s reaching the right people. The quality of location verification still lags behind fraud, viewability, and content verification, but if the demand is there then the industry will step up to the mark. By requesting campaign validation, or choosing partners who already work closely with third-party validators, brands will see the quality and availability of location data verification gradually improve.
A beacon technology company is the kind of third party capable of demonstrating a campaign’s accuracy and helping brands understand their customers better through more accurate attribution. Through cross-referencing GPS and Wi-Fi location signals, the basis for beacon and proximity data, a campaign’s accuracy becomes doubly accounted for.
Vendors, on their part, need to make sure they are being transparent with their clients and scrupulous with the data they sell. They not only need to be able to spot bad data and remove it; they should also know how to use multiple data points to mitigate the sample bias. For instance, brands can partner with platforms which use a process of multi-point quality control to ensure that our GPS data is accurate. We then cross-reference this data against anonymized Wi-Fi addresses and our internal database for verification and scale. If any of the data does not initially pass our proprietary tech’s intense scrutiny, it’s excluded from our database. Our data hygiene protocol means that sometimes 85% of the data we collect in the course of a day is culled.
The fight against dirty data is an important one. But the attention to detail can’t end there, either. The issue of accuracy extends beyond purifying data, because with geo-location, the difference between five metres can be light years apart.
If you’re less than satisfied with your data sets, and have less to work with than expected, then there are innovative ways to leverage it to make sure you’re zeroing in on the desired audience. Let’s say a luxury brand wants to target guests at a high-end hotel. If the data isn’t specifically identifying people within the hotel’s boundaries, the brand may be targeting someone totally unsuitable who’s passing on the street. In this case, drawing polygons around our target zones – i.e. manually drawing lines around the perimeter of the hotel on a map – can tell us exactly who is inside or outside a building. After all, when it comes to hyperlocal targeting, whether someone is inside or outside a building means a world of difference.
Polygons are drawn around what we see when we take a bird’s eye view of buildings and other locations. A store, stadium or airport cannot be mapped out into perfect squares or circles, so drawing polygons ringfences highly specific locations to make the data you have count.
But drawing polygons isn’t a viable option for multiple locations, so that’s where another technique comes in. If a coffee shop chain wanted to target people nearby all of its branches, without the need to discern who’s inside and who’s outside the premises, we’d simply drop a pin in the coffee shop’s various locations and draw a circle around them. The size of the circle would depend on the campaign’s goals, and the radius could be increased or decreased during it, as required. This method is easily scalable for chains as the locations of various branches can be identified and incorporated easily.
Consumers are increasingly wedded to their smartphones, but even though there’s more location data than ever before, quantity just isn’t a substitute for quality. And unfortunately, the more marketing dollars that go towards dodgy dealings and dirty data, the greater the issue will become for the industry as a whole. But it’s possible to stop the data rot. With a combined effort from brands, vendors and third-party companies, the industry can change. Some spring cleaning, scrupulous data partners and using the data you do trust well is enough to ensure a healthier landscape and better results.