Pete Doe, Media Research & Data Expert, Nielsen
If we want to know what advertising will look like in 2020 we will need to be able to measure it – so what will research look like in 2020? The increasing availability of “Big Data” – for example, online server information, Set Top Box TV data and customer purchase data is probably the most important factor. Ever-improving computing capabilities and ever-cooler consumer electronic devices suggests a future where ads and promotions are delivered to exactly the right person at the right time in the right place, and (in a somewhat Orwellian or even Pavlovian scenario) the consumer response is inevitably to purchase. All risk is taken out of the system through the use of completely relevant and accurately addressed messaging.
In this scenario the research data are census-based and come directly from the communication process. The same information moves in two directions – downstream to the consumer and upstream to the advertiser. This is very different from the sample-based currency research panels that are commonly used to determine advertising value for many media today.
This vision of a closed loop system that guides people effortlessly toward purchase is for the most part likely to remain just a vision. The real world is a messier place. There are various reasons why the reality will be different. The four main issues are ownership, definition, coverage and privacy.
Data ownership: the vast amounts of data collected are proprietary. An advertiser planning a national TV advertising campaign would find it difficult to work through all the different individual cable operators to get a consolidated view of where to place its ad. Creating a workable confederation of all the different data sources may eventually occur but it seems unlikely to be in place by 2020.
Coverage: not every transaction or exposure is electronically detected. The majority of set top boxes do not return data, some TV’s are over-the-air and do not return data, people still look at magazines and drive by billboards. And people still talk to each other and make purchasing decisions based on their friends’ recommendations.
Definition: Big Data are machine data – a set top box is tuned to a particular network (though the TV may be off), a computer requests a web page, scanners records purchasing. All these data are at least one step removed from people – we know an event happened but we can’t associate this definitively with an individual. This dilutes the addressability success of ad placement – an online ad served and counted using cookies does not directly reflect an identified individual’s exposure.
Privacy: privacy will always be an issue that has the potential to limit data availability, either explicitly due to consumer agreements or legislation, or because of companies’ caution in handling data. For example, the foundation of a cable company’s business is revenue from customers, not the data that the customers’ set top boxes generate, so protecting the customers’ privacy is clearly of paramount concern.
However, despite all these barriers, we are seeing a shift in the research world towards using these data sources in combination with more traditional research data. It seems logical to suggest that hybrid data integrations that combine machine data with person-centric sample-based measurements are likely to be the best way to measure the total consumer experience. High quality sample-based measurement gives a precise view of people’s media and consumer behavior, but the cost of this research is high and sample sizes are small. Big Data can deliver thousands or millions of records giving a sensitive measurement of changes in behavior but with the caveats above – the data will have coverage and definitional issues that deliver a biased or incomplete view of behavior. Combining the two approaches – person-centric samples and machine based Big Data – should give us the best of both worlds.
Research/Information companies and Big Data companies are already embarked on this, integrating data sources such as meter panels with set top box data or online server results. This requires new statistical modeling techniques as well as a clear understanding of the pros and cons of all the data sources being used. The simple research paradigm of “sample/measure/project“ that has held sway since the 1930s is not enough when multiple data sources feed into the measurement. The research world has a natural tendency toward conservativism – change means trend breaks and that can cause business uncertainty – and if the change entails more complexity there can be a natural reluctance to want to consider that change. For the benefits of Big Data to be realized then, there is some work to be done. The research world needs to (and in some cases is beginning to) agree standards and measures of quality around Big Data and its use in research, particularly for currency research that underpins the advertising business.
If the research world can embrace Big Data in a way that satisfies all constituents it will be a positive move. A better understanding of consumers and better addressing of messages to them has to be good for everyone. Perhaps in 2020 an advertiser may claim that only 40% of advertising is wasted, not 50%. But we will never get to zero because people are too individual and unpredictable for that… and let’s hope that is always the case.