SAN DIEGO, CALIFORNIA

ELECTRONIC MUSIC TESTING

ANALOG vs. DIGITAL

MAY 2005

In 1988, long before the launch of Pinnacle Media, in a conference room in Princeton NJ, Frank Cody and Owen Leach began examining the flaws, impracticalities, and numerous problems that programmers had uncovered through traditional, paper & pencil music testing. Before hailing what emanated from those meetings -- including the development of electronic data collection, or Interactive-digital methodology, and eventually the birth of Pinnacle Media Worldwide – it is most important to outline what those inherent flaws were at the time and in fact, still are today.

It was actually programmers within the industry, not researchers, who began to notice some consistently troublesome occurrences during the course of their paper & pencil music tests. The following is a summary of the issues addressed as we examine how these issues have been remedied though the means of technological advances.

1- Intellectual Responses

For years, music had always been tested intellectually. We were asking people to score songs using a 5 or 7 point scale making respondents feel as if they were taking an SAT style test forcing them to put a number to a product that they simply did not use that way. Music simply is not a product of intellect – it is one of emotion & passion. In fact, most researchers and programmers will agree that the station generating the most emotion will earn the distinction of the most compelling, superlative product. The task was to identify statistically reliable and accurate methods that would be able to measure and harness that emotion. Simply put, people listen to the radio and react emotionally, changing stations when they dislike a song and turning it up when they love it!

2- Too Many Questions in Seven Seconds

Once a hook begins, much like driving a car, there is a couple of seconds of reaction time. So now you only have five seconds left to make the several calculations and intellectual decisions required. Within those five seconds, the first question asked in a paper and pencil test is, “Do you know that song?” or “Is it familiar?”.

Next, they must score the song by first translating their emotional response of, “I love it, I like it, it’s just ok, I don’t like it, or I hate it” to a number from 1-5, and then transpose that score onto a scantron sheet by filling in an SAT style bubble. Next, again within the same five seconds, they must determine if they are “tired of the song” (and quite often “to what degree”). Finally, in some cases they are asked, “On what station would you most expect to hear that song?".

It was the King of Pollsters, George Gallup, who pointed out that the best research only asks one question at a time! There was simply too much information to intellectually process in a seven second period.

We learned after studying and interviewing hundreds of respondents after these style tests, that they would end up falling behind very quickly, and as a result, would often just fill in bubbles arbitrarily, or copy a neighbor’s responses just to keep up with the count…which leads to:

3- Keeping Track of the Slate Numbers

With paper & pencil music tests, it is imperative to have song slates before each title so the respondent can be certain the scores on the scantron sheet are matching the correct songs. One of the greatest threats to accurate research, resulting from this problem, was that often people actually either left bubbles empty, or got to the end and realized that they had mismatched the last hundred songs with each score off by one. Slate numbers were most often perceived as yet another piece of information which respondents had to process and also led to:

4- Extreme Fatigue

Inarguably, six or seven hundred titles represent a great deal of material to present to respondents regardless of the methodology. All research companies agree that a top priority is getting what is needed with as little fatigue as possible. The number one complaint consistently received during this long period of “method testing” was the sheer volume of material. Yet traditional music testing achieved the worst grades and added insult to injury by not only inserting two seconds before each cut, but reminding them every seven seconds of just how many titles they have heard (i.e. “song # five-hundred and seventy-three”). With up to four responses for each song, the level of fatigue becomes great and translates to over 2800 calculations within a two-hour period.

5- Position Bias

Many firms using paper & pencil methodology had conducted only a single session, with all respondents in the room at one time. There is without a doubt, definitive correlation to fatigue and test position. With any methodology, the songs actually testing in the beginning and the end, do exhibit statistical variances with those songs tested in the middle. In second and third sessions, some firms would simply reverse the order, assuming that this would adjust for the inherent bias, however moving titles from the middle to the beginning and end, has become the preferred method. In order to maintain the best possible adjustment Pinnacle’s DMT™ tests songs in groups of fifty. Most firms now use this as the industry standard, regardless of the methodology. There is yet another bias that is not considered when testing in only one session:

6- Lifestyle Variables and Sample Quotas

While single sessions may appear more efficient and are certainly more cost effective, they do not take into account the various life-styles and abilities of the sample. A single session minimizes the ability to randomize your sample properly by limiting it to only those people who are able to attend one specific time. Research is much better served by offering session time options to respondents, which in turn, make it more practical when meeting the quotas outlined within the screener.

However, we must mention that conducting one music test over a 3-4 week period, in order to avoid any “single night” bias is in itself flawed in that the variable of “time and age” of the song has passed. New songs tested in the first or second week of their life cycles that are tested four weeks later will impact the final scores. That song now has its own age as a variable that can no longer be quantified, especially by those testing burn and familiarity, since time is the variable that has the greatest impact on those responses.

Along with other quality firms, Pinnacle Media guarantees samples by + or -10% and will supplement that project with an additional session when necessary to meet the strategic and tactical goals of the study. As a high quality research firm, delivering the correct sample and converting that to improved ratings is how Pinnacle Media measures our own success.

The Biases of ALL Music Research

It must be said that all research has some inherent bias. The German physicist Werner Heisenberg developed the “Uncertainty Principle” which simply states that you can not truly observe something without introducing variables that disturb it. Here is a list of variables that impact ALL research regardless of methodology:

* No environment can exactly match those in which listeners use the product.
* Only those people willing to participate are even included in a research project. (The silver lining to that of course, being they are the same people who will agree to fill out a diary.)
* Only those willing to answer their phone when being recruited can participate.
* No one ever listens to only the hook of a song
* Moods of respondents based on their day, activities, stresses, etc are all variables out of our control.

The list actually goes on but the best research minimizes these variables and attempts to “level the playing field” so all things are equal. In the imperfect world of research, that must remain a most important goal.

 

The Digital Solution

Twenty five years ago Madison Avenue developed an alternative to traditional focus groups that enabled them to capture the emotional appeal of television commercial concepts and TV pilots. Shortly thereafter, film studios picked up on the growing trend to test “rushes” of films in production, and political campaigns began using the technology in focus groups and auditorium studies for testing candidates. This new methodology came with the advent of digital, interactive, data collection. In 1988 we first began exploring how to best utilize this technology to test music and soon discovered its ability to harness true, emotional appeal. Taking into account all the flaws outlined, the dials quickly made it apparent that they didn’t minimize those biases – they eliminated them.

1- Intellectual Responses

Pinnacle’s Digital Music Test™ uses a scale of 0-100 yet we do not force respondents to actually “score” songs. They are instructed to do what they do when listening to the radio. When they like a song they turn it up. When they don’t like it they turn it down. The degree to which they turn up or down tells us how much they like or dislike the song. We learned to ask respondents to tell us how they “feel” about the song today. While interactive dials have the ability to measure “burn” and “familiarity”, Pinnacle Media chooses to live by the Gallup axiom - “ask only one question at a time”.

We must note that “burn” is truly a function of current music and in fact we utilize this in OnlineTRACKER. Burn was developed in order to track the life-cycle of new music, which is usually 20-25 weeks, at which time it either disappears or makes it into library. As a result, respondents are asked to use their dial to reflect how they feel about the song today. How they used to feel is irrelevant. What programmers need to determine is how people feel about songs…not why, and to avoid forcing them through too many hoops.

We have learned that asking about “familiarity” likewise clouds the issue and is an intellectual response. The answer to the question, like burn, is already built into their emotional response to every song. We therefore, leave “burn” and “unfamiliarity” to call out and OnlineTRACKER, at which time the playing of a hook does not begin until the previous song’s information has been recorded.

2- Testing Relative Product Quality

With the advent of digital technology came another tremendous advantage. We discovered an ability to measure actual on-air music mixes of the client station, as well as their competitors. By scoping down two hours of music from each station we could now measure the “relative product quality” of one station against several others. This would become one of the most empowering and compelling points-of-differentiation between dials and paper & pencil. How does one’s core and cume respond to each station? What is their “intent-to-listen” to each station’s mix? And are they able to correctly attribute each to the proper radio station?

3- Less Fatigue

After doing several side-by-side tests using dials and paper & pencil, we learned that respondents felt the dial was more like listening to the radio, more emotional, more accurate, and far less fatiguing with complaints over the amount of test material far lower. Since the dial is read second-by-second and data is recorded in synchronization with every song, respondents no longer need to keep track of the songs. Therefore, there are no slates, making participants work less and completing the same number of songs in less time. As a result, Pinnacle’s DMT™ is afforded nearly 25 more minutes than paper & pencil, to ask perceptual questions and test other types of material:

4- Digital Content Analysis™

Digital-Dial technology also offers Pinnacle clients the option of testing morning shows and personalities, as well as TV spot campaigns (including those of competitors), keeping in mind that the technology was first developed for this purpose. Along with testing on air music mixes (and even prototypical pods) this translates into wonderful opportunities to turn a simple “music test” into something far more valuable, while still realizing the limits of a tactical sample.

5- Immediate Results

One advantage that clients enjoy is the viewing room and next-day results. During a Pinnacle Media Digital Music Test™ clients are in a hidden, adjacent room watching the results on the screen in real time. They can “see” how the sample “feels” moment-to-moment, giving them an actual snapshot of their audience during any song. You can literally and graphically see when they tune out and when turn it up. Data from all sessions is then crunched overnight with final results presented to the client the next morning, less than 12-hours after testing is complete.

Winner of the “Most Asked Question” Award

The most common question asked by first time users is whether or not the previous song will impact the way one responds to the next song. There are a few points that must be made here:

A) If previous songs impact the way one feels about the next song then all music testing is superfluous, since once we put them on the radio, and place them next to other songs, what we learned about the song in the test has now become meaningless, since the previous song they heard has an impact on how they feel about the song on the air right now.

B) If it’s true with dial technology then it’s true with paper & pencil since neither method asks the respondent to forget the previous song before scoring the next song however…

C) Paper & pencil testing actually affords more chance of previous songs impacting subsequent songs by what is called the “rule of the 3’s”. Paper and pencil testing keeps a running history of how they scored all previous songs – it is truly right in front of them and they begin to examine what they should score next, based on how the previous answers look. If a respondent has scored a sequence of three straight 5’s, the temptation to vary their answer for upcoming songs becomes greater. We have all experienced this on standardized testing when we saw too many answers that appear the same, our temptation was to just change it up. With dials there is no previous history to bias subsequent songs. All they see in front of them is the current second.

D) If previous material had an impact on subsequent material then it would also affect the personality testing as well. Many research firms use dial methodology for this purpose and notice no variance in their content testing.

E) This continues to be tested time and time again by simply placing the same song in the same study, multiple times. These songs always test within acceptable statistical variance. This is actually something never tested using paper & pencil but occurs at least four or five times in every music test we do, mostly because as-is pods end up playing a few of the same songs in head-to-head battles.

Pinnacle Media Worldwide

Pinnacle Media Worldwide was conceived, created, and developed by programmers for programmers. Each member of Pinnacle’s team has been in the trenches fighting the battles and each is dedicated to continuing to work on creating new, reliable methods that help our clients drive ratings, and revenue. We, like many others have learned that, “insisting on doing things the same way means you could be doing them wrong.” If not, we would all still be watching VHS tapes, playing vinyl records, have cassette players in our cars, and heaven help us all…talking on pay phones!

Creating the most advanced and accurate methods of research remain Pinnacle Media’s goal and mission as we continue to help empower our clients in the coming years.

 

© Copyright 2001-2010 Pinnacle Media Worldwide.com - ws1

Home | Services | Clients | Tips | TeamPINNACLE | Contact Us