Counting On Our Toes
When it comes to homespun customer research in the membership resort industry, there’s good news…and bad news. The good news is that it really isn’t much different from the way it’s done in any other industry. The bad news, however, is that just because it isn’t different doesn’t make it any easier. To be meaningful, it has to be done right.
With that thought, we’d like to share a few insights with managers intractably bent on saving a few dollars in consulting fees by doing their own project. A word of caution: customer research can be tricky. The following paragraphs are meant to provide a simple overview of the high points, not a graduate seminar, so we won’t be discussing margin of error, significance testing, normal and non-normal distributions—none of the wickedly technical stuff lurking in the shadowy labyrinth of statistical analysis. It comes puckishly to mind that 16th-century territorial maps often bore the inscription “Hic Sunt Dracones” (here be dragons) to denote unexplored areas potentially fraught with peril. You’ve been warned.
A key step in the customer research process is figuring out what the heck it is you want to know. Be sure to canvass your stakeholders as you develop research objectives so as to avoid producing unendorsed, scattershot information that is neither relevant nor actionable. All the knowledge in the world won’t put another camper on the site, collect another maintenance fee, or develop another activity if you cannot do something with what you learn. If your findings are not endorsed, relevant, and actionable…they make a dandy paperweight.
A next step involves choosing the proper research method. Generally speaking, there are three: survey, observation, and experiment. While each method has its place, its broad familiarity suggests we focus on the survey. Surveys can be done over the phone, through the mail, in face-to-face interviews, on the internet, in malls and stores, and with focus groups—just about anywhere you have questions to ask and people to ask them of. A very effective and economical type of survey is the mail questionnaire. Because it’s generally longer format can be completed in the respondent’s home—privately, conveniently, and with the appearance of anonymity—the mail questionnaire fosters candor and superior depth of inquiry. What it lacks, however, is spontaneous follow-up dialogue—so an open-ended opinion section will give respondents an opportunity to share their views about topics of their own choosing…and that will help plug holes inadvertently left in the survey’s design.
What do we mean by “design”?
We mean thinking about specific questions to ask, wording, rating scales, the kind of data you want to generate, survey length, completion instructions, gender bias, response incentives—and more. This is a complex topic, so it will be addressed in a follow-up article, but here’s a word to the wise in the meantime: pay close attention to the questionnaires you receive in the mail, because as often as not they’re coming from professionals. Done judiciously, it may be possible to adapt something that is already out there. Note the word “adapt,” folks: we are not encouraging blatant thievery…just suggesting that if you pay attention to what the professionals are doing, wait for a dark night with plenty of cloud cover…well, you get the idea: if they sent it to you, they must have wanted you to have it!
This next point is very important: unless your population of interest is small—fewer than 500 households, say—you’re probably going to need a sample. A sample is a cross-section of the target population that is small enough to offer a project economy, but large enough to be representative. It matters that the sample be representative because the work invested in development goes right out the window if the questionnaire is sent to the wrong people! That’s obvious, right? Maybe–but the fact is that many surveys often produce questionable, even useless results because not enough attention was given to the appropriateness of potential respondents. Would we ask our dentist whether our car has a timing belt or timing chain? Probably not. Nor would most of us ask our mechanic to take a look at our teeth, maybe give us a price on a crown. A well-designed survey asks the right questions; a thoughtfully drawn sample ensures those questions are asked of the right people.
To draw your sample, you need a task-appropriate list. For example, political pollsters often use voting lists. A Chamber of Commerce wanting to query local businesses might use the Yellow Pages directory. A resort’s membership list generally works well so long as it is comprehensive enough to include small but important groups, and does not have a built-in bias or hidden order. If the purpose of the survey is to gather broadly-based information, you’ll obviously not want to use a delinquent-dues list to generate the sample. On the other hand, if you want to find out why people aren’t paying you—the delinquent dues list may be a good place to start. Think task-appropriate.
There are several methods for drawing the sample, but with a mail questionnaire intended for a cross-section of our members, you might use systematic sampling. Also called “nth name” sampling because it draws every third or tenth, or “nth” name from the list, systematic sampling is a bit less cumbersome than random sampling and is suitable for most applications.
Let’s say that you have 2000 members, they’re a fairly homogenous group, and your research objectives are simple. After talking it over with your stakeholders, you feel that 100 survey responses will produce a representative cross-section. (Note: more diversity, and more complexity call for a larger sample.)
Response rates ranging from 10% to 30% are common, so for this illustration, we’ll assume 25%, which means that you want to identify and mail 400 households because a 25% response rate should produce 100 returns. To create that list of 400 households, divide the target population by sample size—this generates the nth name (2000/400 = 5), which means you’re going to send the survey to every fifth household. Randomly select a starting point on the list, highlight every fifth household, draw a few additions to accommodate invalid addresses…and you’re golden. Well…golden-ish, anyhow.
Once you’ve drawn the sample, it’s time to print and mail the questionnaire. Unless in-house publishing capabilities are really good, it’s time to spend a little money with the local printer: appearance is one aspect of the project too important to do on the cheap. Also, be sure to include a postage-paid return envelope with the mail packet; and since we’re already spending money, it never hurts to add a little incentive for completing the questionnaire. A date-certain drawing for $500 can bump your return rate substantially.
Lurking ahead
Do-it-yourself customer research tips will continue in a future issue with Counting On Our Toes, Part Two as we address questionnaire design, simple analytical techniques—and the all-important KISS factor. One final, final thought: this overview in no way is intended to replace the kind of highly technical services provided by the many fine professional consumer research consultants that serve our industry (including—ahem–the author). If you are concerned that what you want to accomplish is more than you should tackle, pick up the phone: it’s better to be out a few bucks and right… than ahead a few bucks—and wrong.
The KISS principle
Now lets examine two more aspects: the KISS principle as it relates to questionnaire design, and statistical analysis. We should mention, however, that the caveat from the first article applies equally here: these few paragraphs are an overview, not an exhaustive study of all relevant issues. Management intent on doing its own customer research is strongly urged to explore the subject more closely because the pesky old devil does indeed hide in the details.
In questionnaire design, the acronym KISS means just about what you think it does, absent the pejorative: Keep It Short and Simple. KISS applies to the scope of the project, the length of the questionnaire or survey, and the wording of its questions or “items.” There is an inverse correlation between survey length and complexity on the one hand, and level of response on the other: as the questionnaire becomes longer or more complex, the response rate typically will decline. A rule of thumb for mail questionnaire length is no more than four 8½” by 11” panels, including instructions. Regarding clarity—use short sentences or questions, and “package” them, which is to say, keep like-items together under a common heading and sequenced, if possible, in a logical pattern with a natural flow leading from one subject to the next. When it comes to questionnaire design, remember the
“3 Ts”: simplicity, brevity, clarity.
There are, of course, research subjects that lend themselves neither to simplicity nor brevity and in those circumstances, clarity takes on critical importance… and one of the most common violations of clarity is the compound question. Here’s a compound question that actually appeared in the “Restaurant” section of a resort satisfaction survey the author recently saw in use (the name of the restaurant has been changed, obviously):
Q: Are you satisfied with the variety and pricing at the Yum-Yum Restaurant?
Note the potential for confusion—if respondents answer “Yes,” they probably are satisfied with each aspect—variety and pricing. However, if they answer “No,” it isn’t possible to determine whether they are dissatisfied with one, the other, or both. A better approach separates the two aspects while providing greater depth of information, and could look something like this:
INSTRUCTION: On the rating scale beside each question, place a checkmark in front of the score that most closely matches your level of satisfaction. Please note that 1=Very Dissatisfied; 2=Dissatisfied; 3=Neutral; 4=Satisfied; 5=Very Satisfied. If you have not yet been to the restaurant, place a checkmark here __ and move on to the next section of the survey.
- Q1. How satisfied are you with the restaurant’s Variety?
- Q2. How satisfied are you with the restaurant’s Pricing?
- Q3: What changes would you make to improve the restaurant?
With the addition of the 1-5 scale, this approach also creates a numerical basis for analysis of a more sophisticated nature—although we should point out not all practitioners agree that 1-5 scales or so-called “Likert-type” scales are suitably quantitative for advanced statistical treatment. Nonetheless, when (1) there are five or more points on the scale; (2) the intended interpretation of each point is specified; and (3) interval distances are perceived to be equal, the author generally accepts the measure as quantitatively sufficient. With four points, the scale may be thought of as “directionally correct.” Because of between-point information loss, quantitative analysis of scales with fewer than four points can be problematic.
Statistics as a science or discipline is concerned with the collection, analysis, interpretation, and presentation of data. Data is the most fundamental of three levels of increasing abstraction; information and knowledge follow in order, conceptually. Where data may be thought of as the display of a fact, information is the useful interpretation of that fact, and knowledge is the overarching guidance inferred from that information. For real-world summarization, estimation, and prediction, there are two major branches of statistics: descriptive and inferential; and within these branches are four levels of statistical measurement: nominal, such as gender, hair color, a political party; ordinal, which includes such concepts such as first, second and third, or Poor, Fair, Good, and Excellent; interval, such as measures of temperature and certain Likert-type rating scales; and ratio, such as weight, height, distance, and so on. With each successive level comes greater analytical depth and testing stringency. Although this sounds complex, and often is, the epistemological witch’s brew that is the science of statistics is dedicated to the task of finding answers to three ironically simple research questions: What are things like? Why are they like that? And, what will they be like if we change something?
Properly conducted customer research is a process of learning and discovery, a fundamental purpose of which is the refinement or development of activities and services that improve the customer experience. Few other industries rely so heavily as our own on the “happy camper.”
An example of such a project is drawn from a survey conducted with a client resort situated in several hundred acres of pristine, heavily wooded hills abundantly populated with wildlife—a truly beautiful natural setting. In the course of the research, it was learned that a key motivational driver for participation in the outdoor recreation lifestyle were the feelings of tranquility and emotional relief that came from immersion in the natural environment…but astonishingly, one of the greatest deficiencies in member satisfaction was in member access to that experience! Management was understandably surprised: even though the resort was located within some of the most incredible natural scenery in the state, the survey revealed unequivocally that members were less than satisfied with their “natural” experience!
Further analysis revealed the necessary solution. A professional Naturalist was employed to assist management in the planning and development of a well-informed nature-centric program complete with extensive hiking trails, Naturalist-accompanied tours, and overnight camping trips, and numerous other member-interactive events. On the promotional side of the equation, the program was trumpeted in an orchestrated mix of newsletter articles, viral campaigns, and resort signage. The result was doubly gratifying: not only did member satisfaction go through the roof, but maintenance fee revenue spiked appreciably as well.
This exemplifies the earlier paragraph about data, information, and knowledge. The data that began the sequence of events in this particular project was a measured disconnect drawn from paired scales measuring importance and satisfaction; interpretation of that data yielded the conclusion (information) that member achievement of a particular experience was deficient, and the knowledge eventuating from that information served to remind that experience is the product of activity and cannot be assumed to exist simply because underlying resources are abundant. As service providers, we must always connect the dots.
Looking ahead to Counting On Our Toes—Part Three, we will overview specific common statistics that anyone armed with a modern calculator can produce—and we’ll also take a look at an exotic statistical method that has been known to cause nose bleeds and vapors in the faint-of-heart. You’ve been warned.
Central Tendency
Lets discuss one of the most practically useful concepts in the science of statistics—Central Tendency, and we examine a related measure of distribution or “data dispersion,” the standard deviation. We then close with a look at what may be the two most popular calculations in statistical analysis: the correlation coefficient, and its workhorse cousin, the regression equation.
To begin: The organizing concept of Central Tendency is that if nothing else is known, the best description of a body of data will be the measures of its typical or middle nature: the mean, median, and mode, with each statistic representing a somewhat different perspective of the data.
We generally think of the mean as the arithmetic average: the sum of all data entries divided by the number of data entries. While the mean can give us a very good sense of the “middle of things,” it does have a potentially serious drawback: because its usual calculation includes each entry in the data set, the mean can be heavily influenced by extreme scores, called “outliers.”
To illustrate, imagine you’ve been tasked with creating an income profile for your resort’s management staff and you have 5 incomes of $20,000, $50,000, $60,000, $70,000 and $80,000, where the average or mean income is $56,000. (For the record, it’s the poor Activities Director who is being underpaid the $20,000 while the Reservations Manager, Housekeeping Manager, Operations Manager, and HR Director are raking in the big money.)
Now suppose that the Sales Manager’s income somehow slipped into the mix and replaced the HR Director’s $80,000 with his own $500,000. Sales Managers everywhere are probably having a chuckle at this figure, but all joking aside, please note that with inclusion of the outlier, the mean spikes to $140,000, increased by 2.5 times, or 250%! Since you are the one who has been tasked with creating the income profile, you have to ask yourself whether the average is more realistic, more representative if reported with—or without—the outlier.
When outliers distort results, as in this illustration, one solution is to perform a technique known as “data trimming,” in which both the highest and lowest scores are eliminated. While not a perfect solution, data trimming has a prestigious pedigree in that it was done for years by Olympic judges to score figure skating, diving, and gymnastics. Data trimming applied to our set of 5 scores would eliminate both the $20,000 and $500,000 entry, leading to a more reasonable mean of $60,000.
Before we leave our discussion of the mean, please permit a sidebar. An important aspect of what is known as the “normal distribution,” popularly characterized as the “bell-shaped curve,” is equality between the mean and the median. (We’ll visit the median next.) Interestingly, many if not most data distributions are normal or somewhat so, in which case not only will the mean and median be equal, but also about two-thirds of the data set will fall within plus-or-minus one standard deviation (1 SD) of the mean, and 95% will fall within plus-or-minus two standard deviations (2 SD). A technical description of the standard deviation awaits another article, but as a measure of variation or “spread” within the data set, think of the standard deviation as the average distance from the mean. Relatively small SDs indicate that most scores are tightly clustered around the mean, and larger SDs suggest that scores are more widely dispersed. If you are an investor with low risk propensity, you’ll probably be looking for investment return scores with a fairly small standard deviation!) The SD is an enormously informative statistic, much stronger than the data range, and a powerful ally of the mean.
It often is useful or even necessary, in addition to reporting the mean, also to report the mid-point of the data set—the median. In odd-numbered data sets, the median is found after sorting the data in ascending or descending order and then selecting the value that divides the set into halves above and below that value. If a data set has an even number of entries, the data is sorted in ascending or descending order and the median is taken as the average of the middle two values. With the 5 incomes from our example, the middle value of $60,000 is the median—regardless of whether the Sales Manager’s income slipped in or not—and it is precisely this “robustness” to the otherwise pernicious effect of outliers that make the median an attractive alternative to the mean when the score distribution is skewed, or “tail heavy.”
Related: Premium Perception Disconnect Part 2: Wrestling with Squirrels
The third common measure of central tendency is the mode. This statistic simply is the data entry that occurs with the greatest frequency. Absent repetition, there is no mode. If two entries occur with the same greatest frequency, that data set is said to be bimodal. While not a terribly exciting statistic, the model does have the distinction of being the only measure of central tendency that can describe data at the nominal level. For example, if you asked in your survey whether members prefer tent camping, RV camping, or staying in a rental unit, the indicated majority preference would be the mode.
Moving on, we find that our underpaid Activities Director wants a raise, and thinks she has found a way to justify it. She’s noticed over time that when she runs an ad in the local papers, attendance revenue increases for the resort’s outdoor recreation complex, drawing in substantial community participation. She feels that if she can increase advertising expenditure a bit, she can increase quarterly attendance revenue to above $300,000—a resort record—but she knows she must prove that a strong association or “correlation” exists between advertising and attendance…and then somehow calculate the right amount of advertising expenditure to achieve the goal. Dragging out her statistics primer, she finds that the formula for the correlation coefficient, r, is: r = n∑xy – (∑x)(∑y)/√ n∑x² – (∑x) ² √n∑y² – (∑y) ².
Undaunted by what appears to be a very complex equation, she begins. The letter “y” will be the symbol for the attendance variable, and “x” will be the symbol for the advertising expenditures variable. She further interprets the formula’s symbols as ∑ means “sum” or “summate”; √ means “square root”; the lower case letter “n” is the number of variable pairs; and that little ² means “squared.” Using 8 quarters (24 months) worth of data entries, she lists x (in 000s of advertising $) = 2.4, 1.6, 2.0, 2.6, 1.4, 1.6, 2.0 and 2.2 and matches that with its respective quarterly y (in 000s of attendance revenue $) at 225, 184, 220, 240, 180, 184, 186, and 215). Because she knows that the correlation coefficient is essentially a calculation of sums, she follows the guide below to calculate her entries for the equation (her sums are shown in parentheses):
(1) ∑ x means “Find the sum of the x-values.” (15.8)
(2) ∑ y means “Find the sum of the y-values.” (1634)
(3) ∑ xy means “Multiply each x-value by its corresponding y-value and find the sum.” (3289.8)
(4) ∑x ² means “Square each x-value and find the sum.” (32.4)
(5) ∑y ² means “Square each y-value and find the sum.” (337,558)
She then substitutes her sums (15.8, 1634, 3289.8, 32.4, and 337,558) into the formula:
r = n∑xy – (∑x)(∑y)/√n∑x² – (∑x) ² √n∑y²(∑y)²; which becomes: 8(3289.8) – (15.8)(1634)/ √8(32.4) – 15.8² √ 8(337,558) – 1634²; which then becomes 501.2/√ 9.88 √ 30,508; which finally yields the correlation coefficient .913…and the Activities Director is ecstatic because she knows that correlation varies between -1.0 for perfect negative correlation (where revenue would decrease as advertising expenditures increase) and +1.0 for perfect positive correlation (where revenue increases as advertising expenditure increases), so a positive correlation coefficient of .913 provides outstanding support to the argument for an increase in the advertising budget. But how much of an increase in advertising is needed? For this calculation, she turns to regression.
Returning to her statistics primer, she finds the regression equation, y = mx + b, where y is the dependent variable (quarterly attendance revenue); m is the slope or “pitch” (think rise over run) of the regression line; x is the independent or “causative” variable (quarterly advertising expenditure); and b is the constant—the value of y when x is zero. Knowing the regression formula is closely related to the correlation formula, she borrows her earlier correlation information and by way of substitution solves for m = n ∑ xy – (∑ xy)( ∑ xy)/n∑x ² -(∑ x) ²; and she then solves for b, where b = ∑ y/n – m(∑x /n).
She finds that the regression equation y = b + mx translates to y = 104.061 + 50.73x. She applies her findings hypothetically and increases her advertising expenditure to $3000…and discovers that revenue increases to $256,000 (50.73 x $3000) + $104,061 = $256,251). A little short of her goal of $300,000 for the quarter, she then tries $4000: (50.73 x 4000 + 104,061 = $306,981) and finds to her delight that the path to her quarterly revenue goal is clear: if she spends $4000 in advertising over the coming quarter, she can expect attendance revenue to exceed $300,000. With a Hollywood ending, the Activity Director’s boss agrees with her findings, approves the budget increase, sees revenue go up—and promotes the Activities Director to Recreation Vice President.
We should note here that we’ve provided simply an overview to what can be an extremely complex process of analysis, so if the intended project of customer research appears beyond immediate reach, pick up the phone and call one of the many outstanding consumer research professionals in our industry. It’s much, much better to spend a few bucks and be right…than to save a few bucks and be wrong.
You may reach the author, Ken Will, Director of Consumer Research for D & A Solutions, Ltd. Retired from timeshare. Originally printed in the September/October 2009 Resort Trades Management & Operations Magazine