Current Research

What is Political Psychology? Our Team Current Research Publications Colloquia PPRG in the News How to Join PPRG

PPRG Research Project Topics

Voting and Elections

Gender bias in u.s. presidential elections

(With Cecilia Mo)

The goal of this research project is to assess the effects of gender bias in contemporary U.S. Presidential elections. In doing so, we acknowledge that there are many forms of bias against women and female candidates that can affect voter decisions and ultimately electoral outcomes. We take care to measure each of these types of bias, which can take the form of explicitly sexist attitudes and/or unconscious preference for male leaders over female leaders. Our ultimate goal is to measure the impact of these various forms of gender bias on vote choice in a way that accurately measures their impact on electoral outcomes in contemporary elections.

We propose a two-pronged approach to study gender bias in U.S. presidential elections. Firstly, we aim to test whether voters’ biases against women and female political candidates contributed to Clinton’s loss in the 2016 presidential elections. Secondly, we aim to leverage the unprecedented nature of the 2020 Democratic Party presidential primary election, where there have been multiple viable female candidates, to test whether voters’ bias against women and female candidates is resulting in diminished support for the women running in the Democratic primaries. As political party is a powerful evaluative criterion in elections today, the make-up of the primary election affords researchers a unique opportunity to assess whether voters are willing to vote for a woman for President when party identification is not a distinguishing factor between the candidates under consideration.

Using a probability sample of Americans, we test a range of measures of bias against female candidates to identify the impact of gender bias in the 2016 U.S. presidential election, as well as the potential for such bias to impact the 2020 presidential election.

An Exploration of Forces Driving Vote Choices in the 2008 American Presidential Election

(With Omair Akhtar, Josh Pasek, Keith Payne, Trevor Tompson, and Yphtach Lelkes)

At the beginning of September, 2008, Barack Obama was ahead of John McCain in the polls, but by a much smaller margin than many forecasting models predicted. In conjunction with the Associated Press, Yahoo! News, and the Stanford University Institute for Research in the Social Sciences, we investigated potential reasons that might have explained this discrepancy by regressing vote choice at that time and actual voting behavior on election day on a series of predictors, including: racism, beliefs about the candidates’ competence in governing, military experience, integrity, elitism, and issue priorities, their familiarity to voters, voter preference for divided government, voter perception that Obama is a Muslim, Obama’s relationship with Reverend Wright, evaluations of the first ladies, resentment maintained by Hilary Clinton supporters, perceptions of the economy, Presidential approval ratings, party identification, and ideology.

A variety of racism measures were used, including symbolic racism, racial resentment, stereotypes of Blacks’ personalities, affect toward Blacks and Whites, other explicit measures, and the Affect Misattribution Procedure, a measurement tool developed by social psychologists for measuring prejudice without explicitly asking respondents. This procedure asks respondents to evaluate various Chinese ideographs and measures the spillover of the affect inspired by nearly-subliminal exposures of faces of Black and white faces preceding them.

We identified a wide range of factors that explained vote choices as well as decisions about whether to vote or abstain.

The Effect of Polls on Political Behavior

(with Neil Malhotra)

In recent years, there has been much speculation about the possibility that pre-election polls gauging the status of a horserace between competing candidates may affect the behavior of voters on Election Day. Specifically, some observers have asserted that polls showing a runaway victory may discourage voters from turning out, because they think their vote will not make a difference in the outcome.

To test this idea, we are exploring the impact of public opinion poll results on political attitudes, beliefs, and behavior in the electoral context. We are developing and testing a causal model to explain how people update their assessments of the closeness of a race using information from polls and how they then use this closeness assessment to evaluate candidates and decide whether or not to vote.

To test our model, we conducted a survey experiment (administered by Harris Interactive with a sample of the general U.S. adult population) about a hypothetical presidential race between Hillary Clinton and Jeb Bush in 2008. Some respondents were told about a poll showing Clinton far ahead of Bush, other respondents were told about a poll showing the race as tight, and still other respondents were not told about any poll result. Some respondents were asked about their perceptions of race closeness before being told about the poll results, and others were not asked this question.

We explored three sets of questions:

First, we explored how people used the poll results to update their assessments of the closeness of a race, and what factors moderate this process. Previous studies have ignored this updating process – indeed, they did not even measure people’s prior beliefs before providing poll information. We discovered that providing any poll result caused our survey respondents to update their closeness assessments in the direction of the poll result. Interestingly, asking people about their prior beliefs before providing poll results reduced updating.

Second, we explored whether polls showing one candidate far ahead cause people to “jump on the bandwagon” and support the frontrunner. We went beyond the existing literature by analyzing the relation between “the bandwagon effect” and the updating process, the validity of various proposed causal mechanisms underlying the “bandwagon effect,” and the variables that moderate these causal processes. Among many findings, we observed that respondents were more likely to vote for Governor Bush when exposed to the tight poll result.

Third, we explored whether polls showing the race is lopsided reduced political participation by making citizens feel their actions will be inconsequential. We found that polls showing one candidate behind another had a demobilizingeffect; people exposed to polls showing Governor Bush trailing were less likely to vote, volunteer for his campaign, and contribute money.

Rationalization of Candidate Preferences and Mischaracterization of the Causes of Votes

(with Alison Pfent)

For years, researchers have been interested in what determines citizens?N decisions about which candidates to vote for in presidential elections. These decisions have intrigued political scientists and social psychologists alike, and their research has led to the discovery of numerous elements that seem to play causal roles: party identification, performance of the incumbent, the health of the national economy, candidates?N stances on important national issues, perceptions of candidates?N personalities, and more. Although correlations of various political attitudes and beliefs with candidate preferences are well-documented and assumed to influence impact on candidate preferences, social psychology (and cognitive consistency theories in particular) suggests a very different interpretation of these correlations: rationalization.

We are investigating the theory that people form candidate preferences and thereafter change related political attitudes and beliefs so they become more consistent with their candidate choice. Using all available National Election Study data, we have found evidence of pervasive rationalization in every instance possible to analyze. We have found evidence of rationalization in ideological self-identifications, party identification, and stances on abortion and other national policy issues. We are now attempting to employ panel data to yield strong evidence about the causes of candidate choices, eliminating the contaminating influence of rationalization.

Candidate Name Order Effects in Elections

(with Joanne Miller, Michael Tichy, Daniel Schneider, Eyal Ophir, Key Lee, Daniel Blocksom, and Alexander Tahk)

A great deal of evidence suggests that survey respondents’ answers to closed-ended questions can be influenced by the order in which those choices are presented. However, the impact of order depends upon whether the choices are presented visually or orally. Under visual presentation conditions, people are inclined to select the first response options they encounter, and whereas under oral presentation conditions, people are inclined to select the choices they encounter last.

Given that these order effects appear quite consistently in surveys, we were interested in whether they would appear in real elections. When they enter voting booths, citizens encounter candidates’ names visually, either written on paper or written on a voting machine. Findings from survey research therefore suggest that people may be inclined to selected names toward the top of the list. To test this idea, we collected actual voting returns for the 1992 elections in three large counties in Ohio. Precincts are randomly assigned to receive different orders of candidate names, so we were able to analyze these data as if they resulted from an experiment. And indeed, people were inclined to vote for candidates whose names appeared toward the top of the ballot. These effects were much more common in races about which voters knew less and where partisan affiliations of the candidates were not listed on the ballot.

In 2001-2002, we conducted another test of these effects. This time, we analyzed data from the 2000 general election for the entire state of Ohio, as well as the states of North Dakota and California. In all three states, portions of the state (such as precincts or counties) were assigned to receive different orders of candidate names for all races we analyzed, so we were able to analyze these data as if they resulted from an experiment. Again, name order effects were found in many of the races we analyzed. A side effect of this research project was the discovery of how widely name order laws vary in the U.S.

Currently, we are in the process of two large data collection enterprises to improve our understanding of moderators of ballot order effects and to investigate the implication of different statistical approaches to testing hypotheses on ballot order effects. For the 2004 presidential race, we are focusing on the election in Ohio where ballot order was implemented on the precinct level. We are especially interested in the impact of voting method on effect strength and in irregularities in the implementation of rotation. More extensively, we are currently gathering elections results for all state-wide races in California from 1976 to 2006 to investigate the ballot order effect in over 50 different races.

Rationalization of Candidate Preferences and Mischaracterization of the Causes of Votes

(with Alison Pfent)

Survey Research Methodology

Cross-National Replication of Question Design Experiments

(with Henning Silber, Tobias Stark, Annelies Blom, Melvin John, Johan Martinsson, Peter Lynn, Karen Lawson, Sanne Lund Clement, Guðbjörg Andrea Jónsdóttir, Michael Bosnjak, Anne Cornilleau, Endre Tvinnereim, Ana Belchior, and Ruoh-rong Yu)

Our research explores whether the principles of question design operating in the field today, which are based mostly on American data, can legitimately be generalized across countries. More precisely, we implemented well-tested split-ballot design experiments from single-country contexts in multiple countries to gauge country-specific differences in response behavior, satisficing, and social desirability response bias. We conducted these experiments in Canada, Denmark, France, Germany, Iceland, Netherlands, Norway, Portugal, Sweden, Taiwan, UK, and in the United States. The data was collected during the same time period from longitudinal panels or cross-sectional surveys, all based on probability sampling methods. This study design enables us to compare response patterns in thirteen countries on a national level. The study included experiments on question order, acquiescence, question wording, no opinion response options, and response order. It has been presented at international conferences such as the Conference of European Research Association, the Conference on Survey Methods in Multinational, Multiregional and Multicultural Contexts (3MC), and the Conference of the American Association for Public Opinion Research. We currently work on multiple manuscripts to publish our research findings.

Generalization of Classic Question Order Effects Across Cultures

(with Stark, Silber, Krosnick, Blom, Aoyagi, Belchior, Bosnjak, Clement, John, Jo´nsdo´ttir, Lawson, Lynn, Martinsson, Shamshiri-Petersen, Tvinnereim, and Yu)

Questionnaire design is routinely guided by classic experiments on question form, wording, and context conducted decades ago. This article explores whether two question order effects (one due to the norm of evenhandedness and the other due to subtraction or perceptual contrast) appear in surveys of probability samples in the United States and 11 other countries (Canada, Denmark, Germany, Iceland, Japan, the Netherlands, Norway, Portugal, Sweden, Taiwan, and the United Kingdom; N ¼ 25,640). Advancing theory of question order effects, we propose necessary conditions for each effect to occur, and found that the effects occurred in the nations where these necessary conditions were met. Surprisingly, the abortion question order effect even appeared in some countries in which the necessary condition was not met, suggesting that the question order effect there (and perhaps elsewhere) was not due to subtraction or perceptual contrast. The question order effects were not moderated by education. The strength of the effect due to the norm of evenhandedness was correlated with various cultural characteristics of the nations. Strong support was observed for the form-resistant correlation hypothesis.

GENSI: A new graphical tool to collect ego-centered network data

(with Tobias Stark and Jon Krosnick)

This study (1) tested the effectiveness of a new survey tool to collect ego-centered network data and (2) assessed the impact of giving people feedback about their network on subsequent responses. The new tool, GENSI (Graphical Ego-centered Network Survey Interface), allows respondents to describe all network contacts at once via a graphical representation of their networks. In an online experiment, 434 American adults were randomly assigned to answer traditional network questions or GENSI and were randomly assigned to receive feedback about their network or not. The traditional questionnaire and GENSItook the same amount oftime to complete, and measurements of racial composition ofthe network showed equivalent convergent validity in both survey tools. However, the new tool appears to solve what past researchers have considered to be a problem with online administration: exaggerated numbers of network connections. Moreover, respondents reported enjoying GENSI more than the traditional tool. Thus, using a graphical interface to collect ego-centered network data seems to be promising. However, telling respondents how their network compared to the average Americans reduced the convergent validity of measures administered after the feedback was provided, suggesting that such feedback should be avoided.

The Accuracy of Measurements with Probability and Nonprobability Survey Samples: Replication and Extensions

(with Bo MacInnis, Jon A. Krosnick, Annabell S. Ho, Mu-Jung Cho)

Many studies in various countries have found that telephone and internet surveys of probability samples yielded data that were more accurate than internet surveys of nonprobability samples, but some authors have challenged this conclusion. This paper describes a replication and an expanded comparison of data collected in the United States, using a variety of probability and nonprobability sampling methods, using a set of 50 measures of 40 benchmark variables, larger than any used in the past, and assessing accuracy using a new metric for this literature: root mean squared error.

Despite substantial drops in response rates since a prior comparison, the probability samples interviewed by telephone or the internet were the most accurate. Internet surveys of a probability sample combined with an opt-in sample were less accurate; least accurate were internet surveys of opt-in panel samples. These results were not altered by implementing poststratification using demographics.

Does succeeding on attention checks moderate treatment effects?

(with Sebastian Lundmark, John Protzko, Matt Berent, Jon A. Krosnick, Jonathan Schooler, Brian Nosek, Leif Nelson, Charlie Ebersole, and Nick Buttrick)

Attention checks has become a staple when administering questionnaires to participants in online access panels and participants of crowdsourcing platforms such as Amazon Mechanical Turk. The reason for the growing popularity of attention checks is simple; they are thought to allow for the identification of participants who paid too little attention to have received the experimental manipulation properly. By analyzing data from 80 experiments testing 16 independent hypotheses, the present research investigates whether answering attention checks correctly or incorrectly moderate the effects of the experiments. In addition to adding a large number of treatment effects tested, several different types of attention checks and their moderating effects will be compared.

The memory of survey responses: Do respondents remember?

(with Sebastian Lundmark and Jon A. Krosnick)

Pre-test/post-test experiments, panel studies, and multitrait mixed methods all rely on the assumption that respondents do not remember their responses. If the respondents remember their responses, any subsequent measurement of a similar or identical construct will likely be biased, and these methods would not produce valid results. Fortunately, previous research in psychology and survey methodology has found that people generally do not remember events with accuracy nor what responses they gave in a questionnaire. However, in contrast to such findings, our research presents results from several survey embedded experiments where respondents correctly remembered their responses. Respondents were able to retain their previous responses for a longer period than what previous research would have us believe. The results indicate that 20 minutes later and after answering many questions, many respondents accurately reported the exact answer that they gave. A large majority of these respondents remembered their answer when being reminded what the response options had been (i.e., recognition), but even more surprisingly, many respondents could retain their answers when not being reminded of what the response options were (i.e., free recall). These findings should be a cause for concern for any study design relying on the assumption that respondents quickly forget their previous answers.

The Net Promoter Score (NPS) and predicting relative revenue growth

(with Sebastian Lundmark, Jon A. Krosnick, Ellen Konar, Matt Berent, Yphtach Lelkes, Daniel Schneider, Randall Thomas, and David Yeager)

The Net-Promoter Score (NPS) may be the most popular measure of customer satisfaction, touted as the single best predictor of company revenue growth. However, little peer-reviewed evidence has been presented where the NPS’s ability to predict growth have been appraised. Through the analyses of data from five surveys exploring 53 observations of companies in the U.S. and their revenue growth, found no evidence of reliable predictive ability when coding responses to the standard NPS question in various ways, estimating revenue growth in various ways, allowing different functional forms of the relationship, and measuring likelihood to recommend using various question wordings. Asking people how many times they have recommended the company in the recent past did predict growth marginally significantly and positively.

The Dunning-Kruger Effect Revisited

(with Sebastian Lundmark and Jon A. Krosnick)

In 1999, the Journal of Personality and Social Psychology published the now very well-cited article called Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments by Justin Kruger and David Dunning. In the paper, the authors’ suggested that people who performed a task poorly often failed to recognize that fact. Moreover, not only did those poorly performing individuals underestimate their own performance, but also their inability seemed to rob them of the skills necessary to understand that they had performed worse than their peers. This phenomenon has come to be known as the Dunning-Kruger effect (the D-K effect). Since their seminal article, Dunning and Kruger with colleagues have repeatedly shown that the D-K effect travels across knowledge domains, tasks, skills, and abilities.

Despite that David Dunning and Justin Kruger have found replicated the D-K effect across so many studies, most of their studies have measured participants’ relative performance using a survey question wording that may turn out to have been problematic. The majority of their studies (1) instructed participants to guess the percentile of their performance, (2) the instructions for guessing their relative performance were contradictory, (3) the guesses were not corrected for subadditivity, and (4) participants were asked to compare themselves to a not clearly defined group of peers. The present research aims to assess the influence that these four problems may have had for the estimation of the D-K effect on relative performance. Furthermore, attempts to improve the relative performance measurement will be implemented in an attempt to estimate a more distinct and clearly defined D-K effect.

How to measure political attitudes and beliefs: evidence from a meta-analysis and a large-scale experiment examining number of points on rating scales

(With Eran Amsalem, Alex Tahk, Jon A. Krosnick)

When the National Election Study surveys measure Americans’ attitudes on policy issues, they typically employ 7-point rating scales. But attitudes are also measured in those surveys using 2-, 3-, 4-, 5-, and 101-point scales. Thus, in this very visible and important set of surveys, there is no standard for the number of scale points to be used when measuring attitudes. This variation is not unique to this case, because there is in fact tremendous variety in the scale lengths employed by researchers throughout the social sciences.

In addition to the inconsistent use of rating scales when measuring attitudes, another source of disagreement in the literature has been methodological research assessing the performance of different scale lengths. Prior experimental work systematically varying the number of points on a rating scale has reached divergent conclusions. Some studies have suggested that the optimal scale should consist of seven points. Others, however, have concluded that more categories—i.e. 11-point scales—provide better measurement; that a 3-point format should be preferred; and even that measurement reliability and validity do not change systematically with changes in the number of response categories.

To bring clarity to this confusion, we begin this study by offering a theory explaining the influence of scale length on reliability and validity. We then test the theory in two studies. Study 1 is a meta-analysis of the existing experimental studies (k = 39), in which the length of rating scales was varied experimentally. Study 2 is the largest and most rigorous experiment to date on the impact of scale length: a three-wave experiment with a nationally representative sample (N = 6,055), randomly assigning respondents to one of ten scale lengths (2- to 11-points), measuring dozens of attitudes in a variety of domains, and evaluating the influence of various moderators. Across the two studies, we find that measurement reliability and validity improve as scales lengthen up to 7 points for bipolar constructs and up to 4 points for unipolar constructs, and there is no benefit from lengthening scales more. These findings provide clear recommendations for political scientists about how to measure political judgments optimally.

How accurate are survey measurements of objective phenomena?

(With Lisanne Wichgers and LinChiat Chang)

The prominence of surveys is based on the assumption that survey results are accurate. A number of authoritative books and papers have provided thorough instruction on the various sources of survey errors, including measurement errors. In contrast to the plethora of literature on survey error, there is no comprehensive collection of positive evidence on survey accuracy. This imbalance in the literature could convey an unwarrantedly negative view of survey validity and reliability to the average consumer of research results.

A central repository of evidence that speaks to the accuracy of survey estimates can go a long way towards achieving several objectives. Perhaps most importantly, we need this collection of evidence because professionals in the field of survey research routinely need to demonstrate the value of surveys to audiences who are skeptical of the reliability and validity of survey measures. Furthermore, a comprehensive review of this evidence could help researchers compare different ways of conducting surveys in order to identify methods that yield the most accurate results. We could also document problems and pitfalls, and possible ways to improve survey measurement when accuracy is not that high.

In this project, we report a review of various types of evidence assessing accuracy (vs. benchmarks not derived from self-reports) across a wide spectrum of content areas. We focus only on self-reports of events, behaviors, acts, and physical attributes that can all be objectively verified. Moreover, we only include peer-reviewed research publications that provide original empirical evidence that speaks to survey accuracy, and include only survey measures about past or present behaviors or attributes. We study over thousands instances of survey accuracy assessment from over hundreds of papers documenting original empirical evidence. We include papers from diverse areas of research, including crime, demographics, economic indicators, healthcare, labor force statistics, market research, philanthropy, politics, psychology, substance abuse, media, and much more.

Survey Research

(with Penny Visser & Paul Lavrakas)

Social psychologists have long recognized that every method of scientific inquiry is subject to limitations and that choosing among research methods inherently involves trade-offs. With the control of a laboratory experiment, for example, comes an artificiality that raises questions about the generalizability of results. And yet the naturalness of a field study or an observational study can jeopardize the validity of causal inferences. The inevitability of such limitations has led many methodologists to advocate the use of multiple methods and to insist that substantive conclusions can be most confidently derived by triangulating across measures and methods that have nonoverlapping strengths and weaknesses (see, e.g., Brewer, this volume, Ch. 1; Campbell & Fiske, 1959; Campbell & Stanley, 1963; Cook & Campbell, 1969; Crano & Brewer, 1986; E. Smith, this volume, Ch. 2).

This chapter describes a research methodology that we believe has much to offer social psychologies interested in a multimethod approach: survey research. Survey research is a specific type of field study that involves the collection of data from a sample of elements (e.g., adult women) drawn from a well-defined population (e.g., al adult women living in the United States ) through the use of a questionnaire (for more lengthy discussions, see Babbie, 1990; Fowler, 1988; Lavrakas, 1993; Weisberg, Krosnick, & Bowen, 1996). We begin the chapter by suggesting why survey research may be valuable to social psychologists and then outline the utility of various study designs. Next, we review the basics of survey sampling and questionnaire design. Finally, we describe procedures for pretesting questionnaires and for data collection.

Moderators of Ballot Name Order Effects: The Role of Information and Ambivalence

(with Nuri Kim)

Much evidence suggests that candidate name order effects occur in elections, but we know surprisingly little about the psychological mechanisms(s) responsible for these effects. A handful of past studies have identified conditions in which the effect is more or less pronounced, generally relating to either the characteristics of the election (e.g., type of race) or individual attributes (e.g., education). Adding to such external and dispositional contingencies, the current study focuses on the cognitive processes that lay beneath the observed effect. Two main moderators are examined – the amount of information voters have about the candidates, and the ambivalence voters feel toward the candidates. An experiment embedded in a national survey was done to test both explanations.

Unmotivated Anonymity

(with Yphtach Lelkes, David Marx, Charles Judd, and Bernadette Park)

Public opinion researchers often assume that promising anonymity to survey respondents minimizes social desirability response bias. Anonymity may indeed encourage reporting of potentially embarrassing attitudes and behaviors and may discourage over-stating socially admirable attitudes and behaviors, but past studies have generally not tested whether anonymity makes reports more accurate. Two experimental studies demonstrated that making participants completely anonymous when answering self-administered paper and pencil questionnaires led them to report fewer socially admirable opinions and behaviors and more socially embarrassing opinions and behaviors. But complete anonymity also induced more survey satisficing and lower accuracy of reports of factual matters. These studies suggest that complete anonymity may not be a costless method for minimizing social desirability response bias and that less extreme versions of confidentiality may be preferable.

Improving Survey Design and Accuracy for the National Science Foundation

(with Curtiss L. Cobb III)

The National Science Foundation (NSF) is mandated by the U.S. Congress to provide a central clearinghouse for the collection, interpretation, and analysis of data on the science and engineering resources of the nation. NSF partially fulfills this responsibility by annually conducting three large national surveys designed to collect uniform data that allows for a detailed analysis of the employment, educational, and demographic characteristics of those trained in science and engineering fields.

To help NSF carry out this mission, we are conducting a series of experiments and analyses and provide technical expertise on three new efforts: (1) Designing a starting salary question for the Survey of Earned Doctorates (SED). We have designed a survey experiment to determine which question type for salary provides the best quality of data without hindering the quality of other SED data. (2) Designing a question on the field of a person’s bachelor’s degree for inclusion in the American Community Survey (ACS), yielding a recommendation of the best wording for a field of degree question for the ACS. The accuracy of the data yielded by different question designs is being tested in a survey experiment using a sample of Stanford University undergraduate alumni and their parents. (3) Conducting statistical analyses of the Survey of Doctorate Recipients to assess whether survey responses vary systematically depending on mode in which the data are collected, to profile post-docs in various fields, and to assess whether studying migration of post-docs from one field to another is possible given the existing data structure. Additional cognitive work will identify optimal ways to ask survey questions of post-docs using focus groups.

Conversational Conventions

(with Allyson Holbrook, Richard Carson, & Robert Mitchell)

Research in linguistics suggests that conventions govern the order in which words are listed in sentences during everyday conversations. We examine one such convention, that when listing two terms, one positive and the other negative, it is conventional to list the positive one first (e.g., like or dislike, for or against, support or oppose). Specifically, we examine whether, when asking a question to gauge a person’s attitude it is conventional to offer the positive or affirmative response choice first, and the negative response choice second.

We found that in everyday conversation it is conventional to offer the positive or affirmative response option first. We have found that violating conversational conventions can sometimes reduce the data quality of responses to attitude questions. When the two options are presented in the unconventional order, expectations are violated, people are surprised and distracted, so responses are made more slowly and with more error. These effects are most apparent among respondents with the least cognitive skills, those with low GPA’s or little formal education.

If there is a convention regarding the order in which response alternatives to such questions should be offered, one might presume that researchers would never violate it, so the problems caused by violating the convention would never occur. However, there is a reason why researchers may violate the convention: response order effects. A great deal of research has found that the order in which response choices are offered can influence the distribution of answers to closed-ended questions, sometimes advantaging alternatives presented first, and other times advantaging alternatives presented last. In order to minimize the impact of such response order effects on response distributions, some questionnaire design experts have advised that response order be systematically rotated across respondents and at least on major survey firm, the Gallup Organization, routinely rotates response alternatives in order to estimate and control for response order effects.

In the past, the only apparent costs of such rotation have been that it increases the complexity and expense of the survey and introduces a source of systematic measurement error that must then be modeled in multivariate statistical analyses. However, our research suggests that presenting responses in the unconventional order makes respondents’ cognitive tasks more difficult and reduces data quality. Consequently, the best solution may be to use only the conventional response order and take steps to eliminate response order effects by enhancing respondent’s motivation to thoughtfully answer survey questions.

Response Rates in Surveys by the News Media and Government Contractor Survey Research Firms

(with Allyson Holbrook & Alison Pfent)

In recent years, there has been wide-spread speculation about the possibility that response rates for national surveys have been low and are dropping due to increasing respondent reluctance to be interviewed. This concern is accompanied by additional worry that low and dropping response rates are associated with decreased representativeness of survey samples and therefore reduced accuracy.

We initiated a project to attempt to better understand current response rates in the best and most visible surveys being done of nationally representative populations by telephone via Random Digit Dialing. To this end, we approached the nation’s leading news media polling organizations and the nation’s leading survey research firms that do large-scale telephone surveys for agencies of the federal government. All of the organizations we approached agreed to provide to us full disposition codes for recent national RDD telephone surveys, answers to a series of questions about how the surveys were conducted, and unweighted distributions of demographic variables for the respondents who completed interviews.

We found that response rates for the news media surveys were lower than those for the government contractors and that there was considerable variability in these response rates, with some very low and others quite high. Observed response rates were correlated strongly with refusal rates and more weakly with contact rates. Various aspects of survey procedure were associated with higher response rates, as would be expected, including longer field periods, shorter questionnaires, the payment of incentives, sending of advance letters, and more.

Most importantly, the unweighted demographics of the survey were compared to data on the nation gathered via the U.S. Census Bureau’s Current Population Survey (an authoritative benchmark). The survey samples were remarkably similar to the nation in terms of age, race, gender, education, and income. Higher response rate surveys manifested slightly less error than lower response rate surveys, but these differences were quite small.

The Survey Response Process in Telephone and Face-to-Face Surveys: Differences in Respondent Satisficing and Social Desirability Response Bias

(with Melanie Green & Allyson Holbrook)

In recent decades, survey research throughout the world has shifted from emphasizing in-person interviewing of block-listed samples to random digit dialing samples interviewed by telephone. In this paper, we propose three hypotheses about how this shift may bring with it changes in the psychology of the survey response, involving survey satisficing, enhanced social desirability response bias, and compromised sample representativeness among the most socially vulnerable segments of populations. We report tests of these hypotheses using data from three national mode experiments. As expected, RDD-telephone samples were less representative of the population and more significantly under-represented the most socially vulnerable segments. Furthermore, telephone respondents were more likely to satisfice (as evidenced by no-opinion responding, non-differentiation, acquiescence, and interview length), less cooperative and engaged in the interview, and more likely to express dissatisfaction with the length of the interview. Telephone respondents were also more suspicious about the interview and more likely to present themselves in socially desirable ways than were face-to-face respondents. These findings shed light on the nature of the survey response process, on the costs and benefits associated with particular survey modes, and on the nature of social interaction generally.

Acquiescence Biases Answers to Agree/Disagree Rating Scale Questions

(with Willem Saris and Eric Schaeffer)

Agree/disagree rating scales are tremendously popular in questionnaire research, but for 50 years, researchers have known that answers to these questions are biased by acquiescence response bias. In our new paper, we show that this and other problems significantly compromise the validity of measurement made with agree/disagree scales. Fortunately, it is always easy to ask the same questions with construct-specific response alternatives instead, and doing so simplifiest the respondent’s task and gathers more useful data.

Improving Election Forecasting

(with LinChiat Chang)

Surveys that forecast election outcomes have implications for campaign strategies, financial contributions, political analysis in the mass media and academia, as well as actual electoral turnout. However, techniques for improving the accuracy of election forecasting in polling organizations are often proprietary and not amenable to comparative evaluation. With the intention of moving the field toward viewing these techniques as an appropriate terrain for scientific investigation, we investigated how election forecasting would be affected by (a) elimination of respondents who are not likely to vote, (b) allocation of undecided respondents to candidates or referendum positions, (c) weighting of samples for representativeness, (d) addition of random responses, (e) controlling for candidate name order effects.

Using data from the 1997-1999 Buckeye State Polls collected by the Center for Survey Research at The Ohio State University, we found that (a) although better forecasting is achieved when using a combination of filters than separate filters to eliminate non-voters, there is a limit on the number of filters that should be applied. Filtering down to 50% of the samples is optimal for forecasting candidate races, while forecasting of referendum benefit from using only 10-20% of the original samples. (b) Random allocation of undecided respondents to candidates or issue positions improved the forecasting potential of the pre-election surveys for both candidate races and referenda. (c) Although weighting of the unfiltered samples did not consistently improve forecasting, substantial improvement was attained on referenda forecasting by weighting the samples after the optimal number of filters have been applied. (d) Addition of random responses did not improve forecasting. (e) Candidate name order effects emerged on the surveys, with recency effects most pronounced for less publicized races and among respondents with less education. Based on these results, we were able to provide a set of recommendations on how to improve election forecasting using empirically-validated techniques.

The Optimal Length of Rating Scales to Maximize Reliability and Validity

(with Alex Tahk)

Survey research frequently uses multi-point scales to assess respondents’ views. These scales vary from two points (e.g., agree or disagree) to 101 points (e.g., the American National Election Study’s thermometer-style ratings). Scales can also vary in another regard: being bipolar (meaning the zero point is in the middle and the end points are opposites, such as extremely positive and extremely negative) or unipolar (meaning the zero point is at one end, as in “not at all important”). However, different scale lengths may differ in reliability, so it is important to understand how the length of the scales affects the reliability of the responses.

To explore the relation between scale length and reliability, we conducted a meta-analysis of the results of many past studies. Our data consist of results from 706 tests of reliability taken from thirty different between-subject studies. We combined various measures of reliability and various sample sizes, controlling for these and other factors in determining the relation of scale length to reliability.

In general, we found that five- or seven-point scales produced the most reliable results. Bipolar scales performed best with seven points, whereas unipolar scales performed best with five. We also found that offering a midpoint on a bipolar scale, indicating a neutral position, increased reliability.

Instigators of Satisficing

(with Sowmya Anand, George Bizer, Melanie Green, Ken Mulligan, & Wendy Smith)

Satisficing theory proposes a number of survey features and individual differences that are likely to make satisficing more or less likely. For example, individuals who have not given much thought to an issue are theorized to be especially likely to satisfice; similarly, individuals lower in education or who don’t like to think may tend to satisfice. On the other hand, individuals who are specifically instructed to try to answer questions carefully and accurately may more likely to provide optimal answers. We are currently subjecting these proposed moderators of satisficing to experimental tests, using both student and national samples. In particular, these experiments will shed light on the ways in which various interpersonal and situational factors interact to influence survey responding.

Development of Survey Questionnaires for NASA to Track US Aviation Safety

(with Mike Silver)

Currently assisting in the development of several national-level survey questionnaires for a NASA program designed to track changes in US aviation safety. This project has applied research on the relationship between the organization of events in memory and recall of those memories to survey instrument design. More specifically, this research has used focus groups and individual interviews with pilots and air traffic controllers to identify the safety-related issues that inform the content of the questionnaire as well as preliminary information on the organization of related events, identification of a key cognitive organization scheme used by pilots in their memories of safety-related events, assessments of pilots’ abilities to accurately recall events over time, and an experimental demonstration of the effectiveness of using memory cues matching the pilots’ organization scheme to enhance accurate recall for safety-related events.

Comparing the Results of Probability and Non-Probability Sample Surveys

Researchers interested in collecting survey data from national samples often consider three possible ways of doing so: (1) telephone interviewing of RDD samples, (2) internet data collection from non-probability samples of volunteers, and (3) internet data collection from probability samples recruited via RDD. In order to help inform such choices, a single questionnaire was designed and administered by each of eight survey firms (one doing RDD telephone interviewing, one doing internet data collection from a probability sample, and six doing internet data collection from non-probability samples; each sample approximately 1,000 people). The firm that conducted internet data collection from a probability sample and one firm that collected data from volunteer respondents (SPSS) were told in advance that results would be compared across firms, and the remaining firms were not told in advance that such comparisons would be made. A set of preliminary results were presented at the 2005 annual meeting of the American Association for Public Opinion Research (AAPOR). Questions about this study can be directed to Professor Douglas Rivers at Stanford University(rivers@stanford.edu).

A Comparison of Minimally Balanced and Fully Balanced Forced Choice Items

(With Eric Shaeffer, Gary Langer, and Dan Merkle)

Survey researchers are generally mindful that balancing the wording of a question can alter the distributions of answers obtained. However, researchers who choose to use balanced questions can choose among multiple ways to achieve this aim. A u?Lfully balancedi^ question involves fully restating the competing point of view, whereas a u?Lminimally balancedi^ question simply uses the words u?Lor noti^ or a phrase of that sort to briefly acknowledge a second viewpoint.

In two studies using national sample survey data, we compared the distributions and concurrent validity of responses across fully and minimally balanced questions. We also explored whether the impact of full balancing varied with respondent education, a variable that has been shown in prior studies to regulate the magnitude of various response effects. Across these studies, minimally balanced and fully balanced questions resulted in similar distributions of responses of equivalent validity, and this pattern did not vary with respondent education.

A third study examined the distributions of responses to factual knowledge questions using a sample of college undergraduates. Participants provided responses to either fully balanced or minimally balanced questions that were worded either in a conversationally conventional way (e.g., greater or less than) or in a non-conventional way (e.g., less than or greater than). The latter approach has been shown in other research to disrupt processing and reduce data quality, and we found here that the unconventional question wording yielded fewer correct answers from respondents. When the unconventional wording was used, full balancing enhanced answer accuracy over what was obtained with minimal balancing. But when a question is worded in a way consistent with conversational conventions, full balancing did not offer an advantage over minimal balancing in terms of response accuracy.

Therefore, when questions are worded in conversationally conventional ways, the practical benefits of minimal balancing give that approach a relative advantage over full balancing. However, if researchers were inclined to violate conversational conventions, fully balanced items may offer an advantage.

Comparing the Quality of Data from Telephone and Internet Surveys

(with LinChiat Chang)

With their response rates declining and costs rising, telephone surveys are increasingly difficult to conduct. At the same time, Internet data collection is emerging as a viable alternative, in two forms. Some firms are distributing computer equipment to national samples recruited through RDD calling, and other firms are attracting volunteer respondents and then building panels of those individuals with some demographic characteristics distributed as they are in the nation. Most firms assemble panels of respondents who provide data on a regular basis.

Given the obvious practical advantages of Internet-based data collection, it seems worthwhile to conduct object tests of this relatively new method in direct comparison with the dominant alternative methodology: telephone interviewing. To do so, we commissioned a set of side-by-side surveys using a single questionnaire to gauge public opinion and voting intentions regarding the 2000 U.S. Presidential Election from national samples of American adults.

Data were collected by three houses: The Ohio State University Center for Survey Research (CSR), Knowledge Networks (KN), and Harris Interactive (HI). The CSR did RDD telephone interviewing. KN recruited respondents via RDD telephone interviews and equipped them with WebTV, which then permitted Internet data collection. HI respondents joined a panel after seeing and responding to invitations to participate in regular surveys; the invitation appeared on the Excite search engine web page and in various other places as well. These respondents also completed Internet surveys.

This study suggests that Internet-based data collection represents a viable approach to conducting representative sample surveys. Internet-based data collection compromises sample representativeness, more so when respondents volunteer rather than being recruited by RDD methods. But Internet data collection improves the accuracy of the reports respondents provide over that rendered by telephone interviews.

Response Option Order, Respondent Ability, Respondent Motivation, Task Difficulty, and Linguistic Structure

(with Allyson Holbrook)

Satisficing theory suggests that respondents may sometimes choose the first satisfactory response alternative they consider, rather than carefully considering all the response alternatives. This theory predicts that respondents are most likely to satisfice when they are unable and/or unmotivated to think carefully about a question and when the question is difficult to answer. When questions are presented orally, respondents typically cannot start thinking about the response alternatives until all have been read, so they more fully process response alternatives read last. This process typically leads to recency effects when questions are presented orally. In a meta-analysis of 212 dichotomous response order experiments in telephone surveys conducted by the Gallup Organization between 1995 and 1998, we are testing the impact of respondents ability, respondent motivation, and task difficulty on the likelihood and magnitudes of response order effects. In addition, we are exploring a new hypothesis, that the order in which response options are considered can be affected by the linguistic structure of the question.

Measuring the Frequency of Regular Behaviors: Comparing the ‘Typical Week’ to the ‘Past Week’

(with LinChiat Chang)

Respondents’ reports of behavioral frequencies have implications for important issues spanning the spectrum of unemployment rates, medical epidemiology, neighborhood and community evaluations, transport infrastructure, crime rates, consumer behavior, and government health resource allocation. Despite numerous assumptions about the relative strengths and weaknesses of questions asking about the past week vs. a typical week, there is a lack of empirical evidence comparing the performance of these two question forms. One previous study revealed no significant difference between past week and typical week measures, but those analyses treated variances in these two question forms as if they were the same. Using more appropriate analysis techniques, we compared the validity of “typical” week and “past” week reports using data from the 1989 National Election Pilot Study, in which respondents were randomly assigned to report TV news and newspaper exposure during either a typical week or the past week. The predictive validity of the measures was assessed using objective tests of current events and political knowledge, as well as self-report assessments of political knowledge. The typical week question form proved to be consistently superior, especially among the most educated respondents. We encourage further attempts to replicate the current findings in other domains of behavioral frequencies.

Designing Good Questionnaires

(with Leandre Fabrigar)

Thousands of experimental studies have compared the effectiveness of questionnaire items written in different ways, yet these studies have never been brought together in a single review. We are now completing a book doing just that. We will draw upon this literature to recommend to people when to use open-ended vs. closed-ended questions, when to use rating scales vs. ranking tasks, how many points to put on rating scales and how to label the points verbally, how the order of response choices influences answers, whether to offer “don?Nt know” response options, how to word and order questions, whether to ask people to recall their attitudes at prior times, and whether to ask people to explain the causes of their thinking and actions. The result of our efforts is an empirically-validated set of recommendations about how to maximize the reliability and validity of data collected via questionnaires.

Exploring the Impact of Sequential Ordering on the Interpretation of Fully Verbally Labeled Ratings Scales

(with Annabell Suh and Philip Garland)

Rating scales often contain fully verbally labeled response options presented in a sequential order (e.g., from extremely positive to extremely negative). There is some evidence that presenting response options sequentially aids respondents. If response options are vague, for instance, the order of presentation might help to clarify the meaning of each response option. On the other hand, other evidence suggests that verbal labels are clear enough on their own; respondents may, then, only be taking into account the literal meanings of the response options and not any context information.

This series of studies aims to examine these two questions: first, how respondents interpret the meaning of a response option scale and second, whether verbal labels are clear and unambiguous on their own. It compares the data quality and respondent satisfaction of two types of rating scales: one presented in a sequential order and one presented in an entirely random order. Initial results suggest that respondents are not aided by and do not employ order information, indicating that verbal labels may be clear enough on their own.

The Accuracy of Direct vs. Filtered Questions

(with Rajesh Srinivisan, Annabell Suh, and Philip Garland)

A seemingly simple frequency question, for example asking how many times one has seen a movie, can be presented in two different ways. It can be asked directly in a direct question, such as, “How many times in the past week have you seen a movie?” It can also be asked first as a yes/no question asking, “have you seen a movie in the past week?” and only if the answer is yes, then ask how many times.

Previous research has found that these two different types of questions lead to different frequencies, with the direct question generally leading to higher frequencies than the filtered question does. This is the first study to examine which question type is more accurate and to determine why the difference occurs. It also rules out a previously suggested explanation for the difference.

Public Attitudes on Global Warming

Fox and Not-Fox Television News Impact on Opinions on Global Warming: Selective Exposure, Not Motivated Reasoning

(Jon A. Krosnick, Bo MacInnis)

The influence of the mass media on political beliefs and attitudes has been of interest to scholars for many years, spurred seven decades ago by radio broadcasts reaching mass audiences for the first time. Half a century ago, landmark publications proclaimed that the news media have “minimal effects” and this conclusion has been supported by much work since then. One possible explanation for minimal effects in real world settings is that people’s opinions are solidly grounded and highly resistant to change. Another possible explanation is that people rarely pay enough attention to news content in order to be influenced by it. But a third possibility is that exposure to different media sources might cause opinion changes in opposite directions that cancel out in the aggregate. Specifically, people may choose to expose themselves to media with which they generally agree (“selective exposure”) and may be especially influenced by messages that align with their more general political orientations, a process that could appropriately be called “motivated reasoning.” “Minimal effects” documented in past research may therefore be an illusion, attributable to the failure to account for the varying content of news coverage and variation across people in their media exposure diets and acceptance proclivities.

We tested these hypotheses with regard to global warming, using two national probability-sample surveys of American adults. Specifically, we explored (1) whether “minimal effects” are observed when lumping all news media exposure together, (2) whether differentiating Fox News from not-Fox news exposure yields evidence of attitude change in opposite directions and canceling out in the aggregate, (3) whether there is a dose–response relation between exposure and opinions, (4) whether Republicans were more likely to acquire information from Fox News, whereas Democrats were more likely to acquire information from other television news sources, and (5) whether motivated reasoning is observed, whereby Republicans were more persuaded by Fox News and Democrats were more persuaded by not-Fox television news.

The Impact of Candidates’ Statements about Global Warming on Electoral Success in 2008 to 2015: Evidence Using Five Methodologies

(with Bo MacInnis, Jon A. Krosnick)

Using various methodologies and a large number of data sources, we tested opposing speculations in the literature on whether or not adopting a ‘green’ position on global warming would help or harm political candidates in elections. A special focus is on whether these effects might be particularly pronounced among people who attach greater personal importance to the issue of global warming. Across eight studies, we used survey experiments, content analysis, and a traditional political science regression approach to examine the link between a green position on global warming and electoral success. On one hand, the results suggest that political candidates—and in particular Democrats—can only gain votes by taking a green position. Moreover, a simulation building on an extensive content analysis of candidates’ websites suggests that the outcomes of some races could have been flipped if candidates had spoken differently about global warming. On the other hand, the simulation also suggests that these changes would most likely not have altered which party controlled the U.S. Senate and House of Representatives. In line with the three previous chapters, particularly those people who attached greater personal importance to global warming were more strongly inclined to vote for candidates that took a green position.

The present work provides an answer to the question of whether green positions help or harm political candidates and makes a more general contribution to political science, as it serves as an illustration on how the coordinated application of several research methods can illuminate the impact of a single issue on candidate choice.

Trust in Scientists’ Statements about the Environment and American Public Opinion on Global Warming

(with Bo MacInnis, Jon A. Krosnick)

We explore a question of great political significance: what affects public perceptions of global warming? The focus is on people’s trust in climate scientists. Building on several social psychological theories to address both the effect of trust in climate scientists and the accurateness of the conventional wisdom, we address an existing conventional wisdom; namely that controversies around the integrity of climate scientists have led to increasing disbelief in global warming among the general public.

We employed a large database and an experimental design to rule out alternative explanations of the findings. The accumulating evidence suggests that the effect of the controversies about climate scientists’ integrity on the public’s trust in scientists has been exceedingly small at best. Interestingly, people who did not trust scientists were not affected by statements of climate experts that confirmed the existence of global warming, but they were also not affected by experts who were skeptical about global warming. Instead, these people seemed to have based their beliefs about global warming on their own experience with the weather. Conversely, people who trusted scientists were more influenced by scientists of all types. Moreover, trust in scientists was not influenced by media coverage of global warming.

The American Public’s Preference for Preparation for the Possible Effects of Global Warming: Impact of Communication Strategies

(with Bo MacInnis, Jon A. Krosnick, Adina Abeles, Margaret R. Caldwell, Erin Prahler,Debbie Drake Dunne)

Experiments embedded in surveys of nationally representative samples of American adults assessed whether attitudes toward preparation for the possible effects of global warming varied depending on who endorsed such efforts, the stated purpose of preparation, the consequences of global warming targeted in a preparation message, and the words used to describe preparation and its alternative. Collapsing across all experiments, most (74%) Americans preferred preparing for possible consequences of global warming. The experimental manipulations produced statistically significant variation in this percentage, but in ways inconsistent with a series of perspectives that yield predictions about this variation. Preference for preparation was not greater when it was described using more familiar or simpler terms (preference for preparation was greatest when it was described as to “increase preparedness” and least when described as “increase resilience”), when efforts were said to be focused on people’s health rather than on people and the environment generally or on coastal ecosystems in particular, or when preparation was endorsed by more generally trusted groups (preference for preparation was highest when no one explicitly endorsed it or when endorsed by government officials or university researchers and declined when religious leaders or business leaders endorsed it). Thus, these experiments illustrate the value of empirical testing to gauge the impact of variation in descriptions of policy options in this arena and illustrate how communication approaches may have influenced public opinion in the past.

Does the American Public Support Legislation to Reduce Greenhouse Gas Emissions?

(with Jon A. Krosnick, Bo MacInnis)

Despite efforts by some congressional legislators to pass laws to limit greenhouse gas emissions and reduce the use of fossil fuels, no such laws have yet been adopted. Is this failure to pass new laws attributable to a lack of public desire for such legislation? Data from national surveys conducted by the Political Psychology Research Group at Stanford University, support two answers to this question. First, large majorities of Americans have endorsed a variety of policies designed to reduce greenhouse gas emissions; second, policy support has been consistent across years and across scopes and types of policies. Popular policies include fuel economy and energy-efficiency standards, mandated use of renewable sources, and limitations on emissions by utilities and by businesses more generally.

Support for policies has been price sensitive, and the American public appears to have been willing to pay enough money for these purposes to cover their costs. Consistent with these policy endorsements, surveys show that large majorities of Americans believe that global warming has been happening, that it is attributable to human activity, and that future warming will be a threat if unaddressed. Not surprisingly, these beliefs appear to have been important drivers of public support for policies designed to reform energy generation and use. Thus, it seems inappropriate to attribute lack of legislation to lack of public support in these arenas.

Public Opinion on Climate Change Hard to Nudge by Questionnaire Design Manipulations

(with Catherine Chen)

Public policymakers may wish to take into account public opinion on climate change as they craft legislation, but if the public doesn’t have real and crystalized opinions, perhaps such reliance would be unwise. Although much literature documents instances in which seemingly innocuous changes in question design produced statistically significant shifts in the distributions of answers, no such investigation has yet focused on a broad array of questions about climate change. This paper reports 90 survey experiments with representative samples of American adults exploring the extent to which answers to such questions are influenced by various manipulations. Of 128 tests, just 23 yielded statistically significant effects, slightly more than would be expected by chance alone. These results confirm researchers’ suspicions that such effects do occur, but such effects are the exception rather than the rule, which is reassuring about the robustness of survey evidence.

Do People Lie When Reporting Their Beliefs on Global Warming and Support for Green Policies? A Test Using the Item Count Technique

The recent decade has witnessed an increase in people’s concern about the environment and support for green policies (e.g., Gallup, 2019). However, a threat to the optimistic vision mapped out by the current opinion polls is social desirability bias, the tendency that respondents answer questions in ways that they think are socially admirable (Edwards, 1957). For people who believe that to be “green” (i.e., endorsing environment protection beliefs and support environmental protection policies) is socially desirable, they may overreport their green tendencies to fit into the society in general. For those who believe that the members of the social group that they belong to are more likely to be skeptical about global warming and green policies, they may suppress their green beliefs.

The current research investigates whether people intentionally lie while reporting their attitudes in the existence of global warming and their support for green policies with the Item Count Technique (ICT). 1500 respondents are randomly assigned into three groups: one would answer the traditional form of a direct question on 1) the existence of global warming and 2) support for green policies, and two will be provided lists of statements and ask how many of the items they endorse. ICT allows researchers to estimate the proportion of people endorsing the existence of global warming and support green policies secretly. It is hypothesized that the proportions of people endorsing the existence of global warming and supporting green policies measured using the traditional question format and the ICT do not differ significantly. Such hypotheses indicate that researchers can get concrete answers in polls, and public opinion on climate change issues is crystallized.

When climate scientists expressing uncertainty helps or hinders their credibility among the public

(with Lauren Howe, Bo MacInnes, Jon Krosnick, Erza Markowitz & Robert Socolow)

Making predictions about the future, including predicting the consequences of global warming, inherently involves some uncertainty. Yet scientists may be concerned that expressing uncertainty in their predictions would make the public less confident in their findings. In a published study, PPRG scholars and collaborators tested how climate scientists expressing uncertainty affects the public’s response to scientists’ predictions about sea level rise.

The researchers conducted an experiment as part of a survey of a nationally representative sample of 1,174 American adults. Survey respondents were randomly assigned to read information about scientists’ predictions of sea level rise that included different expressions of uncertainty.

Some participants read information in which scientists expressed no uncertainty about sea level rise, including the statement: “Global warming will cause the surface of the oceans around the world to rise about 4 feet.” Other participants read information in which scientists expressed a worst-case scenario alongside this estimate, stating that “Global warming will cause the surface of the oceans around the world to rise about 4 feet. However, sea level could rise as much as 7 feet.” Another group of participants read information in which scientists expressed both a best-case and worst-case scenario about sea level rise, reading “Global warming will cause the surface of the oceans around the world to rise about 4 feet. However, sea level could rise as little as 1 foot or it could rise by as much as 7 feet.”

Then, the researchers measured how these different expressions of uncertainty affected public trust in scientists, by asking respondents to indicate how much they trust the things that scientists say about the environment. The researchers also measured public acceptance of the scientists’ messages by asking respondents to indicate how serious they thought the effects of sea level rise caused by global warming would be.

Interestingly, adding just the worst-case scenario did not increase trust in scientists or acceptance of their messages. But expressing both a best-case and worst-case scenario increased both public trust in scientists and acceptance of their messages.

However, there were limits to the benefits of expressing a best-case and worst-case scenario. Some respondents read these same predictions about sea level rise and also read a statement that scientists believe that global warming will increase storms such as hurricanes, and that these storms would lead to storm surges that can exacerbate the consequences of sea level rise. Acknowledging this additional uncertainty about unpredictable damage caused through storm surges undermined the constructive impact of acknowledging a best-case and worst-case scenario about gradual sea level rise.

Taken together, these findings suggest that the public is quite receptive to uncertainty in the form of a full range of possible outcomes. Such expressions of uncertainty increased both public trust in scientists and receptiveness to their messages in this study. But juxtaposing a range of possible outcomes with discussion about the forces that make it impossible to predict the full extent of possible damage backfired, decreasing public trust in scientists and acceptance of their claims.

Read the published paper describing the research.

Partisanship and Attitudes about Global Warming

(with Ariel Malka)

Since the 1990s, the volume of information about global warming (GW) transmitted to the general public has increased dramatically. This increase in coverage was initially sparked by the emergence of a scientific consensus that human-caused GW has, in fact, been occurring and that it may have devastating consequences (Intergovernmental Panel on Climate Change, 1995). Despite the emergence of this consensus, the messages about GW conveyed to the general public during the last decade have often been mixed. Much mainstream news coverage has suggested that GW is real, human-caused, potentially catastrophic, and something that the federal government should deal with, but a good deal of coverage has also presented various more skeptical views as well. Perhaps driven by a desire to appear politically impartial and/or to cover all viewpoints fully, news media outlets have often quoted individuals conveying that human-caused GW is not happening, or that the government should not take aggressive action to deal with GW.

In this research, we are examining the impact of this information flow on Americans attitudes and beliefs about GW. In one study, we compared survey responses from a representative American sample in 1998 to those of another representative American sample collected in 2007. During those years, Americans became more likely to A) hold basic beliefs about GW that are congruent with the scientific consensus, B) favor greater federal government action to deal with GW, C) perceive agreement among scientists that GW has been happening, and D) possess stronger attitudes and beliefs about GW. However, these increases only happened among Democrats and Non-Partisans but did not occur among Republicans. Consequently, Republicans and Democrats are now more different from one another than they were in the late 1990s.

In another study, we examined the relation between knowledge about GW and concern about this issue. Information campaigns about GW are often predicated on the assumption that learning more about GW will lead people to become more concerned about it. Using data from three surveys of nationally representative samples of American adults, we found that the relation between knowledge and concern about GW is more complex than this view suggests. Among people who trust scientists to provide reliable information about the environment and among Democrats and Non-Partisans, increased knowledge has been associated with increased concern. But among people who are skeptical about scientists and among Republicans, more knowledge was generally not associated with greater concern. The association of knowledge with concern among Democrats and Non-Partisans who trust scientists was mediated by perceptions of consensus among scientists about GW’s existence and by perceptions that humans are a principal cause of GW. Thus, when studying the relation of knowledge and concern, it is important to take into account the content of the information that different types of people acquire and choose to rely upon.

The Development of Public Beliefs and Attitudes about Global Warming

(with Penny Visser, Allyson Holbrook, & Laura Lowe)

In September, 1995, the international community of scientists who study the environment announced that they had come to a new consensus that global warming has been occurring as the result of human activities and that it will have very significant and costly consequences for the world unless some steps are taken to slow its development. This new consensus was reported to Americans via television news programs and in newspapers, but these two media carried slightly different messages. Whereas the television messages simply acknowledged the new scientific consensus, newspaper stories acknowledged that a minority of scientists disagreed with this position, and newspaper stories published in October and November, 1995, were especially skeptical.

In December, 1995, we conducted a telephone survey of a representative sample of Ohio adults to study the diffusion and impact of this information. And in short, we found that people formed their beliefs about whether or not global warming is real using both news media information and their own personal experiences. Television exposure did indeed encourage people to believe more in the existence of global warming, whereas newspaper exposure discouraged such a belief. But these media effects occurred only among people who were highly trusting of scientists to provide accurate information. People who were distrusting of scientists based their assessments of the existence of global warming on their own first-hand observations of changes in temperature and air pollution levels in recent years. Those who thought temperatures had gotten warmer and who thought pollution had increased were especially likely to believe in global warming.

We also examined the origins of people’s attitudes toward global warming. Although most people thought global warming would be negative, some felt it would be neither positive nor negative, and a few actually thought it would be positive overall. And these attitudes were apparently driven by people’s beliefs about impact on factors immediately relevant to people’s daily lives: food, water, and shelter. People who believed global warming would hurt food and water supplies and would flood coastal living areas held negative attitudes. In contrast, global warming?Ns perceived impact on the beauty of natural scenery, on processes of plant and animal species extinction, on animal migration, and the like were inconsequential. Therefore, it appears that people’s attitudes were driven by their beliefs about the immediate material interests of society.

This survey project also allowed us to explore some general issues in the attitude literature. For example, we examined whether four dimensions of attitude strength (attitude importance, prior thought, certainty, and perceived knowledge) are all reflections of a single underlying construct. And although a factor analysis of them yielded a single factor, they were correlated quite differently with demographic variables, psychological antecedents, and a measure of the magnitude of the false consensus effect. This evidence reinforces the general conclusion that attitude strength is not a unitary construct.

In September and October, 1997, we conducted a telephone survey or a representative sample of adults. In December, 1997 through February, 1997, we re-interviewed a portion of those interviewed in September and October, as well as an additional representative sample of adults who had not previously been interviewed. Between these two sets of interviews, the White House Conference on Global Climate Change occurred, and hundreds of stories on global warming were broadcast on television and radio, and published in newspapers and magazines across the country. Our goal was to re-examine our findings from the Ohio survey with a national sample and to study how this media coverage changed public beliefs and attitudes about global warming.

On the surface, American public opinion about global warming did not seem to change in response to media coverage of the issues. However, changes did occur when party identification was considered. Strong Democrats moved in the direction of the message coming from the White House (i.e., that global warming would happen, that it would be bad, and that something should be done about it) while strong Republicans moved in the opposite direction. So even though overall attitudes did not change, opinions polarized along party lines. In addition to this partisan polarization, the media attention led the public to do more thinking about the issue of global warming and to be more certain of their opinions about global warming. People were also able to report their opinion about global warming more quickly during the second set of interviews, one indicator that people’s opinions about global warming were more crystallized after the media campaign.

Attitude Strength and Issue Publics

Exploring the Latent Structure of Strength-Related Attitude Attributes

(with Penny Visser & George Bizer)

Some attitudes are durable and impactful, whereas others are weak and inconsequential. Over the last few decades, researchers have identified roughly a dozen attributes of attitudes that differentiate the strong from the weak. However, considerable controversy remains regarding the relations among these attributes. Some scholars have suggested that the various strength-related attributes reflect a small number of latent constructs, whereas others have suggested that each is a distinct construct in its own right. We review this ongoing controversy, and we then review a diverse set of recent studies that provide new evidence in support of the latter perspective. We consider the implications of our findings for the conceptualization of attitude strength and for the methods by which it is studied.

The Measurement of Attitudes

(with Charles Judd & Bernd Wittenbrink)

Attitude measurement is pervasive. Social psychologists routinely measure attitudes when studying their causes (e.g., Fishbein & Ajzen, 1975; Tesser, Whitaker, Martin, & Ward, 1998; Zajonc, 1968), how they change (e.g., Festinger, 1957; Howland, Janis, & Kelley, 1953; Petty & Cacioppo, 1986) and their impact on cognition and behavior (e.g., Lord, Ross, & Lepper, 1979). Attitude measurement is also frequently done by political scientists, sociologists, economist, and other academics. Commercial market researchers are constantly engaged in measuring attitudes toward real and imagined consumer products and services. Beginning in the 1990s, all agencies of the U.S. federal government initiated surveys to measure attitudes toward the services they provided. And the news media regularly conduct and report surveys assessing public attitudes toward a wide range of objects. One of the most consequential examples is the routine measurement of Americans’ approval of their president.

To gauge people’s attitudes, researchers have used a wide variety of measurement techniques. These techniques have varied across history, and they vary across professions today. This variation is due both to varying philosophies of optimal measurement and varying availability of resources that limit assessment procedures. When attitude measurement was first formalized, the pioneering scholars presumed that an attitude could be accurately assessed only using a large set of questions that were selected via an elaborate procedure (e.g., Likert, 1932; Thurstone, 1928). But today, attitudes are most often assessed using single questions with relatively simple wordings and structures, and the variability of the approaches is striking, suggesting that there is not necessarily one optimal way to achieve the goal of accurate measurement.

Recently, however, scholars have begun to recognize that the accumulating literature points to clear advantages and disadvantages of various assessment approaches, so there may in fact be ways to optimize measurement by making good choices among the available tools. Furthermore, some challenging puzzles have appeared in the literature on attitude measurement that are stimulating a reevaluation of widely shared presumptions. This makes the present a particularly exciting time for reconsidering the full range of issues relevant to attitude measurement.

In this chapter, we offer a review of issues and literatures of use to researchers interested in assessing attitudes. We begin by considering the definition of attitudes, because no measurement procedure can be designed until the construct of interest has been specified. We review a range of different definitions that have been adopted throughout the history of social psychology but settle in on one that we believe captures the core essence of the notion of attitudes and that we use to shape our discussions throughout.

Because attitudes, like all psychological constructs, are latent, we cannot observe them directly. So all attitude measurement depends on those attitudes being revealed in overt responses, either verbal or nonverbal. We, therefore, turn next to outlining the processes by which we believe attitudes are expressed, so we can harness those processes to accurately gauge the construct. Finally, we outline the criteria for optimal measurement that we use throughout the rest of the chapter: reliability, validity, and generalizability.

Having thus set the stage, we turn to describing and evaluating various techniques for measuring attitudes, beginning with direct self-reports (which overtly ask participants to describe their attitudes). We outline many ways by which a researcher can design direct self-report measures well and less well. Next, we acknowledge the limits of such direct self-reports. A range of alternative assessment techniques, some old and others very new, have been developed to deal with these limitations, and we review those techniques next.

Attitude Importance and Attitude Accessibility

(with George Bizer)

Some scholars have argued that people use attitude accessibility as a heuristic with which to infer attitude importance, whereas others have argued that importance causes accessibility. Through a series of experiments, we have examined the relation between these constructs. We failed to find an effect of accessibility on importance, whereas we did find effects of importance on accessibility. These findings have helped us to better understand the relation between these two constructs and, perhaps more importantly, the underlying structure of attitude strength in general. Specifically, it appears that importance and accessibility represent distinct constructs, and some apparent effects of importance may be mediated by accessibility.

The Development of Attitude Strength Over the Life-Cycle

(with Penny Visser)

A number of theories posit that people’s attitudes become stronger as they get older, though they disagree on exactly how and when this might occur. Using data from national and regional surveys of adults, we have found that peoples political attitudes are especially open to change between ages 18-25, become more resistant to change immediately thereafter, and become more open to change at the end of the life-cycle. Other manifestations of attitude strength (e.g., the personal importance of attitudes, the confidence with which they are held, and the amount of knowledge people feel they have) also show this same surge and decline.

We have recently expanded this program of research in a number of ways. First, we are exploring the generalizability of our results to other attitude domains. Because our research, like virtually all of the investigations that have preceded it, has focussed on attitudes toward social and political issues, it is not clear whether the observed pattern of openness to attitude change is unique to social and political attitudes, or whether it describes age-related fluctuation in openness to change more generally.

Second, in addition to resistance to attitude change, we are exploring changes over the life span in some of the other defining qualities of attitude strength. Specifically, we are assessing changes over the life span in the degree to which attitudes (1) motivate and guide behavior and (2) direct information processing.

Finally, we are moving beyond a simple description of the relation between age and openness to attitude change to explore the causal underpinnings of this relation. Specifically, we are testing several possible mediators of the relation between age and openness to change, including (1) changes in the size, composition, and frequency of contact with people’s social networks, (2) changes in the frequency of role transitions and new social identifications, (3) changes in the nature of people’s self concept, and (4) changes in cognitive functioning over the life span.

This program of research promises to enrich our understanding of the aging process and refine our appreciation of the adult life cycle. Equally important, however, this research will contribute to a broader understanding of the social and psychological factors that determine susceptibility to attitude change in general.

How People Form Political Attitudes

(with Allyson Holbrook)

Many researchers have argued that citizens combine information about political candidates by simply subtracting the number of unfavorable beliefs they have about a candidate from the number of favorable beliefs they have about the candidate. This describes a symmetric linear process. It is symmetric because favorable and unfavorable beliefs have the same magnitude of impact on attitudes. It is linear because as beliefs are added, they have the same amount of impact as earlier beliefs. For example, five beliefs have five times as much impact is one belief. In addition, according to a symmetric linear model (SLM), citizens who have no favorable or unfavorable beliefs about a candidate have neutral attitudes toward him or her.

Work in psychology adopting a behavioral adaptive perspective suggests a number of amendments to the SLM. According to this perspective, human cognitive and behavioral processes develop because they facilitate survival and reproduction in a potentially hostile world. Approaching any new object with favorable expectations is worthwhile, because it could be food or could facilitate acquisition of food. However, vigilantly scanning for ally signs of danger all object might pose is also important, so that harmful objects call be avoided. In the absence of any information about all object, then, attitudes toward it should be slightly positive. And people should be especially attentive to the first information they receive about all object in order to form an accurate first impression. Then if the object appears to pose not immediate threat, vigilance can taper off, so that the impact of each additional piece of information acquired about the object may diminish. However, because one must vigilantly scan for signs of danger in all object, unfavorable information should have more impact than favorable information and vigilance to additional unfavorable information should not taper off to the same degree as attention to additional favorable information.

The model we propose, the asymmetric nonlinear model (ANM) is based on this approach and makes three predictions about attitudes about political candidates that differ from those of the SLM. First, citizens who leave no favorable or unfavorable beliefs about a candidate should have slightly positive attitudes toward him or her. Second, the information should have less impact as the amount of previously acquired information increases. And third, unfavorable information should have greater impact than favorable information and/or as the amount of previously acquired information increases, the impact of unfavorable information should decrease slower than the impact of favorable information.

We compared the SLM and the ANM using National Election Study (NES) data from presidential elections from 1972 to 1996. Cross-sectional NES data showed that the ANM describes attitudes toward presidential candidates and political parties better than the SLM among respondents high and low in political involvement (measured using education, political knowledge, time of voting decision, and whether or not respondents voted). Longitudinal NES data (collected before and after presidential elections between 1980 and 1996) showed that the ANM outperforms the SLM in describing the impact of beliefs on changes over time in attitudes toward presidential candidates. And the ANM revealed that voter turnout is enhanced by a stronger preference for one preferred candidate, as long as at least one candidate is dislike, whereas the SLM failed to detect this effect. Thus, the ANM appears to be superior to the SLM and the ANM has important implications for understanding the impact of election campaigns on citizens’ preferences and actions.

Gauging the Attitude-Behavior Relation

(with Christopher Bryan)

Social psychologists have long been interested in the extent to which behavior is consistent with attitudes. One approach to the study of this topic is to measure features of attitudes that relate to the strength of the attitudes. A great deal of research has shown that certain features of attitudes are related to the strength of the attitude-behavior relation. Examples of such features are the personal importance of the attitude object and the certainty with which the attitude is held. These features are referred to as Strength-Related Attitude Features (or SRAFs).

Our research focuses on a methodological issue related to the way in which the effects of SRAFs are determined. Existing social psychological research on this topic has used one of two types of statistical analysis to gauge the effect of SRAFs on the attitude-behavior relation. The first assumes that, to the extent that people engage in behavior related to an attitude object, it will be consistent with their attitude. For example, it is assumed that if a person is opposed to legalized abortion and signs a petition on the issue, that person will sign a petition against legalized abortion and not for it. Making this assumption, some studies have tested whether various SRAFs predict the number of attitude-expressive behaviors a person performs, ignoring the types of behaviors.

A second approach does not make any assumptions about valence matching between attitudes and behavior. Studies that used this approach treated the direction of a person’s attitude as a variable in the analysis and tested whether various SRAFs interacted with a person’s attitude to predict behavior, taking into account the direction of that behavior.

Although both of these approaches are designed to test the same theoretical question, we have found that, in our data on attitudes about legalized abortion, the methods yield very different results. This might help to explain the fact that different studies of SRAF effects sometimes yielded contradictory results. Therefore, attitude researchers should carefully choose their analytic method, because that choice can affect their results.

Presidential Approval and Gas Prices: The Bush Presidency in Historical Context

(with Laurel Harbridge)

During the last two years, journalists and scholars have speculated about the possibility that rising gasoline prices have caused the decline in President Bush’s approval ratings. But documenting causality has been difficult, because both variables have trended together, albeit in opposite directions, since 2002. In order to separate correlation from causation, we created a monthly time series from 1976 to 2006 to place the relation between gas prices and approval in context of numerous presidential administrations. Controlling for traditional economic, event, and scandal predictors of presidential approval, we implemented time series analysis to assess the historical relation between gas prices and approval and to test whether President George W. Bush’s approval has been impacted differentially by gas prices. In addition, we tested whether data from different polling agencies produce different results and whether an average measure of approval is sufficient, collapsing across “houses”. Preliminary evidence indicates that, when controlling for other factors, gas prices have not been a significant factor in determining residential approval either in the past or during the current Bush presidency but that they were powerful determinants of approval during Jimmy Carter’s presidency, when the White House’s responsibility for this aspect of the economy was much more apparent to Americans.

A Reexamination of the False Consensus Effect: Projection or Conformity?

(with Lori Gauthier)

Most explanations of the false consensus effect (FCE) presume that people exaggerate the extent to which others share their own attitudes, as the result of projection from self to others. Surprisingly, the accumulated evidence on this issue has rarely tested this proposition directly. We returned FCE research to its most basic level by examining the relation between the effect’s key variables. Study 1 manipulated respondents’ attitudes to test whether those attitudes shape their perceptions of others. Study 2 adopted procedures analogous to the Asch experimental paradigm to test whether perceptions of others’ attitudes influence one’s own attitude via conformity. Altering participants’ attitudes did not impact their perceptions of others’ attitudes, but manipulating participants’ perceptions of others’ attitudes did influence their own attitudes. These results challenge the widely held belief that people project their attitudes onto others and suggest that conformity drives the FCE.

The Impact of Policy Change Threat on Financial Contributions to Interest Groups

(Joanne M. Miller, Jon A. Krosnick, Allyson Holbrook, Laura Lowe, Alex Tahk)

Many scholars have proposed that citizen activism intended to influence government policy has one of its primary motivations in citizen dissatisfaction with current life circumstances. Less prominent—and never directly tested—is the notion that the perceived threat of undesirable policy change could motivate political behaviors aimed at averting the threat. In this research, we explore the relationship between threat perception and financial contributions to interest groups, using data from three representative sample surveys. Because the financial contributions do not appear to fit any standard parametric distribution, we rely on non-parametric tests of differences in means and regressions bootstrapped standard errors to ensure that our findings are robust to assumptions about the distribution of financial contributions.

The research demonstrates that the threat of an undesired policy change did motivate financial contributions to interest groups. There is also weak evidence that policy change threat motivated activism most when people attached great personal importance to a policy issue and when they had sufficient resources to permit participation.

Self-Interest, Values, Involvement, and Susceptibility to Attitude Change

(With Stephanie Lampron, & Eric Shaeffer)

In the social psychological literature on attitude change, there has been an ongoing debate regarding how being personally involved in an issue impacts attitude change. One perspective has been that involvement in an issue can be separated into two types of involvementA?None based on self-interest and one based on value-relevance. It was hypothesized that these two constructs would have separate effects on attitude change. The other perspective is that involvement need not be separated into its more specific components because they will all lead to the same effect on attitude change. No previous studies have provided an adequate framework from which to test these hypotheses.

An experimental framework was created in which to test the effects of self-interest and value-relevance on attitude change. In the study, participants were manipulated to connect the issue of comprehensive graduation exams to either their self-interest or their values before they read either a strong or weak counter-attitudinal message regarding the issue, which consistent of appeals to both self-interest and values.

The findings provide preliminary support for the idea that value-relevance interacts with message strength in the same manner as self-interest to affect attitude change. In addition, high value-relevance also led to a greater resistance to attitude change than high self-interest when messages were weak, indicating that separate effects may also be plausible.

Currently, this study is being re-run with new self-interest and value-relevance manipulations as well as more clear definitions of self-interest and values. The results of this study will hopefully elucidate the earlier findings while providing greater insight on how involvement is best conceptualized as a variable in attitude change.

Attitude Strength, Threat, and Political Activism

(with Joanne Miller & Laura Lowe)

A great deal of research has explored the psychological origins of citizen activism intended to influence government policy in democratic societies. Although various scholars have speculated that one motivator of such behavior is the perception of a credible threat of policy change in an undesirable direction, this hypothesis has never been directly tested using data on individual citizens and their perceptions. Our research tests the hypothesis that policy change threat leads to activism. More specifically, we are testing the effect of a citizen?Ns belief that a piece of legislation that he/she does not want to see passed i` whether for health or safety reasons, or just personal preference i` might actually become law.

We suspect that such threats will be more likely to inspire activism among some individuals than others, one determining factor being the personal importance of the issue to the individual. To a citizen who attaches a great deal of importance to an issue, an undesirable policy change would be personally devastating. But no matter how much significance people attach to an issue, they cannot be immediately and vigorously responsive to a threat unless they have the available resources in terms of time and/or money. Resources have, of course, been recognized as very important determinants of activism. But in all past research, resources have been statistically treated as having direct effects on activism, as if simply having more money or more free time, in and of themselves, directly inspire political action. Although this research implies that the effect of resources on activism depends on certain conditions, it is surprising that this hypothesis has never been explicitly tested, which we have now done.

We have conducted a series of studies to test the following hypotheses: 1) that policy change threat leads to political activism; 2) that the effect of policy change threat on activism is moderated by personal importance; and 3) that the effect of policy change threat on activism is moderated by income. These studies have employed a variety of methods, including nationally representative telephone surveys and a field experiment, and have examined threat in a variety of context, including attitudes towards the environment, abortion, and presidential candidates.

The results of these studies have been quite compelling i` in all cases, the threat of an undesirable policy change motivated activism aimed at preventing the change. In addition, we have found that personal importance exacerbates the effect of threat on activism, as does income, when the type of activism examined is financial contributions (an act for which income is a necessary resource). We are currently conducting a nationally representative telephone survey experiment and a laboratory experiment to extend our findings and show additional causal evidence of the effect of threat on activism.

Contingent Valuation

Valuing environmental goods via contingent valuation: comparing referendum and direct question

(with Lisanne Wichgers and Eric Shaeffer)

During the last several decades, numerous surveys have been done to assess the monetary value that people place on the existence of various natural resources that have been damaged by human actions. For example, the Exxon Valdez oil tanker damaged ecosystems in Prince William Sound, Alaska, and surveys later sought to ascertain the value of those ecosystems for litigation. This sort of survey research has been called “contingent valuation”.

A heated debate has been raging in the contingent valuation literature about how best to measure these monetary values. Some scholars argue that it is best to ask respondents referendum questions about whether they would be willing to pay a specified amount of money to prevent the same sort of damage from happening again in the future. Different respondents are randomly assigned to be asked about different amounts, and everyone’s answers are used to assess the public’s total willingness to pay. Other scholars argue that this approach may bias estimates of willingness to pay by anchoring thinking on the particular values offered. Therefore, these scholars suggest, it is preferable to ask direct questions about how much respondents would be willing to pay.

We initiated a project assessing the impact of anchoring on responses to these sorts of referendum questions. By experimentally comparing referendum questions to direct questions, we find that they do yield significantly different results. In line with prior research, the referendum question yielded higher mean maximum WTP and higher perceived cost of the program than the direct question. Furthermore, WTP increased when the scope of the program increased from one polluted river to three. As expected, the effect of scope was mediated by perceived cost only for the direct question, not for the referendum question. This suggests that the direct question capped respondents’ WTP using estimations of program cost, thereby risking under-reporting their actual WTP, while referendum questions may more fully reveal actual WTP.

Valuing Environmental Goods via Contingent Valuation: Comparing Referendum and Direct Question

(with Eric Shaeffer, Stephanie Lampron, Penny Visser, Trevor Thompson, and Daniel Schneider)

During the last two decades, numerous surveys have been done to assess the monetary value that people place on the existence of various natural resources that have been damaged by human actions. For example, the Exxon Valdez oil tanker damaged ecosystems in Prince William Sound, Alaska, and surveys later sought to ascertain the value of those ecosystems for litigation. This sort of survey research has been called “contingent valuation”.

News Media Influence

News Media Agenda-Setting and Priming

(with Joanne Miller)

A great deal of literature has shown that the news media have the ability to influence peoples political judgments. One particular media effects is agenda-setting, the notion that by paying attention to a particular national problem, the media can induce people to cite it as the most important national problem. A second media effect is priming, the idea that prolonged focus on a political issue can lead Americans to derive their overall evaluations of their President’s job performance primarily from his handling of that issue.

We tested a widely-held assumption about the cognitive mechanism responsible for these effects: accessibility. In short, scholars have presumed that media attention to an issue makes attitudes and beliefs about that issue especially accessible, which leads them to select the issue as the country’s most important and leads them to place weight on it when evaluating presidential performance. However, our laboratory studies clearly refuted these hypotheses by showing that although news media to a problem did increase the accessibility of relevant attitudes and knowledge, this increase in accessibility did not mediate either effect. Furthermore, agenda-setting did not mediate priming; that is, considering a problem to be the nations most important did not lead people to place greater weight on it when evaluating presidential performance. These findings challenge prevailing wisdom about these news media effects and encourage future research seeking to identifying the mechanisms that are in fact at work.

News Media Priming: Derivation or Rationalization?

(with Brent Bannon, Stanford University, and Laura Brannon, Kansas State University)

A great deal of evidence is consistent with the news media priming hypothesis, but no past study has yet used survey data to directly test whether news media coverage of an issue increases the consistency between domain–specific evaluations and overall evaluations of the president via derivation, rationalization, or both. The research reported here applied covariance structure modeling to longitudinal data from the 1990-1992 American National Election Study panel survey to gauge the impact of the surge in media coverage of the economy between 1991 and 1992 on derivation and rationalization of overall evaluations of President George H. W. Bush. All analytic approaches yielded support for the same conclusion: increased media attention to the economy increased derivation and reduced rationalization, consistent with presumptions about the workings of news media priming. The reduction in rationalization caused by media coverage of an issue most likely means that past studies have under-estimated the magnitude of priming.

Racism and Prejudice

The Impact of Interviewer Race and Gender on Survey Results: The Case of Pre-Election Polls

(with Mario Callegaro, Robert P. Daves, Femke De Keulenaer, Daniel Schneider, and Yphtach Lelkes)

Most discussions of bias due to the race and gender of a survey interviewer center on the quality and veridicality of the responses given by the survey respondent. That is, respondents are thought to bow to social desirability pressures and give answers that they believe will be less offensive or more appealing to the interviewer given his/her race or gender. However, an often overlooked but potentially damaging source of error involves interviewer recruitment differences due to race and gender. That is, interviewers of some races or genders may be especially effective at recruiting respondents with matching races or genders, thereby inducing sample composition bias.

Hierarchical regression models were used to disentangle various sources of interviewer error in a survey conducted in Minneapolis during the 2001 mayoral election, when a black female incumbent was challenged by a white male. Stated vote intention and favorability toward the incumbent increased when the interviewer was a black female. Furthermore, survey recruitment also varied by the race and gender of the incumbent as expected.

Thus, this study indicates that interviewer race and gender may introduce bias in two different steps: recruitment and response generation.

Best Practices in Science

Best Practices in Science Conference

The aim of science is to accurately depict some aspect of the world. Recently, problems in science have been on the rise, from studies not replicating in many different fields to the rise in questionable research practices and unethical behavior. After many discussions at Stanford University’s Center for Advanced Study in the Behavioral Sciences, a group of scholars, led by Jon Krosnick, Lee Jussim, and Simine Vazire, convened the Best Practices in Science Conference at Stanford University on June 18-19, 2015. The conference gathered experts in the field to present research and discuss problems in science, their causes, and possible solutions.

We are writing a report that summarizes the insights from the discussions at the conference and provides empirical research questions designed to examine the extent of problems, the source or cause of these problems, and the impact of possible solutions. These research questions can be used in future studies to empirically investigate how scientific practice can best be improved, and to further develop the behavioral science behind scientific practice.

Funding for the conference was provided by the Fetzer Franklin Fund.

The Replication Project

(with Sebastian Lundmark)

The reproducibility of scientific findings has become an issue of growing concern in science. Disciplines including medicine, psychology, genetics, and biology have been repeatedly challenged by findings that are not as robust as they initially appeared. Shrinking effects and out right failures of replication raise questions not only about the specific findings they challenge, but more generally about the confidence that we can have in published results that have yet to be verified independently.

The Replication project aims to investigate these issues through a collaboration between University of California Santa Barbara, University of California Berkeley, University of Virginia, and Stanford University. Four research teams, located at each university, will individually produce new experimental treatment effects which they will share with the other universities who, in turn attempts to replicate these initial findings. Partaking in this form of structured collaboration, allows the four research teams to participate in one of the first multi-site multi-replication meta-science experiments in Social Psychology with the goal of investigating the reproducibility and potential decline of brand new scientific findings.

The project is funded by the Fetzer Franklin Fund.

Port Columbus International Airport – Customer Experience Survey – Wave 1

At the request of the Columbus Airport Authority, a research team at the Ohio State University under the direction of Professor Jon Krosnick conducted a survey of a representative sample of passengers arriving at and departing from Port Columbus International Airport between February 16 and March 30, 2002. Interviews lasted approximately 10 minutes on average and were conducted by trained interviewers equipped with palm-top computers in which the questionnaire was programmed. The questions measured the experiences people had and the services they used while at the airport and asked for evaluations of the quality of those experiences.

Click here for the final report.

Dr. Krosnick’s principal collaborator on this project, Amanda Scott, is now a principal researcher at and co-owner of The Strategy Team, based in Columbus, Ohio.