Choice modelling
Choice modelling attempts to model the decision process of an individual or segment via revealed preferences or stated preferences made in a particular context or contexts. Typically, it attempts to use discrete choices in order to infer positions of the items on some relevant latent scale. Indeed many alternative models exist in econometrics, marketing, sociometrics and other fields, including utility maximization, optimization applied to consumer theory, and a plethora of other identification strategies which may be more or less accurate depending on the data, sample, hypothesis and the particular decision being modelled. In addition, choice modelling is regarded as the most suitable method for estimating consumers' willingness to pay for quality improvements in multiple dimensions.
Related terms
There are a number of terms which are considered to be synonyms with the term choice modelling. Some are accurate and some are used in industry applications, although considered inaccurate in academia.These include the following:
- Stated preference discrete choice modeling
- Discrete choice
- Choice experiment
- Stated preference studies
- Conjoint analysis
- Controlled experiments
Theoretical background
The theory behind choice modelling was developed independently by economists and mathematical psychologists. The origins of choice modelling can be traced to Thurstone's research into food preferences in the 1920s and to random utility theory. In economics, random utility theory was then developed by Daniel McFadden and in mathematical psychology primarily by Duncan Luce and Anthony Marley. In essence, choice modelling assumes that the utility that an individual derives from item A over item B is a function of the frequency that he chooses item A over item B in repeated choices. Due to his use of the normal distribution Thurstone was unable to generalise this binary choice into a multinomial choice framework, hence why the method languished for over 30 years. However, in the 1960s through 1980s the method was axiomatised and applied in a variety of types of study.Distinction between revealed and stated preference studies
Choice modelling is used in both revealed preference and stated preference studies. RP studies use the choices made already by individuals to estimate the value they ascribe to items - they "reveal their preferences - and hence values – by their choices". SP studies use the choices made by individuals made under experimental conditions to estimate these values – they "state their preferences via their choices". McFadden successfully used revealed preferences to predict the demand for the Bay Area Rapid Transit before it was built. Luce and Marley had previously axiomatised random utility theory but had not used it in a real world application; furthermore they spent many years testing the method in SP studies involving psychology students.History
McFadden's work earned him the Nobel Memorial Prize in Economic Sciences in 2000. However, much of the work in choice modelling had for almost 20 years been proceeding in the field of stated preferences. Such work arose in various disciplines, originally transport and marketing, due to the need to predict demand for new products that were potentially expensive to produce. This work drew heavily on the fields of conjoint analysis and design of experiments, in order to:- Present to consumers goods or services that were defined by particular features that had levels, e.g. "price" with levels "$10, $20, $30"; "follow-up service" with levels "no warranty, 10 year warranty";
- Present configurations of these goods that minimised the number of choices needed in order to estimate the consumer's utility function.
Relationship with conjoint analysis
Choice modelling from the outset suffered from a lack of standardisation of terminology and all the terms given above have been used to describe it. However, the largest disagreement has proved to be geographical: in the Americas, following industry practice there, the term "choice-based conjoint analysis" has come to dominate. This reflected a desire that choice modelling reflect the attribute and level structure inherited from conjoint analysis, but show that discrete choices, rather than numerical ratings, be used as the outcome measure elicited from consumers. Elsewhere in the world, the term discrete choice experiment has come to dominate in virtually all disciplines. Louviere and colleagues in environmental and health economics came to disavow the American terminology, claiming that it was misleading and disguised a fundamental difference discrete choice experiments have from traditional conjoint methods: discrete choice experiments have a testable theory of human decision-making underpinning them, whilst conjoint methods are simply a way of decomposing the value of a good using statistical designs from numerical ratings that have no psychological theory to explain what the rating scale numbers mean.Designing a choice model
Designing a choice model or discrete choice experiment generally follows the following steps:- Identifying the good or service to be valued;
- Deciding on what attributes and levels fully describe the good or service;
- Constructing an Experimental design that is appropriate for those attributes and levels, either from a design catalogue, or via a software program;
- Constructing the survey, replacing the design codes with the relevant attribute levels;
- Administering the survey to a sample of respondents in any of a number of formats including paper and pen, but increasingly via web surveys;
- Analysing the data using appropriate models, often beginning with the Multinomial logistic regression model, given its attractive properties in terms of consistency with economic demand theory.
Identifying the good or service to be valued
- the research question in an academic study, or
- the needs of the client
Deciding on what attributes and levels fully describe the good or service
Constructing an experimental design that is appropriate for those attributes and levels, either from a design catalogue, or via a software program
A strength of DCEs and conjoint analyses is that they typically present a subset of the full factorial. For example, a phone with two brands, three shapes, three sizes and four amounts of memory has 2x3x3x4=72 possible configurations. This is the full factorial and in most cases is too large to administer to respondents. Subsets of the full factorial can be produced in a variety of ways but in general they have the following aim: to enable estimation of a certain limited number of parameters describing the good: main effects, two-way interactions, etc. This is typically achieved by deliberately confounding higher order interactions with lower order interactions. For example, two-way and three-way interactions may be confounded with main effects. This has the following consequences:- The number of profiles is significantly reduced;
- A regression coefficient for a given main effect is unbiased if and only if the confounded terms are zero;
- A regression coefficient is biased in an unknown direction and with an unknown magnitude if the confounded interaction terms are non-zero;
- No correction can be made at the analysis to solve the problem, should the confounded terms be non-zero.
Designs are available from catalogues and statistical programs. Traditionally they had the property of Orthogonality where all attribute levels can be estimated independently of each other. This ensures zero collinearity and can be explained using the following example.
Imagine a car dealership that sells both luxury cars and used low-end vehicles. Using the utility maximisation principle and assuming an MNL model, we hypothesise that the decision to buy a car from this dealership is the sum of the individual contribution of each of the following to the total utility.
- Price
- Marque
- Origin
- Performance
- high performance, expensive German cars
- low performance, cheap American cars
An experimental design in a Choice Experiment is a strict scheme for controlling and presenting hypothetical scenarios, or choice sets to respondents. For the same experiment, different designs could be used, each with different properties. The best design depends on the objectives of the exercise.
It is the experimental design that drives the experiment and the ultimate capabilities of the model. Many very efficient designs exist in the public domain that allow near optimal experiments to be performed.
For example the Latin square 1617 design allows the estimation of all main effects of a product that could have up to 1617 configurations. Furthermore this could be achieved within a sample frame of only around 256 respondents.
Below is an example of a much smaller design. This is 34 main effects design.
0 | 0 | 0 | 0 |
0 | 1 | 1 | 2 |
0 | 2 | 2 | 1 |
1 | 0 | 1 | 1 |
1 | 1 | 2 | 0 |
1 | 2 | 0 | 2 |
2 | 0 | 2 | 2 |
2 | 1 | 0 | 1 |
2 | 2 | 1 | 0 |
This design would allow the estimation of main effects utilities from 81 possible product configurations assuming all higher order interactions are zero. A sample of around 20 respondents could model the main effects of all 81 possible product configurations with statistically significant results.
Some examples of other experimental designs commonly used:
- Balanced incomplete block designs
- Random designs
- Main effects
- Higher order interaction designs
- Full factorial
More information on experimental designs may be found here. It is worth reiterating, however, that small designs that estimate main effects typically do so by deliberately confounding higher order interactions with the main effects. This means that unless those interactions are zero in practice, the analyst will obtain biased estimates of the main effects. Furthermore he has no way of testing this, and no way of correcting it in analysis. This emphasises the crucial role of design in DCEs.
Constructing the survey
Constructing the survey typically involves:- Doing a "find and replace" in order that the experimental design codes are replaced by the attribute levels of the good in question.
- Putting the resulting configurations into a broader survey than may include questions pertaining to sociodemographics of the respondents. This may aid in segmenting the data at the analysis stage: for example males may differ from females in their preferences.
Administering the survey to a sample of respondents in any of a number of formats including paper and pen, but increasingly via web surveys
Analysing the data using appropriate models, often beginning with the [multinomial logistic regression] model, given its attractive properties in terms of consistency with economic demand theory
Analysing the data from a DCE requires the analyst to assume a particular type of decision rule - or functional form of the utility equation in economists' terms. This is usually dictated by the design: if a main effects design has been used then two-way and higher order interaction terms cannot be included in the model. Regression models are then typically estimated. These often begin with the conditional logit model - traditionally, although slightly misleadingly, referred to as the multinomial logistic regression model by choice modellers. The MNL model converts the observed choice frequencies into utility estimates via the logistic function. The utility associated with every attribute level can be estimated, thus allowing the analyst to construct the total utility of any possible configuration. However, a DCE may alternatively be used to estimate non-market environmental benefits and costs.Strengths
- Forces respondents to consider trade-offs between attributes;
- Makes the frame of reference explicit to respondents via the inclusion of an array of attributes and product alternatives;
- Enables implicit prices to be estimated for attributes;
- Enables welfare impacts to be estimated for multiple scenarios;
- Can be used to estimate the level of customer demand for alternative 'service product' in non-monetary terms; and
- Potentially reduces the incentive for respondents to behave strategically.
Weaknesses
- Discrete choices provide only ordinal data, which provides less information than ratio or interval data;
- Inferences from ordinal data, to produce estimates on an interval/ratio scale, require assumptions about error distributions and the respondent's decision rule ;
- Fractional factorial designs used in practice deliberately confound two-way and higher order interactions with lower order estimates in order to make the design small: if the higher order interactions are non-zero then main effects are biased, with no way for the analyst to know or correct this ex post;
- Non-probabilistic decision-making by the individual violates random utility theory: under a random utility model, utility estimates become infinite.
- There is one fundamental weakness of all limited dependent variable models such as logit and probit models: the means and variances on the latent scale are perfectly Confounded. In other words they cannot be separated.
The mean-variance confound
- Respondents place the item high up on the latent scale, or
- Respondents do not place the item high up on the scale BUT they are very certain of their preferences, consistently choosing the item over others presented alongside, or
- Some combination of and.
Versus traditional ratings-based conjoint methods
Major problems with ratings questions that do not occur with choice models are:- no trade-off information. A risk with ratings is that respondents tend not to differentiate between perceived 'good' attributes and rate them all as attractive.
- variant personal scales. Different individuals value a '2' on a scale of 1 to 5 differently. Aggregation of the frequencies of each of the scale measures has no theoretical basis.
- no relative measure. How does an analyst compare something rated a 1 to something rated a 2? Is one twice as good as the other? Again there is no theoretical way of aggregating the data.
Other types
Ranking
Rankings do tend to force the individual to indicate relative preferences for the items of interest. Thus the trade-offs between these can, like in a DCE, typically be estimated. However, ranking models must test whether the same utility function is being estimated at every ranking depth: e.g. the same estimates must result from the bottom rank data as from the top rank data.Best–worst scaling
is a well-regarded alternative to ratings and ranking. It asks people to choose their most and least preferred options from a range of alternatives. By subtracting or integrating across the choice probabilities, utility scores for each alternative can be estimated on an interval or ratio scale, for individuals and/or groups. Various psychological models may be utilised by individuals to produce best-worst data, including the MaxDiff model.Uses
Choice modelling is particularly useful for:- Predicting uptake and refining new product development
- Estimating the implied willingness to pay for goods and services
- Product or service viability testing
- Estimating the effects of product characteristics on consumer choice
- Variations of product attributes
- Understanding brand value and preference
- Demand estimates and optimum pricing