Respondent behavior in conjoint studies often deviates from the assumptions of random utility theory. We refer to deviations from normative choice behavior as data pathologies. A variety of models have been developed that attempt to correct for specific pathologies (i.e., screening rules, respondent quality, attribute non-attendance, etc.). While useful, these approaches tend to be both conceptually complex and computational intensive. As such, these approaches have not widely diffused into the practice of marketing research. In this paper we draw on innovations in machine learning to develop a practical approach that relies on (clever) randomization strategies and ensembling to simultaneously accommodate multiple data pathologies in a single model. We provide tips and tricks on how to implement this approach in practice.
/code
Scripts with prefixes (e.g.,01_import-data.R
,02_clean-data.R
) and functions in/code/src
./data
Simulated and real data, the latter not pushed./figures
PNG images and plots./output
Output from model runs, not pushed./presentations
Presentation slides, without knitted PDFs pushed./private
A catch-all folder for miscellaneous files, not pushed./renv
Project library, once initialized (see below)./writing
Case studies and the paper, without its knitted PDF pushed.renv.lock
Information on the reproducible environment.
Every package you install lives in your system library, accessible to
all projects. However, packages change. Add a reproducible environment
by creating a project library using the {renv}
package.
- Initialize the project library once using
renv::init()
. - Once you’ve installed packages, add them to the project library using
renv::snapshot()
. - If a project library already exists, install the associated packages
with
renv::restore()
.
For more details on using GitHub, Quarto, etc. see Research Assistant Training.