Assessing diabetes-relevant data provided by undergraduate and crowdsourced web-based survey participants for honesty and accuracy

Document Type


Publication Date



Background: To eliminate health disparities, research will depend on our ability to reach select groups of people (eg, samples of a particular racial or ethnic group with a particular disease); unfortunately, researchers often experience difficulty obtaining high-quality data from samples of sufficient size. Objective: Past studies utilizing MTurk applaud its diversity, so our initial objective was to capitalize on MTurk's diversity to investigate psychosocial factors related to diabetes self-care. Methods: In Study 1, a “Health Survey” was posted on MTurk to examine diabetes-relevant psychosocial factors. The survey was restricted to individuals who were 18 years of age or older with diabetes. Detection of irregularities in the data, however, prompted an evaluation of the quality of MTurk health-relevant data. This ultimately led to Study 2, which utilized an alert statement to improve conscientious behavior, or the likelihood that participants would be thorough and diligent in their responses. Trap questions were also embedded to assess conscientious behavior. Results: In Study 1, of 4165 responses, 1246 were generated from 533 unique IP addresses completing the survey multiple times within close temporal proximity. Ultimately, only 252 responses were found to be acceptable. Further analyses indicated additional quality concerns with this subsample. In Study 2, as compared with the MTurk sample (N=316), the undergraduate sample (N=300) included more females, and fewer individuals who were married. The samples did not differ with respect to race. Although the presence of an alert resulted in fewer trap failures (mean=0.07) than when no alert was present (mean=0.11), this difference failed to reach significance: F =2.5, P=.11, =.004, power=.35. The modal trap failure response was zero, while the mean was 0.092 (SD=0.32). There were a total of 60 trap failures in a context where the potential could have exceeded 16,000. Conclusions: Published studies that utilize MTurk participants are rapidly appearing in the health domain. While MTurk may have the potential to be more diverse than an undergraduate sample, our efforts did not meet the criteria for what would constitute a diverse sample in and of itself. Because some researchers have experienced successful data collection on MTurk, while others report disastrous results, Kees et al recently identified that one essential area of research is of the types and magnitude of cheating behavior occurring on Web-based platforms. The present studies can contribute to this dialogue, and alternately provide evidence of disaster and success. Moving forward, it is recommended that researchers employ best practices in survey design and deliberately embed trap questions to assess participant behavior. We would strongly suggest that standards be in place for publishing the results of Web-based surveys-standards that protect against publication unless there are suitable quality assurance tests built into the survey design, distribution, and analysis. 1,604 2

Publication Name

JMIR Diabetes

Volume Number


Issue Number






This document is currently not available here.