Strategies for the identification and prevention of survey fraud: Data analysis of a web-based survey

Document Type

Journal Article

Publication Date

7-1-2021

Journal

JMIR Cancer

Volume

7

Issue

3

DOI

10.2196/30730

Keywords

Cancer patients; Cancer survivors; CAPTCHA; COVID-19; Data integrity; Fraud; Fraudulent responses; Online surveys; Pandemic; Research methods; Survey

Abstract

Background: To assess the impact of COVID-19 on cancer survivors, we fielded a survey promoted via email and social media in winter 2020. Examination of the data showed suspicious patterns that warranted serious review. Objective: The aim of this paper is to review the methods used to identify and prevent fraudulent survey responses. Methods: As precautions, we included a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA), a hidden question, and instructions for respondents to type a specific word. To identify likely fraudulent data, we defined a priori indicators that warranted elimination or suspicion. If a survey contained two or more suspicious indicators, the survey was eliminated. We examined differences between the retained and eliminated data sets. Results: Of the total responses (N=1977), nearly three-fourths (n=1408) were dropped and one-fourth (n=569) were retained after data quality checking. Comparisons of the two data sets showed statistically significant differences across almost all demographic characteristics. Conclusions: Numerous precautions beyond the inclusion of a CAPTCHA are needed when fielding web-based surveys, particularly if a financial incentive is offered.

Share

COinS