Strategies for the identification and prevention of survey fraud: Data analysis of a web-based survey
Cancer patients; Cancer survivors; CAPTCHA; COVID-19; Data integrity; Fraud; Fraudulent responses; Online surveys; Pandemic; Research methods; Survey
Background: To assess the impact of COVID-19 on cancer survivors, we fielded a survey promoted via email and social media in winter 2020. Examination of the data showed suspicious patterns that warranted serious review. Objective: The aim of this paper is to review the methods used to identify and prevent fraudulent survey responses. Methods: As precautions, we included a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA), a hidden question, and instructions for respondents to type a specific word. To identify likely fraudulent data, we defined a priori indicators that warranted elimination or suspicion. If a survey contained two or more suspicious indicators, the survey was eliminated. We examined differences between the retained and eliminated data sets. Results: Of the total responses (N=1977), nearly three-fourths (n=1408) were dropped and one-fourth (n=569) were retained after data quality checking. Comparisons of the two data sets showed statistically significant differences across almost all demographic characteristics. Conclusions: Numerous precautions beyond the inclusion of a CAPTCHA are needed when fielding web-based surveys, particularly if a financial incentive is offered.
Pratt-Chapman, M., Moses, J., & Arem, H. (2021). Strategies for the identification and prevention of survey fraud: Data analysis of a web-based survey. JMIR Cancer, 7 (3). http://dx.doi.org/10.2196/30730