The effect of different methods to identify, and scenarios used to address energy intake misestimation on dietary patterns derived by cluster analysis

Abstract
Abstract Background All self-reported dietary intake data are characterized by measurement error, and validation studies indicate that the estimation of energy intake (EI) is particularly affected. Methods Using self-reported food frequency and physical activity data from Alberta’s Tomorrow Project participants (n = 9847 men 16,241 women), we compared the revised-Goldberg and the predicted total energy expenditure methods in their ability to identify misreporters of EI. We also compared dietary patterns derived by k-means clustering under different scenarios where misreporters are included in the cluster analysis (Inclusion); excluded prior to completing the cluster analysis (ExBefore); excluded after completing the cluster analysis (ExAfter); and finally, excluded before the cluster analysis but added to the ExBefore cluster solution using the nearest neighbor method (InclusionNN). Results The predicted total energy expenditure method identified a significantly higher proportion of participants as EI misreporters compared to the revised-Goldberg method (50% vs. 47%, p < 0.0001). k-means cluster analysis identified 3 dietary patterns: Healthy, Meats/Pizza and Sweets/Dairy. Among both men and women, participants assigned to dietary patterns changed substantially between ExBefore and ExAfter and also between the Inclusion and InclusionNN scenarios (Hubert and Arabie’s adjusted Rand Index, Kappa and Cramer’s V statistics < 0.8). Conclusions Different scenarios used to account for EI misreporters influenced cluster analysis and hence the composition of the dietary patterns. Continued efforts are needed to explore and validate methods and their ability to identify and mitigate the impact of EI misestimation in nutritional epidemiology.
Description
Keywords
Citation
Nutrition Journal. 2021 May 08;20(1):42