Exploring data reduction strategies in the analysis of continuous pressure imaging technology

Abstract Background Science is becoming increasingly data intensive as digital innovations bring new capacity for continuous data generation and storage. This progress also brings challenges, as many scientific initiatives are challenged by the shear volumes of data produced. Here we present a case study of a data intensive randomized clinical trial assessing the utility of continuous pressure imaging (CPI) for reducing pressure injuries. Objective To explore an approach to reducing the amount of CPI data required for analyses to a manageable size without loss of critical information using a nested subset of pressure data. Methods Data from four enrolled study participants excluded from the analytical phase of the study were used to develop an approach to data reduction. A two-step data strategy was used. First, raw data were sampled at different frequencies (5, 30, 60, 120, and 240 s) to identify optimal measurement frequency. Second, similarity between adjacent frames was evaluated using correlation coefficients to identify position changes of enrolled study participants. Data strategy performance was evaluated through visual inspection using heat maps and time series plots. Results A sampling frequency of every 60 s provided reasonable representation of changes in interface pressure over time. This approach translated to using only 1.7% of the collected data in analyses. In the second step it was found that 160 frames within 24 h represented the pressure states of study participants. In total, only 480 frames from the 72 h of collected data would be needed for analyses without loss of information. Only ~ 0.2% of the raw data collected would be required for assessment of the primary trial outcome. Conclusions Data reduction is an important component of big data analytics. Our two-step strategy markedly reduced the amount of data required for analyses without loss of information. This data reduction strategy, if validated, could be used in other CPI and other settings where large amounts of both temporal and spatial data must be analysed.
BMC Medical Research Methodology. 2023 Mar 01;23(1):56