ZINB-Grad: A Gradient Based Linear Model Outperforming Deep Models For Single-Cell Analysis.
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Single-cell RNA-seq (scRNA-seq) is an emerging experimental protocol for profiling gene expression levels at a single-cell resolution. It significantly improves researchers’ ability to characterize diversity among biological systems. Although scRNA-seq experiments are powerful and popular, they suffer from technical noise, dropouts, batch effects, and biases. Many statistical and machine learning methods have been designed to overcome these challenges. In recent years, there has been a shift from traditional statistical models to deep learning models. But are deep models better for scRNA-seq analysis? Published literature claimed that deep-learning-based models, such as scVI, outperform conventional models, such as ZINB-WaVE. Here, we used a novel optimization procedure combined with modern machine learning software packages to overcome the scalability and efficiency challenges inherited in traditional tools. We showed that our implementation is more efficient than both conventional models and deep learning models. Moreover, our model is scalable to millions of samples. It has performance comparable with deep models in terms of accuracy. As the devil is in the implementation details, the supremacy of deep models may not be due to their sophisticated deep architecture. Instead, the source of effectiveness is merely the optimization procedure built-in to deep models implementations, which could also be adopted by many traditional models.