Abstract:
A human body has billions of cells specialized with their own function and
each cell carries genome in its nucleus. The activity of the genome is
controlled by a multitude of molecular complexes called as epigenome.
Previously scientists had a notion that human diseases are caused only due
to changes in the DNA sequence or through the infectious agents present
in the environment. However, recent studies have revealed that changes
in the epigenome are also associated with disease.
Our aims is to create an imputation method for noisy, sparse and highly
unbalanced single cell epigenome data. This problem is challenging as
there is no imputation method for imputing huge and unbalanced dataset
of single cell epigenome. Moreover, its analysis holds a significant amount
of importance in the biological domain for preventing and curing many
critical diseases.
Here we propose an imputation method called as RITs for imputing single
cell epigenome profiles. We evaluated our proposed method through
various possible techniques and compared its results with traditional
imputation methods, although those imputation methods were made for
imputing gene expression data. Our proposed method out-performs in
every test and comes out as reliable imputation method even when we
have huge unbalanced data. We tested our method on scATAC-seq dataset
of cells from organs of the adult mouse to check the robustness and
efficiency of this method. In all the conditions and tests, our imputation
methods RITS remained at the top. The generality of RITs and it robustness
for very noisy and sparse data-sets hints that it is the next generation
imputation method for single cell profiles.