Background: Topologically associating domains (TADs) are considered the structural and functional units of the genome. However, there is a lack of an integrated resource for TADs in the literature where researchers can obtain family classifications and detailed information about TADs. Results: We built an online knowledge base TADKB integrating knowledge for TADs in eleven cell types of human and mouse. For each TAD, TADKB provides the predicted three-dimensional (3D) structures of chromosomes and TADs, and detailed annotations about the protein-coding genes and long non-coding RNAs (lncRNAs) existent in each TAD. Besides the 3D chromosomal structures inferred by population Hi-C, the single-cell haplotype-resolved chromosomal 3D structures of 17 GM12878 cells are also integrated in TADKB. A user can submit query gene/lncRNA ID/sequence to search for the TAD(s) that contain(s) the query gene or lncRNA. We also classified TADs into families. To achieve that, we used the TM-scores between reconstructed 3D structures of TADs as structural similarities and the Pearson's correlation coefficients between the fold enrichment of chromatin states as functional similarities. All of the TADs in one cell type were clustered based on structural and functional similarities respectively using the spectral clustering algorithm with various predefined numbers of clusters. We have compared the overlapping TADs from structural and functional clusters and found that most of the TADs in the functional clusters with depleted chromatin states are clustered into one or two structural clusters. This novel finding indicates a connection between the 3D structures of TADs and their DNA functions in terms of chromatin states. Conclusion: TADKB is available at http://dna.cs.miami.edu/TADKB/.
- Family classification
- Long non-coding RNAs
- Single-cell 3D genome structures
- Topologically associating domains
ASJC Scopus subject areas