logo.png

1. How to install CNIT


1.1. Click here to download the standalone version of CNIT.

1.2. Install the CNIT standalone version

1.3. Run CNIT stabdalone version

2. Download CNIT dataset


2.1 Training & Test set

The training dataset of CNIT containing human (GRCH38) and Arabidopsis (EnsemblPlants-v37) protein-coding transcripts and noncoding transcripts. To evaluate the performance of CNIT across species, we further built an test set for 10 animals: Mouse, Anole lizard, Chicken, Gorilla, Xenopus, Macaque, Chimpanzee, Orangutan, Zebrafish, worm and 25 plants: Aegilops tauschii, Amborella trichopoda, Arabidopsis lyrata, Beta vulgaris, Brachypodium distachyon, Brassica napus, Brassica oleracea, Brassica rapa, Chlamydomonas reinhardtii, Galdieria sulphuraria, Glycine max, Medicago truncatula, Musa acuminate, Oryza brachyantha, Oryza sativa, Physcomitrella patens, Populus trichocarpa, Selaginella moellendorffii, Setaria italic, Solanum lycopersicum, Solanum tuberosum, Sorghum bicolor, Theobroma cacao, Vitis vinifera and Zea mays. We selected animal protein-coding and noncoding transcripts from the RefSeq database. For plant, Coding or noncoding transcripts were obtained from the Refseq or EnsemblPlants (v37) databases with transcript status as “KNOWN”, respectively. Users can download all sequences in training and test set via the links below.

Animal mRNAs LncRNAs Plant mRNAs LncRNAs Plant mRNAs LncRNAs
Human (Training set) Download Download Arabidopsis thaliana (Training set) Download Download Musa acuminata Download Download
Mouse Download Download Aegilops tauschii Download Download Oryza brachyantha Download Download
Anole lizard Download Download Amborella trichopoda Download Download Oryza sativa Download Download
Chicken Download Download Arabidopsis lyrata Download Download Physcomitrella patens Download Download
Gorilla Download Download Beta vulgaris Download Download Populus trichocarpa Download Download
Xenopus Download Download Brachypodium distachyon Download Download Selaginella moellendorffii Download Download
Macaque Download Download Brassica napus Download Download Setaria italica Download Download
Chimpanzee Download Download Brassica oleracea Download Download Solanum lycopersicum Download Download
Lamprey Download Download Brassica rapa Download Download Solanum tuberosum Download Download
Orangutan Download Download Chlamydomonas reinhardtii Download Download Sorghum bicolor Download Download
Zebrafish Download Download Galdieria sulphuraria Download Download Theobroma cacao Download Download
Glycine max Download Download Vitis vinifera Download Download
Medicago truncatula Download Download Zea mays Download Download

2.2 Comparing dataset across different softwares

To evaluate the performance of CNIT across other softwares, we further use the independent testing set for human, mouse, zebrafish, fly, worm and Arabidopsis from CPC2 dataset.