1) IsomiR IdentificationPlant IsomiR Atlas (PIA) is a database depositing isomiRs identified from plant landscape. For this version v1.0, PIA deposits 196,829 unique isomiR signatures (98,734 unique isomiR sequences) identified from 6,167 plant miRNA hairpins by using 667 Illumina small RNA sequencing datasets of 23 species, whose genomes, primary transcripts and annotation information are mostly from phytozome, except those of Nelumbo nucifera, which are from lotus-db (Table 1).
The species-specific hairpin and mature sequences used for isomiR identification are from miRBase and Plant Non-coding RNA Database. We integrated miRNAs sequences in these two databases and removed the redundancy. Most datasets are from the NCBI SRA database and transformed into collapsed FASTA files using in-house Perl script which calls cutadapt to remove adapter sequences accurately. Clean reads are then used for isomiRs identification by a Perl script of modified isomiR2Function called isomiRIden. Briefly, sequenced reads and canonical miRNAs are mapped on species-specific pre-miRNAs allowing no mismatch. By comparing mapping information, templated isomiRs as well as their relative position to canonical miRNAs are identified. After that, reads not mapped on precursors are mapped on genome allowing no mismatch. Reads not mapped on the genome are then been mapped on species-specific pre-miRNAs again allowing two mismatches. By comparing mapping information, non-templated isomiRs as well as their relative position to canonical miRNAs and mismatch positions are identified. Finally, by analyzing the identified information, isomiRs are indexed and accurately classified into different categories (Figure 1). You can read our paper " isomiR2Function: An Integrated Workflow for Identifying MicroRNA Variants in Plants" for more information. ![]() 2) Structure of Plant IsomiR AtlasPIA was implemented in MySQL, PHP, JavaScript and Perl. Anyone can access this database totally free. The MySQL database of PIA consists of seven tables which are seq, hairpin, exist, evidence, isomiR_alignment, mature_alignment, precursor_alignment, targetfinder, psRNATarget, miRNA_targetfinder and datasets. database is an independent table (Figure 2) which stores the information of datasets. Relation between other tables and their column information are shown in the following Figure 3, where 'P' is for primary key, 'I' is for indexed and 'F' is for foreign key. The same name columns in different columns with a 'F' reference to same name column without 'F'. For example, seq.ID is the foreign key of tables evidence.ID, hairpin.ID, targetfinder.ID and psRNATarget.ID. ![]() ![]() |