|multi-FASTA format(for small RNAs and target transcripts):||
>AT1G27360.1 AAGGTATCTATTTGCCTAGCCAGAGTTATATATAGGATTGATTGTCTAGTCTTTTCTTAT ATGATTTTTGTTCTCATTTACTAATCAAAGTTCTGCAAACTTGTAGTTGTTGTAGGATTT GTTGCTCTGGCTCTGGTGGTAGGTCTATGAAATCAACCCATATCGTGAATGGACTGCAAC ATGGTATCTTCGTCCCAGTGGGATTGGGAGCATTTGATCATGTCCAATCCGTCAAGGACT GAAGATGACAGCAAACAG >AT1G27360.4 | Symbols: | squamosa promoter-binding protein CTGGGTGAAACATAGAAAAGTTTCTCTTGCTCAAGTTAATGATAAAAGGGTGAGAGCAAT AAACGCTGATAAGCCTTGTCTGGTCCTTGGAATTTTGAATTTTCTTTTTCTATCTTACTT ATAGTATTGGTAGTTGAGGGTGTCGTCGATAAGTTGTTGTAGGATTTGTTGCTCTGGCTC TGGTGGTAGGTCTATGAAATCAACCCATATCGTGAATGGACTGCAACATGGTATCTTCGT CCCAGTGGGATTGGGAGCATTTGATCATGTCCAATCCGTCAAGGACTGAAGATGACAGCA AACAGCTACCTACTGAGTGGGAAATTGAAAAAGGTGAAGGAATTGAATCTATAGTTCCAC ATTTCTCAGGCCTTGAGAGAGTCAGTAGTGGCTCTGCCACCAGCTTCTGGCACACTGCTG TATCGAAAAGCTCACAGTCGACCTCTATCAACTCATCATCTCCCGAAGCCAAACGATGCA AGCTTGCATCAGA
|Short Tags: for small RNA sequences, one sequence per line||
UGACAGAAGAGAGUGAGCAC UUGACAGAAGAUAGAGAGCAC UCCCAAAUGUAGACAAAGCA UGUGUUCUCAGGUCACCCCUU UGUGUUCUCAGGUCACCCCUG UGUGUUCUCAGGUCACCCCUG UGGUAGCAGUAGCGGUGGUAA AAGCUCAGGAGGGAUAGCGCC AAGCUCAGGAGGGAUAGCGCC
|Pure Sequence: a single target transcript sequence without FASTA head (may occupy multi-lines)||
CTGGGTGAAACATAGAAAAGTTTCTCTTGCTCAAGTTAATGATAAAAGGGTGAGAGCAAT AAACGCTGATAAGCCTTGTCTGGTCCTTGGAATTTTGAATTTTCTTTTTCTATCTTACTT ATAGTATTGGTAGTTGAGGGTGTCGTCGATAAGTTGTTGTAGGATTTGTTGCTCTGGCTC TGGTGGTAGGTCTATGAAATCAACCCATATCGTGAATGGACTGCAACATGGTATCTTCGT CCCAGTGGGATTGGGAGCATTTGATCATGTCCAATCCGTCAAGGACTGAAGATGACAGCA AACAGCTACCTACTGAGTGGGAAATTGAAAAAGGTGAAGGAATTGAATCTATAGTTCCAC ATTTCTCAGGCCTTGAGAGAGTCAGTAGTGGCTCTGCCACCAGCTTCTGGCACACTGCTG TATCGAAAAGCTCACAGTCGACCTCTATCAACTCATCATCTCCCGAAGCCAAACGATGCA AGCTTGCATCAGA
Prior to analysis, back-end pipeline will check submitted small RNAs, mainly including miRNA and phasiRNA (sRNA) sequences by the following standards:
Users are allowed to submit target candidate sequences of their interest in this section. A typical target transcript sequence can be a cDNA, EST, Unigene, mRNA or genomic segment, etc. The server will search possible target sites on these submitted target cadidates for (submitted or preloaded) small RNA sequences (mainly including miRNA and ta-siRNA, sic passim). Prior to analysis, back-end pipeline will check these submitted sequences by the following standards:
The raw NGS data need to be preprocessed prior to submission. For the miRNA sequenced by NGS, users should firstly convert them either into FASTA format or as short tags (see above examples). To reduce data size, users need to filter sequences by length to only keep those with 19-25 NT. Redundant sequences can be removed to further reduce data size. For the mRNA transcript (target candidates) sequenced by NGS, we recommend de novo transcriptome assembly which will generate longer contig and improve prediction quality. The workload for analysis server will also be reduced.
The V1 scoring schema [PMID:21622958] was developed referring to the model from animal based on a series of research papers at early stage. One of major features is that the seed region is from No. 2-8 bp only and there is no limit for the number of mismatches occured in seed region. In our early study, the v1 schema can identify all of validated miRNA-target pair (usually by 5'-RACE) in our curated dataset if the maximum epxectation is set to 5.0. In psRNATarget, we set the default value of maximum expectation to 3.0 for compatibility reason
We improved the default scoring schema based on the curated dataset including the validated miRNA-target pairs after the V1 schema published. The improved schema (V2, 2017 release) can find more curated miRNA-target pairs from the updated dataset without significant increase in total output. In V2 schema, the seed region has been extended to No. 2-13 bp and the maximum number of mismatches (excluding G-U) allowed in seed region has been restricted to two. In addtion, the analysis of target accessibility has been disable since its value didn't change the final output. The default maximum expectation is set to 5.0, which recalls 93% of validated miRNA-target pairs compared to the 86% of recall rate reached by V1 schema with the same cutoff.
User may change settings to handle special case of target recognition. For example, some miRNA-target interactions may accommodate long INDEL, so Penalty for opening gap can be reduced to display more such kind of interactions. Extra weight in seed region can also be increased to give more weight for seed region recognition. Calculate target accessibility can be enabled to consider the effect of mRNA secondary structure on target recognition. Please referring to the help information described below to adjust schema.
Expectation value is the penalty for the mismatches between small RNA mature and target sequence. Higher value indicated less similarity (and possibility) between small RNA and target candidate. The default penalty rule is set up by scoring schema. Maximum expectation is the cutoff; any small RNA-target pair with expecation less than the cutoff will be discarded in final result. The recommended values are 3.0-5.0 depending on scoring schema.
The length of region in which the server will score complementarity between small RNA and target transcript. The recommended range for hspsize is 19-20. Be aware that scoring algorithm will only penalize mismatches in this region(from No. 1 to No. hspsize nt) and subsequent mismatches will be ignored. In addition, the submitted small RNAs will be removed if they are shorter than HSP value.
The number of top (the best) target gene candidates that will be listed for each submitted small RNA.
The accessibility of mRNA target site to small RNA has been identified as one of important factors that are involved in target recognition because the secondary structure (stem etc.) around target site will prevent small RNA (including miRNA and ta-siRNA, sic passim) and mRNA target from contacting. The psRNATarget server employes RNAup to calculate target accessbility, which is represented by the energy required to open (unpair) secondary structure around target site (usually the complementary region with small RNA and up/downstream) on target mRNA(see figure below). The less energy means the more possibility that small RNA is able to contact (and cleave) target mRNA.
In above figure, represents the energy that is required to open secondary structure around target site. We use a software, namely RNAup, described by Muckstein et al (2005, pmid=16446276) to calculate this value, denoted as UPE.
Besides target site (complementary region with small RNA) itself, its two flanks on mRNA are also required to be opened in secondary structure for small RNA's (including miRNA and ta-siRNA, sic passim) binding and cleavage (see two red up-arrows in the following figure). The reason is that small RNA binds to target mRNA in the groove of RISC complex which need extra space on two sides of target site. Kertesz et al (2007)(PMID:17893677) suggested that 17 upstream and 13 downstream nucleotides of target site should be considered in target accessibility analysis.
In addition to cleave mRNA, plant miRNA also reportedly inhibits the translation of target genes. It often happens if any mismatch occurs in around center of
complemetary region because the central region is essential for cleavage (Brodersen et al 2008, PMID: 18483398). This mechanism is different from translational
inhibition of animal miRNA, although the latter also inhibits gene expression at the translational level.
The users are allowed to set coordinates of central region in which any mismatch will be reported as the trigger of translational inhibition.
Two-hits model (Axtell et al, 2005; PMID:17081978) suggests that a miRNA or ta-siRNA may have multiple target sites (i.e. complementary regions) on a specific target transcript, which will increase recognition actitivity of the miRNA/ta-siRNA to the mRNA target. The server will report the number of target sites for each small RNA/target pair. Users are advised to preferentially select a sRNA/target pair with more target sites.