|
IMEx Glossary
Imperfection Limit/Repeat Unit
Imperfection Limit of a repeat size (Ex:tri,penta) is the number of point mutations (substitutions and indels) that can be allowed
in a repeat unit (word) so that it can be considered as a match.
Example :
For example, 2 words ATGC and ATGG can be considered as matching repeat units if the imperfection
limit is 1 or more.
Imperfection Percentage
Imperfection percentage of a microsatellite repeat tract is the percentage of mismatches
in the entire repeat tract. Imperfection Percentage can be calculated as follows:
Formula :
Imperfection % = (No. of imperfections in entire tract / Total No. of Bases in tract) * 100
Minimum Repeat Number
Minimum Repeat Number of a repeat tract is the number of repeat units in a tract so that it
can be considered as a valid microsatellite tract.
Example :
For example, for mono, if Min. Repeat Number is set to 15 means that only those mononucleotide repeat tracts
which contain atleast 15 nucleotides will be considered as a valid tract.
Protein Table File (.ptt)
The PTT file format is a table of protein features. A Protein Table (PTT) file contains the
location, strand, length, ID, generic name, COG assignation and functional annotation for every predicted protein in whole
genomes. These protein table files, which end in a .ptt extension, can be obtained from NCBI's ftp site: ftp://ftp.ncbi.nih.gov/genomes/.
Example :
Click Here to see how a ptt file looks like.
dMAX : Maximum distance between SSRs allowed to consider as a cSSR
dMAX is the maximum distance (threshold) between any two SSRs to become a potential Compound SSR. IMEx considers 2 or more SSRs as a single compound SSR if the distance between any two SSRs is less than or equal to the dMAX value set by the user.
The dMAX value can be bewteen 0 to 50.
Standardization Level
The Microsatellites and Compound Microsatellites in IMEx are standardized according to the following Levels.
For Microsatellites:
As illustrated in the above Figure, the SSR of trinucleotide motif ATG is equivalent to the two SSRs of TGA and GAT as these are related to ATG by circular permutation. This is referred to as L1 standardization. At L2 level, in addition to the three SSRs CAT, ATC and TCA become equivalent to ATG due to its reverse complementation followed by circular permutation. Furthermore, at L3 level in addition to all the SSRs at level 2, the motifs TAC, ACT and CTA become equivalent to ATG because of complementation and circular permutation. Many pathogenic bacteria are known to show genome rearrangements and hence at level “F” other than the motifs considered up to L3, motifs GTA, TAG and AGT are also considered equivalent to ATG which arise as a consequence of motif reversal followed by its circular permutation.
No. of visits to this page :
© Copyright 2006 All Rights Reserved
www.cdfd.org.in |