Sputnik is a C language program that searches dna sequence files in Fasta format for microsatellite repeats. A sequence file is specified on the command line and the resulting hits are written to stdout along with their position in the sequence, length, and a score determined by the length of the repeat and the number of errors.
Sputnik uses a recursive algorithm to search for repeated patterns of nucleotides of length between 2 and 5. Insertions, mismatches and deletions are tolerated but affect the overall score. It does not search against a "library" of known microsatellites. Instead it reads through the entire sequence, assumes the existence of a repeat at every position, compares subsequent nucleotides and applies a simple scoring rule. If the resulting score rises above a preset threshold, the region along with its position and score is written out. If the score falls below a cutoff threshold, the search is abandoned and begun again at the next nucleotide. Each nucleotide that matches the value predicted (by assuming a repeat) adds to the score. Each "error" subtracts from the score. When an error is encountered, the three possible kinds of errors (mismatch, insertion and deletion) are assumed and recursive calls to the comparison routine are made. If the resulting score from one of these is above the cutoff threshold, it is returned and the best of three pursued.
Here is a sample of the output from sputnik being run against a library constructed from a genbank search for "HUMAN REPEAT" sequences:
> sputnik rep.lib >hshprma LOCUS HSHPRMA 249 bp DNA PRI 01-MAY-1993 dinucleotide 128 : 171 -- length 44 score 35 GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAAAGAGAGAGA dinucleotide 184 : 210 -- length 27 score 25 GTGTGTGTGTGTGTGTGTGTGTGTGTG >hum315mfd LOCUS HUM315MFD 251 bp ds-DNA PRI 04-AUG-1993 trinucleotide 210 : 246 -- length 37 score 16 TTATTATTATTATTTTATTTTATTTTATTATTATTAT ...Sputnik can be recompiled to change the score or threshold parameters, or the maximum recursion depth. In practice scores diverge quickly and adjusting these has little effect on anything other than the execution time. It might benefit from a nicer interface and output that was easier to parse.
Sputnik was developed by Chris Abajian at the University of Washington Department of Molecular Biotechnology in September '94. It is not currently "supported" but is a small program and easily modified.
If you have any questions or need help, email me at
Chris Abajian <chrisa@espressosoftware.com>