Biopolym. Cell. 1989; 5(4):30-37.
Structure and Function of Biopolymers
Statistical significance of the occurrence of some complex nucleotide combinations: comparison of the DNA models
1Suboch G. M., 1Sprizhitsky Yu. A.
  1. Institute of Molecular Genetics, Academy of Sciences of the USSR
    Moscow, USSR

Abstract

A scheme for modeling of the DNA chain as a sequence of the nucleotide runs of different length is presented. The advantages of such a method and range of its application are discussed. A procedure is suggested to estimate statistical significance of occurrence of some complex sequence structures in DNA by the Monte-Carlo method. It uses a bootstrap algorithm and necessitates comparatively small number of calculations. Three different models, used to derive such estimations of the frequencies of homopurine-homopyrimidine mirror repeats in the DNA of phage K and rodentia noncoding regions are compared.

References

[1] Day GR, Blake RD. Statistical significance of symmetrical and repetitive segments in DNA. Nucleic Acids Res. 1982;10(24):8323-39.
[2] Saurin W. Repetitive palindromic sequences in Escherichia coli. Detection and characterization with a new computer program. Comput Appl Biosci. 1987;3(2):121-7.
[3] Suboch GM, SPrizhitsky YuA, Alexandrov AA. Occurrence of homoPurine-homoPyrimidine mirror rePeats in natural DNAs. Biopolym Cell. 1989; 5(4):24-30.
[4] Sprizhitskii IuA, Nechipurenko IuD, Aleksandrov AA, Vol'kenshtein MV. Characteristics of nucleotide blocks in coding and non-coding DNA sequences from different organisms. Mol Biol (Mosk). 1988;22(2):338-56.
[5] Efron B. Bootstrap Methods: Another Look at the Jackknife. Ann Stat. 1979;7(1):1–26.
[6] Boos DD, Monahan JF. Bootstrap methods using prior information. Biometrika. 1986;73(1):77–83.
[7] GenBank (1986). Genetic sequence data bank, R. 44.0. BBN laboratories, USA.
[8] Borodovskii MIu, Sprizhitskii IuA, Golovanov EI, Aleksandrov AA. Statistical characteristics of primary structures of the functional regions of the Escherichia coli genome. III. Computer recognition of coding regions. Mol Biol (Mosk). 1986;20(5):1390-8.