Statistical significance of the occurrence of some complex nucleotide combinations: comparison of the DNA models

Authors

  • G. M. Suboch Institute of Molecular Genetics, Academy of Sciences of the USSR Moscow, USSR Author
  • Yu. A. Sprizhitsky Institute of Molecular Genetics, Academy of Sciences of the USSR Moscow, USSR Author

DOI:

https://doi.org/10.7124/bc.0000D0

Abstract

A scheme for modeling of the DNA chain as a sequence of the nucleotide runs of different length is presented. The advantages of such a method and range of its application are discussed. A procedure is suggested to estimate statistical significance of occurrence of some complex sequence structures in DNA by the Monte-Carlo method. It uses a bootstrap algorithm and necessitates comparatively small number of calculations. Three different models, used to derive such estimations of the frequencies of homopurine-homopyrimidine mirror repeats in the DNA of phage K and rodentia noncoding regions are compared.

References

Day GR, Blake RD. Statistical significance of symmetrical and repetitive segments in DNA. Nucleic Acids Res. 1982;10(24):8323-39.

Saurin W. Repetitive palindromic sequences in Escherichia coli. Detection and characterization with a new computer program. Comput Appl Biosci. 1987;3(2):121-7.

Suboch GM, SPrizhitsky YuA, Alexandrov AA. Occurrence of homoPurine-homoPyrimidine mirror rePeats in natural DNAs. Biopolym Cell. 1989; 5(4):24-30.

Sprizhitskii IuA, Nechipurenko IuD, Aleksandrov AA, Vol'kenshtein MV. Characteristics of nucleotide blocks in coding and non-coding DNA sequences from different organisms. Mol Biol (Mosk). 1988;22(2):338-56.

Efron B. Bootstrap Methods: Another Look at the Jackknife. Ann Stat. 1979;7(1):1–26.

Boos DD, Monahan JF. Bootstrap methods using prior information. Biometrika. 1986;73(1):77–83.

GenBank (1986). Genetic sequence data bank, R. 44.0. BBN laboratories, USA.

Borodovskii MIu, Sprizhitskii IuA, Golovanov EI, Aleksandrov AA. Statistical characteristics of primary structures of the functional regions of the Escherichia coli genome. III. Computer recognition of coding regions. Mol Biol (Mosk). 1986;20(5):1390-8.

Published

1989-07-20

Issue

Section

Structure and Function of Biopolymers