Version 2.5.2.0 CRISP Logo CRISP Homepage Help for CRISP Email Us

Abstract

Grant Number: 1Z01BC008396-13
Project Title: Molecular Information Theory
PI Information:NameEmailTitle
SCHNEIDER, THOMAS W. schneidert@saic.com

Abstract: Information theory is a powerful tool for understanding the DNA and RNA patterns that define genetic control systems. My theoretical work is divided into several levels. Level 0 is the study of genetic sequences bound by proteins or other macromolecules, briefly described below. The success of this theory suggested that other aspects of information theory should also apply to molecular biology. Level 1 theory introduces the more general concept of the molecular machine, and the concept of a machine capacity equivalent to Shannon's channel capacity. In Level 2, the Second Law of Thermodynamics is connected to the capacity theorem. This defines the limits of Maxwell's Demon and future molecular computers. The project also has three interrelated activities: theory, computer analysis and genetic engineering experiments. In level 0 I showed that binding sites on nucleic acids usually contain just about the amount of information needed for molecules to find the sites in the genome. Apparent exceptions to this "working hypothesis" have revealed many new phenomena. The first major anomaly was found at bacteriophage T7 promoters, which conserve twice as much information as the polymerase requires to locate them. The most likely explanation is that a second protein binds to the DNA. In another case, we discovered that the F incD region has a three-fold excess conservation, which implies that three proteins bind there. We are investigating both anomalies experimentally. Two graphical methods have been invented to display the structure of binding sites. A sequence logo shows the average patterns in a set of binding sites. The recently invented walker shows individual binding sites. Displaying many walkers simultaneously has become such a powerful tool for investigating genetic structure that it will undoubtedly replace consensus sequences. Walkers can be used to distinguish mutations from polymorphisms, and this has clinical applications. See http://www.lecb.ncifcrf.gov/~toms/schneider.html for further information. Z01 BC 08396-11

Public Health Relevance:
This Public Health Relevance is not available.

Thesaurus Terms:
computer assisted sequence analysis, computer data analysis, genetic promoter element, information theory, nucleic acid sequence, nucleic acid structure, structural biology, virus genetics
DNA binding protein, bacteriophage T7, binding site, informatics, intermolecular interaction, thermodynamics

Institution:
Fiscal Year: 2001
Department:
Project Start:
Project End:
ICD: DIVISION OF BASIC SCIENCES - NCI
IRG: LECB


CRISP Homepage Help for CRISP Email Us