Biochemistry Publications

Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

Ruipeng Lu
Eliseos J Mucaki
Peter K Rogan

Document Type

Article

Publication Date

3-17-2017

Journal

Nucleic acids research

Volume

Issue

First Page

Last Page

URL with Digital Object Identifier

https://doi.org/10.1093/nar/gkw1036

Abstract

Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes.

Notes

This article was published by Oxford University Press in Nucleic Acids Research and is available open access at: https://doi.org/10.1093/nar/gkw1036

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Citation of this paper:

Lu R, Mucaki E and Rogan PK. Discovery and Validation of Information Theory-Based Transcription Factor and Cofactor Binding Site Motifs, Nucleic Acids Research. 45(5): e27, 2017

Download

Included in

Biochemistry Commons, Bioinformatics Commons, Computational Biology Commons, Genomics Commons

COinS

Biochemistry Publications

Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

Document Type

Publication Date

Journal

Volume

Issue

First Page

Last Page

URL with Digital Object Identifier

Abstract

Notes

Creative Commons License

Citation of this paper:

Included in

Links

Browse

Author Corner

Biochemistry Publications

Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

Authors

Document Type

Publication Date

Journal

Volume

Issue

First Page

Last Page

URL with Digital Object Identifier

Abstract

Notes

Creative Commons License

Citation of this paper:

Included in

Share

Links

Browse

Author Corner