Bioinformatics; Computational Biology – Dr. Baldi’s group works at the intersection of biological and computer sciences, using probabilistic/machine learning techniques to address biological problems and mine large data sets produced by massive data acquisition technologies, such as genome sequencing, high-throughput drug screening, and DNA microarrays. Current projects include the prediction of protein secondary and tertiary structure, the study of DNA structure in relation to several biological processes (protein binding, gene regulation, triplet repeat expansion diseases), and the analysis of gene expression data. Dr. Baldi’s group is building a suite of genomics and proteomics programs for the prediction of protein structure and function and the analysis of microarray data.
Recent Publications
- Whisenant TC, Ho DT, Benz RW, Rogers JS, Kaake RM, Gordon EA, Huang L, Baldi P, Bardwell L. (2010) Computational prediction and experimental verification of new MAP kinase docking sites and substrates including Gli transcription factors. PLoS Comput Biol. 6(8). pii: e1000908.
- Compani B, Su T, Chang I, Cheng J, Shah KH, Whisenant T, Dou Y, Bergmann A, Cheong R, Wold B, Bardwell L, Levchenko A, Baldi P, Mjolsness E. (2010) A Scalable and Integrative System for Pathway Bioinformatics and Systems Biology. Adv Exp Med Biol. 680:523-534.
- Cruz-Fisher MI, Cheng C, Sun G, Pal S, Teng A, Molina DM, Kayala MA, Vigil A, <strong<baldi P, Felgner PL, Liang X, de la Maza LM.(2010) Identification of immunodominant antigens by probing a whole Chlamydia ORFome microarray using sera from immunized mice. Infect Immun. [Epub ahead of print]
- Daily K, Rigor P, Christley S, Xie X, Baldi P.(2010) Data structures and compression algorithms for high-throughput sequencing technologies. BMC Bioinformatics.11:514.
- Magnan CN, Zeller M, Kayala MA, Vigil A, Randall A, Felgner PL, Baldi P.(2010) High-throughput prediction of protein antigenicity using protein microarray data. Bioinformatics.[Epub ahead of print]
- Kao A, Chiu CL, Vellucci D, Yang Y, Patel VR, Guan S, Randall A, Baldi P, Rychnovsky SD, Huang L.(2010) Development of a novel cross-linking strategy for fast and accurate identification of cross-linked peptides of protein complexes. Mol Cell Proteomics. [Epub ahead of print]
- Nasr R, Hirschberg DS, Baldi, P.(2010) Hashing algorithms and data structures for rapid searches of fingerprint vectors. J Chem Inf Model. 50(8):1358-68.
- Baldi P, Nasr R. (2010) When is chemical similarity significant? The statistical distribution of chemical similarity scores and its extreme values. J Chem Inf Model. 50(7):1205-22.
- Liang L, Leng D, Burk C, Nakajima-Sasaki R, Kayala MA, Atluri VL, Pablo J, Unal B, Ficht TA, Gotuzzo E, Saito M, Morrow WJ, Liang X, Baldi P, Gilman RH, Vinetz JM, Tsolis RM, Felgner PL.(2010) Large scale immune profiling of infected humans and goats reveals differential recognition of Brucella melitensis antigens. PLoS Negl Trop Dis.;4(5):e673.
- Mochon AB, Jin Y, Kayala MA, Wingard JR, Clancy CJ, Nguyen MH, Felgner P, Baldi P, Liu H.(2010) Serological profiling of a Candida albicans protein microarray reveals permanent host-pathogen interplay and stage-specific responses during candidemia. PLoS Pathog. 2010 Mar 26;6(3):e1000827. Erratum in: PLoS Pathog 6(9).
- Guzmán DL, Randall A, Baldi P, Guan Z.(2010) Computational and single-molecule force studies of a macro domain protein reveal a key molecular determinant for mechanical stability. Proc Natl Acad Sci U S A. 107(5):1989-94. Epub 2010 Jan 13.
- Baldi, P, Itti L.(2010) Of bits and wows: A Bayesian theory of surprise with applications to attention. Neural Netw. 23(5):649-66. Epub 2009 Dec 28.
- Molina DM, Pal S, Kayala MA, Teng A, Kim PJ, Baldi P, Felgner PL, Liang X, de la Maza LM.(2010) Identification of immunodominant antigens of Chlamydia trachomatis using proteome microarrays. Vaccine. 28(17):3014-24. Epub 2009 Dec 29.
- Chen JH, Baldi P.(2009) No electron left behind: a rule-based expert system to predict chemical reactions and reaction mechanisms. J Chem Inf Model. 49(9):2034-43.
- Felgner PL, Kayala MA, Vigil A, Burk C, Nakajima-Sasaki R, Pablo J, Molina DM, Hirst S, Chew JS, Wang D, Tan G, Duffield M, Yang R, Neel J, Chantratita N, Bancroft G, Lertmemongkolchai G, Davies DH, Baldi P, Peacock S, Titball RW.(2009) A Burkholderia pseudomallei protein microarray reveals serodiagnostic and cross-reactive antigens. Proc Natl Acad Sci U S A. 106(32):13499-504. Epub 2009 Jul 28.
- Baldi P, Hirschberg DS.(2009) An intersection inequality sharper than the tanimoto triangle inequality for efficiently searching large databases. J Chem Inf Model. 49(8):1866-70.
- Magnan CN, Randall A, Baldi P.(2009) SOLpro: accurate sequence-based prediction of protein solubility.Bioinformatics. 25(17):2200-7. Epub 2009 Jun 23.
- Brandon MC, Wallace DC, Baldi P.(2009) Data structures and compression algorithms for genomic sequence data. Bioinformatics. 25(14):1731-8. Epub 2009 May 15
- Swamidass SJ, Azencott CA, Lin TW, Gramajo H, Tsai SC, Baldi P.(2009) Influence relevance voting: an accurate and interpretable virtual high throughput screening method. J Chem Inf Model 49(4):756-66
- Nasr RJ, Swamidass SJ, Baldi PF.(2009) Large scale study of multiple-molecule queries. J Cheminform. 1(1):7.
- Sweredoski MJ, Baldi P.(2009) COBEpro: a novel system for predicting continuous B-cell epitopes. Protein Eng Des Sel. 22(3):113-20. Epub 2008 Dec 10