Nancy R. Zhang

Assistant Professor of Statistics, Stanford University

Contact

Snail mail: 390 Serra Mall, Department of Statistics, Stanford University, Stanford, CA 94305.
Email: [email protected]
Fax: 650-725-8977


Click here for a pdf copy of my CV.

Current research interests

My research is data driven, with the data coming primarily from current biological applications.  Today's data sets are rich in structure and high in dimension, motivating new statistical models, as well as new perspectives on classic statistical concepts.   I am currently focusing on the detection of genomic variation from high-density SNP chips and next generation sequencing experiments.  These applications motivate new methods in change-point detection, scan statistics, and model and variable selection.

Publications

Siegmund, D., Yakir, B. and Zhang, N.R., 2011, The false discovery rate for scan statistics, Biometrika, in press.

Zhang, N.R. and Siegmund, D., 2011, Model Selection for High Dimensional, Multi-sequence Change-point Problems, Statistica Sinica, in press.

Muralidharan, O., Natsoulis, G., Bell, J., Newburger, D., Xu, H., Keta, I., Ji, H. and Zhang, N., 2011, A Cross-Sample Statistical Model for SNP Detection in Short-Read Sequencing Data, Nucleic Acids Research, in press.

Efron, B. and Zhang, N.R., 2011, False Discovery Rates and Copy Number Variation, Biometrika, 98, 251-271.

                Download pdf, Supplementary Materials

Siegmund, D.O., Yakir, B. and Zhang, N.R., 2011, Detecting simultaneous variant intervals in aligned sequences.  Annals of Applied Statistics, 5, 645-668.

Download pdf.      

Chan, H.P.*, Zhang, N.R.*, and Chen, Louis H.S., 2010, Importance sampling of word patterns in DNA and protein sequences.   Journal of Computational Biology, 17, 1697-1709.

Download pdf.  

Chen, H., Xing, H. and Zhang, N.R., 2011, Stochastic segmentation models for allele-specific copy number estimation with SNP-array data, PLoS Computational Biology, 7, e1001060.

Download pdf, software, user guide, example code.   


Bickel, P., Boley, N., Brown, B., Huang, H. and Zhang, N.R., 2010, Non-parametric methods for genomic inference.  Annals of Applied Statistics, 4, 1660-1697.

 

Download pdf, software.  

Li, F and Zhang, NR, 2010, Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces with Applications in Genomics. JASA Theory and Methods, 105, 1202-1214.

Download pdf, software.  

Siegmund, D.O., Yakir, B. and Zhang, N.R., 2010, Tail approximations for maxima of random fields by likelihood ratio transformations.  Sequential Analysis, 29, 245 - 262.

Zhang, N.R., Siegmund, D.O., Ji, H., and Li, J. 2010, Detecting simultaneous change-points in multiple sequences. Biometrika, 97,  631-645.

Download pdf, supplementary, software.                       

Zhang, N.R., 2010, DNA copy number profiling in normal and tumor genomes.  Frontiers in Computational and Systems Biology, ed. Jianfeng Feng, Wenjiang Fu and Fengzhu Sun, pp. 259-281, Springer-Verlag: London. 

Download pdf.  

Zhang, NR, Senbabaoglu, Y and Li, J, 2010, Joint Estimation of DNA Copy Number from Multiple Platforms.  Bioinformatics, 26, 153-160.

Download pdf, software.  

Chan, HP, Tu, IP and Zhang, NR, 2008, Boundary Crossing Probability Computations in the Analysis of Scan Statistics, in Scan Statistics:  Methods and Applications,  ed. Glaz, J. Pozdnyakov, V. and Wallenstein, S., 89-105  (Boston: Birkhauser). 

Download pdf.  

Lai, TL, Xing, H and Zhang, NR, 2008, Stochastic segmentation models for array-based comparative genomic hybridization data analysis. Biostatistics 9, 290-307. 

Download pdf, software.  

Zhang, NR, Wildermuth, MC, and Speed, TP, 2008, Transcription factor binding site prediction with multivariate gene expression data. Annals of Applied Statistics 2, 332-365. 

Download pdf, software (begin by reading file analysis_README).  

The Encode Consortium, 2007, Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799-816

Download pdf, software.  

Chan, HP and Zhang, NR, 2007, Scan statistics with weighted observations. 2007, JASA Theory and Methods 102, 595-602. 

Download pdf, Matlab code for analysis in paper.  

Zhang, NR and Siegmund, DO, 2007, A Modified Bayes Information Criterion with Applications to the Analysis of Comparative Genomic Hybridization Data. Biometrics 63, 22-32.

Download pdf, software.  

           

Submitted Preprints

A Flexible Approach for Targeted Human Genome Resequencing and Variant Discovery.  (with Georges Natsoulis et al.)

Change-point model on non-homogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing. (with Jeremy Shen.)

Detecting mutations in mixed sample sequencing data using empirical Bayes. (with Omkar Muralidharan et al.)

Multiple hypothesis testing, adjusting for latent variables. (with Yunting Sun and Art Owen.)

 

Awards and Grants

Sloan Fellowship (2011)

New World Silver Medal for Best PhD Thesis in the Mathematical Sciences (2007)

NSF DMS Grant 0906394 “Change-point Problems in Genomic Profiling” (2009)

NSF DMS Grant 1043204 “Statistical Methods for Threat Detection” (2010)

NIH R01 HG006137-01 “Statistical Models for Genome Sequencing and Association” (2011)

  

Courses

         Statistics 191                              Applied statistics.

         Statistics 203                              Introduction to regression models and analysis of variance.

         Statistics 205                              Nonparametric statistics.

         Statistics 215                              Stochastic processes in Biology.

         Statistics 345                              Special topics course on computational biology.  (Spring 2008)

         Statistics 345/Genetics 245        Computational algorithms for statistical genetics. (Spring 2009)

Statistics 366                              Statistical Models in Biology