Recent & Upcoming Events

Feb 20, 2018

Lessons (Not) to be Learned from the Debate about Genetic Privacy by Dr. Ellen Wright Clayton

CPCP Privacy/Fairness Seminar

Oct 31, 2017

The Fairsquare Project: Countering Programs that Discriminate by Dr. Aws Albarghouthi

CPCP Privacy/Fairness Seminar

Oct 24, 2017

Friends Don’t Let Friends Deploy Black-Box Models: The Importance of Intelligibility in Machine Learning for Healthcare by Dr. Rich Caruana

CPCP Privacy/Fairness Seminar

Sep 26, 2017

The Bounty of the Commons by Dr. Casey Greene

CPCP Privacy/Fairness Seminar

Jun 1, 2017

CPCP Third Annual Retreat

The day-long program will feature presentations about using Big Data to improve human health.

Training Resources

CPCP seminar: Lessons (Not) to be Learned from the Debate about Genetic Privacy Seminar Video

Talk by Ellen Wright Clayton MD, JD, Vanderbilt University

CPCP Seminar: Project Fairsquare Seminar Video

Talk by Aws Albarghouthi PhD, University of Wisconsin Department of Computer Sciences

CPCP Seminar: Friends Don't Let Friends Use Black-Box Models: The Importance of Intelligibility in Machine Learning for Healthcare Seminar Video

Talk by Rich Caruana PhD, Microsoft Research

CPCP Seminar: The Bounty of the Commons Seminar Video

Casey Greene, PhD University of Pennsylvania Abstract: This is an exciting time in biomedical data science. It is now possible to collect substantial information about individuals and their encounters with health care. Our ultimate goal is to integrate this data, along with data and findings from those engaged in basic science, to identify new opportunities to improve health. Broad data sharing will further our progress towards this goal. However, data sharing poses both cultural and technological challenges. I'll discuss our work to address technical issues, including analysis approaches that lift techniques from the field of software engineering and data sharing approaches that employ deep generative neural networks. I'll also touch on our work to shift cultures, including the research parasite and research symbiont awards (applications for each due Sept 30!).

CPCP 2017 Retreat: Phenotype Models for Breast Cancer Screening Symposium Video

Talk by Beth Burnside and Ming Yuan

Recent Publications

Statistical tests and identifiability conditions for pooling and analyzing multisite datasets. Zhou HH, Singh V, Johnson SC, Wahba G, and the Alzheimer’s Disease Neuroimaging Initiative. Proceedings of the National Academy of Sciences USA, 2018

Integrative genomic analysis predicts causative cis-regulatory mechanisms of the breast cancer-associated genetic variant rs4415084. Zhang Y, Manjunath M, Zhang S, Chasman D, Roy S, Song JS. Cancer Research , 2018

Anxiety-related experience-dependent white matter structural differences in adolescence: A monozygotic twin difference approach. Adluru N, Luo Z, VanHulle CA, Schoen AJ, Davidson, RJ, Alexander AL, Goldsmith HH. Scientific Reports, 7(1): 8749, 2017

When can multi-site datasets be pooled for regression? Hypothesis tests, L2-consistency and neuroscience applications. Zhou HH, Zhang Y, Ithapu VK, Johnson SC, Wahba G, Singh V. Proceedings of the International Conference on Machine Learning (ICML), 2017

Riemannian nonlinear mixed effects models: analyzing longitudinal deformations in neuroimaging. Kim HJ, Adluru N, Suri H, Vemuri BC, Johnson SC, Singh V. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

Recent Resources

atSNP Search software

atSNP Search is a web tool that evaluates the impact of single nucleotide polymorphisms (SNPs) on transcription factor binding (TF) in silico. It statistically quantifies whether any given SNP from the dbSNP Build 144 is likely to lead to gain and/or loss of function for binding of any TF from existing TF binding profile databases.

RKColocal software

RKColocal provides baisc functions for dual-channel image input/output and colocalization analysis. RKColocal includes tools to display joint distribution of pixel intensities for colocalization analysis, evaluate the average degree of colocalization in a given region, qunatify the degree of colocalization at each pixel and identify colocalized region.

CMINT software

Chromatin Module INference on Trees (CMINT) is an algorithm for learning chromatin modules, defined as groups of genomic loci that have similar chromatin states. Chromatin states in turn are defined by a combination of chromatin mark profiles.

scDD software

scDD is an R package to identify genes with distributional changes across conditions in a single-cell RNA-seq experiment

scPattern software

scPattern is an R package to identify and classify gene expression changes in ordered single-cell RNA-seq experiments