Research
Current Research Interests
I am currently broadly interested in problems at the intersection of biology and machine learning. Some of my current interests include:
- Generative models and pretraining for proteins and chemistry
- Machine learning for protein engineering
- Uncertainty quantification in neural networks
Publications
Evolutionary velocity with protein language models. Brian L. Hie, Kevin K. Yang, and Peter S. Kim. Cell Systems, 2022. 10.1016/j.cels.2022.01.003
Machine learning modeling of family wide enzyme-substrate specificity screens. Samuel Goldman, Ria Das, Kevin K Yang, Connor W Coley. PLoS computational biology, 2022. 10.1371/journal.pcbi.1009853
A topological data analytic approach for discovering biophysical signatures in protein dynamics. Wai Shing Tang, Gabriel Monteiro da Silva, Henry Kirveslahti, Erin Skeens, Bibo Feng, Timothy Sudijono, Kevin K. Yang, Sayan Mukherjee, Brenda Rubenstein, Lorin Crawford. PLoS computational biology, 2022. 10.1371/journal.pcbi.1010045
Adaptive machine learning for protein engineering. Brian L. Hie and Kevin K. Yang. Current Opinion in Structural Biology, 2022. 10.1016/j.sbi.2021.11.002
FLIP: Benchmark tasks in fitness landscape inference for proteins. Christian Dallago, Jody Mou, Kadina E. Johnston, Bruce J. Wittmann, Nicholas Bhattacharya, Samuel Goldman, Ali Madani, Kevin K. Yang. NeurIPS 2021 Datasets and Benchmarks Track. 10.1101/2021.11.09.467890
Protein sequence design with deep generative models. Zachary Wu, Kadina E. Johnston, Frances H. Arnold, and Kevin K. Yang. Current Opinion in Chemical Biology, 2021. 10.1016/j.cbpa.2021.04.004
Learned embeddings from deep learning to visualize and predict protein sets. Christian Dallago, Konstantin Schütze, Michael Heinzinger, Tobias Olenyi, Maria Littmann, Amy X. Lu, Kevin K. Yang, Seonwoo Min, Sungroh Yoon, James T. Morton, Burkhard Rost. Current Protocols, May 2021. 10.1002/cpz1.113
Signal Peptides Generated by Attention-Based Neural Networks. Zachary Wu, Kevin K. Yang, Michael J. Liszka, Alycia Lee, Alina Batzilla, David Wernick, David P. Weiner, and Frances H. Arnold. ACS Synthetic Biology, 10 July 2020. 10.1021/acssynbio.0c00219
Machine learning-guided channelrhodopsin engineering enables minimally-invasive optogenetics. Bedbrook CN, Yang KK, Robinson JE, Gradinaru V, Arnold FH. Nature Methods, October 14, 2019. 10.1038/s41592-019-0583-8.
Machine-learning-guided directed evolution for protein engineering. Yang KK, Wu Z, Arnold FH. Nature Methods, July 15, 2019. 10.1038/s41592-019-0496-6.
Batched stochastic Bayesian optimization via combinatorial constraints design. Yang KK, Chen Y, Lee A, Yue Y. AIStats 2019. arxiv.
The Generation of Thermostable Fungal Laccase Chimeras by SCHEMA-RASPP Structure-Guided Recombination in Vivo. Mateljak I, Rice A, Yang KK, Tron T, Alcalde M. ACS Synthetic Biology, March 21, 2019. 10.1021/acssynbio.8b00509
Learned protein embeddings for machine learning. Yang KK, Wu Z, Bedbrook CN, Arnold FH. Bioinformatics. 23 March 2018. 10.1093/bioinformatics/bty178.
Machine learning to predict eukaryotic expression and plasma membrane localization of engineered integral membrane proteins. Bedbrook CN, Yang KK, Rice AJ, Gradinaru V, Arnold FH. PLOS Computational Biology 13(10): e1005786 (2017). 10.1371/journal.pcbi.1005786.
“Structure-Guided SCHEMA Recombination Generates Diverse Chimeric Channelrhodopsins. C. N. Bedbrook, A. J. Rice, K. K. Yang, X. Ding, S. Chen, E. M. LeProust, V. Gradinaru, F. H. Arnold. Proceedings of the National Academy of Sciences 114, E2624-E2633 (2017). 10.1073/pnas.170026911.
Preprints
Exploring evolution-based &-free protein language models as protein function predictors. Mingyang Hu, Fajie Yuan, Kevin K. Yang, Fusong Ju, Jin Su, Hui Wang, Fei Yang, Qiuyang Ding. arxiv
Masked inverse folding with sequence transfer for protein representation learning. Kevin K. Yang, Niccolò Zanichelli, Hugh Yeh. biorxiv
Convolutions are competitive with transformers for protein sequence pretraining. Kevin K. Yang, Alex X. Lu, Nicolo Fusi. biorxiv
Randomized gates eliminate bias in sort-seq assays. Brian L. Trippe, Buwei Huang, Erika A. DeBenedictis, Brian Coventry, Nicholas Bhattacharya, Kevin K. Yang, David Baker, Lorin Crawford. biorxiv