Current Research Interests

I am currently broadly interested in problems at the intersection of biology and machine learning. Some of my current interests include:

  • Generative models and pretraining for proteins and chemistry
  • Machine learning for protein engineering
  • Uncertainty quantification in neural networks

Publications

Protein sequence design with deep generative models. Zachary Wu, Kadina E. Johnston, Frances H. Arnold, and Kevin K. Yang. Current Opinion in Chemical Biology, 2021. 10.1016/j.cbpa.2021.04.004

Learned embeddings from deep learning to visualize and predict protein sets. Christian Dallago, Konstantin Schütze, Michael Heinzinger, Tobias Olenyi, Maria Littmann, Amy X. Lu, Kevin K. Yang, Seonwoo Min, Sungroh Yoon, James T. Morton, Burkhard Rost. Current Protocols, May 2021. 10.1002/cpz1.113

Signal Peptides Generated by Attention-Based Neural Networks. Zachary Wu, Kevin K. Yang, Michael J. Liszka, Alycia Lee, Alina Batzilla, David Wernick, David P. Weiner, and Frances H. Arnold. ACS Synthetic Biology, 10 July 2020. 10.1021/acssynbio.0c00219

Machine learning-guided channelrhodopsin engineering enables minimally-invasive optogenetics. Bedbrook CN, Yang KK, Robinson JE, Gradinaru V, Arnold FH. Nature Methods, October 14, 2019. 10.1038/s41592-019-0583-8.

Machine-learning-guided directed evolution for protein engineering. Yang KK, Wu Z, Arnold FH. Nature Methods, July 15, 2019. 10.1038/s41592-019-0496-6.

Batched stochastic Bayesian optimization via combinatorial constraints design. Yang KK, Chen Y, Lee A, Yue Y. AIStats 2019. arxiv.

The Generation of Thermostable Fungal Laccase Chimeras by SCHEMA-RASPP Structure-Guided Recombination in Vivo. Mateljak I, Rice A, Yang KK, Tron T, Alcalde M. ACS Synthetic Biology, March 21, 2019. 10.1021/acssynbio.8b00509

Learned protein embeddings for machine learning. Yang KK, Wu Z, Bedbrook CN, Arnold FH. Bioinformatics. 23 March 2018. 10.1093/bioinformatics/bty178.

Machine learning to predict eukaryotic expression and plasma membrane localization of engineered integral membrane proteins. Bedbrook CN, Yang KK, Rice AJ, Gradinaru V, Arnold FH. PLOS Computational Biology 13(10): e1005786 (2017). 10.1371/journal.pcbi.1005786. “Structure-Guided SCHEMA Recombination Generates Diverse Chimeric Channelrhodopsins. C. N. Bedbrook, A. J. Rice, K. K. Yang, X. Ding, S. Chen, E. M. LeProust, V. Gradinaru, F. H. Arnold. Proceedings of the National Academy of Sciences 114, E2624-E2633 (2017). 10.1073/pnas.170026911.

Preprints

Evolutionary velocity with protein language models. Brian L. Hie, Kevin K. Yang, and Peter S. Kim. biorXiv

Adaptive machine learning for protein engineering. Brian L. Hie and Kevin K. Yang. arxiv

Machine learning-guided channelrhodopsin engineering enables minimally-invasive optogenetics. Bedbrook CN, Yang KK, Robinson JE, Gradinaru V, Arnold FH. biorXiv.

Machine learning in protein engineering. Yang KK, Wu Z, Arnold FH. arxiv.

Presentations

“Signal Peptides Generated by Attention-Based Neural Networks.” GSK Data Forum. 16 Feb 2021.

“Signal Peptides Generated by Attention-Based Neural Networks”. Boğaziçi University Biotech Conference. 15 Jan 2021.

“Machine optimization and generation of proteins”. Ohio State University Society for Biological Engineering. 6 Nov 2020.

“Probabilistic protein engineering.” Janelia Research Center. 30 May 2019.

“Learning the language of proteins.” Gray-Hill Seminar, Occidental College. 29 June 2018.

“Machine Learning to Predict Eukaryotic Expression and Plasma Membrane Localization of an Integral Membrane Protein.” Proteins Gordon Research Seminar. 18 June 2017.