Responding to 'That's not biology/chemistry!
When Frances won the Nobel Prize in Chemistry, a Spanish chemist emailed her asking for advice on colleagues who claimed that their work wasn’t chemistry at all because it used machine learning:
A few analytical-chemistry people are criticizing our work because we use machine learning and neural networks as chemometrical tools for solving some identification/discrimination problems. In one particular undergraduate work we use multispectral data (imaging) and neural networks (e.g., CNN programmed in Matlab) for differentiating dogs from wolfs bite-marks.
We’ve been told this (multispectral+CNN) work is not chemistry at all, and that this particular research work is a shame for the chemistry world.
Here is my (lightly-edited) response:
I’m a PhD student in Frances’s lab at Caltech. She forwarded me your email and asked me to respond because my PhD has focused on using machine learning for protein engineering, so I have strong feelings about people who criticize work using machine learning as ‘not chemistry’ or ‘not biology.’
Using multispectral data and a neural network to differentiate dog bites from wolf bites is really cool! It seems like an ideal application for a neural network, because there’s sound (chemical) reasons to believe that the bites should generate different spectra, but it may be very difficult to tell the spectra apart by eye.
Here’s what I would say to your critics:
If you collect some spectral data and use linear regression to draw some conclusions from it, nobody will argue that it’s not chemistry. Using a more complex model, such as a neural network, doesn’t magically make your work not chemistry. You’re simply using a very powerful computational tool to draw conclusions from your data. Your subject-area expertise as a chemist is what allows you to choose a reasonable model and evaluate model performance.
Machine learning isn’t the solution to every problem in chemistry, but it’s already proven its ability to solve chemical problems, and will continue to do so.
For example:
ACS Central Science thinks this is chemistry: https://pubs.acs.org/doi/full/10.1021/acscentsci.7b00572
Here’s one at Nature: https://www.nature.com/articles/nature25978
This site has many more papers (but doesn’t seem to be updated anymore): https://github.com/kangway/mlchempapers
In short, machine learning is one of many tools that belongs in the chemist’s arsenal.