Learning More, with Less


P racticing chemists solve problems via “chemical intuition”, a quality that lets them skip intermediate details and get to the essential result, even if the outcome is counterintuitive to the uninitiated. There is no human shortcut to building this intuition; chemists hone their skills through years of experience of learning and memorizing patterns of molecular structure and reactivity. It is in this spirit that Vijay Pande and co-workers propose in “Low Data Drug Discovery with One-Shot Learning” in this issue of ACS Central Science a computational approach for chemical prediction by learning from a low number of examples. The paper touches on many central themes that are relevant to the intersection of the three main components of computation in chemistry: molecular representation, chemical space exploration, and machine learning to accelerate computation. For discovering new molecules, the enormity of chemical space cannot be understated; the number of “small” to “medium” sized molecules is estimated to be in the range of 10 to 10180, a number that is a hundred orders of magnitude larger than the number of atoms in the visible universe. With just a considerably small number of examples, chemists are able to distinguish and assess the potential function of a molecule for a given task. For example, we recently created a “Molecular Tinder” application that helped us in the design of molecules for organic displays. In analogy to the dating application, Molecular Tinder was a voting system that allowed us to harvest information from experimentalists who voted “Yes”, “No”, or “Maybe” on the synthesizability of molecules. Voting results allowed us to design algorithms that preferentially generated molecules with practical synthesic routes that were eventually synthesized and tested in real devices. Another very important aspect of human intuition is “transferability”, which enables the generalization of knowledge learned in a particular domain to untested domains. Everyone who has passed an undergraduate organic chemistry test had to show that their brain is able to generalize from one domain to the other. This is a much more challenging task for a computer. We are sometimes able to predict with varying degrees of success these properties using quantum chemistry calculations, but when these simulations are involved, supralinear computational scaling laws hinder the application of most common algorithms to complex molecules. Therefore, to cover chemical space efficiently, we cannot go unaided by intuition if we ever hope to explore it for successful molecular design. It is often thought in the artificial intelligence (AI) community that any human decision that can be done in a matter of a few seconds, can be in theory, learned and automated by a computer. There have been many recent examples where deep learning is solving increasingly complex tasks and getting closer to the performance of humans, even surpassing it in certain tasks such as the game Go with AlphaGo. This progress has been propelled mainly by two factors: broader availability of data and cheaper


0 Figures and Tables

    Download Full PDF Version (Non-Commercial Use)