Article
Finding the K best synthesis plans
Affiliations
Organisations
- (1) University of Southern Denmark, grid.10825.3e, SDU
- (2) University of Vienna, grid.10420.37
- (3) Bioinformatics Group, Department of Computer Science, Interdisciplinary Center for Bioinformatics, Härtelstraße 16-18, 04107, Leipzig, Germany
- (4) Max Planck Institute for Mathematics in the Sciences, grid.419532.8
- (5) Fraunhofer Institute for Cell Therapy and Immunology, grid.418008.5
- (6) Santa Fe Institute, grid.209665.e
- (7) University of Copenhagen, grid.5254.6, KU
Description
In synthesis planning, the goal is to synthesize a target molecule from available starting materials, possibly optimizing costs such as price or environmental impact of the process. Current algorithmic approaches to synthesis planning are usually based on selecting a bond set and finding a single good plan among those induced by it. We demonstrate that synthesis planning can be phrased as a combinatorial optimization problem on hypergraphs by modeling individual synthesis plans as directed hyperpaths embedded in a hypergraph of reactions (HoR) representing the chemistry of interest. As a consequence, a polynomial time algorithm to find the K shortest hyperpaths can be used to compute the K best synthesis plans for a given target molecule. Having K good plans to choose from has many benefits: it makes the synthesis planning process much more robust when in later stages adding further chemical detail, it allows one to combine several notions of cost, and it provides a way to deal with imprecise yield estimates. A bond set gives rise to a HoR in a natural way. However, our modeling is not restricted to bond set based approaches-any set of known reactions and starting materials can be used to define a HoR. We also discuss classical quality measures for synthesis plans, such as overall yield and convergency, and demonstrate that convergency has a built-in inconsistency which could render its use in synthesis planning questionable. Decalin is used as an illustrative example of the use and implications of our results.