Molecular Structure Disassembly Program (MOSDAP): A chemical information model to automate structure-based physical property estimation

Document Type


Publication Date



Chemical information theory and molecular structure searching have long been used as computational aids to researchers in the pharmaceutical field to estimate molecular structure-property relationships and to assist in drug design. Tailored to these and other specific applications, such endeavors have been expensive to develop and typically are very specialized. Often, they are not readily available and are not a part of the open literature. Because the number of chemicals in commercial use is growing daily (with over 18 million molecular species now catalogued by Chemical Abstract Services), there is a need among engineers in the chemical process industries for predictive structure-property algorithms. The most common and useful methods are those based on group contribution that require only the chemical structure of interest. Unfortunately, each group contribution method typically has its own fragment library and specialized rules, making such models difficult to automate for general use by the engineering community. This work, which has culminated in the creation of the Molecular Structure Disassembly Program (MOSDAP) software, is focused on combining and improving upon the best published methods in four areas: (1) lexicographical entry of structures, (2) prescreening methods, (3) abstract representation of molecular structures, and (4) structure manipulation routines. Additional features, such as a custom modification of the published Ullman substructure search algorithm specific to molecular graphs and an exact cover procedure to elucidate structural ambiguities, have been added by us to address specific problems encountered in group contribution methods. At present, most of the popular published group contribution methods can be automated using MOSDAP as a general engine for converting formula line notation (e.g., SMILES strings) into corresponding sets of functional groups and/or features.

Publication Title

Journal of Chemical Information and Computer Sciences