Home > Research > Publications & Outputs > Solubility Prediction From First Principles

Electronic data

  • 2021CarruthersPhD

    Final published version, 75.2 MB, PDF document

    Available under license: None

Text available via DOI:

View graph of relations

Solubility Prediction From First Principles

Research output: ThesisDoctoral Thesis

  • James Carruthers
Publication date2021
Number of pages159
Awarding Institution
Thesis sponsors
  • Syngenta
Award date12/08/2021
  • Lancaster University
<mark>Original language</mark>English


Solubility is a phenomenon of critical importance in countless areas of nature and industry. Solubility drives geological evolution through sedimentation and erosion. The solubility of pharmaceuticals and agrochemicals determines their efficacy and how they have to be formulated for the best efficiency of resources. Solubility determines the fate of artificial chemicals in nature. There are many areas of science where recreating the system in a lab environment is physically impossible or prohibitively expensive so the ability to simulate these systems is a high priority. This thesis is an exploration of methods to estimate solubilities from direct simulation of molecular systems and seeks to test their accuracy, precision and efficiency, and how they can be further improved.
The first study seeks to recreate the solubility of urea in water using two different thermodynamic cycles (molecular and atomic routes) and two different sets of force fields (Özpınar and TIP3P versus Hölzl and TIP4P/2005) of significantly different ages. This project is a test of simulation software to see if the thermodynamic cycles produce the same results and a test of the force fields to see if the newer force fields give a better estimate of the solubility of urea in water than the older force fields. Neither set of force fields were actually tested in this way. The solubilities are also estimated using direct coexistence simulations to test the efficiency of this method with modern software and computing power. The newer Hölzl and TIP4P/2005 force fields are closer to reproducing the solubility of urea in water than the older Özpınar and TIP3P force
fields according to direct coexistence method but the simulations take a very long time to equilibrate and a different solubility is obtained depending on whether the initial system is subsaturated or supersaturated. The Özpınar and TIP3P force fields failed to produce sensible chemical potential data. The chemical potentials derived from Hölzl and TIP4P/2005 through the molecular route agree with the direct coexistence results. The atomic route gives a too low estimate of the chemical potential difference between the crystal and solution.
The second study seeks to recreate the temperature/solubility phase diagram of butanol and water with direct coexistence simulations and free energy calculations to construct the curves of free energy of mixing using the GAFF and TIP3P force fields. The thermodynamics of mutual solvation are complicated by the competing solvation processes between phases and requires a more thorough analysis than for the solvation of solids which potentially means that direct coexistence simulations are competitive. The direct coexistence simulations were much more efficient than anticipated and gave statistically rigorous estimates of the solubilities of butanol and water. Numerically, they were not accurate estimates but reproduced the qualitative behaviour of the phase
diagram and the critical temperature of miscibility was closely reproduced at just above 100°C. The free energy calculations failed to produce chemical potential data with the precision required to create the curves of free energy of mixing at any temperature but the excess chemical potential calculations showed the correct behaviour of electrostatic interactions being more favourable in water than butanol and dispersion interactions being more favourable in butanol than water.
The third study explores the phenomenon of polymorphism where a molecule can adopt multiple different crystal arrangements depending on temperature and pressure. The stability of polymorphs affects how soluble a molecule is in a particular solvent — higher stability means lower solubility. The drug molecule carbamazepine exists in four known polymorphs. The GAFF force field was tested on how well it can reproduce its polymorphs, judged on crystal unit cell parameters. The chemical potentials of carbamazepine in its four polymorphs and in water were calculated to then see how the solubilities of the four polymorphs compare with experimental data. The GAFF force field closely reproduced three of the four polymorphs with one having issues on a single crystal axis. The stability hierarchy of the four polymorphs was reproduced but the experimental solubility of carbamazepine was over an order of magnitude lower than experimental data.
In conclusion, these studies show that there is still much progress required in general use force field development in order for solubility estimation to go mainstream. In some applications, direct coexistence simulations will give faster solubility estimates than free energy calculations but they can’t give the same thermodynamic insight. For free energy calculations, the thermodynamic cycle should be as simple as possible to avoid unnecessary errors — a thermodynamic adaptation of Occam’s Razor. Finally, there needs to be development of dedicated software for setting up free energy simulations.
There were thousands of simulations in these studies and much time was dedicated to writing input files by hand and troubleshooting errors in them. Dedicated software that automates the process will reduce errors and open up free energy simulations to wider use.