Saturday, October 20, 2018

theoretical chemistry - How to identify hydrogen bonds and other non-covalent interactions from structure considerations?


Chemistry is governed by a wide range of interactions, from ionic and covalent bonding, or other types of strong interactions, towards weaker types of bonding, attraction, or repulsion, that typically occur between molecules. The latter are often referred to as chemical, molecular, or non-covalent interactions. Popular examples include ion-dipole interactions, hydrogen bonds, and London dispersion forces.


These non-covalent interactions play very important roles in DNA, protein-drug interactions, catalyst-substrate interactions, self-assembly, chemical reactions, among others.


How can I identify and analyse hydrogen bonds and other non-covalent interactions, possibly from experimental and calculated structures?



Answer




It is safe to say that there will always be intermolecular forces at play. At the time where you will consider these you should already have a good idea about the molecules involved in your system.
Based on the composition and molecular structures you can make certain assumptions. In a molecule it is straight forward to estimate (bond) polarities based on electronegativities, then infer from these how they might arrange. I will work an example based on the interactions between adenine and thymine later.[1]


With the advent of the information age, tools that every chemist has at her or his disposal have become more sophisticated. We have access to many digital resources like databases,[2] or publication servers[3] to retrieve a vast amount of information. Molecular modelling,[4] or even more sophisticated quantum chemical calculations,[5] have become more important; and free tools are available for everyone to use.


With that being said, here are some points that might help you identifying hydrogen bonds and other non-covalent interactions. For that purpose, let's have a look at our example molecules:


adenine thymine


Immediately we can formulate a couple of assumptions based on the schematic representation. In adenine there are five nitrogen atoms, which have a higher electronegativity than carbon. Negative partial charges will therefore be located at mostly at these. Hydrogens in organic compounds usually carry positive partial charges; albeit $\ce{C-H}$ tend to be a lot less polarised than $\ce{N-H}$ bonds. Whenever a hydrogen is involved in a bond, that bond can potentially act as a hydrogen-bond donor (see below).
Similar observations can be made for thymine. Here we have two terminal oxygen atoms, which will carry a negative partial charge, since they have one of the highest electronegativities. These are often able to accept hydrogen bonds. On the other hand we also have $\ce{N-H}$ bonds, which can act as hydrogen bond donors.


Charges & Electrostatic Potential Surfaces


For many molecules structures are readily available. If not, some molecular editors give you the possibility to use implemented force fields to optimise built (guessed) structures. Based on those you can already do a few analyses. One tool that is quite powerful for various tasks is Avogadro, it let's you read crystal structures, perform basic calculations and much more. If you are just playing around, this is a really good choice.
For example, I have imported the crystal structure of adenine into Avogadro, optimised it, and calculated the electrostatic potential. Or after extracting Cartesian coordinates, Molden let's you easily calculate the charges.[6]



screenshot avogadro screen of molden


Hydrogen Bonds


Many molecular editors try to guess hydrogen bonds based on their implemented cutoff values. That certainly is very helpful, but not everything can be automatised in this way. And especially weaker interactions won't be found. One has to go a bit deeper then.
As a nice and concise example I have picked an intermolecular 2:1 complex between adenine and thymine, for which the crystal structure is available.[1]


schematic representation of adenine-thymine 2:1 complex


There are two principle structural parameters to decide about hydrogen bonds: (a) The distance of the hydrogen $\ce{H}$ and the hydrogen-bond acceptor $\ce{Y}$ is significantly shorter than the sum of their respective van-der-Waals radii, $d(\ce{XH\bond{...}Y}).[7] (b) The angle around the hydrogen is nearly linear, $\angle(\ce{XH\bond{...}Y}) \approx 180^\circ$. For weakly polarised $\ce{XH}$ bonds, isotropic dispersion forces become more important (while the directional electrostatic and covalent contributions decrease), therefore the angle becomes more flexible.


crystal structure of adenine-thymine 2:1 complex with displayed hydrogen bond distances and angles


We easily see that the bond angles are close to what we expect for hydrogen bonds. I have reproduced a few values from Batsanov's paper below, with the caveat that the value for hydrogen strongly varies depending on the chemical environment from $\pu{110 - 161 pm}$, so I used the classic from Bondi.[7c] Since all the distance are around $\pu{200 pm}$, they are well below the threshold we set earlier.


$$ \begin{array}{lr} \text{Element }\ce{Y} & r_\mathrm{vdW}(\ce{Y})/\pu{pm}\\\hline \ce{H} & \approx 120\\ \ce{C} & 196 \\ \ce{N} & 179 \\ \ce{O} & 171 \\\hline \end{array}\hspace{2ex} \begin{array}{lr}\\ \text{H-Bond }\ce{XH\bond{...}Y} & \sum r_\mathrm{vdW}(\ce{Y},\ce{H})/\pu{pm}\\\hline \ce{CH} & 316 \\ \ce{NH} & 299 \\ \ce{OH} & 291 \\\hline \end{array} $$


A quite interesting approach of revealing non-covalent interactions was presented by Johnson et. al., and the corresponding program is easy to use and only requires Cartesian coordinates.[8]



plot of non-covalent interactions of adenine-thymine 2:1 complex


The surfaces between the molecules represent these interactions, where green represents weak interactions, typically found for dispersion. Blue represents stronger interactions, typically found for hydrogen-bonds. Red displays repulsive forces, typically found within ring or cage systems.


If you have access to quantum chemical software, then you can obtain this plot also for wave function files .wfn.


Another possibility is to analyse the electron density in terms of the quantum theory of atoms in molecules (QTAIM).[9] For this you do need a wave function file. The analysis, however, is straight forward and will yield a bond path or not. If there is a bond path, we can estimate the strengths of these bond with the methodology developed by Espinosa et. al.. According to this the bondstrength is approximately half the value of the potential energy density at the bond critical point. $$E_\mathrm{H-Bond} = \frac{1}{2}V(r_{\mathrm{BCP}[\ce{XH\bond{...}Y}]})$$


I have performed such a calculation on the DF-B97D3/def2-TZVPP level of theory with Gaussian 09. The optimised geometry will be at the end.


QTAIM plot of the adenine-thymine 2:1 complex


\begin{array}{lr} \text{H-Bond} & E_\mathrm{H-Bond}/\pu{kJ mol-1}\\\hline \mathrm{N(37)H \cdots O(1)} & 46.6\\ \mathrm{N(5)H \cdots N(36)} & 38.5\\ \mathrm{C(41)H \cdots O(2)} & 3.2\\ \mathrm{N(20)H \cdots O(2)} & 50.5\\ \mathrm{N(3)H \cdots N(24)} & 29.9\\ \end{array}


A general warning shall be applied to the above. Absolute values of these are only approximate, but fall within the range of what is expected. A very nice side effect of this methodology is, that it can be applied to intramolecular hydrogen bonds, too.


Concluding remarks


Dispersive interactions and hydrogen bonds become more and more important in rational reaction design. Be it for understanding of molecular structure of biomolecules, or as a guiding principle for catalyst-substrate interactions. With further development of computer technology, it should become more accessible to everyone. I hope this post demonstrates that gaining more insight can actually be quite easy (and free).





Notes and References




  1. (a) Based on the structure from S. Chandrasekhar, T. R. Naik, S. K. Nayak, T. N. Row, Bioorg. Med. Chem. Lett. 2010, 20 (12), 3530-3533. DOI: 10.1016/j.bmcl.2010.04.131 PMID: 20493694 CSD: 739016 (b) Adenine, CSID: 185 (c) S. Mahapatra, S. K. Nayak, S. J. Prathapa, T. N. Guru Row, Cryst. Growth Des. 2008, 8 (4), 1223–1225. DOI: 10.1021/cg700743w, CSD: 652573 (d) Thymine, CSID: 1103 (e) G. Portalone, L. Bencivenni, M. Colapietro, A. Pieretti, F. Ramondo, Acta Chemica Scand. 1999, 53, 57-68. DOI: 10.3891/acta.chem.scand.53-0057, CSD: 136916




  2. (a) The Cambridge Structural Database (CSD), https://www.ccdc.cam.ac.uk/ (b) Crystallography Open Database (COD), http://www.crystallography.net/cod/ (c) Computational Chemistry Comparison and Benchmark DataBase, http://cccbdb.nist.gov/ (Only for 1799 small molecules and atoms) (d) Handbook of Chemistry and Physics, http://hbcponline.com/faces/contents/ContentsSearch.xhtml (e) ...





  3. (a) SciFinder, https://www.cas.org/products/scifinder (b) Google Scholar, https://scholar.google.de/ (c) Web of Science, (formerly known as Web of Knowledge) http://www.webofknowledge.com/ (d) ...




  4. (a) MolCalc, http://molcalc.org/ (b) Pitt Quantum Repository, https://pqr.pitt.edu/ (At the time of writing it was dead.) Github: pittquantum (c) Many open source molecular editors include the possibility to use force field calculations. For example: Avogadro, molden (d) For more on molecular modelling in the open source domain see S. Pirhadi, J. Sunseri, D. R. Koes, J. Mol. Graph. Model. 2016, 69, 127-143. An updated online version of this catalog can be found at https://opensourcemolecularmodeling.github.io.




  5. (a) For an extensive, but not necessarily complete, list of quantum chemistry software see Wikipedia. (b) For the purpose of this demonstration I will be using the proprietary software Gaussian. (c) To view crystal structures, Mercury can be obtained (for free) from ​the Cambridge Crystallographic Data Centre (CCDC), which also hosts CSD. https://www.ccdc.cam.ac.uk/solutions/csd-system/components/mercury/




  6. (a) Tutorial for Avogadro (b) Tutorial for molden





  7. (a) A concise (and as far as I can tell newest) list of van-der-Waals radii of many elements can be found in S. S, Batsanov, Inorg. Mat. 2001, 37, 871-885. DOI: 10.1023/A:1011625728803 (mirrored pdf) (b) A list of van-der-Waals radii can also be found on Wikipedia (c) A. Bondi, J. Phys. Chem. 1964, 68, 441-451. doi: 10.1021/j100785a001




  8. (a) The original publication: E. R. Johnson, S Keinan, P. Mori-Sánche§, J. Contreras-García, A. J. Cohen, W. Yang, J. Am. Chem. Soc. 2010, 132, 6498-6506. DOI: 10.1021/ja100936w (b) The presentation of the program: J. Contreras-Garcia, E. R. Johnson, S. Keinan, R. Chaudret, J-P. Piquemal, D. N. Beratan, W. Yang, J. Chem. Theory Comput. 2011, 7, 625-632. DOI: 10.1021/ct100641a (c) Download the code: http://www.lct.jussieu.fr/pagesperso/contrera/nciplot.html (d) You'll also need VMD (Visual Molecular Dynamics) from the University of Illinois




  9. (a) A very brief introduction can be found on Wikipedia. The corresponding book: Bader, Richard (1994). Atoms in Molecules: A Quantum Theory. USA: Oxford University Press. ISBN 978-0-19-855865-1. (publisher) (b) Multiwfn - A Multifunctional Wavefunction Analyzer; http://sobereva.com/multiwfn/ corresponding paper: T. Lu, F. Chen, J. Comput. Chem. 2012, 33, 580-592. DOI: 10.1002/jcc.22885 (c) Startup script (and examples) for Linux version: https://github.com/polyluxus/runMultiwfn.bash (shameless self-plug) (d) E. Espinosa, E. Molins and C. Lecomte, Chem. Phys. Lett., 1998, 285, 170–173.







Appendix


Optimised Structure of the adenine-thymine 2:1 complex calculated at DF-B97D3/def2-TZVPP in Gaussian 09 Rev. E.01



45
E(RB97D3/def2TZVPP/W06) = -1388.51095169
O 19.03780 11.79565 1.63996
O 14.74303 13.36808 1.71115
N 15.07940 11.09762 1.64043

H 14.04669 10.93403 1.64128
N 16.88238 12.55083 1.67499
H 17.23771 13.53695 1.70257
C 17.83067 11.52964 1.63864
C 17.29332 10.17534 1.60048
C 15.51719 12.40225 1.67775
C 15.94232 10.03610 1.60357
H 15.46489 9.06076 1.57654
C 18.24710 9.02139 1.56030
H 18.89656 9.08263 0.68030

H 18.90669 9.02997 2.43483
H 17.71085 8.06866 1.53480
N 8.28060 10.09397 1.65692
H 7.68055 9.28386 1.63653
N 8.91986 12.24943 1.71837
N 10.48101 9.01368 1.60709
N 11.91467 12.94472 1.71757
H 12.92612 13.11883 1.71578
H 11.26613 13.71385 1.74609
C 10.03193 11.42408 1.68464

N 12.26612 10.63929 1.64378
C 11.41986 11.70069 1.68284
C 9.65939 10.07380 1.64587
C 7.89700 11.42267 1.70074
H 6.85459 11.71169 1.71748
C 11.75955 9.39245 1.60917
H 12.49648 8.59107 1.57896
N 21.22700 16.46826 1.76840
N 17.31372 17.43807 1.81488
H 16.80939 18.31038 1.84233

N 19.61652 18.27309 1.82815
C 18.69253 17.30751 1.80467
N 17.71219 15.24141 1.74956
N 20.64990 14.21957 1.70686
H 19.98577 13.44513 1.68592
H 21.63800 14.02512 1.69520
C 20.27365 15.50797 1.74532
C 16.77892 16.17095 1.78070
H 15.71578 15.97320 1.77996
C 18.91965 15.92447 1.76371

C 20.85479 17.75435 1.80728
H 21.67003 18.47621 1.82413

No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions

Apparently the of last four, $\ce{Mg^2+}$ is closest in radius to $\ce{Li+}$. Is this true, and if so, why would a whole larger shell ($\ce{...