Friday, June 29, 2018

computational chemistry - Structure that breaks InChI


I am currently testing a piece of software that generates the InChI for a given structure. I also want to test error situations as end users will also deliver the mol files that will be used as input for the software.


I have found this description about what the current version of InChI cannot represent (Technical FAQ section 4.15 at inchi-trust.org)




InChI currently does not support the representation of:



  • Polymers

  • Complex organometallics

  • Markush structures

  • Mixtures

  • Conformers

  • Excited state and spin isomers

  • Non-local stereochemistry/chirality

  • Topological isomers


  • Cluster molecules

  • Polymorphs

  • Unspecific isotopic enrichment

  • Reactions


Also, InChI is not suitable for very large compounds; technically, InChI input may not contain more than 1023 atoms.


Please note that there are IUPAC InChI subcommittee working groups currently addressing some of these matters. Details of these efforts can be found at: http://iupac.org/web/ins/802



Unfortunately I am lacking the chemical background to generate a suitable structure as molfile that meets the described features to produce an error as an InChI cannot be generated for it.



Answer




Okay, let's tackle at least one problem here. Consider the rotation of 1,2-dichloroethane (BP86/cc-pVDZ):
rotation of dichloroethane


These conformational changes can be further rationalised:



  • C and C' are the same conformation, since these are mirror images.

  • The same applies to B and B'.

  • A and C are local minima and can be referred to as conformes in the above given way

  • B and D are transition states - they are not stable.
    If you consider the rotation A to C as a reaction, this state would mark the maximum energy needed for the conversion. I believe with reactions it is meant, that an accurate description of the transition state by InChI is not possible.





For the term non-local chirality I suggest you look at BINAP or BINOL.
structure of BINAP
In this case I have used a slightly abridged version of BINAP for educational purposes, where $\ce{Ph -> Me}$.


R-BINAP S-BINAP


Other wonderful examples of non-local chirality include helicenes, which have helical chirality. Here is one example of a fancy azonia derivative of hexahelicene (parent compound and links). The structure is available at the Cambridge Crystallographic Data Centre (CCDC) Database. I removed the anion and converted it to xyz coordinates which I include in the appendix. Here are the structures, left P-chirality, right M-chirality:
scheme of 8a-azonia[6]helicene
P-8a-azonia[6]heliceneM-8a-azonia[6]helicene


On another account of non/local chirality, there is the planar chirality, as can be seen in trans-Cyclooctene (the bottom two). For further reading I suggest Neuenschwander et.al. "The Conformations of Cyclooctene: Consequences for Epoxidation Chemistry."
Structures of cyclooctene





Markush Structures are a kind of summary structure for certain compound classes. A very simple example would be a monosubstituted biphenyl system, where the upper structure is a representation of all three below. However, it is noteworthy, that each instance of a Markush structure can be well explained by InChI.
biphenyl markush structure
More complicated structures are often used in patents and in other scientific publications, usually specifically referring to a very general procedure or mechanism. I found one on wikipedia commons, that could well represent a million molecules:
markush structure




And another way to break the InChI are clusters. Although the definition is not very specific, the $\ce{[KO{}^{\mathit{t}}Bu]_4}$ tetramer (wikipedia, chemspider) certainly contains one. While the monomer can be described via InChI, the tetra- or Polymer should fail. It was published by Chisholm et.al. in Polyhedron and the *.cif file can be obtained from CCDB. I cleaned it up a little bit and included the *.xyz in the appendix. You can quite nicely see the $\ce{K4}$ tetrahedral cluster in the middle (hydrogens omitted for clarity). molecular structure of the potassium butoxide tetramer




Last but not least, I'd like to cover the matter of complicated organometallic molecules. As such I'd like to introduce the Grubbs' catalysts. Grubbs obtained the Nobel Prize in Chemistry 2005 for his research in conjunction with Yves Chauvin and Richard R. Schrock. One of the key publications was "Ring-opening metathesis polymerization (ROMP) of norbornene by a Group VIII carbene complex in protic media". The crystal structure can also be obtained at CCDC. Another example stems from "Synthesis and Applications of $\ce{RuCl2(=CHR')(PR3)2}$:  The Influence of the Alkylidene Moiety on Metathesis Activity" and is also available via CCDC. This structure can also be found in the appendix. It is generally known as a 1st generation Grubbs catalyst (Hydrogens omitted for clarity).
scheme of Grubbs Imolecular structure of Grubbs I






The following section contains the nuclear coordinates in *.xyz(XMol) file format and angstroms as unit. (If not denoted otherwise, structures are obtained at BP86/cc-pVDZ.)


1,2-dichloroethane A


C        0.000000000      0.000000000      0.000000000
C 0.000000000 0.000000000 1.521307000
Cl 1.723641000 0.000000000 -0.593797000
H -0.486731000 -0.904743000 -0.410753000
H -0.486731000 0.904743000 -0.410753000
Cl -1.723641000 0.000000000 2.115104000

H 0.486731000 -0.904743000 1.932060000
H 0.486731000 0.904743000 1.932060000

1,2-dichloroethane B


C        0.030242000      0.085780000     -0.007620000
C -0.030242000 0.085780000 1.528927000
Cl 1.518312000 -0.758527000 -0.631286000
H -0.837191000 -0.442465000 -0.444148000
H 0.070555000 1.115212000 -0.412109000
Cl -1.518312000 -0.758527000 2.152593000

H 0.837191000 -0.442465000 1.965455000
H -0.070555000 1.115212000 1.933416000

1,2-dichloroethane C


C        0.066377000      0.127741000      0.004558000
C -0.066377000 0.127741000 1.516749000
Cl 1.083384000 -1.241726000 -0.615213000
H -0.919786000 0.050600000 -0.492443000
H 0.568277000 1.063384000 -0.317636000
Cl -1.083384000 -1.241726000 2.136520000

H 0.919786000 0.050600000 2.013750000
H -0.568277000 1.063384000 1.838943000

1,2-dichloroethane D


C        0.164887000      0.141535000     -0.000240000
C -0.164887000 0.141535000 1.521547000
Cl 0.337146000 -1.473219000 -0.795160000
H -0.633932000 0.666019000 -0.558099000
H 1.123149000 0.665665000 -0.177529000
Cl -0.337146000 -1.473219000 2.316468000

H 0.633932000 0.666019000 2.079406000
H -1.123149000 0.665665000 1.698836000

R-BINAP


C       -0.086423000     -0.055821000     -0.031546000
C -0.090885000 -0.047699000 1.480085000
C 1.148913000 -0.158515000 2.237313000
C 1.064686000 -0.139089000 3.676951000
C -0.131097000 -0.014450000 4.357670000
C -1.357362000 0.116010000 3.652970000

C -1.345263000 0.107319000 2.211098000
P 2.768926000 -0.321256000 1.451350000
H 1.988600000 -0.233144000 4.265654000
H -0.135982000 -0.008251000 5.459401000
C -2.590782000 0.269421000 4.370415000
C -2.623021000 0.305506000 1.559779000
C 0.437392000 1.073619000 -0.788457000
C 0.410661000 0.991852000 -2.228054000
C -0.079670000 -0.105834000 -2.908992000
C -0.579162000 -1.233570000 -2.204534000

C -0.579999000 -1.219208000 -0.762676000
P 1.107602000 2.557133000 -0.001640000
H 0.790955000 1.839345000 -2.816335000
H -0.077540000 -0.112552000 -4.010726000
C -1.063317000 -2.378355000 -2.921905000
C -1.047368000 -2.424579000 -0.111239000
C -3.792005000 0.431032000 3.705083000
H -2.558632000 0.263827000 5.471798000
H -4.735477000 0.552772000 4.258523000
C -3.797353000 0.457087000 2.279022000

H -2.662047000 0.340912000 0.466043000
H -4.750761000 0.602046000 1.748556000
C -1.503463000 -3.517255000 -0.830391000
H -1.036094000 -2.475958000 0.982488000
H -1.854657000 -4.415344000 -0.299810000
C -1.523650000 -3.499569000 -2.256520000
H -1.052652000 -2.347534000 -4.023279000
H -1.890198000 -4.377463000 -2.809877000
C 4.061075000 -0.509159000 2.730645000
C 2.844045000 -1.846744000 0.446017000

C 1.585856000 3.772639000 -1.280334000
C -0.173490000 3.383258000 1.008127000
H 0.254332000 4.289560000 1.485098000
H -1.067700000 3.703930000 0.418133000
H -0.532498000 2.717241000 1.815764000
H 1.997887000 4.670081000 -0.777298000
H 2.377005000 3.362349000 -1.938832000
H 0.749130000 4.118465000 -1.941170000
H 5.044249000 -0.601334000 2.227532000
H 4.100503000 0.383637000 3.385840000

H 3.944572000 -1.404262000 3.395364000
H 3.842692000 -1.931312000 -0.030903000
H 2.674387000 -2.779210000 1.039384000
H 2.087489000 -1.826871000 -0.361580000

S-BINAP


C       -1.383946000     -3.448543000     -2.256996000
C -1.065989000 -2.266144000 -2.910129000
C -0.652376000 -1.110469000 -2.179491000
C -0.563545000 -1.177237000 -0.738280000

C -0.899026000 -2.413534000 -0.098700000
C -1.299620000 -3.519323000 -0.838209000
C -0.154302000 -0.008671000 0.004167000
C 0.166467000 1.182512000 -0.673969000
C 0.072046000 1.221328000 -2.098779000
C -0.323252000 0.113168000 -2.831301000
C -0.061735000 -0.093981000 1.507135000
C -1.110430000 0.372113000 2.322674000
C -0.989013000 0.267385000 3.742090000
C 0.132224000 -0.289631000 4.336601000

C 1.210137000 -0.777916000 3.542877000
C 1.113248000 -0.682564000 2.103804000
C 2.206553000 -1.174312000 1.320794000
C 3.329975000 -1.726929000 1.922868000
C 3.419059000 -1.820539000 3.340084000
C 2.377998000 -1.353835000 4.130094000
P -2.590653000 1.186493000 1.502738000
C -3.750658000 -0.293139000 1.402088000
P 0.636241000 2.692004000 0.339668000
C 2.511918000 2.555623000 0.253510000

C -3.385046000 2.079953000 2.947654000
C 0.388676000 4.078351000 -0.898992000
H -1.802841000 0.635802000 4.383638000
H 0.202504000 -0.360978000 5.434128000
H 0.319305000 2.150696000 -2.632672000
H -0.386992000 0.166196000 -3.930250000
H 2.436377000 -1.419019000 5.228476000
H 4.313241000 -2.262708000 3.805535000
H 2.146133000 -1.110484000 0.224695000
H 4.157224000 -2.097887000 1.298347000

H -0.834575000 -2.478936000 0.997042000
H -1.552289000 -4.458551000 -0.322477000
H -1.128190000 -2.199481000 -4.008215000
H -1.700498000 -4.331984000 -2.832408000
H 2.950535000 3.473538000 0.698290000
H 2.842563000 1.687658000 0.858736000
H 2.884117000 2.439555000 -0.786611000
H 0.678948000 5.023670000 -0.395042000
H 0.990357000 3.982867000 -1.828352000
H -0.686746000 4.150262000 -1.160750000

H -4.302108000 2.573768000 2.564446000
H -3.673500000 1.421270000 3.794842000
H -2.700063000 2.872570000 3.312044000
H -4.742112000 0.063129000 1.051704000
H -3.359638000 -1.014277000 0.656570000
H -3.871382000 -0.807607000 2.379049000

M-8a-azonia[6]helicene (from crystal structure)


N        3.565811000      0.259443000     13.724558000
C 7.445232000 -1.196589000 12.564039000

C 8.629416000 -1.762757000 12.151335000
C 8.885324000 -1.978555000 10.783139000
C 7.914184000 -1.694865000 9.864763000
C 6.665024000 -1.171130000 10.265448000
C 5.584378000 -1.088690000 9.330104000
C 4.339207000 -0.800151000 9.738213000
C 4.088732000 -0.406137000 11.083608000
C 2.763235000 -0.187914000 11.543327000
C 2.522579000 0.029096000 12.830042000
C 3.246403000 0.443720000 15.064121000

C 4.198045000 0.692252000 15.970654000
C 5.531025000 0.961394000 15.546639000
C 6.527506000 1.327523000 16.488699000
C 7.769629000 1.675468000 16.086777000
C 8.044342000 1.864594000 14.697195000
C 9.259232000 2.459858000 14.281133000
C 9.466841000 2.806590000 12.982045000
C 8.445715000 2.608977000 12.035743000
C 7.265466000 1.985829000 12.396483000
C 7.061105000 1.553020000 13.720492000

C 5.833567000 0.912900000 14.172787000
C 4.884651000 0.303088000 13.284635000
C 5.151668000 -0.288539000 12.002691000
C 6.444932000 -0.841371000 11.632053000
H 7.296301000 -1.066868000 13.466331000
H 9.304643000 -2.036748000 12.840647000
H 9.760406000 -2.388329000 10.496986000
H 8.038683000 -1.964007000 8.943383000
H 5.780686000 -1.333585000 8.395468000
H 3.604979000 -0.909262000 9.084780000

H 2.029733000 -0.315211000 10.922946000
H 1.672696000 0.060618000 13.273677000
H 2.290439000 0.327335000 15.253240000
H 3.954366000 0.824398000 16.914659000
H 6.255785000 1.333585000 17.427224000
H 8.447606000 1.891266000 16.702563000
H 9.925491000 2.630800000 14.988120000
H 10.286033000 3.200604000 12.619714000
H 8.559741000 2.909640000 11.135042000
H 6.570317000 1.891266000 11.711236000


M-8a-azonia[6]helicene (from BP86/STO-3G)


N        0.241946000      3.510759000     13.746203000
C -1.207087000 7.492952000 12.512135000
C -1.801472000 8.692275000 12.089683000
C -2.102492000 8.912745000 10.707738000
C -1.853647000 7.899258000 9.772330000
C -1.278056000 6.643678000 10.182732000
C -1.146616000 5.553783000 9.229249000
C -0.770948000 4.283969000 9.646130000

C -0.392477000 4.045979000 11.023979000
C -0.161103000 2.688046000 11.484550000
C 0.042219000 2.428893000 12.821687000
C 0.400500000 3.183323000 15.136378000
C 0.639392000 4.176721000 16.059802000
C 0.949043000 5.527644000 15.625469000
C 1.361398000 6.519481000 16.597330000
C 1.808770000 7.766658000 16.182869000
C 1.981223000 8.066019000 14.770425000
C 2.627560000 9.287588000 14.362758000

C 2.913363000 9.530579000 13.012425000
C 2.578886000 8.546652000 12.028110000
C 1.916409000 7.363991000 12.392255000
C 1.565834000 7.097902000 13.763105000
C 0.907348000 5.851096000 14.211296000
C 0.281492000 4.885356000 13.292158000
C -0.309154000 5.144896000 11.968134000
C -0.894790000 6.445456000 11.575101000
H -1.007504000 7.339193000 13.585554000
H -2.048406000 9.472285000 12.834293000

H -2.558542000 9.868716000 10.388465000
H -2.123797000 8.035037000 8.708340000
H -1.422274000 5.738729000 8.172836000
H -0.763274000 3.430969000 8.943112000
H -0.251900000 1.843110000 10.776492000
H 0.087281000 1.420761000 13.274365000
H 0.295196000 2.107892000 15.372408000
H 0.697253000 3.916258000 17.133249000
H 1.320711000 6.254708000 17.669775000
H 2.110372000 8.530075000 16.926174000

H 2.921715000 10.016221000 15.141242000
H 3.423497000 10.463796000 12.708003000
H 2.853573000 8.716110000 10.969950000
H 1.691114000 6.611596000 11.618453000

$\ce{KO{}^{\mathit{t}}Bu}$ tetramer (from crystal structure)


K       -1.309381000     -1.309381000     -1.309381000
O -1.313567000 1.313567000 -1.313567000
C -2.107232000 2.107232000 -2.107232000
C -3.502008000 2.298114000 -1.478495000

H -3.842748000 1.465100000 -1.465100000
H -4.060420000 2.821364000 -2.059512000
H -3.516240000 2.821364000 -0.619528000
O -1.313567000 -1.313567000 1.313567000
O 1.313567000 -1.313567000 -1.313567000
K -1.309381000 1.309381000 1.309381000
K 1.309381000 1.309381000 -1.309381000
C -1.478495000 3.502008000 -2.298114000
C -2.298114000 1.478495000 -3.502008000
K 1.309381000 -1.309381000 1.309381000

C -2.107232000 -2.107232000 2.107232000
C 2.107232000 -2.107232000 -2.107232000
O 1.313567000 1.313567000 1.313567000
H -1.465100000 3.842748000 -1.465100000
H -2.059512000 4.060420000 -2.821364000
H -0.619528000 3.516240000 -2.821364000
H -1.465100000 1.465100000 -3.842748000
H -2.821364000 2.059512000 -4.060420000
H -2.821364000 0.619528000 -3.516240000
C -1.478495000 -3.502008000 2.298114000

C -3.502008000 -2.298114000 1.478495000
C -2.298114000 -1.478495000 3.502008000
C 3.502008000 -1.478495000 -2.298114000
C 1.478495000 -2.298114000 -3.502008000
C 2.298114000 -3.502008000 -1.478495000
C 2.107232000 2.107232000 2.107232000
H -1.465100000 -3.842748000 1.465100000
H -2.059512000 -4.060420000 2.821364000
H -0.619528000 -3.516240000 2.821364000
H -3.842748000 -1.465100000 1.465100000

H -4.060420000 -2.821364000 2.059512000
H -3.516240000 -2.821364000 0.619528000
H -1.465100000 -1.465100000 3.842748000
H -2.821364000 -2.059512000 4.060420000
H -2.821364000 -0.619528000 3.516240000
H 1.465100000 -1.465100000 -3.842748000
H 3.842748000 -1.465100000 -1.465100000
H 4.060420000 -2.059512000 -2.821364000
H 3.516240000 -0.619528000 -2.821364000
H 2.059512000 -2.821364000 -4.060420000

H 0.619528000 -2.821364000 -3.516240000
H 1.465100000 -3.842748000 -1.465100000
H 2.821364000 -4.060420000 -2.059512000
H 2.821364000 -3.516240000 -0.619528000
C 1.478495000 2.298114000 3.502008000
C 3.502008000 1.478495000 2.298114000
C 2.298114000 3.502008000 1.478495000
H 1.465100000 1.465100000 3.842748000
H 2.059512000 2.821364000 4.060420000
H 0.619528000 2.821364000 3.516240000

H 3.842748000 1.465100000 1.465100000
H 4.060420000 2.059512000 2.821364000
H 3.516240000 0.619528000 2.821364000
H 1.465100000 3.842748000 1.465100000
H 2.821364000 4.060420000 2.059512000
H 2.821364000 3.516240000 0.619528000

Grubbs I catalyst


Ru      1.923022    1.782458    4.279841
Cl 3.508375 2.054525 6.061616

P 0.620071 3.495015 5.336792
C 2.967497 2.881791 3.239789
Cl 0.101984 1.197998 2.838894
P 2.855052 -0.401312 3.741075
C 2.954465 3.343954 1.839333
Cl 3.310061 4.678711 -2.473972
C 1.880873 3.257547 0.960743
C 1.973325 3.667349 -0.352875
C 3.185197 4.168660 -0.812833
C 4.252303 4.292067 0.012884

C 4.145603 3.890748 1.341393
C -0.637087 4.207002 4.175769
C -1.979932 4.694567 4.692950
C -2.932151 4.983356 3.538285
C -2.339331 5.946324 2.517809
C -0.957803 5.505405 2.056512
C -0.041546 5.252113 3.235271
C 1.483967 4.856669 6.241316
C 0.518279 5.817396 6.934517
C 1.256971 6.738015 7.899778

C 2.346867 7.495106 7.161401
C 3.302557 6.547168 6.465690
C 2.565538 5.630984 5.497585
C -0.333586 2.698072 6.719178
C -1.014762 1.418485 6.241149
C -1.859827 0.796208 7.358000
C -1.043744 0.536540 8.601177
C -0.361483 1.805588 9.075860
C 0.493782 2.427234 7.979254
C 2.628515 -0.975242 1.991593

C 3.236690 -0.059043 0.937820
C 2.687448 -0.416553 -0.435697
C 2.971677 -1.856587 -0.778868
C 2.466883 -2.799363 0.290130
C 2.999176 -2.423218 1.675026
C 4.622054 -0.540862 4.248552
C 5.523290 0.453308 3.517538
C 6.887074 0.518241 4.175936
C 7.549339 -0.842629 4.233326
C 6.649166 -1.859830 4.903939

C 5.271941 -1.927013 4.232992
C 1.995357 -1.737789 4.704830
C 2.010202 -1.452357 6.212537
C 1.292567 -2.535071 7.008639
C -0.120796 -2.764155 6.507353
C -0.110733 -3.109699 5.031603
C 0.567738 -2.010334 4.218268
H 3.716836 3.154747 3.670969
H 1.120535 2.917815 1.276640
H 1.189061 3.632874 -0.962082

H 5.017596 4.645026 -0.262690
H 4.792512 4.003484 1.897392
H -0.815388 3.399122 3.615754
H -1.902150 5.510235 5.223685
H -2.378700 3.974732 5.228704
H -3.726326 5.342882 3.821555
H -3.181986 4.184074 3.125510
H -2.984577 6.066660 1.748479
H -2.250423 6.781889 2.909669
H -1.054116 4.674447 1.537657

H -0.576082 6.145292 1.467384
H 0.723372 4.965838 2.939787
H 0.125317 6.138803 3.761321
H 1.921040 4.375890 6.908583
H 1.676706 6.235725 8.636983
H -0.145301 5.369077 7.433963
H 0.091950 6.422548 6.224250
H 0.676388 7.347621 8.295653
H 2.791319 8.070082 7.723424
H 1.896364 8.099751 6.522077

H 3.914233 7.048872 6.008409
H 3.698100 6.058287 7.055823
H 3.116098 5.021349 5.094850
H 2.158660 6.180195 4.786983
H -1.006024 3.336226 6.935353
H -1.559257 1.598311 5.456258
H -0.325083 0.765200 6.003389
H -2.608220 1.404611 7.524315
H -2.214635 0.029761 7.044111
H -0.418035 -0.117788 8.397718

H -1.627822 0.182306 9.277813
H 0.176244 1.645756 9.829964
H -1.053680 2.460437 9.333028
H 0.877440 3.225032 8.287287
H 1.235327 1.881817 7.763580
H 1.720632 -0.869158 1.853889
H 4.184273 -0.146643 0.965428
H 3.012517 0.820454 1.136093
H 1.847900 -0.200343 -0.409930
H 3.096540 0.151437 -1.100956

H 3.904789 -1.982340 -0.878422
H 2.584028 -2.105280 -1.617970
H 2.763571 -3.703670 0.105411
H 1.556168 -2.778368 0.317905
H 4.607903 -0.216537 5.166797
H 3.990062 -2.477871 1.686571
H 2.642313 -3.025155 2.339113
H 5.156005 1.289122 3.512016
H 5.590006 0.129866 2.598457
H 6.737696 0.838429 5.032942

H 7.431919 1.106030 3.739569
H 8.390108 -0.830548 4.679900
H 7.729962 -1.148361 3.309561
H 6.514020 -1.632968 5.797587
H 7.041226 -2.668142 4.887375
H 4.662063 -2.600521 4.676553
H 5.418404 -2.225128 3.326293
H 2.516573 -2.545817 4.572816
H 1.564734 -0.621934 6.319621
H 2.866067 -1.356878 6.528769

H 1.774146 -3.375710 6.968817
H 1.285356 -2.270001 7.937592
H -0.502310 -3.449358 7.037418
H -0.746615 -1.928105 6.670990
H 0.328488 -3.937948 4.917492
H -1.013735 -3.235955 4.696631
H 0.558651 -2.294513 3.264385
H 0.034321 -1.214512 4.228140

No comments:

Post a Comment

periodic trends - Comparing radii in lithium, beryllium, magnesium, aluminium and sodium ions

Apparently the of last four, $\ce{Mg^2+}$ is closest in radius to $\ce{Li+}$. Is this true, and if so, why would a whole larger shell ($\ce{...