9-7 AI for Crystallography: Introducing “Crystract”, an R package to Read and Analyze CIF Files
Session: Early Career Investigators in Mineralogy and Crystallography
Presenting Author:
Anirudh PrabhuAuthors:
Prabhu, Anirudh1, Ngo, Don2, Maria-Hubner, Julia3, Ralph, Jolyon4(1) Earth and Planets Laboratory, Carnegie Science, Washington, DC, USA, (2) Earth and Planets Laboratory, Carnegie Science, Washington, DC, USA, (3) Technische Universität Dresden, Dresden, Germany, (4) mindat.org, Mitcham, Surrey, United Kingdom,
Abstract:
Solid matter consists of three-dimensional arrangements of individual atoms predominantly organized in repeating patterns (motifs). We can directly relate properties of materials, both of natural and synthetic origin to their structural arrangement, regardless if they belong to the classes of organics, polymers, or inorganics. For example, the arrangement of carbon atoms leads to the stark difference between diamond and graphite. The large-scale analysis of crystal structures is fundamental not just to the understanding of minerals and their evolution through time, but also can be employed towards the design of new materials. As crystallographic databases continue to expand, the need for powerful computation tools to harness their information has become critical.
The Crystallographic Information File (CIF) is the standard for disseminating this data, yet researchers face a significant bottleneck in its programmatic use. Manual data extraction is tedious, and syntactic inconsistencies in CIF files complicate high-throughput analysis, forcing researchers into fragmented workflows between different software environments. To improve reproducibility and hasten the speed of discovery, we have developed “Crystract”, an open-source R package designed to overcome these challenges by providing an efficient solution for the batch processing and statistical analysis of CIF data. The package streamlines the extraction of essential information including unit cell parameters, atomic coordinates, and symmetry operations. A key feature of “Crystract” is its ability to propagate experimental uncertainties from the CIFs through derived quantities, such as interatomic distances and bond angles, facilitating a more rigorous statistical treatment.
We also implement established algorithms to generate full unit cells and determine atomic bonding environments. By integrating these tools directly within the R statistical environment, “Crystract” provides a seamless workflow for crystallographic analysis. As a fully open-source package to be made available on the Comprehensive R Archive Network (CRAN), “Crystract” promotes reproducible, data-driven research in the Crystallographic research across Earth and Space Science.
Geological Society of America Abstracts with Program. Vol. 57, No. 6, 2025
doi: 10.1130/abs/2025AM-10241
© Copyright 2025 The Geological Society of America (GSA), all rights reserved.
AI for Crystallography: Introducing “Crystract”, an R package to Read and Analyze CIF Files
Category
Topical Sessions
Description
Session Format: Oral
Presentation Date: 10/19/2025
Presentation Start Time: 09:55 AM
Presentation Room: HBGCC, 214D
Back to Session