15-4 Revealing Hidden Structure in Elemental XRF Datasets with Unsupervised Non-Negative Matrix Factorization
Session: Integrated Digital Workflows in Geoscience: Mapping, Marine Exploration, and Machine Learning
Presenting Author:
Jeffrey PietrasAuthors:
Pietras, Jeffrey Todd1, Kelley, Mari K.2, Rust, Tyler Jacob3(1) Binghamton University, Binghamton, , (2) US Army Corps of Engineers, Geotechnical Engineering and Geosciences Branch, , (3) San Francisco, ,
Abstract:
Large elemental concentration datasets collected by portable XRF (pXRF) spectrometers are now commonplace, including applications to core analysis, outcrop studies, soil and environmental surveys, and more. These datasets are explored using elemental abundance and elemental ratio logs and bivariate or multivariate cross-plots to identify trends, correlations, and compositional groupings. While these approaches are powerful and intuitive, they often struggle to disentangle overlapping geochemical signals when multiple mineral phases or sources contribute to a sample. Here, we apply unsupervised non-negative matrix factorization (NMF) as a complementary, data-driven approach that decomposes multivariant datasets into a limited number of interpretable compositional factors and their sample-specific contributions.
To rigorously test the interpretability of this approach, we constructed a controlled dataset composed of mixtures with known endmembers. Eight natural earth powders used as pigments for painting were characterized using pXRF spectroscopy for elemental composition and XRD with Rietveld refinement for mineralogical composition. They were then combined into binary mixtures of known proportions, yielding 112 samples. A painting was created using unquantified mixtures of the eight endmembers, allowing factor contribution maps to be examined in a spatial context. Linseed oil was used as a binder because pXRF spectrometers cannot detect organic compounds, minimizing elemental contamination. The binary mixtures and a 3 mm-spaced grid of 50 by 40 measurements (2,000 points) across the painting were analyzed by pXRF and combined with the endmembers into a 2,120-sample matrix containing 18 elemental concentrations per sample.
The NMF workflow was implemented in Python, with iterative code development and debugging assisted by a large language model (ChatGPT). Bootstrapping was used to assess the reproducibility and stability of the extracted factors, while K-fold cross-validation guided selection of the optimal number of factors. The resulting seven-factor model resolves compositional factors with elemental fingerprints corresponding to distinct minerals or geochemical associations. Contribution maps accurately capture spatial variability in the dataset. Importantly, NMF allows individual elements to contribute to multiple factors, reflecting the reality that elements are hosted in more than one phase. For example, calcium was distributed across several factors, each associated with different elemental assemblages. These results demonstrate that NMF enhances the interpretive power of pXRF elemental datasets and provides an objective tool that can complement traditional interpretive workflows in a wide range of geological applications.
Geological Society of America Abstracts with Programs. Vol. 58, No. 2, 2026
© Copyright 2026 The Geological Society of America (GSA), all rights reserved.
Revealing Hidden Structure in Elemental XRF Datasets with Unsupervised Non-Negative Matrix Factorization
Category
Topical Sessions
Description
Session Format: Oral
Presentation Date: 3/22/2026
Presentation Start Time: 04:30 PM
Presentation Room: CCC, Room 25
Back to Session