56-4 Deep learning approaches to the phylogenetic placement and classification of fossil grass silica short cell phytoliths
Session: Phylogenetic and Computational Approaches in Paleobiology and Paleoecology, Part II
Presenting Author:
Benjamin Alexander LloydAuthors:
Lloyd, Benjamin A.1, Adaïmé, Marc-Élie2, Punyasena, Surangi W.3, Gallaher, Timothy J.4, Hermans, Rosalie M.5, Kong, Shu6, Strömberg, Caroline A.E.7(1) Earth & Space Sciences, University of Washington, Seattle, WA, USA, (2) Data Science Lab, Smithsonian Institution, Washington, DC, USA; Plant Biology, University of Illinois Urbana-Champaign, Urbana, IL, USA, (3) Plant Biology, University of Illinois Urbana-Champaign, Urbana, IL, USA, (4) Bishop Museum, Honolulu, HI, USA, (5) Archeology, Environmental Changes, and Geochemistry, Vrije Universiteit Brussel, Brussels, Belgium, (6) University of Macau, Macau, Macao SAR, China, (7) Biology, University of Washington, Seattle, WA, USA,
Abstract:
Phytoliths, microscopic silica bodies that infill plant tissues, have proven instrumental for tracking the evolution and proliferation of grasses and grasslands in deep time, due to the distinct morphologies of Grass Silica Short Cell Phytoliths (GSSCPs) affording more taxonomic specificity than Poaceae pollen and leaf fossils, which are largely non-diagnostic. However, the precision with which GSSCPs can be identified in the fossil record is limited by the difficulty and subjectivity of phytolith morphotype classification. Convolutional Neural Networks (CNNs) are a powerful, widely-used machine learning tool for image classification, with the potential to enhance the reproducibility and speed of phytolith identification. Trained CNNs may also help uncover previously unrecognized connections between morphology and phylogeny, expanding the ability of researchers to interpret fossil phytolith assemblages. To streamline model training and directly focus on phytolith morphology, we produced training image sets from 3D meshes of GSSCPs generated from confocal images. We used 1,421 3D models of phytoliths extracted from 94 vouchered herbarium specimens spanning the Bambusoideae, Oryzoideae, and Pooideae (BOP) clade of Poaceae. Using 2D renditions of these models, as opposed to brightfield images of GSSCP, allowed for precise control of phytolith alignment, image quality, and color, as well as consistent evaluation of different angles of observation. We rendered images along each phytolith’s X, Y, and Z axes to produce three training datasets of 2,842 images each. When trained on modern GSSCPs, our best-performing CNNs achieve 78% accuracy at the subfamily level and 61% accuracy at the tribe level. We then used these models to make predictions of the subfamily-level affinities of 269 expert-identified fossil phytoliths. We achieved a baseline accuracy of 73%, but when filtering results to the 134 fossils where the CNN had >90% confidence in its predictions, it reached a peak accuracy of 98%. To maximize the evolutionary information obtainable from CNN-learned GSSCP features, we are also exploring phylogenetically-informed methods for clustering and phylogenetic placement, which may allow for more specific fossil phytolith identification than has previously been possible.
Geological Society of America Abstracts with Program. Vol. 57, No. 6, 2025
© Copyright 2025 The Geological Society of America (GSA), all rights reserved.
Deep learning approaches to the phylogenetic placement and classification of fossil grass silica short cell phytoliths
Category
Topical Sessions
Description
Session Format: Oral
Presentation Date: 10/19/2025
Presentation Start Time: 02:15 PM
Presentation Room: 304B
Back to Session