28-5 From Incomplete Drilling Data to Synthetic Seismograms: Machine Learning Based Prediction of Key LWD Logs for Core-Log-Seismic Integrations
Session: Geoscience and Hydrogeology in the AI Era: From Predictive Models to Real-Time Applications (Posters)
Poster Booth No.: 103
Presenting Author:
Ms. Jenna Lauren EverardAuthors:
Everard, Jenna Lauren1, Nicholson, Uisdean2(1) Earth and Environmental Sciences, Columbia University, New York, New York, USA, (2) School of Energy, Geoscience, Infrastructure and Society, Heriot-Watt University, Edinburgh, United Kingdom,
Abstract:
Logging While Drilling (LWD) facilitates the real-time collection of downhole geophysical measurements during subsurface drilling, providing information about properties such as density, resistivity, porosity, and acoustic velocity. These logs have clear applications in both industrial and scientific drilling, from aiding in identifying coal- and hydrocarbon-rich units in energy exploration to enabling stratigraphic correlation, lithologic interpretation, and depth correction for physical core samples during scientific drilling.
Formation density and p-wave velocity (Vp) are two particularly important logs, as they are used to generate synthetic seismograms that help integrate seismic datasets in time domain with both well logs and physical cores in depth. However, these logs are not always available because of cost or logistical constraints. Further, logging operations can be unpredictable due to borehole collapse, water leaks, and lost tools. As a result, logs are often missing or incomplete.
With the rising popularity of machine learning techniques in the geosciences, some have proposed using ML models to complete partial logs or even generate missing ones. However, currently available models tend to focus on the former and are primarily developed for industrial applications. This has limited the amount of data, the geographical variation, and the lithologic diversity used in training, thereby narrowing the scope of these models' applications.
Here, we compile a new, large, and comprehensive training dataset using all publicly available oceanic and continental scientific drilling logs. Of the 712 public DSDP/ODP/IODP datasets, 503 contain density logs, 353 contain Vp logs, and just 267 – 37.5% of all logs – contain both. With data from these, our training set spans a wide geographical range and captures highly variable lithologies and geologic environments, making our final model broadly applicable. We train various ML models - including polynomial regression, random forest, extreme gradient boosting (XGB), neural network, and ensemble methods - to predict density and Vp logs independent of one another using other commonly recorded logging parameters as inputs. Our final, most effective model achieves an R2 of greater than 0.95 for both density and Vp predictions. To support open science and a broader application, final models and tools are available in a public repository with an accessible user interface that enables others to input their own LWD data to predict missing logs and generate synthetic seismograms.
Geological Society of America Abstracts with Program. Vol. 57, No. 6, 2025
© Copyright 2025 The Geological Society of America (GSA), all rights reserved.
From Incomplete Drilling Data to Synthetic Seismograms: Machine Learning Based Prediction of Key LWD Logs for Core-Log-Seismic Integrations
Category
Topical Sessions
Description
Session Format: Poster
Presentation Date: 10/19/2025
Presentation Room: Hall 1
Poster Booth No.: 103
Author Availability: 9:00–11:00 a.m.
Back to Session