Advisor: Professor Alexandre Tartakovsky
Physics-informed Machine Learning Methods to Environmental Modeling and Uncertainty Quantification
Abstract
Machine learning approaches are increasingly recognized for enhancing prediction and model inversion
in natural and engineering systems due to their high expressivity, improved training algorithms,
and fast inference speed. However, effectively applying data-driven machine learning methods to
environmental problems, particularly in data-limited contexts such as subsurface modeling and
large-scale climate simulations, remains challenging. One major limitation is the scarcity of highquality
data that accurately captures the heterogeneous and multi-scale nature of these systems.
Additionally, environmental problems are often high-dimensional, making them susceptible to the
“curse of dimensionality (CoD).” This poses significant computational challenges, especially in inverse
parameter estimation, uncertainty quantification (UQ), and design and optimization tasks,
where solving these problems requires extensive physics-based simulations.
To address these challenges, this thesis aims to develop machine learning approaches that integrate
physical domain knowledge (i.e., governing physical equations) as a form of regularization.
Additionally, we explore efficient algorithms to mitigate the effects of high dimensionality in inverse
problems and UQ. Our research unfolds into three well-defined objectives: The first is physicsinformed
machine learning (PIML) for deterministic PDE problems. Here, we focus on developing,
improving, and analyzing PIML approaches in the deterministic regime, designed to accurately
approximate solutions to partial differential equations (PDEs) as well as learning PDE operators.
We propose an improved training framework for physics-informed neural networks (PINNs) that
systematically targets different sources of PINN errors. The effectiveness of our improved PINN
model is demonstrated in forward, inverse, and backward advection-dispersion equations (ADEs)
with sharply perturbed initial conditions, where conventional PINNs struggle to learn effectively.
We also introduce a reduced-order method called the physics-informed Karhunen-Lo`eve expansion
(PICKLE), which combines model reduction techniques with PIML to efficiently solve space-timedependent
PDE problems. A key advantage of these two methods is their ability to seamlessly
assimilate additional data. Additionally, we propose reduced-order neural operator surrogate models,
including KL-DNN and VAE-DNN, which employ the dimension reduction technique to identify
low-dimensional subspaces and construct operator-based mappings between PDE parameters and
solutions. These surrogate models offer significant computational and energy efficiency, as they
can be trained in separate components while maintaining accuracy comparable to state-of-the-art
neural operators such as the Fourier Neural Operator.
1
The second objective is to develop scalable UQ methods to address the challenges posed by
complex posterior distributions in high-dimensional, PDE-constrained inverse problems. These
problems are often affected by the CoD and exhibit pathological behaviors in high-dimensional
probability spaces, making existing Bayesian inference computationally demanding. We propose a
Bayesian UQ framework based on the randomize-then-optimize approach, which enables efficient
posterior sampling by tailoring the optimization process to the loss functions of our physics-informed
machine learning models, including PINN and PICKLE.
The third objective is to develop effective digital twins capable of accurately and rapidly predicting
PDE solutions under a wide range of control variables while remaining differentiable with
respect to them. To achieve this, we integrate the proposed neural operator-based surrogate models
with transfer learning techniques, allowing models trained on one set of conditions to generalize efficiently
to new scenarios with minimal retraining. Additionally, we provide a rigorous mathematical
analysis of the transferability of these surrogate models for both linear and nonlinear PDEs.
This thesis outlines a comprehensive roadmap for advancing physics-informed machine learning
techniques. We present numerical results for each objective, demonstrating their effectiveness and
laying the foundation for broader applications toward large, real-world problems.
2