Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

This research proposes an online learning method for efficiently constructing surrogate models that can accurately emulate complex systems. The method consists of three key components:

  1. Sampling strategy for generating new training and testing data;
  2. Learning strategy for generating candidate surrogate models based on the training data;
  3. Validation metric for evaluating the effectiveness of candidate surrogate models on the testing data.

The authors use radial basis function (RBF) interpolation as the response surface for the surrogate model. This online method aims to ensure that the surrogate model captures all the local extrema (including endpoints) of the response surface and adopts a continuous validation and updating mechanism, where the surrogate model is retrained when its performance falls below an effectiveness threshold.

The main innovations of the authors are:

  1. Proposing an optimizer-driven sampling strategy that ensures the training data contains all the local extrema of the response surface, thereby guaranteeing the long-term effectiveness of the surrogate model.
  2. Designing an automated online learning workflow, including explicit validation and updating mechanisms, to generate a surrogate model that is effective for all future data.

Research process:

a) Validation workflow: - First, link the model to a database to automatically store the model’s inputs and outputs; - Then, retrieve the corresponding surrogate model from the database and evaluate its effectiveness using the testing data; - If ineffective, retrain the surrogate model using the stored model evaluation results; - If the retrained surrogate model has higher quality, store it; otherwise, use the sampler to generate new model evaluation data; - Repeat the above process until an effective surrogate model is generated on the testing data.

b) Evaluate the impact of sampling strategy and optimizer settings on efficiency: - Compared the efficiency of optimizer-driven sampling and random sampling on benchmark functions; - Optimizer-driven sampling better reproduces function behavior near extrema, but random sampling converges faster on average error; - Evaluated the impact of different optimizer configurations on sampling efficiency.

c) Application to two physical problems: - Constructed accurate surrogate models for the high-density nuclear matter equation of state, including the phase transition region; - Constructed high-precision surrogate models for the radial distribution function of strongly coupled plasmas.

d) Research value: - The proposed online learning method can efficiently generate long-term effective surrogate models for complex systems, avoiding limitations of traditional methods; - Has broad application prospects in fields such as physical modeling, helping to improve computational efficiency; - The research method is general and can be applied to various scientific problems.

e) Research highlights: - Innovative sampling strategy ensures the surrogate model contains all critical points of the response surface; - Automated online learning workflow continuously evaluates and improves surrogate model quality; - Successfully applied to challenging physical problems such as the nuclear matter equation of state and plasma simulations.

This research proposes an innovative online learning method for efficiently constructing long-term effective surrogate models of complex systems, which is valuable for improving simulation efficiency and saving computational resources. The research method is general and has broad application prospects in fields such as physical modeling.