Synthetic Lagrangian Turbulence by Generative Diffusion Models

Currently, there are significant challenges in studying the statistical and geometrical properties of particles carried by the fluid in turbulence. Despite outstanding efforts in theory, numerical simulations, and experiments over the past 30 years, there is still a lack of models that can realistically reproduce the statistical and topological features of the trajectories of turbulent particles. This study proposes a machine learning method based on the latest diffusion model (DM) that can generate trajectories of individual particles in three-dimensional high-Reynolds-number turbulence, thereby bypassing the need for direct numerical simulations or experiments to obtain reliable Lagrangian data.

Paper Information: The authors of this paper are from the University of Rome and other institutions, and it was published in the April 2024 issue of Nature Machine Intelligence.

Research Methods: (a) Research Process This study first uses direct numerical simulations (DNS) to generate high-Reynolds-number turbulent fields governed by the three-dimensional Navier-Stokes equations, and tracks a large number (327,680) of Lagrangian particle trajectories to construct a high-quality training dataset. The diffusion model (DM) is then trained on this dataset, resulting in two models: DM-1c for generating a single velocity component and DM-3c for simultaneously generating three correlated velocity components.

(b) Main Results - The generated synthetic data can well reproduce the probability distribution functions of turbulent particle velocity increments and accelerations, including the observed extreme event tails up to 60 standard deviations. - The synthetic data can accurately reproduce multiscale statistics, such as velocity increment structure functions and generalized flatness, from large scales to the Kolmogorov scale (including the Kolmogorov-inertial transition range), and capture the intermittency enhancement in the critical region. - The synthetic data accurately reproduces the behavior of local scaling exponents, which is the most stringent multiscale test in turbulence statistics.

© Research Significance This work overcomes the previous challenge of theoretical and empirical models being unable to generate synthetic trajectories with realistic turbulence statistics across the entire dynamic range. The proposed data-driven model can produce high-quality, large-volume artificial data, providing powerful support for various downstream tasks (such as diffusion and mixing) in turbulence applications that require pre-training. Additionally, the model demonstrates excellent extrapolation capabilities for extreme events, generating previously unseen high-intensity, low-probability events while still preserving true statistical features.

(d) Research Highlights - First use of the latest diffusion model to generate three-dimensional turbulent Lagrangian particle trajectories - Highly faithful reproduction of various classic statistics from large scales to the Kolmogorov scale - Demonstrates the ability to extrapolate extreme events, breaking the limitations of previous models - Provides a new avenue for producing large volumes of high-quality synthetic data in turbulence-related fields

(e) The work further explores issues related to training convergence, generalization ability, interpretability, and computational cost. Although the current model cannot yet adapt to different flow conditions, future integration with conditional diffusion models is expected to achieve broader applicability. Overall, this research provides powerful data support for a deeper understanding of turbulent phenomena and accelerates the development of downstream tasks.