科学计算与机器学习研讨会 (Workshop on Scientific Computing and Machine Learning)

2024.11.25

集人:鲁剑锋(杜克大学数学系)、明平兵(中国科学院数学与系统科学研究院)

时间:2024.12.22—2024.12.28


   会议日程

2024年12月23日:学术报告

时间

开幕式

主持人

8:30-8:40

开幕致辞/合影


时间

报告人

报告题目

主持人

8:40-9:40

明平兵

科学计算与机器学习


9:40-10:40

毛志平

深度学习求解微分方程:算法、理论、应用及进展(一)

10:40-11:10

茶歇

11:10-11:50

高卫国

How Do Flow Matching Models Memorize and Generalize in Sample Data   Subspaces?


11:50

午餐休息

14:00-14:40

张钊

基于离散化物理信息神经网络的参数化油藏渗流代理模型


14:40-15:20

于灏

Generalization Error Estimate of a Machine Learning Method for Solving   High Dimensional Schrödinger Eigenvalue Problems

15:20-15:50

茶歇

15:50-17:10

论坛

17:10

晚餐

 

    20241224日:学术报告

时间

报告人

报告题目

主持人

8:30-9:30

毛志平

深度学习求解微分方程:算法、理论、应用及进展(二)


9:30-10:30

王飞

Adaptive Growing Randomized Neural Networks for Solving Partial   Differential Equations

10:30-11:00

茶歇

11:00-11:40

贺巧琳

Moving Sampling Physics-informed Neural Networks induced by Moving Mesh   PDE


11:40

午餐休息

14:00-17:00

自由讨论

17:00

晚餐

 

2024年12月25日:学术报告

时间

报告人

报告题目

主持人

8:30-9:10

史作强

Interface Laplace Learning: Nonlocal Interface Model Helps   Semi-Supervised Learning


9:10-9:50

邱越

Data-driven Discovery of (Partial) Differential Equations with Machine   Learning

9:50-10:20

茶歇

10:20-11:00

唐科军

Solving high-dimensional partial differential equations with deep   learning requires rethinking sample generation


11:00-11:40

廖奇峰

KRnet流模型密度函数估计方法及高维偏微分方程求解

11:40

午餐休息

14:00-14:40

项阳

Exploring the Training   Challenges of Neural Network-Based PDE Solvers


14:40-15:20

潘晓敏

Applications and Potential of Monolithic Projection-Based Methods with   Staggered Time Discretization

15:20-15:50

茶歇

15:50-16:30

张悦嘉

Parallel Coordinate Descent Methods for Full Configuration Interaction


16:30-17:10

郭义销

A deep learning framework for solving heat equations with non-smooth   boundary conditions and obstacle constraints

17:10

晚餐


20241226日:学术报告

时间

报告人

报告题目

主持人

8:30-9:10

王东

Efficient and stable methods for computing partition   problems


9:10-10:10

焦雨领

Theoretical Study of   Deep Learning (1)

10:10-10:40

茶歇

10:40-11:40

焦雨领

Theoretical Study of   Deep Learning (2)


11:40

午餐

14:00-14:40

应文俊

Physics-informed Data-driven Cavitation Model for a   Specific Mie–Grüneisen Equation of State


14:40-15:20

凌舒扬

Beyond Unconstrained Features: Neural Collapse for   Shallow Neural Networks with General Data

15:20-15:50

茶歇


15:50-16:30

冯亚妮

Dimension-reduced KRnet maps for high-dimensional   Bayesian inverse problems



16:30-17:10

陈恒准

On the Continuity of Schur-Horn Mapping


17:10

晚餐


20241227日:学术报告

时间

报告人

报告题目

主持人

8:30-9:30

焦雨领

Theoretical Study of Deep Learning (3)


9:40-10:40

焦雨领

Theoretical Study of   Deep Learning (4)

10:40-11:10

茶歇

11:10-11:50

廖钰蕾

Spectral Barron space and deep neural network   approximation


11:50

午餐


14:00-17:00

自由讨论


17:00

晚餐


 

   专家报告摘要

(按姓氏字母排序)

报告人  冯亚妮香港大学

报告题目

Dimension-reduced KRnet maps for high-dimensional Bayesian inverse problems

报告摘要

       We present a dimension-reduced KRnet map approach (DR-KRnet) for high-dimensional Bayesian inverse problems, which is based on an explicit construction of a map that pushes forward the prior measure to the posterior measure in the latent space. Our approach consists of two main components: data-driven VAE prior and density approximation of the posterior of the latent variable. In reality, it may not be trivial to initialize a prior distribution that is consistent with available prior data; in other words, the complex prior information is often beyond simple hand-crafted priors. We employ variational autoencoder (VAE) to approximate the underlying distribution of the prior dataset, which is achieved through a latent variable and a decoder. Using the decoder provided by the VAE prior, we reformulate the problem in a low-dimensional latent space. In particular, we seek an invertible transport map given by KRnet to approximate the posterior distribution of the latent variable. Moreover, an efficient surrogate model is constructed to reduce the computational cost of solving both forward and adjoint problems involved in likelihood computation. With numerical experiments, we demonstrate the accuracy and efficiency of DR-KRnet for high-dimensional Bayesian inverse problems.

 

 

报告人  高卫国复旦大学

报告题目

How Do Flow Matching Models Memorize and Generalize in Sample Data Subspaces?

报告摘要

       Real-world data is often assumed to lie within a low-dimensional structure embedded in high-dimensional space. In practical settings, we observe only a finite set of samples, forming what we refer to as the sample data subspace. It serves an essential approximation supporting tasks such as dimensionality reduction and generation. A major challenge lies in whether generative models can reliably synthesize samples that stay within this subspace rather than drifting away from the underlying structure. In this work, we provide theoretical insights into this challenge by leveraging Flow Matching models, which transform a simple prior into a complex target distribution via a learned velocity field. By treating the real data distribution as discrete, we derive analytical expressions for the optimal velocity field under a Gaussian prior, showing that generated samples memorize real data points and represent the sample data subspace exactly. To generalize to suboptimal scenarios, we introduce the Orthogonal Subspace Decomposition Network (OSDNet), which systematically decomposes the velocity field into subspace and off-subspace components. Our analysis shows that the off-subspace component decays, while the subspace component generalizes within the sample data subspace, ensuring generated samples preserve both proximity and diversity. This is joint work with Ming Li.

报告人  贺巧琳四川大学

报告题目

Moving Sampling Physics-informed Neural Networks induced by Moving Mesh PDE

报告摘要

       In this work, we propose an end-to-end adaptive sampling framework based on deep neural networks and the moving mesh method  (MMPDE-Net), which can adaptively generate new sampling points by solving the moving mesh PDE. This model focuses on improving the quality of sampling points generation. Moreover, we develop an iterative algorithm based on MMPDE-Net, which makes sampling points distribute more precisely and controllably. Since MMPDE-Net is independent of the deep learning solver, we combine it with physics-informed neural networks (PINN) to propose moving sampling PINN (MS-PINN) and show the error estimate of our method under some assumptions. Finally, we demonstrate the performance improvement of MS-PINN compared to PINN through numerical experiments of four typical examples, which numerically verify the effectiveness of our method.

 

 

报告人  焦雨领武汉大学

报告题目

Theoretical Study of Deep Learning

报告摘要

       In this talk, we will discuss theory of deep learning in the framework of nonparametric estimation. We will focus on approximation, generalization, optimization and minimax lower bounds.

 


 

报告人  廖奇峰(上海科技大学)

报告题目

KRnet流模型密度函数估计方法及高维偏微分方程求解

报告摘要

       概率密度函数估计仍是计算科学与工程中的一个难题。通过耦合 Knothe-Rosenblatt (KR) 重排和基于流的生成模型,我们开发了一种可逆传输映射,称为 KRnet,用于高维密度估计。本报告对Krnet就行概述,并介绍其用于求解高维偏微分方程的自适应版本。

 

 

报告人  廖钰蕾(新加坡国立大学)

报告题目

Spectral Barron space and deep neural network approximation

报告摘要

       We prove the sharp embedding between the spectral Barron space and the Besov space with embedding constants independent of the input dimension. Given the spectral Barron space as the target function space, we prove a dimension-free convergence result that if the neural network contains $L$ hidden layers with $N$ units per layer, then the upper and lower bounds of the $L^2$-approximation error are $\mathcal{O}(N^{-sL})$ with $0 < sL\le 1/2$, where $s\ge 0$ is the smoothness index of the spectral Barron space. This is a joint work with Prof. Pingbing Ming.


 

报告人  凌舒扬上海纽约大学

报告题目

Beyond Unconstrained Features: Neural Collapse for Shallow Neural Networks with General Data

报告摘要

       Neural collapse (NC) is a phenomenon that emerges at the terminal phase of the training (TPT) of deep neural networks (DNNs). The features of the data in the same class collapse to their respective sample means and the sample means exhibit a simplex equiangular tight frame (ETF). In the past few years, there has been a surge of works that focus on explaining why the NC occurs and how it affects generalization. Since the DNNs are notoriously difficult to analyze, most works mainly focus on the unconstrained feature model (UFM). In this work, we focus on shallow ReLU neural networks and try to understand how the width, depth, data dimension, and statistical property of the training dataset influence the neural collapse. We provide a complete characterization of when the NC occurs for two or three-layer neural networks. For two-layer ReLU neural networks, a sufficient condition on when the global minimizer of the regularized empirical risk function exhibits the NC configuration depends on the data dimension, sample size, and the signal-to-noise ratio in the data instead of the network width. For three-layer neural networks, we show that the NC occurs as long as the first layer is sufficiently wide. Regarding the connection between NC and generalization, we show the generalization heavily depends on the SNR (signal-to-noise ratio) in the data. Our results significantly extend the state-of-the-art theoretical analysis of the NC under the UFM by characterizing the emergence of the NC under shallow nonlinear networks and showing how it depends on data properties and network architecture.

 

 

报告人  毛志平(厦门大学)

报告题目

深度学习求解微分方程:算法、理论、应用及进展

报告摘要

       在该报告中,我将首先介绍深度学习求解微分方程的发展历程,以及当前的一些基本算法(PINNDeep RitzWANELM及其相关等算法)和理论;其次,我将介绍相关的深度算子学习算法(PDE-netDeepONetGNO/FNO等)以及深度学习/深度算子学习算法在工程中的相关应用;最后,我将介绍神经网络求解微分方程的一些最新进展,以及对该领域发展方向的一些思考。


 

报告人  明平兵中国科学院数学与系统科学研究院

报告题目

科学计算与机器学习

报告摘要

       我们聚焦科学计算与机器学习的融合,特别要讨论基于深度学习的偏微分方程数值方法。主要探讨这一主题的国内外研究现状,面临的挑战以及发展趋势。

 

 

报告人  潘晓敏(上海大学)

报告题目

Applications and Potential of Monolithic Projection-Based Methods with Staggered Time Discretization

报告摘要

       An advanced monolithic projection-based method equipped with staggered time discretization (MPM-STD) has been crafted to probe into the non-Oberbeck–Boussinesq (NOB) effects within a diverse array of natural convection conditions, notably where fluid properties significantly alter with temperature. Employing the Crank–Nicolson scheme in tandem with staggered time discretization for momentum and energy equations enables the decoupling of these equations. This approach directly addresses the complexity of the variable coefficient Poisson equation through an innovative pressure-correction scheme, converting it into a constant coefficient equation for enhanced computation efficiency. Through numerical tests in two-dimensional periodic NOB Rayleigh–Bénard convection (RBC) with glycerol, the MPM-STD's second-order accuracy and efficiency were validated. The application of this method to differentially heated cavity problems in air and RBC in liquids under NOB effects demonstrated its capacity to significantly mitigate timestep constraints and enhance computational efficiency, surpassing conventional semi-implicit and explicit techniques. Additionally, the application of MPM-STD to challenging three-dimensional turbulence scenarios, as evidenced in direct simulations of turbulent RBC under NOB effects, exemplifies its robustness and potential for tackling complex fluid dynamics investigations. The method's extensive applicability, including the study of NOB effects in RBC with various aspect ratios and under different conditions, underscores its adaptability and comprehensive utility in addressing intricate fluid dynamics challenges.


 

报告人  邱越(重庆大学)

报告题目

Data-driven Discovery of (Partial) Differential Equations with Machine Learning

报告摘要

       Differential equations including partial differential equations (PDEs) are widely prevalent in both scientific research and engineering applications. However, the governing equations are generally unknown in many situations and it can be challenging to study the systems directly based on the first principles. Therefore, using data of the underlying equations to model and predict the systems become important and necessary. In this talk, we will introduce two types of machine learning approaches for the data-driven discovery of (partial) differential equations. In the first part, we focus on the Koopman operator theory that transforms nonlinear but finite-dimensional systems to linear yet infinite-dimensional systems. Thereby the data could be leveraged to learn a linear system. The key lies in constructing a set of bijective functions for such transformation and we propose the coupling flow invertible neural network approach. In the later part, we introduce the multi-fidelity Gaussian process (MFGP) for the sparse identification of nonlinear dynamical systems (SINDy). Our MFGP-SINDy is not only robust to noise but also quantifies the uncertainty and utilizes the multi-fidelity data to reduce the computational complexity. We demonstrate the advantage of our algorithms using various benchmark problems of PDEs over state-of-the-art methodologies.

 

 

报告人  史作强(清华大学)

报告题目

Interface Laplace Learning: Nonlocal Interface Model Helps Semi-Supervised Learning

报告摘要

       We introduce a novel framework, called Interface Laplace learning, for graph-based semi-supervised learning. Motivated by the nonlocal interface model, we introduce a Laplace learning model that incorporates an interface term. This model challenges the long-standing assumption that functions are smooth at all unlabeled points. In the proposed approach, we add an interface term to the Laplace learning model at the interface positions. We provide a practical algorithm to approximate the interface positions using k-hop neighborhood indices, and to learn the interface term from labeled data without artificial design. Our method is efficient and effective, and we present extensive experiments demonstrating that Interface Laplace learning achieves better performance than other recent semi-supervised learning approaches at extremely low label rates on the MNIST, FashionMNIST, and CIFAR-10 datasets.


 

报告人  唐科军(深圳理工大学)

报告题目

Solving high-dimensional partial differential equations with deep learning requires rethinking sample generation

报告摘要

       Using deep learning methods to approximate an unknown function often involves computing an integral. The effective way to discretize the high-dimensional integration is Monte Carlo sampling, and the discretization accuracy of the loss function is one of the key points to the solution accuracy. One fundamental numerical difficulty is that random samples in the training set introduce statistical errors into the discretization of the loss functional which may become the dominant error in the final approximation, and therefore overshadow the modeling capability of the neural network. In this talk, we will show how to use deep adaptive sampling methods for solving high-dimensional partial differential equations (including Fokker-Planck equations and surrogate modeling of parametric differential equations), and its applications to rare events in molecular simulations.

 

 

报告人  王东(香港中文大学)

报告题目

Efficient and stable methods for computing partition problems

报告摘要    

       A Dirichlet k-partition of a domain is a collection of k pairwise disjoint open subsets such that the sum of their first Laplace--Dirichlet eigenvalues is minimal. In this talk, we propose a new relaxation of the problem by introducing auxiliary indicator functions of domains and develop a simple and efficient diffusion generated method to compute Dirichlet k-partitions for arbitrary domains. The method only alternates three steps: 1. convolution, 2. thresholding, and 3. projection. The method is simple, easy to implement, insensitive to initial guesses and can be effectively applied to arbitrary domains without any special discretization. At each iteration, the computational complexity is linear in the discretization of the computational domain. Moreover, we theoretically prove the energy decaying property of the method. Experiments are performed to show the accuracy of approximation, efficiency and unconditional stability of the algorithm. We will also extend the method for spectral classification problems and introduce deep learning based algorithms for these problems


 

报告人  王飞(西安交通大学)

报告题目

Adaptive Growing Randomized Neural Networks for Solving Partial Differential Equations

报告摘要

       Traditional numerical methods face numerous challenges in handling high-dimensional problems, complex regional segmentation, and error accumulation caused by time iteration. Concurrently, neural network methods based on optimization training suffer from insufficient accuracy, slow training speeds, and uncontrollable errors due to the lack of efficient optimization algorithms. To combine the advantages of these two approaches and overcome their shortcomings, randomized neural network methods have been proposed. This method not only leverages the strong approximation capabilities of neural networks to circumvent the limitations of classical numerical methods but also aims to resolve issues related to accuracy and training efficiency in neural networks. By incorporating a posterior error estimation as feedback, in this talk, we propose Adaptive Growing Randomized Neural Networks for solving PDEs. This approach can adaptively generate network structures, significantly improving the approximation capabilities.

 

 

报告人  项阳香港科技大学

报告题目

Exploring the Training Challenges of Neural Network-Based PDE Solvers

报告摘要

       Neural network-based methods have emerged as powerful tools for solving partial differential equations (PDEs) in scientific and engineering applications. However, training such networks can be challenging, often resulting in limited accuracy. In this talk, we examine the training dynamics of neural network-based PDE solvers, with a particular focus on understanding and mitigating training difficulties. By analyzing the eigenvalue distribution of the associated kernel, we leverage the concept of effective rank as a quantitative measure of these challenges.  Specifically, we address two key aspects: (1) the role of initialization strategies, including partition of unity and variance scaling techniques, in influencing training dynamics; and (2) the treatment of the differential operator within the loss function from a training perspective. Additionally, we present comprehensive experiments employing PDE-solving frameworks, such as Physics-Informed Neural Networks (PINNs), the Deep Ritz method, and the operator learning framework DeepONet, to validate our theoretical analysis.

 


 

报告人  应文俊上海交通大学

报告题目

Physics-informed Data-driven Cavitation Model for a Specific Mie–Grüneisen Equation of State

报告摘要

       We will present a novel one-fluid cavitation model of a specific Mie-Grüneisen equation of state(EOS), named polynomial EOS, based on an artificial neural network. Not only the physics-informed equation but also the experimental data are embedded into the proposed model by an optimization problem. The physics-informed data-driven model provides the concerned pressure within the cavitation region, where the pressure tends to zero, and the density has the same trend. The present model is then applied to computing the challenging compressible multi-phase flow simulation, such as nuclear and underwater explosions. Numerical simulations show that our model in application agrees well with the corresponding experimental data, ranging from one dimension to three dimensions with the h−adaptive mesh refinement algorithm and load balance techniques in the structured and unstructured grids.

 

 

报告人  张钊山东大学

报告题目

基于离散化物理信息神经网络的参数化油藏渗流代理模型

报告摘要

       基于物理信息神经网络可实现融合数据与物理信息的流体力学模拟和代理模型构建。其基本思想是利用流体控制方程和初边值条件以及观测数据建立损失函数,从而训练神经网络,而损失函数的计算则采用自动微分。然而,物理信息神经网络用于油藏渗流存在两点困难。首先,自动微分无法严格定义非均质油藏模型上配置点之间的质量守恒;第二,井、裂缝等具有多尺度性的区域压力梯度较大,导致自动微分精确度较低。对此,结合有限体积法与卷积神经网络建立了物理信息卷积神经网络用于数值模拟和代理模型构建。其中,基于有限体积法将渗流方程离散化,并利用离散方程的残差近似原方程残差并构建损失函数。并且,油藏数值模拟领域的离散裂缝和井模型等成熟算法均可借助有限体积格式编码至损失函数,促进了物理信息神经网络在油藏数值模拟领域的落地。神经网络的结构采用卷积神经网络,适用于结构化的模型输入,而地质模型往往都是结构化的。进一步,利用自适应采样提高了训练效率。神经网络结构也可借助残差网络和注意力机制进一步优化。最后,基于典型的油藏渗流算例验证了算法的精确性以及计算效率。


 

   学生论坛

(按姓氏字母排序)

 

报告人     陈恒准复旦大学

报告题目

On the Continuity of Schur-Horn Mapping

报告摘要

       The Schur-Horn theorem is a well-known result that characterizes the relationship between the diagonal elements and eigenvalues of a symmetric (Hermitian) matrix. In this work, we extend this theorem by exploring the eigenvalue perturbation of a symmetric (Hermitian) matrix with fixed diagonals, which is referred to as the continuity of the Schur-Horn mapping. We introduce a concept called strong Schur-Horn continuity, characterized by minimal constraints on the perturbation. We demonstrate that several categories of matrices exhibit strong Schur-Horn continuity. Leveraging this notion, along with a majorization constraint on the perturbation, we prove the Schur-Horn continuity for general symmetric (Hermitian) matrices. The Schur-Horn continuity finds applications in oblique manifold optimization related to quantum computing.

 

 

报告人     郭义销中国科学院数学与系统科学研究院

报告题目

A deep learning framework for solving heat equations with non-smooth boundary conditions and obstacle constraints

报告摘要

       In this talk, we will present a deep learning framework for pricing American options based on the Black-Scholes-Merton (BSM) equation. By using specially designed functions, our method ensures that the numerical solution not only exactly satisfies the boundary condition, but also matches the non-smooth and singular behavior of the solution near the boundary. Our approach can effectively addresses the challenges posed by inequality constraints and can be easily extended to high-dimensional scenarios. We will also present various experiments to demonstrate that the proposed method is accuracy and robustness in various situations, outperforming traditional numerical methods and other machine learning methods.

 


 

报告人     于灏中国科学院数学与系统科学研究院

报告题目

Generalization Error Estimate of a Machine Learning Method for Solving High Dimensional Schrödinger Eigenvalue Problems

报告摘要

       The high-dimensional Schrödinger eigenvalue problem plays a crucial role in various fields, such as computational chemistry, condensed matter physics and quantum computing. Though classical numerical methods have achieved great success in solving low-dimensional PDEs and eigenvalue problems, a major challenge persists: the curse of dimensionality. Recently, significant progress has been made in applying deep neural networks to solve PDEs and Schrödinger eigenvalue problems. In this talk, we introduce a machine learning method for computing eigenvalues and eigenfunctions of the Schrödinger operator with Dirichlet boundary conditions. The eigenvalues are deep in the spectrum. The cut-off function technique is employed to construct trial functions that precisely satisfy the Dirichlet boundary conditions. This approach outperforms the standard boundary penalty method, as demonstrated by the numerical tests. Under the assumption that the eigenfunctions belong to a spectral Barron space, we derive a dimension-free convergence rate of the generalization error bound of the method, and the constants that grow at most polynomially in $d$. This assumption is verified by proving a new regularity result for the eigenfunctions when the potential lies in an appropriate spectral Barron space. Moreover, we prove a sharp accumulation rate of the generalization error and extend the generalization bound to the normalized penalty method, which is widely used in practice.

 

 

报告人     张悦嘉复旦大学

报告题目

Parallel Coordinate Descent Methods for Full Configuration Interaction

报告摘要

       Solving the time-independent Schrödinger equation gives us full access to the chemical properties of molecules. Among all the ab-initio methods, full configuration interaction (FCI) provides the numerically exact solution under a predefined basis set. However, the FCI problem scales exponentially with respect to the number of bases and electrons and suffers from the curse of dimensionality.<br>We develop a mutli-threaded parallel coordinate descent full configuration interaction algorithm, for the electronic structure ground-state calculation in the configuration interaction framework. The algorithm solves an unconstrained nonconvex optimization problem, via a modified block coordinate descent method with a deterministic compression strategy. CDFCI captures and updates appreciative determinants with different frequencies proportional to their importance. We demonstrate the efficiency of the algorithm on practical systems.