高维图像处理的先进方法与理论(Advanced Methods and Theories in High-dimensional Image Processing)

2025.02.23

召集人:吴国宝(香港浸会大学,讲席教授)、赵熙乐(电子科技大学,教授)、王超(南方科技大学,副研究员)

时间:2025.03.16—2025.03.22


高维图像处理的先进方法与理论研讨会

会议日程总体安排

 

 

时间     日期

3.17

3.18

3.19

3.20

3.21

主持人

  

黄玉梅

李朝迁

丁维洋

赵熙乐

9:00-9:30

高连如

骆其伦

庄晓生

魏益民

王红霞

9:30-10:00

Michael Ng

王治国

刘昭强

王卫卫

崔卓须

10:00-10:15

  


主持人

Michael Ng

  

庄晓生

王卫卫


10:15-10:45

Andrea Marinoni

王书强

  

喻高航


10:45-11:15

潘珺珺

  

  

何雨晨


11:15-11:45

  

  

缪吉飞

曾铁勇


11:45-14:30

  

主持人

  

庞彤瑶

自由讨论

刘昭强

自由讨论

14:30-15:00

  

  

  

15:00-15:30

庄丽娜

Kit Ian Kou

曹相湧

15:30-15:45

  

  

主持人

潘珺珺

赵熙乐

曹相湧

15:45-16:15

彭江军

  

  

16:15-16:45

  

丁乔乔

  

16:45-17:15

郑玉棒

芮翔宇

冀腾宇

 

 

2025/3/17(星期一)

时间

报告人

报告题目

主持人

09:00-09:30

高连如

高光谱遥感图像处理与信息提取

  

09:30-10:00

Michael Ng

Quaternion Computation for Signal and Image Processing

10:00-10:15

 

10:15-10:45

Andrea Marinoni

Multimodal data analysis with sparse and missing data: from fluid graph to soft manifolds

Michael Ng

10:45-11:15

潘珺珺

Non-Negative Reduced Biquaternion Matrix Factorization with Applications in Color Face Recognition

11:15-11:45

  

多模态整合算法及其在组学数据中的应用

11:45-14:30

 

14:30-15:00

  

Fast and Provable Nonconvex Low-Tubal-Rank Tensor Recovery

  

15:00-15:30

庄丽娜

高光谱遥感图像去噪方法

15:30-15:45

 

15:45-16:15

彭江军

基于子空间先验挖掘的高光谱图像去噪研究

潘珺珺

16:15-16:45

  

Bilateral Tensor Low-Rank Representation for Insufficient Observed Samples in Multidimensional Image Clustering and Recovery

16:45-17:15

郑玉棒

张量网络分解及其在高维数据复原中的应用


 









2025/3/18(星期二)

时间

报告人

报告题目

主持人

09:00-09:30

骆其伦

Bayesian Dictionary Learning on Robust Tubal Transformed Tensor Factorization

黄玉梅

09:30-10:00

王治国

太赫兹三维层析成像:从物理建模到计算重建的理论与挑战

10:00-10:15

 

10:15-10:45

王书强

基于生成式AI的多模态脑影像计算

  

10:45-11:15

  

Convergence Analysis of the Nonlinear Kaczmarz Method for Systems of Nonlinear Equations with Component-wise Convex Mappings

11:15-11:45

 

Self-supervised Representations for Spectral and Temporal Imaging

11:45-14:30

 

14:30-15:00

  

tnGPS: Discovering Unknown Tensor Network Structure Search Algorithms via Large Language Models (LLMs)

彤瑶

15:00-15:30

Kit Ian Kou

Advanced Signal Processing via Hypercomplex Analysis

15:30-15:45

 

15:45-16:15

  

快照压缩感知中的非凸低秩建模与算法研究

赵熙乐

16:15-16:45

丁乔乔

Flow based Generative Models for Medical Image Synthesis

16:45-17:15

芮翔宇

基于低秩分解和扩散模型的无监督高光谱全色锐化方法


 









2025/3/19(星期三)

时间

报告人

报告题目

主持人

09:00-09:30

庄晓生

Spherical Framelets from Spherical Designs

李朝迁

09:30-10:00

刘昭强

Generative PCA and Fast Diffusion Sampling

10:00-10:15

 

10:15-10:45

  

First-Order Algorithms for Single-Ratio Minimization Problems

庄晓生

10:45-11:15

  

Operator Learning and Neural Scaling Laws

11:15-11:45

缪吉飞

Quaternion Higher-Order Singular Value Decomposition and Its Applications in Color Image Processing

11:45-14:30

 

14:30-15:00

自由讨论

15:00-15:30

15:30-15:45

15:45-16:15

16:15-16:45

16:45-17:15


 









2025/3/20(星期四)

时间

报告人

报告题目

主持人

09:00-09:30

魏益民

Coseparable Nonnegative Tensor Factorization With T-CUR Decomposition

丁维洋

09:30-10:00

王卫卫

基于深度学习的低光图像增强

10:00-10:15

 

10:15-10:45

喻高航

Sketching Methods for Large-Scale Tensor Decomposition with Applications

王卫卫

10:45-11:15

何雨晨

Region-based Image Vectorization

11:15-11:45

曾铁勇

Fast and Reliable Score-Based Generative Model for Parallel MRI

11:45-14:30

 

14:30-15:00

  

Fabric Image Recolorization by Fuzzy Pretrained Neural Network

刘昭强

15:00-15:30

曹相湧

生成式遥感大模型及其应用

15:30-15:45

 

15:45-16:15

  

Spherical Image Processing via Spherical Haar Framelets and CNNs

曹相湧

16:15-16:45

  

扩散生成模型在图像反问题计算中的应用

16:45-17:15

冀腾宇

遥感图像中低秩与光滑性驱动的无监督网络方法


 









  2025/3/21(星期五)

时间

报告人

报告题目

主持人

09:00-09:30

王红霞

Learning Input Encodings for Kernel-Optimal Implicit Neural Representations

赵熙乐

09:30-10:00

崔卓须

White-Box Transformers for Accelerated MRI

10:00-10:15




10:15-10:45




10:45-11:15



11:15-11:45



12:00-14:30

 

14:30-15:00

自由讨论

15:00-15:30

15:30-15:45

15:45-16:15

16:15-16:45

16:45-17:15


 

   

 

 

报告题目和摘要

(按报告时间顺序排序)

 

高光谱遥感图像处理与信息提取

高连如 (中国科学院空天信息创新研究院)

摘要:高光谱遥感既能成像又能测谱,每个像元对应一条近乎连续的光谱曲线,可以反映物质的诊断性光谱吸收差异,通过特征提取、图像分类、目标探测等技术的处理能够实现对地表地物类型的精确识别和区分,这对于民用和专用等诸多领域都具有十分重要的意义。高光谱遥感图像包含了地物丰富的空间、光谱和辐射三重信息,具有图谱合一的特点和优势,在展现其巨大应用潜力和价值的同时,其独特的立方体式高维图像数据结构也给信息提取带来了挑战。过去十余年间,以传统机器学习为基础,在高光谱遥感图像处理与信息提取方面提出了许多新的方法,并且随着大数据和人工智能技术的飞速发展,面向高光谱遥感信息智能提取的深度网络模型近年来发展迅速。本报告在讲述传统机器学习方法的同时,也将介绍以深度学习为主的智能化的高光谱遥感图像处理与信息提取新技术。

  

Quaternion Computation for Signal and Image Processing

Michael Ng (香港浸会大学)

摘要In this talk, we share recent results of quaternion computation for signal and image processing. Examples are given to demonstrate the results.

 

 

Multimodal Data Analysis with Sparse and Missing Data: From Fluid Graph to Soft Manifolds

Andrea Marinoni (UiT the Arctic University of Norway)

摘要:Modern real-life datasets are more and more characterized by multimodal sparse observations with missing records. This property can lead to dramatic degradation of the performance on any data analysis system, if not adequately addressed. This effect is more evident when considering information extraction in continuous spaces, which is a key factor for diverse tasks (e.g., learning, inferring, predicting) across operational scenarios. In particular, the reliability of data analysis directly depends on how much the geometry of the manifold in continuous space matches the data structure. State-of-the-art of manifold-based learning algorithms assume that the projection on a tangential space of each point in the manifold would locally resemble a Euclidean space. Although this condition helps in achieving efficient analytical solutions to the embedding problem, it is not an adequate set-up to work with missing records and sparse observations. In this work, we introduce a new class of manifold, named soft manifold, that can solve this situation. Soft manifolds are mathematical structures with spherical symmetry where the tangent spaces to each point are hypocycloids whose shape is defined according to the velocity of information propagation across the data points, inspired by the fluid dynamics diffusion mechanism. Experimental results on reconstruction tasks on synthetic and real datasets (especially drawn from environmental monitoring scenarios) show how the proposed approach enables more accurate and reliable characterization of information in continuous spaces with respect to the state-of-the-art.

 

 

Non-Negative Reduced Biquaternion Matrix Factorization with Applications in Color Face Recognition

潘珺珺 (香港浸会大学)

摘要Reduced biquaternion (RB), a four-dimensional algebra well-suited for representing color pixels, has recently gained significant attention in the field of image processing. This talk will introduce a novel non-negative RB matrix factorization (NRBMF) model to overcome the key difficulties associated with establishing a non-negative quaternion matrix factorization model, which are primarily due to the multiplication limitations of traditional quaternions. To solve the NRBMF model, we reformulate it as an RB alternating non-negative least squares (RB-ANNLS) problem. We then develop an RB projected gradient algorithm to solve the optimization problem by introducing a method to compute the gradient of a real-valued function with RB matrix variables. Experimental results on color face recognition demonstrate the effectiveness and superiority of the proposed NRBMF model, highlighting its potential for advancing color image processing applications.

 

 

多模态整合算法及其在组学数据中的应用

刘晔 (华南理工大学)

摘要:单细胞多组学测序技术的发展已积累了海量的微观分子组学(涵盖基因组、蛋白质组、表观组等)以及介观影像组学(如H&E染色、免疫荧光成像等)的跨尺度、多模态数据。近年来,空间组学技术在获得测序的同时保留细胞在组织中的相对位置信息,为揭示细胞间的相互作用及其微环境提供了有效的工具。但是,与单组学相比,多组学技术往往具有更大的测序误差,导致了多组学数据异构性强、组学数据不完备、组学数据质量低等问题。为此,我们提出了多模态数据整合分析算法,对异构空间组学数据学习统一的表征。所提出的算法在细胞空间域识别、伪时空发育轨迹分析等任务取得更精准的结果。

 

Fast and Provable Nonconvex Low-Tubal-Rank Tensor Recovery

王尧 (西安交通大学)

摘要This work considers the problem of recovering a tensor with an underlying low-tubal-rank structure from a small number of corrupted linear measurements. Traditional convex optimization approaches tackling such a problem require the computation of tensor Singular Value Decomposition (t-SVD), that is a computationally intensive process, rendering them impractical for dealing with large-scale tensors. Aim to address this challenge, we propose an efficient and effective nonconvex optimization approach based on a factorization procedure akin to the Burer-Monteiro (BM) method. Precisely, our procedure involves decomposing a large tensor into two smaller factor tensors, followed by solving the problem through factorized gradient descent (FGD). This strategy eliminates the need for t-SVD computation, thereby reducing computational costs and storage requirements. We provide rigorous theoretical analysis to ensure the convergence of FGD under both noise-free and noisy situations. It is worth noting that our procedure does not require the precise estimation of the tensor tubal-rank. Even in cases where the tubal-rank is slightly overestimated, our method continues to demonstrate robust performance. Moreover, the connections between the proposed procedure and large language models fine-tuning would be discussed.

  

高光谱遥感图像去噪方法

庄丽娜 (中国科学院空天信息创新研究院)

摘要:高光谱遥感图像的优势在于极高的光谱分辨率,它能够捕获关于电磁光谱的细节信息,实现地表精细反演。然而,这一优势的代价是每个光谱通道的信号能量降低,导致与传统的多光谱相机或RGB相机相比噪声水平更高。图像噪声的存在对于高光谱图像在地物精细信息提取方面的应用构成了重大挑战。因此,高光谱图像去噪已成为高光谱数据处理领域一个重要的研究问题。在这次报告中,我将介绍近期我们在高光谱遥感图像去噪领域的一些研究进展,介绍如何利用高光谱图像的光谱低秩性、空间自相似性、深度图像先验实现高光谱图像去噪。

 

 

基于子空间先验挖掘的高光谱图像去噪研究

彭江军 (西北工业大学)

摘要:高光谱图像(HSI)因其丰富的光谱信息在遥感、环境监测等领域广泛应用,但易受噪声干扰且数据处理复杂。本研究提出一种基于低秩分解子空间先验挖掘的修复框架,证明子空间因子能准确表征原始数据先验。研究涵盖三种方法:手工正则化、有监督深度学习和无监督深度学习。手工正则化利用全变分正则化提高去噪效率但存在先验挖掘不充分问题;有监督深度学习通过可学习的表示系数深度去噪器可精细化刻画先验但面临泛化性问题;无监督深度学习采用深度双分支网络,虽先验刻画准确但需时较长。这些方法展示了子空间先验挖掘的优势及深度学习在先验刻画上的潜力。

Bilateral Tensor Low-Rank Representation for Insufficient Observed Samples in Multidimensional Image Clustering and Recovery

丁猛 (西南交通大学)

摘要In this work, we study the subspace clustering and recovery of multidimensional images. Existing matrix-based/tensor-based subspace clustering methods successfully consider unilateral information (i.e., the similarity between image samples) to cluster samples into subspaces by using low-rank representation. The key issue of the unilateral representation-based methods is that the number of samples in each subspace should be sufficiently enough for subspace representation. In practice, the clustering performance can be degraded when there is only a small number of observed samples in each subspace. To address the problem of insufficient observed samples, we propose to introduce hidden tensor data to supplement insufficient number of observed samples. We employ both observed samples and hidden tensor data under low-rank constraints so that a new bilateral tensor low-rank representation (BTLRR) in subspace clustering is formulated. We show that a closed-form solution of block-diagonal tensor structure is obtained in subspace clustering of observed samples and hidden tensor data. Also the proposed BTLRR optimization problem can be solved by using the convex relaxation technique and augmented Lagrangian multiplier algorithm. The proposed BTLRR can fully explore the bilateral information of observations, including not only the similarity between samples but also the relationship among features. Extensive numerical results on multidimensional image data clustering and recovery illustrate the effectiveness and robustness of the proposed bilateral representation are better than those of state-of-the-art methods (e.g., the popular LRR and TLRR methods).

  

张量网络分解及其在高维数据复原中的应用

郑玉棒 (西南交通大学)

摘要:张量网络分解旨在将大规模高阶张量分解成一系列的小规模低阶因子(也称为核张量),近年来在科学计算、机器学习和计算机视觉等领域得到了广泛的关注和应用。本报告将回顾张量网络分解的基础运算,介绍代表性固定拓扑的张量网络分解(如tensor train分解、tensor ring分解和fully-connected tensor network分解等)以及拓扑自揭示的张量网络分解,并基于此探讨不同张量网络分解在高维数据复原中的应用能力和表现。

  

Bayesian Dictionary Learning on Robust Tubal Transformed Tensor Factorization

骆其伦 (华南师范大学)

摘要The recent study on tensor singular value decomposition (t-SVD) that performs the Fourier transform on the tubes of a third-order tensor has gained promising performance on multi-dimensional data recovery problems. However, such a fixed transformation, e.g., discrete Fourier transform and discrete cosine transform, lacks being self-adapted to the change of different datasets, and thus it is not flexible enough to exploit the low-rank and sparse property of the variety of multi-dimensional datasets. In this paper, we consider a tube as an atom of a third-order tensor and construct a data-driven learning dictionary from the observed noisy data along the tubes of the given tensor. Then a Bayesian dictionary learning model with tensor tubal transformed factorization, aiming to identify the underlying low-tubal-rank structure of the tensor effectively via the data-adaptive dictionary, is developed to solve the tensor robust principal component analysis problem (TRPCA). With the defined page-wise tensor operators, a variational Bayesian dictionary learning algorithm is established and updates the posterior distributions instantaneously along the third dimension to solve the TPRCA. Extensive experiments on real-world applications, such as color image and hyperspectral image denoising and background/foreground separation problems demonstrate both effectiveness and efficiency of the proposed approach in terms of various standard metrics.

 

 

太赫兹三维层析成像:从物理建模到计算重建的理论与挑战

王治国 (四川大学)

摘要:太赫兹成像因其独特的穿透性和无损检测能力,在医学诊断、工业无损检测和安全检查等领域展现出巨大潜力。本报告将系统介绍太赫兹三维层析成像技术的最新进展。首先,我们将讨论太赫兹成像的物理特性及其应用背景。其次,深入探讨太赫兹层析成像的数学建模过程,包括正问题的建立和反问题的分析。随后,我们将重点介绍几类关键的图像重建算法。最后,我们将探讨当前太赫兹三维层析成像面临的主要挑战,如采样效率、空间分辨率的提升等问题,并展望未来的研究方向。

  

 

基于生成式AI的多模态脑影像计算

王书强 (中国科学院深圳先进技术研究院)

摘要:脑成像技术的快速发展为非侵入性地解析脑结构和脑功能提供了极大便利,脑网络可以帮助人们更好地理解神经系统疾病的发生机制。脑影像-脑网络计算是脑认知机制解析和脑疾病预警的重要工具和手段。VAE,GAN,Diffusion等为代表的生成式AI算法模型在脑影像-脑网络计算中具有广泛的应用前景。本次报告首先分享基于GAN,VAE等生成式AI算法的脑影像跨模态重建和3D脑重建工作;并进一步探讨脑影像-脑网络重建面临的主要问题和挑战,分析现有脑网络构建工具的优缺点;最后,针对多模态脑影像融合过程中存在异质-异构的挑战,提出基于脑结构-功能融合表征学习的多模态脑网络计算方法,并以阿尔茨海默病为例,分析比较了各个阶段异常脑连接变化。实验结果揭示相关模型有望为早期AD预测提供潜在的生物标志物。

 

 

Convergence Analysis of the Nonlinear Kaczmarz Method for Systems of Nonlinear Equations with Component-wise Convex Mappings

陈冲 (中国科学院)

摘要Motivated by a class of nonlinear imaging inverse problems, for instance, multispectral computed tomography (MSCT), we study the convergence theory of the nonlinear Kaczmarz method (NKM) for solving the system of nonlinear equations with component-wise convex mapping, namely, the function corresponding to each equation being convex. Such kind of nonlinear mapping may not satisfy the commonly used component-wise tangential cone condition (TCC). For this purpose, we propose a novel condition named relative gradient discrepancy condition (RGDC), and make use of it to prove the convergence and even the convergence rate of the NKM with several general index selection strategies, where these strategies include the cyclic strategy and the maximum residual strategy. Particularly, we investigate the application of the NKM for solving nonlinear systems in MSCT image reconstruction. We prove that the nonlinear mappings in this context fulfill the proposed RGDC rather than the component-wise TCC, and provide a global convergence of the NKM based on the previously obtained results. Numerical experiments further illustrate the numerical convergence of the NKM for MSCT image reconstruction.

  

Self-supervised Representations for Spectral and Temporal Imaging

王超 (南方科技大学)

摘要Spectral and temporal imaging problems such as hyperspectral imaging and dynamic medical image reconstruction have been widely encountered in machine learning and computer vision. These areas often encounter challenges associated with high dimensionality and limited ground truth data. In this talk, I will discuss several self-supervised learning strategies that apply to various applications, from remote sensing to computational imaging. The proposed approaches integrate the concept of low-rank matrix factorization, leverage continuity through neural representation, and employ variational techniques from a model-based approach. Extensive experimental results reveal that these self-supervised learning techniques perform competitively, often outperforming traditional supervised learning methods in various real-world imaging scenarios.

  

tnGPS: Discovering Unknown Tensor Network Structure Search Algorithms via Large Language Models (LLMs)

李超 (RIKEN)

摘要Tensor networks are efficient for extremely high-dimensional representation, but their model selection, known as tensor network structure search (TN-SS), is a challenging problem. Although several works have targeted TN-SS, most existing algorithms are manually crafted heuristics with poor performance, suffering from the curse of dimensionality and local convergence. In this work, we jump out of the box, studying how to harness large language models (LLMs) to automatically discover new TN-SS algorithms, replacing the involvement of human experts. By observing how human experts innovate in research, we model their common workflow and propose an automatic algorithm discovery framework called tnGPS. The proposed framework is an elaborate prompting pipeline that instruct LLMs to generate new TN-SS algorithms through iterative refinement and enhancement. The experimental results demonstrate that the algorithms discovered by tnGPS exhibit superior performance in benchmarks compared to the current state-of-theart methods. Our code is available at https://github.com/ChaoLiAtRIKEN/tngps.

 

 

Advanced Signal Processing via Hypercomplex Analysis

Kit Ian Kou (澳门大学)

摘要This presentation is dedicated to exploring how hypercomplex analysis can be applied in the realm of high-dimensional signal processing. It makes use of advanced mathematical constructs to address complex challenges associated with data. The study investigates the use of quaternions and other hypercomplex numbers to enhance various signal processing techniques and offers novel solutions for applications such as image recognition, signal recovery, and data compression.

  

快照压缩感知中的非凸低秩建模与算法研究

李敏 (深圳大学)

摘要The reconstruction of snapshot compressive imaging (SCI) presents a significant challenge in signal processing. The primary goal of SCI is to employ a low-dimensional sensor to capture high-dimensional data in a compressed form. As a result, compared to traditional compressive sensing, SCI emphasizes capturing structural information and enhancing the reconstruction quality of high-dimensional videos and hyperspectral images. This paper proposes a novel SCI reconstruction method by integrating non-convex regularization approximation in conjunction with rank minimization. Furthermore, we address the characterization of structural information by leveraging nonlocal self-similarity across video frames to improve the reconstruction quality. We also develop an optimization algorithm based on the alternating direction method of multipliers (ADMM) to solve the model and provide a convergence algorithm analysis. Extensive experiments demonstrate that the proposed approach can potentially reconstruct SCI effectively.

  

Flow based Generative Models for Medical Image Synthesis

丁乔乔 (上海交通大学)

摘要The synthesis of high-quality medical images is critical for enhancing clinical decision-making, diagnostic accuracy, and treatment planning, as well as for applications such as data augmentation and image quality improvement. Flow based generative models have demonstrated significant potential in modeling complex data distributions and generating realistic synthetic images. This talk presents two novel approaches that contribute to advancements in flow-based generative modeling for medical image synthesis. The first approach introduces SyMOT-Flow, an invertible transformation model that minimizes the symmetric maximum mean discrepancy between samples from two unknown distributions, incorporating an optimal transport cost as regularization. This ensures short distance and interpretable mappings, leading to more stable and accurate sample generation. The model is validated through low-dimensional illustrative examples and high-dimensional bi-modality medical image generation tasks. The second approach proposes Bi-DPM (Bi-directional Discrete Process Matching), a novel model for bi-modality image synthesis. Unlike traditional flow-based methods that rely on computationally intensive ordinary differential equation (ODE) solvers, Bi-DPM utilizes forward and backward flows with enhanced consistency over discrete time steps. This results in efficient and high-quality image synthesis guided by paired data. Experimental results on MRI T1/T2 and CT/MRI datasets show that Bi-DPM achieves superior image quality and accurately synthesizes anatomical regions compared to existing methods. These contributions offer practical advancements in flow-based medical image synthesis, addressing computational efficiency and image fidelity while providing tools that can support improved clinical workflows and outcomes.

  

 

基于低秩分解和扩散模型的无监督高光谱全色锐化方法

芮翔宇 (西安交通大学)

摘要:高光谱全色锐化是将高分辨率全色(PAN)图像和低分辨率高光谱(LRHS)图像融合成单一的高分辨率高光谱(HRHS)图像的任务。该任务需要解决的一大问题是如何刻画高维度数据信息。为此,我们提出了一种结合低秩分解和无监督扩散模型的无监督方法,该方法同时利用了RGB图像上预训练扩散模型中的丰富图像信息和一种简单有效的低秩分解方式。具体来说,HRHS 图像可以分解成两个低秩张量 (即基张量和系数矩阵)的乘积。我们设计了一种简洁的低秩分解方式,使得系数矩阵可以从观测图像中直接估计,并且基张量空间特征近似RGB图像,可以利用预训练扩散模型来生成。所提方法对高维度数据信息有良好解释性,在不同数据上展现了出色的泛化能力。

 

 

Spherical Framelets from Spherical Designs

庄晓生 (香港城市大学)

摘要In this talk, we discuss the structures of the variational characterization of the spherical t-design, its gradient, and its Hessian in terms of fast spherical harmonic transforms. Moreover, we propose solving the minimization problem of the spherical t-design using the trust-region method to provide spherical t-designs with large values of t. Based on the obtained spherical t-designs, we develop (semi-discrete) spherical tight framelets as well as their truncated systems and their fast spherical framelet transforms for practical spherical signal/image processing. Thanks to the large spherical t-designs and localization property of our spherical framelets, we are able to provide signal/image denoising using local thresholding techniques based on a fine-tuned spherical cap restriction. Many numerical experiments are conducted to demonstrate the efficiency and effectiveness of our spherical framelets and spherical designs, including Wendland function approximation, ETOPO data processing, and spherical image denoising. Research was supported in part by the Research Grants Council of Hong Kong (Project nos.: CityU 11309122, CityU 11302023, CityU 11301224) and a grant from the Innovation and Technology Commission of Hong Kong (Project no. MHP/054/22).

  

Generative PCA and Fast Diffusion Sampling

刘昭强 (电子科技大学)

摘要In this talk, I will first briefly discuss the problem of principal component analysis with generative modeling assumptions. The key assumption is that the underlying signal lies near the range of a Lipschitz continuous generative model with bounded low-dimensional inputs. We propose a quadratic estimator, and show that it enjoys the optimal statistical rate. Moreover, we provide a variant of the classic power method, which projects the calculated data onto the range of the generative model during each iteration. We show that under suitable conditions, this method converges exponentially fast to a point achieving the above-mentioned statistical rate. We perform experiments on various image datasets to illustrate performance gains of our method to the classic power method and the truncated power method devised for sparse principal component analysis. Next, I will briefly discuss our recent work on accelerating diffusion sampling with optimized time steps. Diffusion (generative) models (DMs) have shown remarkable performance in high-resolution image synthesis, but their sampling efficiency is still to be desired due to the typically large number of sampling steps. Recent advancements in high-order numerical ODE solvers for DMs have enabled the generation of high-quality images with much fewer sampling steps, but most sampling methods simply employ uniform time steps. To address this issue, we propose a general framework for designing an optimization problem that seeks more appropriate time steps for a specific numerical ODE solver for DMs. This optimization problem aims to minimize the distance between the ground-truth solution to the ODE and an approximate solution corresponding to the numerical solver. It can be efficiently solved using the constrained trust region method, taking less than 15 seconds. Our extensive experiments on both unconditional and conditional sampling using pixel- and latent-space DMs validate the effectiveness of the optimized time steps.

 

First-Order Algorithms for Single-Ratio Minimization Problems

李洽 (中山大学)

摘要In this talk, we focus on a class of single-ratio minimization problems, where the numerator of the objective is the sum of a nonsmooth nonconvex function f and a smooth nonconvex function h, while the denominator g is a nonsmooth convex function. We analyze the first-order necessary optimality condition and propose the Proximity-Gradient-Subgradient Algorithm (PGSA), along with its modified version PGSA_L, which incorporates a line-search scheme. Specifically, when f is convex, we implement a backtracked extrapolation technique to enhance PGSA. Besides, for cases where f is block-separable, we present a novel reformulation of the problem and propose a solving framework called multi-proximity gradient algorithm (MPGA). We establish subsequential convergence analysis for all the proposed algorithms. Moreover, the global convergence is established by the Kurdyka-Łojasiewicz (KL) property of auxiliary functions and some mild additional assumptions. Furthermore, we prove that in several special cases of fractional programs, the corresponding KL exponents are 1/2, which leads to linear convergence rates for the proposed algorithms. Finally, some preliminary numerical results demonstrate the efficiency of our algorithms.

  

Operator Learning and Neural Scaling Laws

刘皓 (香港浸会大学)

摘要Deep neural networks have demonstrated a great success in many applications. For operator learning and large language model, neural scaling laws are observed in many works. Most of the observed laws are power laws, i.e., the testing error can be written as a power of number of parameters or the number of training samples. However, theoretical explanations of the scaling laws are largely missing. In this presentation, we focus on operator learning and analyze the approximation and generalization error of some popular network architectures. We provide a theoretical explanation of neural scaling laws, and show that if the data has low-dimensional structures, one can achieve power laws.

 

 

Quaternion Higher-Order Singular Value Decomposition and Its Applications in Color Image Processing

缪吉飞 (云南大学)

摘要Higher-order singular value decomposition (HOSVD) is one of the most efficient tensor decomposition techniques. It has the salient ability to represent high-dimensional data and extract features. On the other hand, in recent years, quaternions have proven to be a very suitable tool for color pixel representation, as they can effectively preserve the cross-channel correlation of color channels. Motivated by the advantages of HOSVD and quaternion algebra, we generalize HOSVD to the quaternion domain and define quaternion-based HOSVD (QHOSVD). Theoretically, QHOSVD is a proper tensor generalization of quaternion singular value decomposition (QSVD) and a proper quaternion generalization of the standard HOSVD. From an application perspective, the proposed QHOSVD can be widely used in various visual data processing tasks involving color pixels. As examples, we present two applications of QHOSVD in color image processing: color image denoising and multi-focus color image fusion.

  

Coseparable Nonnegative Tensor Factorization With T-CUR Decomposition

魏益民 (复旦大学)

摘要Nonnegative matrix factorization (NMF) is an important unsupervised learning method to extract meaningful features from data. To address the NMF problem within a polynomial time framework, researchers have introduced a separability assumption, which has recently evolved into the concept of coseparability. This advancement offers a more efficient core representation for the original data. However, in the real world, the data is more naturally represented as a multidimensional array, such as images or videos. The NMF's application to high-dimensional data involves vectorization, which risks losing essential multidimensional correlations. To retain these inherent correlations in the data, we turn to tensors (multidimensional arrays) and leverage the tensor t-product. This approach extends the coseparable NMF to the tensor setting, creating what we term coseparable nonnegative tensor factorization (NTF). In this work, we provide an alternating index selection method to select the coseparable core. Furthermore, we validate the t-CUR sampling theory and integrate it with the tensor discrete empirical interpolation method to introduce an alternative, randomized index selection process. These methods have been tested on both synthetic and facial analysis datasets. The results demonstrate the efficiency of coseparable NTF when compared to coseparable NMF.

 

 

基于深度学习的低光图像增强

王卫卫 (西安电子科技大学)

摘要:在夜间或光线极弱环境,或恶劣天气条件下拍摄的图像通常会出现亮度不足、对比度低、细节丢失等问题,导致图像质量较差,难以满足应用需要。因此需要研究低光图像增强技术,对低光照图像进行处理,以提高其亮度、对比度、细节清晰度和色彩准确性等,从而改善图像的视觉质量和可用性。报告在简单回顾已有方法基础上,讨论两种基于神经网络的低光图像增强模型,一个是基于Retinex分解的正则化模型,一个是基于图像卡通分量估计传递函数的模型。

 

Sketching Methods for Large-Scale Tensor Decomposition with Applications

喻高航 (杭州电子科技大学)

摘要Large tensors are frequently encountered in various fields such as computer vision, scientific simulations, sensor networks, and data mining. However, these tensors are often too large for convenient processing, transfer, or storage. Fortunately, they typically exhibit a low-rank structure that can be leveraged through tensor decomposition. However, performing large-scale tensor decomposition can be time-consuming. Sketching is a useful technique to reduce the dimensionality of the data. In this talk, we present some efficient two-sided sketching methods for large-scale tensor low-rank decomposition. A rigorous theoretical analysis is also conducted to assess the approximation error of the proposed method. Specifically, we improve the proposed method with power iteration to achieve more precise approximate solutions. Extensive numerical experiments and comparisons on low-rank approximation of synthetic large tensors and real-world data like color images and grayscale videos illustrate the efficiency of the proposed approach in terms of both CPU time and approximation accuracy.

 

Region-based Image Vectorization

何雨晨 (香港城市大学)

摘要Image vectorization is an important technique enabling resolution-free, compact, and scale invariant representation of any raster images. Through this approach, various important geometric measures can be effectively and robustly captured. In this talk, we shall discuss our recent works on region-based image vectorization integrating affine shortening flow.

 

Fast and Reliable Score-Based Generative Model for Parallel MRI

曾铁勇 (香港中文大学)

摘要The score-based generative model (SGM) can generate high-quality samples, which have been successfully adopted for magnetic resonance imaging (MRI) reconstruction. However, the recent SGMs may take thousands of steps to generate a high quality image. Besides, SGMs neglect to exploit the redundancy in k space. To overcome the above two drawbacks, in this talk, we propose a fast and reliable SGM (FRSGM). First, we propose deep ensemble denoisers (DEDs) consisting of SGM and the deep denoiser, which are used to solve the proximal problem of the implicit regularization term. Second, we propose a spatially adaptive self-consistency (SASC) term as the regularization term of the k-space data. We use the alternating direction method of multipliers (ADMM) algorithm to solve the minimization model of compressed sensing (CS)-MRI incorporating the image prior term and the SASC term, which is significantly faster than the related works based on SGM. Meanwhile, we can prove that the iterating sequence of the proposed algorithm has a unique fixed point. In addition, the DED and the SASC term can significantly improve the generalization ability of the algorithm. The features mentioned above make our algorithm reliable, including the fixed-point convergence guarantee, the exploitation of the k space, and the powerful generalization ability.

  

 

Fabric image recolorization by fuzzy pretrained neural network

韩雨 (深圳大学)

摘要In the art of fabric design, the technic of image recolorization is usually used to generate synthetic fabric images that can serve as new fabric design proposals. However, classical non-learning-based image recolorization methods for fabric color design are formulated by variational models that are the integration of image decomposition models and image segmentation models. Although classical image recolorization methods can generate relatively good color design proposals, they have much higher computation burden since classical variational models typically rely on inefficient iterative algorithms for solving minimization problems associated with these models. This work addresses the inefficiency problem by introducing a novel image recolorization architecture that is based on two neural networks: (1) a pretrained neural network for intrinsic image decomposition, and (2) a pretrained plug-and-play denoiser for image segmentation. Numerical results demonstrate that, compared to classical variational-based image recolorization methods, our proposed method shows higher efficiency in the running time, while texture details in fabric images to be recolored are better preserved.

 

 

生成式遥感大模型及其应用

曹相湧 (西安交通大学)

摘要:生成式模型是目前大模型时代的热门研究领域,其中代表性的技术是扩散模型。本报告主要介绍我们团队近两年将生成式扩散模型应用到遥感图像处理领域的初步探索,具体内容包括:1.扩散模型在遥感图像底层处理任务(如去噪、超分、修复、融合等)的应用;2.基于扩散模型的生成式大模型(包括多条件可控生成式遥感大模型CRS-Diff和高光谱生成式大模型HSIGene);3.面向遥感高层视觉任务的生成式模型。

  

Spherical Image Processing via Spherical Haar Framelets and CNNs

冯寒 (香港城市大学)

摘要In the realm of high - dimensional image processing, spherical images pose unique challenges due to their non - Euclidean nature. This talk focus on spherical image processing using framelet - based techniques. We first introduce a general framework for constructing Haar - type tight framelets on any compact set with a hierarchical partition. Specifically, an area - regular hierarchical partition on the 2 - sphere is developed, leading to the creation of spherical Haar tight framelets with directionality. These framelets are then utilized in denoising experiments. Additionally, a convolutional neural network (CNN) model, integrating fast framelet decomposition and reconstruction algorithms, is proposed. Subsequently, we conduct the task of spherical image inpainting by employing a fast directional spherical Haar framelet transform and devises an optimization framework based on the sparsity of framelet coefficients. A deep CNN denoiser, Double - S2HaarNet, is introduced, which is based on a progressive encoder - decoder architecture. By using a plug - and - play method to handle the optimization model, the proposed approach are applied to recover damaged spherical images.

 

 

扩散生成模型在图像反问题计算中的应用

李季 (首都师范大学)

摘要:图像反问题计算的关键是图像先验表示的设计。随着近来年深度学习的发展,图像先验从传统的专家设计、数据驱动等表示,逐步发展到基于生成模型学习的表示。更精确的图像先验无疑能带来图像反问题计算的性能提升。本报告先简单介绍扩散生成模型学习数据分布的范式,然后我们介绍扩散生成模型的数学建模以及如何发展基于扩散生成模型的反问题计算算法。扩散生成模型的迭代生成特性与反问题的迭代算法能很好地融合,而问题的关键是如何从数学上导出反问题的扩散生成式算法。本报告将介绍图像反问题的生成式算法的主要思想和计算框架。

 

 

遥感图像中低秩与光滑性驱动的无监督网络方法

冀腾宇 (西北工业大学)

摘要:异常检测与云去除是遥感图像处理中的两大关键挑战,但现有方法常因忽略复杂背景特性或跨域时空关联性而性能受限。为解决这些局限性,我们利用连续低秩表征和多域时空平滑性,将遥感图像转换到特征域进行处理。具体而言:针对遥感图像的复杂结构,我们设计基于无监督网络的连续低秩张量函数建模背景的低秩性与平滑性,并通过过拟合屏蔽机制平衡背景与异常的复杂特性;针对遥感图像时空平滑性,创新性地引入图像域与梯度域的双重时序平滑性约束,利用低秩张量正则化重建全局遥感图像。实验表明:在异常检测中,所提方法在机场、海岸等复杂场景下显著优于主流算法;在云去除任务中,去云结果的PSNR/SSIM指标均超越现有方法,尤其在农田纹理、建筑边缘等细节保留上表现突出。

  

 

Learning Input Encodings for Kernel-Optimal Implicit Neural Representations

王红霞 (国防科技大学)

摘要Implicit Neural Representations (INRs) rely heavily on architectural choices for good generalization. Developing theoretically grounded approaches for architecture design remains an active area of research. Via theoretical analysis of the infinite-width limit, we establish a methodology that characterizes INRs generalization by means of kernel alignment. We first formulate the optimal kernel that minimizes pointwise expected squared error, then demonstrate that the Neural Tangent Kernel of the composed function (INR with input encoding) can approximate any positive semidefinite dot-product kernels through input feature mapping adjustments. Building upon these insights, we propose a Kernel Alignment Regularizer (KAR) that naturally integrates with existing INR systems to enhance kernel alignment. We further develop Plug-in Encoding for Aligned Kernels (PEAK) to refine INR models with KAR using learnable input encoding. Numerical experiments on image inpainting and Fourier phase retrieval illustrate the efficiency of PEAK compared with strong baseline methods, including vanilla MLP (ReLU activation), Fourier feature networks and DINER (Hash -mapped coordinates).

  

 

White-Box Transformers for Accelerated MRI

崔卓须 (中国科学院深圳先进技术研究院)

摘要Over the past decade, convolutional neural networks (CNNs) have significantly advanced accelerated MRI. However, their inherently limited receptive field constrains their ability to capture long-range dependencies, restricting feature extraction to local regions. In contrast, Transformers have demonstrated exceptional performance in natural language processing and computer vision by effectively modeling global relationships. This capability makes them a promising alternative for accelerated MRI, where leveraging long-range dependencies could enhance reconstruction quality. Despite their potential, existing Transformer-based MRI reconstruction models often lack interpretability, raising concerns about their reliability in clinical applications. To address this challenge, this talk will explore how MRI-specific physical priors can be incorporated to guide the development of interpretable attention mechanisms, enabling the design of white-box Transformers tailored for fast MRI reconstruction. Additionally, we will discuss strategies for constructing structured, interpretable prompts and optimizing prompt-tuning approaches to improve the adaptability of Transformer-based models across diverse imaging scenarios, including variations in undersampling patterns, anatomical regions, and data acquisition settings.