当前位置：首页 > news >正文

【读论文】【精读】3D Gaussian Splatting for Real-Time Radiance Field Rendering

news 2025/6/20 13:43:54

文章目录

- 1. What：
- 2. Why：
- 3. How：
- - 3.1 Real-time rendering
  - 3.2 Adaptive Control of Gaussians
  - 3.3 Differentiable 3D Gaussian splatting
- 4. Self-thoughts

1. What：

What kind of thing is this article going to do (from the abstract and conclusion, try to summarize it in one sentence)

To simultaneously satisfy the requirements of efficiency and quality, this article begins by establishing a foundation with sparse points using 3D Gaussian distributions to preserve desirable space. It then progresses to optimizing anisotropic covariance to achieve an accurate representation. Lastly, it introduces a cutting-edge, visibility-aware rendering algorithm designed for rapid processing, thereby achieving state-of-the-art results in the field.

2. Why：

Under what conditions or needs this research plan was proposed (Intro), what problems/deficiencies should be solved at the core, what others have done, and what are the innovation points? (From Introduction and related work)

Maybe contain Background, Question, Others, Innovation:

Three aspects of related work can explain this question.

Traditional reconstructions such as SfM and MVS need to re-project and
blend the input images into the novel view camera, and use the
geometry to guide this re-projection(From 2D to 3D).

Sad: Cannot completely recover from unreconstructed regions, or from “over-reconstruction”, when MVS generates inexistent geometry.
Neural Rendering and Radiance Fields

Neural rendering represents a broader category of techniques that leverage deep learning for image synthesis, while radiance field is a specific technique within neural rendering focused on the scene representation of light and color in 3D spaces.

Deep Learning was mainly used on MVS-based geometry before, which is also its major drawback.
Nerf is along the way of volumetric representation, which introduced positional encoding and importance sampling.
Faster training methods focus on the use of spatial data structures to store (neural) features that are subsequently interpolated during volumetric ray-marching, different encodings, and MLP capacity.
Today, notable works include InstantNGP and Plenoxels both rely on Spherical Harmonics.

Understand Spherical Harmonics as a set of basic functions to fit a geometry in a 3D spherical coordinate system.

球谐函数介绍（Spherical Harmonics） - 知乎 (zhihu.com)

Point-Based Rendering and Radiance Fields

The methods in human performance capture inspired the choice of 3D Gaussians as scene representation.
Point-based and spherical rendering is achieved before.

3. How：

请添加图片描述

Through the Gradient Flow in this paper’s pipeline, we are trying to connect Part4, 5, and 6 in this paper.

Firstly, start from the loss function, which is combined by a ${\mathcal L}_{1}$ loss and a $SS I M$ index, just as shown below:

${\mathcal L}=(1-\lambda){\mathcal L}_{1}+\lambda{\mathcal L}_{\mathrm{D-SSIM}}.\tag{1}$

It found a relation between the actual image and the rendering image. So to finish the optimization, we need to dive into the process of rendering. From the chapter on related work, we know Point-based $\alpha$ -blending and NeRF-style volumetric rendering share essentially the same image formation model. That is

$C=\sum_{i=1}^{N}T_{i}(1-\exp(-\sigma_{i}\delta_{i}))c_{i}\quad\mathrm{with}\quad T_{i}=\exp\left(-\sum_{j=1}^{i-1}\sigma_{j}\delta_{j}\right).\tag{2}$

And this paper actually uses a typical neural point-based approach just like (2), which can be represented as:

$C=\sum_{i\in N}c_{i}\alpha_{i}\prod_{j=1}^{i-1}(1-\alpha_{j}) \tag{3}$

From this formulation, we can know what the representation of volume should contain the information of color $c$ and transparency $\alpha$ . These are attached to the gaussian, where Spherical Harmonics was used to represent color, just like Plenoxels. The other attributes used are the position and covariance matrix. So, now we have introduced the four attributes to represent the scene, that is positions 𝑝, 𝛼, covariance Σ, and SH coefficients representing color 𝑐 of each Gaussian.
After knowing the basic elements we need to use, now let’s work backward, starting with rendering, which was addressed in the author’s previous paper.

3.1 Real-time rendering

This method is independent of the propagation of gradients but is critical for real-time performance, which was published in the author’s paper before.
在这里插入图片描述

In the previous game, someone had tried to model the world in ellipsoid and render it. This is the same as the render process of Gaussian splatting. But the latter uses lots of techniques in the utilization of threads and GPU.

Firstly, it starts by splitting the screen into 16×16 tiles and then proceeds to cull 3D Gaussians against the view frustum and each tile, only keeping Gaussians with a 99% confidence interval intersecting the view frustum.
Then instantiate each Gaussian according to the number of tiles they overlap and assign each instance a key that combines view space depth and tile ID.
Then sort Gaussians based on these keys using a single fast GPU Radix sort.
Finally, launching one thread block for each tile, for a given pixel, accumulate color and transparency values by traversing the lists front-to-back, until $\alpha$ goes to one.

3.2 Adaptive Control of Gaussians

In the process of fitting gaussian to the scene, we should utilize the number and volume of gaussian to strengthen the representation of the scene. It contained two methods named clone and split, as shown below.

在这里插入图片描述

These were judged by the view-space positional gradients. Both under-reconstruction and over-construction have large view-space positional gradients. We will clone or split the gaussian according to different conditions.

3.3 Differentiable 3D Gaussian splatting

We have known the process of rendering and control of gaussian. Finally, we will talk about how to backward the gradients to where we can optimize. This is mainly about the processing of Gaussian function.

The basic simplified formulation of 3D Gaussain can be represented as:

$G(x)=e^{-\frac{1}{2}(x)^{T}\Sigma^{-1}(x)}.\tag{4}$

We will use $\alpha$ -blending to combine it to generate the rendering picture, so that we can calculate the loss function and finish the optimization. So now we need to know how to optimize and calculate the gradients of Gaussian.

When rasterizing, the three-dimensional scene needs to be transformed into a two-dimensional space. The author hopes that the 3D Gaussian will maintain its distribution during the transformation (otherwise, if the raster finish has nothing to do with Gaussian, all the efforts will be in vain). So we should choose a method to transfer the covariance matrix to camera coordinate without change the affine relation. That is

$\Sigma'=JW\Sigma W^{T}J^{T},\tag{5}$

where $J$ is the Jacobian of the affine approximation of the projective transformation.

Another problem is that the covariance matrix must be semi-definite. So we use a scaling matrix 𝑆 and rotation matrix 𝑅 to assure it. That is

$\Sigma=RSS^{T}R^{T}\tag{6}$

And then we can use a 3D vector 𝑠 for scaling and a quaternion 𝑞 to represent rotation. The gradients will backward to them. These are the whole process of optimization.

4. Self-thoughts

Summary of different representation

Explicit representation: Mesh, Point Cloud
Implicit representation
- Volumetric representation: Nerf
  
  The density value returned by the sample points reflects whether there is geometric occupancy here.
- Surface representation: SDF(Signed Distance Function)
  
  Outputs the distance to the nearest surface in the space from this point, where a positive value indicates outside the surface, and a negative value indicates inside the surface.

Refer:

[1]: 3D Gaussian Splatting：用于实时的辐射场渲染-CSDN博客

[2]: 【三维重建】3D Gaussian Splatting：实时的神经场渲染-CSDN博客

[3]: 3D Gaussian Splatting中的数学推导 - 知乎 (zhihu.com)

[4]: [NeRF坑浮沉记]3D Gaussian Splatting入门：如何表达几何 - 知乎 (zhihu.com)

【读论文】【精读】3D Gaussian Splatting for Real-Time Radiance Field Rendering

文章目录

1. What：

2. Why：

3. How：

3.1 Real-time rendering

3.2 Adaptive Control of Gaussians

3.3 Differentiable 3D Gaussian splatting

4. Self-thoughts

相关文章：

【读论文】【精读】3D Gaussian Splatting for Real-Time Radiance Field Rendering

JVM理解学习

使用 Ruby 或 Python 在文件中查找

python实现冒泡排序

大数据开发（HBase面试真题-卷二）

基于springboot+vue的线上教育系统（源码+论文）

01-shell的自学课-基础变量学习

鸿蒙Harmony应用开发—ArkTS声明式开发（基础手势：Span）

前端框架的演进之路：从静态网页到现代交互体验的探索

在Linux/Ubuntu/Debian中设置字体

Python 常用内置函数，及实例演示

C++标准输入输出和名字空间

hive逗号分割行列转换

Jenkins插件Parameterized Scheduler用法

西门子S7.NET通信库【读】操作详解

Qt/C++音视频开发69-保存监控pcm音频数据到mp4文件/监控录像/录像存储和回放/264/265/aac/pcm等

闲聊Swift的枚举关联值

抓取Instagram数据：Fizzler库带您进入C#爬虫程序的世界

Codeforces Round 933 (Div. 3) A~D

《vtk9 book》官方web版第3章 - 计算机图形基础（3 / 5）

连锁超市冷库节能解决方案：如何实现超市降本增效

【HTML-16】深入理解HTML中的块元素与行内元素

css3笔记（1）自用

Redis数据倾斜问题解决

Mac下Android Studio扫描根目录卡死问题记录

GitHub 趋势日报 (2025年06月06日)

从 GreenPlum 到镜舟数据库：杭银消费金融湖仓一体转型实践

C++实现分布式网络通信框架RPC(2)——rpc发布端

React从基础入门到高级实战：React 实战项目 - 项目五：微前端与模块化架构

密码学基础——SM4算法