First Advisor

Feng Liu

Term of Graduation

Winter 2023

Date of Publication


Document Type


Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science


Computer Science




High resolution imaging, Monte Carlo method, Neural networks (Computer science)



Physical Description

1 online resource (xiv, 79 pages)


Physically-based image synthesis has attracted considerable attention due to its wide applications in visual effects, video games, design visualization, and simulation. However, obtaining visually satisfactory renderings with ray tracing algorithms often requires casting a large number of rays and thus takes a vast amount of computation. The extensive computational and memory requirements of ray tracing methods pose a challenge, especially when running these rendering algorithms on resource-constrained platforms, and impede their applications that require high resolutions and refresh rates. This thesis presents three methods to address the challenge of efficient rendering.

First, we present a hybrid rendering method to speed up Monte Carlo rendering algorithms. Our method first generates two versions of a rendering: one at a low resolution with a high sample rate (LRHS) and the other at a high resolution with a low sample rate (HRLS). We then develop a deep convolutional neural network to fuse these two renderings into a high-quality image as if it were rendered at a high resolution with a high sample rate. Specifically, we formulate this fusion task as a super-resolution problem that generates a high-resolution rendering from a low-resolution input (LRHS), assisted with the HRLS rendering. The HRLS rendering provides critical high-frequency details which are difficult to recover from the LRHS for any super-resolution methods. Our experiments show that our hybrid rendering algorithm is significantly faster than the state-of-the-art Monte Carlo denoising methods while rendering high-quality images when tested on both our own BCR dataset and the Gharbi dataset.

Second, we investigate super-resolution to reduce the number of pixels to render and thus speed up Monte Carlo rendering algorithms. While great progress has been made in super-resolution technologies, it is essentially an ill-posed problem and cannot recover high-frequency details in renderings. To address this problem, we exploit high-resolution auxiliary features to guide the super-resolution of low-resolution renderings. These high-resolution auxiliary features can be quickly rendered by a rendering engine and, at the same time, provide valuable high-frequency details to assist super-resolution. To this end, we develop a cross-modality Transformer network that consists of an auxiliary feature branch and a low-resolution rendering branch. These two branches are designed to fuse high-resolution auxiliary features with the corresponding low-resolution rendering. Furthermore, we design residual densely-connected Swin Transformer groups for learning to extract representative features to enable high-quality super-resolution. Our experiments show that our auxiliary features-guided super-resolution method outperforms both state-of-the-art super-resolution methods and Monte Carlo denoising methods in producing high-quality renderings.

Third, we present a deep-learning-based Monte Carlo Denoising method for the stereoscopic images. Research on deep-learning-based Monte Carlo denoising has made significant progress in recent years. However, existing methods are mostly designed for single-image Monte Carlo denoising, and stereoscopic image Monte Carlo denoising is less explored. Traditional methods require first rendering a noiseless for one view, which is time-consuming. Recent deep-learning-based methods achieve promising results on single-image Monte Carlo denoising, but their performance on the stereoscopic image is compromised as they do not consider the spatial correspondence between the left image and the right image. In this thesis, we present a deep-learning-based Monte Carlo denoising method for stereoscopic images. It takes low sampling per pixel (spp) stereoscopic images as inputs and estimates the high-quality result. Specifically, we extract features from two stereoscopic images and warp the features from one image to the other using the disparity finetuned from the disparity calculated from geometry. To train our network, we collected a large-scale Blender Cycles Stereo Ray-tracing dataset. Our experiments show that our method outperforms state-of-the-art methods when the sampling rates are low.


In Copyright. URI: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier