First Advisor

Feng Liu

Term of Graduation

Winter 2020

Date of Publication


Document Type


Degree Name

Doctor of Philosophy (Ph.D.) in Computer Science


Computer Science




Computer graphics, Computer vision, Image processing -- Digital techniques



Physical Description

1 online resource (xiv, 111 pages)


Novel view synthesis is a classic problem in computer vision. It refers to the generation of previously unseen views of a scene from a set of sparse input images taken from different viewpoints. One example of novel view synthesis is the interpolation of views in between the two images of a stereo camera. Another classic problem in computer vision is video frame interpolation, which is important for video processing. It refers to the generation of video frames in between existing ones and is commonly used to increase the frame rate of a video or to match the frame rate to the refresh rate of the monitor that the video is being displayed on. Interestingly, off-the-shelf video frame interpolation can directly be employed to successfully perform view interpolation to address the aforementioned stereo view interpolation problem.

Video frame interpolation can be seen as temporal novel view synthesis. However, this perspective is usually not considered and novel view synthesis generally concerns generating unseen views in space rather than time. For this reason, the set of sparse input images that is used for spatial novel view synthesis is commonly either captured at the same time, or it is assumed that the scene is static. This paradigm limits the applicability of novel view synthesis in real-world scenarios though.

This thesis addresses three applications of novel view synthesis and provides practical solutions that do not require difficult-to-acquire multi-view imagery: video frame interpolation which performs temporal video-to-video synthesis, synthesizing the 3D Ken Burns effect from a single image which performs spatial image-to-video synthesis, synthesizing video action shots which performs spatiotemporal video-to-video and video-to-image synthesis. These applications not only explore different dimensions of time and space, they also perform novel view synthesis on everyday image and video footage. This is in stark contrast to the large body of existing work which focuses on spatial novel view synthesis while requiring multiple input views that were either captured at the same time or under the assumption of a static scene.


In Copyright. URI: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

717147_supp_C729C764-4976-11EA-962C-1F524D662D30 (1).avi (97503 kB)
Proposed softmax splatting

717147_supp_E6D556BE-4976-11EA-B4B1-22524D662D30.avi (103052 kB)
Proposed 3D Ken Burns effect

717147_supp_DD261D78-4977-11EA-82C4-C6554D662D30.avi (98358 kB)
Proposed video action shot synthesis framework