Ghost Towns: Semantically Labelled Object Removal From Video

William Clifford, Charles Markham

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper describes a method used to produce a video of a road in which the foreground itemswhich obstruct the view of the road have been removed i.e. other vehicles. Once these regions have been identified they are replaced using suitable images that closely resemble the original background. The work considers an approach that uses multiple video sequences of the same road (C1...Cn). One video is identified as video Cp , that requires the least repair. All instances of vehicles in each frame of video were identified using a Convolutional Neural Network (CNN). The regions associated with each vehicle were then filled using suitable regions from the frames from the remaining streams (assuming at least one of these streams has a background region visible which matches our query region in Cp ). To match frames and locate suitable patches ideally the video sequences need to be aligned both temporally and structurally. To match the frames temporally a bag of visual words approach was taken. To align the frames structurally a template search was performed on regions surrounding the region to be replaced. Given the template matches, the region between these templates in the matching frame were used to fill where the vehicles were previously, leaving behind only the background.
Original languageEnglish
DOIs
Publication statusPublished - 1 Jan 2019
Externally publishedYes
EventIMVIP 2019: Irish Machine Vision & Image Processing - Technological University Dublin, Dublin, Ireland
Duration: 28 Aug 201930 Aug 2019

Conference

ConferenceIMVIP 2019: Irish Machine Vision & Image Processing
Country/TerritoryIreland
CityDublin
Period28/08/1930/08/19

Keywords

  • video
  • road
  • foreground items
  • vehicles
  • Convolutional Neural Network
  • CNN
  • video sequences
  • background
  • bag of visual words
  • template search

Fingerprint

Dive into the research topics of 'Ghost Towns: Semantically Labelled Object Removal From Video'. Together they form a unique fingerprint.

Cite this