Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper
In this repo, we demonstrate that the FVD implementation from StyleGAN-V paper is equivalent to the original one when the videos are already loaded into memory and resized to a necessary resolution.
The main difference of our FVD evaluation protocol from the paper is that we strictly specify how data should be processed, clips sampled, etc.
The problem with the original implementation is that it does not handle:
- data processing: in which format videos are being stored (JPG/PNG directories of frames or MP4, etc.), how frames are being resized, normalized, etc.
- clip sampling strategy: how clips are being selected (from the beginning of the video, or randomly, with which framerate, how many clips per video, etc.)
- how many fake and real videos should be used
That’s why