![]() Added a checkpoint of a VQGAN trained with f8 compression and Gumbel-Quantization.Added accelerated sampling via caching of keys/values in the self-attention operation, used in scripts/sample_fast.py.Added pretrained, unconditional models on FFHQ and CelebA-HQ.Added a pretrained, 1.4B transformer model trained for class-conditional ImageNet synthesis, which obtains state-of-the-art FID scores among autoregressive approaches and outperforms BigGAN.Our paper received an update: See and the corresponding changelog.Use legacy=False in the quantizer config to enable it. For backward compatibility it isÄisabled by default (which corresponds to always training with beta=1.0). Thanks to rom1504 it is now easy to train a VQGAN on your own datasets.Added scene synthesis models as proposed in the paper High-Resolution Complex Scene Synthesis with Transformers, see this section.a f8-model with only 256 codebook entries) are available in our new work on Latent Diffusion Models. Tl dr We combine the efficiancy of convolutional approaches with the expressivity of transformers by introducing a convolutional VQGAN, which learns a codebook of context-rich visual parts, whose composition is modeled with an autoregressive transformer. Taming Transformers for High-Resolution Image Synthesis Taming Transformers for High-Resolution Image Synthesis CVPR 2021 (Oral)
0 Comments
Leave a Reply. |