MiniMax's Hailuo Video Team Open-Sources for the First Time

【#Tech24H】MiniMax’s Hailuo Video team has open-sourced its visual tokenizer pre-training framework, VTP, addressing the issue of traditional tokenizers achieving high reconstruction accuracy but poor generation performance. VTP improves model performance through joint optimization of image-text contrastive learning, self-supervised learning, and reconstruction objectives, offering new insights for the development of generative models.
Editor:Zhang Liyan









