arxiv RealCustom++: Representing Images as Real Textual Word for Real-Time Customization

名称
RealCustom++: Representing Images as Real Textual Word for Real-Time Customization
首页
https://yiyibooks.cn/arxiv/2408.09744v3/index.html
原始地址
https://arxiv.org/pdf/2408.09744
描述
Text-to-image customization, which takes given texts and images depicting given subjects as inputs, aims to synthesize new images that align with both text semantics and subject appearance. This task provides precise control over details that text alone cannot capture and is fundamental for various real-world applications, garnering significant interest from academia and industry. Existing works follow the pseudo-word paradigm, which involves representing given subjects as pseudo-words and combining them with given texts to collectively guide the generation.