The following pages link to Extending Text2Image Models to Accept Multi-Modal Conditions by Encoding to the CLIP Latent Space:
Displayed 1 item.