BENGALURU: Imagine seeing yourself playing guitar or sharing a meal with Einstein. Researchers at the Indian Institute of Science (IISc) have developed a system that makes this possible. Their system allows users to seamlessly insert themselves or others into AI-generated scenes and make precise facial feature adjustments.
“Our system makes these creative scenarios possible while maintaining precise control over facial features,” explains Prof Venkatesh Babu, adding that the breakthrough lies in the team’s novel approach of combining strengths of two AI models.
The innovative system, developed at IISc’s Vision and AI Lab (VAL) at Computational and Data Science Department (CDS), combines two powerful image generation technologies: Text-to-Image (T2I) diffusion models and Style Generative Adversarial Network (StyleGAN) models.
The research team, comprising Rishubh Parihar, Sachidanand VS, Sabariswaran Mani, and Tejan Karmali, working under the guidance of Prof Babu, has created a model that transforms StyleGAN’s facial representations into a format compatible with T2I models. This has helped overcome the individual limitations of the models.
While T2I models excel at creating complex scenes from text descriptions, they struggle with precise face editing. StyleGAN models, conversely, are adept at generating and modifying realistic faces but are limited to face portraits. The team’s solution introduces an innovative adapter that bridges this gap, allowing for seamless integration of both capabilities.
“A key feature of the system is its ability to handle multiple subjects in a single image without mixing up their facial features. The parallel generation technique ensures that each person’s identity remains distinct while blending naturally with the background scene. Users can modify individual facial attributes—such as adding a smile or beard—without affecting other subjects in the image,” Babu said.
This development represents a significant step forward in generative AI technology, offering new possibilities for creative expression and image manipulation. The system’s ability to maintain precise control over facial features while generating complex scenes opens up numerous applications in fields ranging from entertainment to digital art.
“While we’ve kept our code open for anybody wanting to use it responsibly, we urge people not to misuse it, as any new technology can be put to misuse,” Babu warned.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *