Apple researchers have introduced an AI model enabling users to write desired changes to a photograph using simple language, eliminating the need for traditional photo editing software. The groundbreaking model, named MGIE (MLLM-Guided Image Editing), is the result of a collaboration between Apple and the University of California, Santa Barbara.
Capable of executing various editing tasks including cropping, resizing, flipping, and applying filters solely through text prompts, MGIE represents a significant advancement in image editing technology. This innovation can handle both straightforward and intricate editing requests, such as altering specific objects within a photo or enhancing brightness levels.
MGIE harnesses the power of multimodal language models, first deciphering user prompts and then generating corresponding edits. For instance, a request for a “bluer sky” translates to adjusting the brightness of the sky portion of an image. This approach ensures precise interpretation and execution of editing instructions.
For example, a prompt to “make it more healthy” when editing an image of a pepperoni pizza results in the addition of vegetable toppings. Similarly, instructing the model to “add more contrast to simulate more light” enhances the brightness of a dark image, such as tigers in the Sahara.
In a statement accompanying the release, the researchers highlighted the model’s ability to derive explicit visual-aware intentions, leading to meaningful image enhancements. They conducted extensive studies validating MGIE’s effectiveness across various editing scenarios, emphasising its improved performance while maintaining competitive efficiency. Furthermore, they envision the MLLM-guided framework contributing to future advancements in vision-and-language research.
Apple has made MGIE accessible for download via GitHub, with a web demo also available on Hugging Face Spaces. However, the company has not disclosed its plans for the model beyond research purposes.
While some image generation platforms like OpenAI’s DALL-E 3 offer similar capabilities, and Adobe’s Firefly AI model powers generative fill in its Photoshop software, Apple’s foray into the generative AI space signifies its commitment to incorporating advanced AI features into its products. CEO Tim Cook has previously expressed the company’s intention to expand AI functionalities across its devices, with recent efforts including the release of the open-source machine learning framework MLX in December, aimed at facilitating AI model training on Apple Silicon chips.
Source: Business Today