Enhance Your Images with Guiding Image Editing using camenduru/ml-mgie Actions

22 Apr 2025
Enhance Your Images with Guiding Image Editing using camenduru/ml-mgie Actions

In the realm of image processing, the camenduru/ml-mgie Cognitive Actions provide powerful tools for enhancing and customizing images through instruction-based guides. By leveraging Multimodal Large Language Models, these actions allow developers to efficiently edit images while maximizing quality and speed. In this article, we will explore the capabilities of the "Perform Guiding Image Editing" action, outlining its structure, usage, and potential applications.

Prerequisites

Before you start integrating the Cognitive Actions, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic knowledge of making HTTP requests and handling JSON data.

For authentication, you will typically pass your API key in the request headers.

Cognitive Actions Overview

Perform Guiding Image Editing

Description: This action allows users to edit images by providing instruction-based guides, improving customization through a simple prompt and configuration parameters.

Category: Image Processing

Input

The action requires the following fields in the input schema:

  • inputImage (string, required): The URL of the input image to be processed. This must be a valid URI.
  • prompt (string, optional): A textual instruction guiding the editing process (e.g., "make the frame red").
  • seed (integer, optional): A seed value for randomization, defaulting to 13331.
  • textConfiguration (number, optional): Configuration value for text processing, defaulting to 7.5.
  • imageConfiguration (number, optional): Configuration value for image processing, defaulting to 1.5.

Example Input:

{
    "seed": 13331,
    "prompt": "make the frame red",
    "inputImage": "https://replicate.delivery/pbxt/KNSKXP6DiykiZn7bEsZoZiaxGmE5o90BSUbDr67KrbOZcAvc/_input_0.jpg",
    "textConfiguration": 7.5,
    "imageConfiguration": 1.5
}

Output

The action typically returns a JSON object containing:

  • path (string): A URL link to the edited image.
  • text (string): A description of the changes made to the image.

Example Output:

{
    "path": "https://assets.cognitiveactions.com/invocations/da368fcf-2cab-4ecd-b6ed-24f1bf8bea73/561b4af8-5472-45ab-9460-f0d375f939ab.png",
    "text": "If the frame of the glasses in the image were made red, the overall appearance of the scene would change significantly. The red frame would draw more attention to the glass and create a stronger contrast with the black frame."
}

Conceptual Usage Example (Python)

Here’s a conceptual Python code snippet to demonstrate how you might invoke the "Perform Guiding Image Editing" action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "8a3d6ec3-b6f2-4a73-86f2-32d97e1e1e76"  # Action ID for Perform Guiding Image Editing

# Construct the input payload based on the action's requirements
payload = {
    "seed": 13331,
    "prompt": "make the frame red",
    "inputImage": "https://replicate.delivery/pbxt/KNSKXP6DiykiZn7bEsZoZiaxGmE5o90BSUbDr67KrbOZcAvc/_input_0.jpg",
    "textConfiguration": 7.5,
    "imageConfiguration": 1.5
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured according to the action's input schema, and the response is processed to display the results.

Conclusion

The "Perform Guiding Image Editing" action from the camenduru/ml-mgie Cognitive Actions suite provides a robust solution for developers looking to enhance their image editing capabilities. By utilizing instruction-based guides, you can create customized images quickly and efficiently. As you explore further, consider integrating these actions into applications that require advanced image processing features, such as photo editing tools or content generation platforms. Happy coding!