Enhance Your Images with LEdits++: A Developer's Guide to Cognitive Actions

21 Apr 2025
Enhance Your Images with LEdits++: A Developer's Guide to Cognitive Actions

In the realm of image processing, the ability to manipulate visuals using intuitive textual descriptions has gained immense popularity. The LEdits++ action, part of the adirik/leditsplusplus spec, allows developers to perform sophisticated image editing by specifying objects to add, remove, or modify within images through textual prompts. This powerful tool supports image inversion and reconstructs input images based on given parameters, enabling advanced customization that can significantly enhance your applications.

Prerequisites

Before diving into using Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests. This key will typically be passed in the request headers.
  • Familiarity with basic JSON structures, as you'll be constructing input payloads in this format.

Cognitive Actions Overview

Perform Textual Image Editing with LEdits++

Description:
Use LEdits++ to perform sophisticated image editing by specifying objects to add, remove, or modify within images using textual prompts. This operation supports image inversion and reconstructs input images based on given parameters for advanced customization.

Category: image-processing

Input

The input for this action includes several fields, both required and optional. Here’s a breakdown:

  • image (required): A URI pointing to the input image to be edited.
    Example:
    "image": "https://replicate.delivery/pbxt/Kdrtkd4IdmtW53B6l9tG1upqxL6xFMhXobvcQ27qayMQFAIA/girl_with_a_pearl_earring.jpeg"
  • skip (optional): The portion of initial steps ignored for inversion and generation. Defaults to 0.15.
    Example:
    "skip": 0.3
  • editingThreshold (optional): Comma-separated float values representing edit thresholds for each editing prompt. Defaults to 0.9 if empty.
    Example:
    "editingThreshold": "0.75"
  • sourceDescription (optional): A prompt describing the input image used for guidance during inversion. If empty, guidance is disabled.
    Example:
    "sourceDescription": ""
  • editingWarmupSteps (optional): The number of diffusion steps per prompt where guidance is not applied. Defaults to 0.
    Example:
    "editingWarmupSteps": 8
  • editingInstructions (optional): Comma-separated descriptions of objects to add, remove, or edit. Defaults to None, which inverts and reconstructs the input image.
    Example:
    "editingInstructions": "glasses"
  • negativeFirstPrompt (optional): The negative prompt for the first text encoder to guide the image generation. Defaults to None.
    Example:
    "negativeFirstPrompt": ""
  • editingGuidanceScale (optional): Comma-separated float values for each change in the editing prompts list. Defaults to 5 if empty.
    Example:
    "editingGuidanceScale": "3.0"
  • negativeSecondPrompt (optional): The negative prompt for the second text encoder. Defaults to None if the negative_first_prompt is empty.
    Example:
    "negativeSecondPrompt": ""
  • numberOfInversionSteps (optional): The number of steps for image inversion. Defaults to 50.
    Example:
    "numberOfInversionSteps": 50
  • sourceGuidanceStrength (optional): Defines the strength of guidance during inversion. Defaults to 3.5.
    Example:
    "sourceGuidanceStrength": 3.5
  • reverseEditingInstructions (optional): Comma-separated booleans (True/False) indicating if the prompt in editing_prompts should be increased or decreased. Defaults to False.
    Example:
    "reverseEditingInstructions": "False"

Example Input

Here's how a complete input payload might look:

{
  "skip": 0.3,
  "image": "https://replicate.delivery/pbxt/Kdrtkd4IdmtW53B6l9tG1upqxL6xFMhXobvcQ27qayMQFAIA/girl_with_a_pearl_earring.jpeg",
  "editingThreshold": "0.75",
  "editingWarmupSteps": 8,
  "editingInstructions": "glasses",
  "editingGuidanceScale": "3.0",
  "numberOfInversionSteps": 50,
  "sourceGuidanceStrength": 3.5,
  "reverseEditingInstructions": "False"
}

Output

The action typically returns a URI pointing to the edited image. For instance:

"https://assets.cognitiveactions.com/invocations/48ab9aab-57cb-444e-9bac-6201df654866/e48ef27d-b1f5-43cb-8b8f-2b3b8c21d9b7.png"

Conceptual Usage Example (Python)

Here's a conceptual Python code snippet demonstrating how to call the LEdits++ action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "4974ca09-4b33-4eea-a77c-a42f02d88f93" # Action ID for Perform Textual Image Editing with LEdits++

# Construct the input payload based on the action's requirements
payload = {
    "skip": 0.3,
    "image": "https://replicate.delivery/pbxt/Kdrtkd4IdmtW53B6l9tG1upqxL6xFMhXobvcQ27qayMQFAIA/girl_with_a_pearl_earring.jpeg",
    "editingThreshold": "0.75",
    "editingWarmupSteps": 8,
    "editingInstructions": "glasses",
    "editingGuidanceScale": "3.0",
    "numberOfInversionSteps": 50,
    "sourceGuidanceStrength": 3.5,
    "reverseEditingInstructions": "False"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID corresponds to the LEdits++ operation, and the input payload is structured according to the specifications provided.

Conclusion

The LEdits++ action offers a powerful way to enhance images through intuitive textual commands, making it an excellent addition to any developer's toolkit. By leveraging these Cognitive Actions, you can create applications that provide users with advanced image editing capabilities without the need for complex algorithms or extensive coding.

Consider exploring additional use cases, experimenting with different prompts, and integrating these capabilities into your projects. Happy coding!