Enhance Your Images with the adirik/t2i-adapter-sdxl-sketch Cognitive Actions

22 Apr 2025
Enhance Your Images with the adirik/t2i-adapter-sdxl-sketch Cognitive Actions

In the ever-evolving world of image processing, the adirik/t2i-adapter-sdxl-sketch API offers developers powerful tools to harness the capabilities of Stable Diffusion-XL for enhancing and modifying images through sketches. With a focus on flexibility and control, the Cognitive Actions provided allow developers to creatively manipulate images with precision. In this guide, we will explore how to utilize these actions to integrate advanced image modification features into your applications.

Prerequisites

To get started with the Cognitive Actions, you will need:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of working with APIs and handling JSON data.

For authentication, you will typically pass your API key in the request headers when making API calls to execute the actions.

Cognitive Actions Overview

Modify Image Using Sketches

The Modify Image Using Sketches action allows you to combine an original image with sketch conditions to create enhanced images. This action provides a rich and adaptable image synthesis experience, giving you significant control over how the final output looks.

Input

The input for this action is structured in a JSON object that includes several properties.

  • image (required): The URI of the input image in a supported format.
  • prompt (optional): A text prompt that describes the desired characteristics of the output image. Default is "a robot, mount fuji in the background, 4k photo, highly detailed".
  • scheduler (optional): Select the algorithm used for image generation; defaults to "K_EULER_ANCESTRAL".
  • randomSeed (optional): An integer seed value for reproducibility.
  • guidanceScale (optional): A scaling factor for prompt adherence, typically between 0 and 10; default is 7.5.
  • negativePrompt (optional): A list of elements to avoid in the output image to improve focus on the intended prompt.
  • numberOfSamples (optional): Number of images to generate per request, maximum of 4; default is 1.
  • numberOfInferenceSteps (optional): Defines the number of diffusion steps for generating images, affecting detail and quality; default is 30.
  • adapterConditioningScale (optional): Scale factor for adapter-conditioned image processing; default is 0.9.
  • adapterConditioningFactor (optional): Determines the extent to which the input image influences the output; default is 1.

Here is an example input for this action:

{
  "image": "https://replicate.delivery/pbxt/Jbn9KLY1fCG2CNgBrsJgFKARXnIPi5I5vsO0gARPB8Olfz0O/org_sketch.png",
  "prompt": "a robot, mount fuji in the background, 4k photo, highly detailed",
  "scheduler": "K_EULER_ANCESTRAL",
  "guidanceScale": 7.5,
  "negativePrompt": "extra digit, fewer digits, cropped, worst quality, low quality, glitch, deformed, mutated, ugly, disfigured",
  "numberOfSamples": 1,
  "numberOfInferenceSteps": 30,
  "adapterConditioningScale": 0.9,
  "adapterConditioningFactor": 1
}

Output

The action typically returns an array of URIs pointing to the generated images. Here’s an example of the output you might receive:

[
  "https://assets.cognitiveactions.com/invocations/3cc07b5d-eb97-435c-94ab-fde307354ce3/5839dd1d-6584-4b34-8dd3-389d54ce416c.png",
  "https://assets.cognitiveactions.com/invocations/3cc07b5d-eb97-435c-94ab-fde307354ce3/295ad175-881c-4d68-b277-9febe6036b11.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual Python snippet that demonstrates how to call the Modify Image Using Sketches action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "268b5fee-8e20-4b46-badb-75118b464820" # Action ID for Modify Image Using Sketches

# Construct the input payload based on the action's requirements
payload = {
    "image": "https://replicate.delivery/pbxt/Jbn9KLY1fCG2CNgBrsJgFKARXnIPi5I5vsO0gARPB8Olfz0O/org_sketch.png",
    "prompt": "a robot, mount fuji in the background, 4k photo, highly detailed",
    "scheduler": "K_EULER_ANCESTRAL",
    "guidanceScale": 7.5,
    "negativePrompt": "extra digit, fewer digits, cropped, worst quality, low quality, glitch, deformed, mutated, ugly, disfigured",
    "numberOfSamples": 1,
    "numberOfInferenceSteps": 30,
    "adapterConditioningScale": 0.9,
    "adapterConditioningFactor": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this Python snippet, you will replace the COGNITIVE_ACTIONS_API_KEY and the COGNITIVE_ACTIONS_EXECUTE_URL with your actual API key and endpoint. The action ID for "Modify Image Using Sketches" is included, and the input payload is structured in accordance with the action's requirements. This example illustrates how to handle the response and potential errors effectively.

Conclusion

The adirik/t2i-adapter-sdxl-sketch Cognitive Actions empower developers to creatively modify images with precision using sketches. By integrating these actions into your applications, you can enhance user experiences and unlock new creative possibilities in image processing. Explore the capabilities of these actions and consider how they might fit into your next project!