Transform Images Effortlessly with ControlNet Cognitive Actions

The ControlNet specification enables developers to leverage advanced image transformation capabilities through a set of pre-built Cognitive Actions. These actions allow you to apply various transformations to images based on specified prompts, granting you the ability to control attributes such as scale, strength, and more. By integrating these cognitive actions into your applications, you can automate complex image processing tasks and enhance user experiences.
Prerequisites
Before diving into the Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform to authenticate your requests.
- Basic familiarity with JSON structure and Python programming.
- A valid URL for the images you wish to process.
To authenticate your requests, you will pass your API key in the headers of your HTTP requests.
Cognitive Actions Overview
Perform Image Transformation with ControlNet
This action utilizes ControlNet to apply specified transformations to an input image. You can control various parameters to achieve the desired output while minimizing undesirable characteristics.
- Category: Image Processing
Input
The input schema requires the following fields:
- inputImagePath (string, required): URI of the image to be processed.
- prompt (string, required): Main instruction for the image transformation.
- seed (number, optional): Starting point for randomness; default is 3.5.
- scale (number, optional): Controls the effect scale; default is 9.
- strength (number, optional): Intensity of the effect; default is 1.
- ddimSteps (integer, optional): Number of sampling steps; default is 20.
- negativePrompt (string, optional): Aspects to avoid; default is a list of negative traits.
- enableGuessMode (boolean, optional): Infers missing details; default is false.
- imageResolution (integer, optional): Output image resolution; default is 512 pixels.
- numberOfSamples (integer, optional): Number of samples to generate; default is 1.
- additionalPrompt (string, optional): Additional instructions; default is "best quality, extremely detailed".
- detectionResolution (integer, optional): Detection resolution; default is 512 pixels.
- estimatedTimeOfArrival (number, optional): Estimated time for processing; default is 0.
Example Input
Here’s an example of a valid JSON payload for this action:
{
"seed": 3.5,
"scale": 9,
"prompt": "Put furniture",
"strength": 1,
"ddimSteps": 20,
"inputImagePath": "https://replicate.delivery/pbxt/KSbEAl9T5UeDInRBNViM9xIpyYR2XNcRVqXC8gKJeWSfNe1y/point3d-commercial-imaging-ltd-nQlVMCHPysY-unsplash.jpg",
"negativePrompt": "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
"enableGuessMode": false,
"imageResolution": 512,
"numberOfSamples": 1,
"additionalPrompt": "best quality, extremely detailed",
"detectionResolution": 512,
"estimatedTimeOfArrival": 0
}
Output
Upon successful execution, this action typically returns an array of URLs pointing to the transformed images. Here’s an example of the expected output:
[
"https://assets.cognitiveactions.com/invocations/fa05f264-101c-4414-9d63-45fe2603e0b4/ed78cca2-9450-43a9-afb6-dbfa9b3bb339.png"
]
Conceptual Usage Example (Python)
To call this action using Python, you can structure your code as follows:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "a384e125-55b2-419f-8e41-a88968b7d6d7" # Action ID for Perform Image Transformation with ControlNet
# Construct the input payload based on the action's requirements
payload = {
"seed": 3.5,
"scale": 9,
"prompt": "Put furniture",
"strength": 1,
"ddimSteps": 20,
"inputImagePath": "https://replicate.delivery/pbxt/KSbEAl9T5UeDInRBNViM9xIpyYR2XNcRVqXC8gKJeWSfNe1y/point3d-commercial-imaging-ltd-nQlVMCHPysY-unsplash.jpg",
"negativePrompt": "longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality",
"enableGuessMode": False,
"imageResolution": 512,
"numberOfSamples": 1,
"additionalPrompt": "best quality, extremely detailed",
"detectionResolution": 512,
"estimatedTimeOfArrival": 0
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action_id corresponds to the "Perform Image Transformation with ControlNet" action. The payload is structured according to the input schema, and the response will provide you with the URLs of the transformed images.
Conclusion
The ControlNet Cognitive Actions provide powerful tools for developers looking to enhance their applications with sophisticated image transformation capabilities. By utilizing the provided input parameters effectively, you can generate high-quality outputs tailored to your specifications. Explore further use cases, integrate these actions into your projects, and elevate your image processing workflows today!