Generate Stunning Images with the Kandinsky Cognitive Actions
In the realm of artificial intelligence and creative expression, the Kandinsky Cognitive Actions offer developers powerful tools for generating unique images by blending text and images. Utilizing the Kandinsky 2.1 model, these actions are optimized for speed and customization, enabling a variety of applications from art generation to content creation. By leveraging these pre-built actions, developers can focus on integrating creativity into their applications without the need for extensive machine learning expertise.
Prerequisites
Before getting started with Kandinsky Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Familiarity with making HTTP requests and handling JSON payloads.
- A basic understanding of Python programming for testing the API calls.
Authentication typically involves passing your API key in the headers of your requests, ensuring secure access to the capabilities offered by the platform.
Cognitive Actions Overview
Generate Image with Kandinsky
The Generate Image with Kandinsky action allows you to create images by merging textual prompts with image inputs. This action is part of the image generation category and is designed for tasks such as art creation or generating illustrative content based on textual descriptions.
Input
The input schema for this action requires the following fields:
- task: (string) Specify the task to be performed. Currently, only
"text2img"is supported. - width: (integer) Width of the output image in pixels, ranging from 64 to 1024 (default: 512).
- height: (integer) Height of the output image, also between 64 and 1024 (default: 512).
- prompt: (string) A descriptive text guiding the image generation, outlining desired features (default: "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting").
- scheduler: (string) Select the scheduler for denoising, options include
"dpm"and"ddim"(default:"ddim"). - imageWeight: (number) Adjust the weight of the input image, with values from 0 to 10 (default: 1).
- guidanceScale: (number) Controls adherence to the prompt, ranging from 1 to 10 (default: 4).
- negativePrompt: (string) Specify elements to avoid in the output, separated by commas (default includes various undesirable traits).
- numberOfOutputs: (integer) Defines how many images to generate (1 to 4, default: 1).
- numberOfStepsPrior: (integer) Number of denoising steps in the prior diffusion process (1 to 500, default: 2).
- numberOfInferenceSteps: (integer) Number of denoising steps in the inference process (1 to 500, default: 18).
Here’s an example of the JSON payload you might send to invoke this action:
{
"task": "text2img",
"width": 512,
"height": 512,
"prompt": "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting",
"scheduler": "ddim",
"imageWeight": 1,
"guidanceScale": 4,
"negativePrompt": "ugly, tiling, oversaturated, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft",
"numberOfOutputs": 1,
"numberOfStepsPrior": 2,
"numberOfInferenceSteps": 18
}
Output
The output of this action typically returns a list of image URLs generated based on the provided inputs. Here’s an example of a possible output:
[
"https://assets.cognitiveactions.com/invocations/9dbb5ff2-5b7f-4ba1-909b-2a04ebf35a15/a6cb298c-bd8d-48f3-aaba-4b6fa39f98be.png"
]
Conceptual Usage Example (Python)
Below is a conceptual Python code snippet demonstrating how to call the Generate Image with Kandinsky action using a hypothetical Cognitive Actions execution endpoint:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "e580824f-f675-4f9c-aca8-3c13b279d30a" # Action ID for Generate Image with Kandinsky
# Construct the input payload based on the action's requirements
payload = {
"task": "text2img",
"width": 512,
"height": 512,
"prompt": "A alien cheeseburger creature eating itself, claymation, cinematic, moody lighting",
"scheduler": "ddim",
"imageWeight": 1,
"guidanceScale": 4,
"negativePrompt": "ugly, tiling, oversaturated, poorly drawn hands, poorly drawn feet, poorly drawn face, out of frame, extra limbs, disfigured, deformed, body out of frame, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft",
"numberOfOutputs": 1,
"numberOfStepsPrior": 2,
"numberOfInferenceSteps": 18
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, replace the COGNITIVE_ACTIONS_API_KEY and COGNITIVE_ACTIONS_EXECUTE_URL with your actual credentials and endpoint. The action ID and input payload are structured to match the requirements for generating an image.
Conclusion
The Kandinsky Cognitive Actions provide an exciting avenue for developers to integrate innovative image generation capabilities into their applications. By leveraging the power of the Kandinsky 2.1 model, you can create visually stunning content with minimal effort. Whether you're aiming to generate artwork, enhance user interfaces, or explore creative possibilities, these actions open up a world of potential. Start experimenting today and bring your ideas to life!