Create Stunning Images with Kandinsky and ControlNet

26 Apr 2025
Create Stunning Images with Kandinsky and ControlNet

In the world of digital art and design, creating high-quality images that meet specific requirements can be a challenging task. The "Kandinsky 2 2 Controlnet Depth" offers a powerful solution through its Cognitive Actions, allowing developers to generate detailed images with remarkable precision. This service harnesses the capabilities of the Kandinsky model combined with ControlNet conditioning, providing a seamless experience for both text-to-image and image-to-image tasks.

With this API, developers can expect to streamline their workflow, reduce the time spent on image generation, and achieve results that align closely with their creative vision. Whether you're working on a marketing campaign, game design, or any other project requiring unique visuals, this tool can significantly enhance your productivity and output quality.

Prerequisites

To get started, you'll need a Cognitive Actions API key and a basic understanding of making API calls.

Generate Image with Kandinsky and ControlNet

This action allows you to create high-detail images using the Kandinsky model, leveraging ControlNet for enhanced control over the output's composition and dimensions.

Purpose

The "Generate Image with Kandinsky and ControlNet" action addresses the need for high-quality image generation, whether you're producing entirely new creations from text prompts or transforming existing images into new artistic interpretations. It supports customization in terms of size and content, making it versatile for various applications.

Input Requirements

To utilize this action, you'll need to provide the following inputs:

  • Seed (integer): A random seed to ensure variability in output. Leave blank for randomization.
  • Task (string): Specify whether you're performing a 'text2img' or 'img2img' task. The default is 'img2img'.
  • Image (string): A URI of the input image, applicable only when the task is 'img2img'.
  • Width (integer): Desired width of the output image (options ranging from 384 to 2048 pixels).
  • Height (integer): Desired height of the output image (options ranging from 384 to 2048 pixels).
  • Prompt (string): The guiding text for image generation (default: "A robot, 4k photo").
  • Negative Prompt (string): A list of undesirable elements to avoid in the output, enhancing image quality.
  • Number of Outputs (integer): Specify how many images to produce (range: 1 to 4).
  • Number of Inference Steps (integer): Defines the steps for the denoising process, impacting image detail (range: 1 to 500).

Expected Output

The expected output is a set of generated images that reflect the specifications provided. For example, an output could be a URL linking to the final image, showcasing the high-quality results produced by the action.

Example Input

{
  "task": "img2img",
  "image": "https://replicate.delivery/pbxt/JBQjVyAYINXgKMvXGfE2ykyLgNE6E7ytZLC8b26BM2D0IRoG/cat.png",
  "width": 768,
  "height": 768,
  "prompt": "A robot, 4k photo",
  "negativePrompt": "lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature",
  "numberOfOutputs": 1,
  "numberOfInferenceSteps": 75
}

Example Output

[
  "https://assets.cognitiveactions.com/invocations/17fddf35-9218-48f5-af5e-9eaf2b171d7c/d41026bc-3e18-49b9-9460-ba1f88ae82f2.png"
]
import requests
import json

# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"

action_id = "52243daa-5aee-43c7-ad8b-515a5274ecec" # Action ID for: Generate Image with Kandinsky and ControlNet

# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
  "task": "img2img",
  "image": "https://replicate.delivery/pbxt/JBQjVyAYINXgKMvXGfE2ykyLgNE6E7ytZLC8b26BM2D0IRoG/cat.png",
  "width": 768,
  "height": 768,
  "prompt": "A robot, 4k photo",
  "negativePrompt": "lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck, username, watermark, signature",
  "numberOfOutputs": 1,
  "numberOfInferenceSteps": 75
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json",
    # Add any other required headers for the Cognitive Actions API
}

# Prepare the request body for the hypothetical execution endpoint
request_body = {
    "action_id": action_id,
    "inputs": payload
}

print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json=request_body
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully. Result:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body (non-JSON): {e.response.text}")
    print("------------------------------------------------")

Use Cases for this Action

  • Creative Projects: Artists and designers can use this action to explore new styles and concepts, generating images that match their creative briefs.
  • Marketing and Advertising: Create unique visuals for campaigns that stand out and convey specific messages effectively.
  • Game Development: Generate assets that can be used in games, ensuring they meet both artistic and technical specifications.
  • Social Media Content: Produce eye-catching images for posts or advertisements that engage audiences and enhance brand visibility.

Conclusion

The "Kandinsky 2 2 Controlnet Depth" Cognitive Action provides developers with a robust tool for generating high-quality images tailored to specific needs. By simplifying the image creation process and allowing for detailed customization, this service opens up a world of possibilities for various applications, from creative projects to marketing strategies.

As you integrate this action into your workflow, consider how you can leverage its capabilities to enhance your projects and deliver stunning visuals that resonate with your audience. Start exploring the potential today!