Generate Stunning Images with Prompt-Free Diffusion Cognitive Actions

22 Apr 2025
Generate Stunning Images with Prompt-Free Diffusion Cognitive Actions

Cognitive Actions in the cjwbw/prompt-free-diffusion spec empower developers with the ability to generate images using advanced diffusion models without relying on text prompts. By utilizing visual inputs, these actions streamline the creative process and allow for more intuitive image generation. The underlying technology leverages the Semantic Context Encoder (SeeCoder) and integrates seamlessly with various image processing techniques, offering flexibility and control over the generated output.

Prerequisites

Before you start integrating Cognitive Actions into your application, ensure you have the following:

  • API Key: Obtain an API key for the Cognitive Actions platform to authenticate your requests.
  • Basic Setup: Familiarity with making HTTP requests in your preferred programming language.

Authentication is typically handled by passing the API key in the request headers, allowing you to securely interact with the Cognitive Actions API.

Cognitive Actions Overview

Generate Image with Prompt-Free Diffusion

This action allows you to generate images based solely on visual inputs, utilizing the prompt-free diffusion capabilities of SeeCoder. It supports multiple diffusion models and various control methods, making it a versatile tool for image generation tasks.

Input

The action requires the following fields in its input schema:

  • control (string, required): The URI for the control image that guides the processing.
  • image (string, required): The URI of the input image to be transformed.
  • seed (integer, optional): A random seed for generating outputs. Leave blank for a system-generated random seed.
  • outputWidth (integer, optional): Specifies the output image's width (default: 512, max: 1536).
  • outputHeight (integer, optional): Specifies the output image's height (default: 512, max: 1536).
  • guidanceScale (number, optional): Scale for classifier-free guidance (default: 2, valid range: 0-10).
  • controlNetwork (string, optional): The type of ControlNet to use (default: "canny").
  • diffusionModel (string, optional): The chosen diffusion model for image generation (default: "Deliberate-v2.0").
  • contextualEncoder (string, optional): The context encoder to influence results (default: "SeeCoder").
  • preprocessingMethod (string, optional): Method for managing input data (default: "canny").
  • numberOfInferenceSteps (integer, optional): Number of denoising steps (default: 50, valid range: 1-500).

Example Input:

{
  "image": "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
  "control": "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
  "outputWidth": 768,
  "outputHeight": 512,
  "guidanceScale": 2,
  "controlNetwork": "canny",
  "diffusionModel": "Deliberate-v2.0",
  "contextualEncoder": "SeeCoder",
  "preprocessingMethod": "canny",
  "numberOfInferenceSteps": 50
}

Output

The action returns a URI pointing to the generated image. It typically follows this structure:

Example Output:

https://assets.cognitiveactions.com/invocations/14227584-fd25-4613-9605-739b216d5eb6/db373abe-03df-406f-8211-d00388ed7ad5.png

Conceptual Usage Example (Python)

Here's how you can invoke the Generate Image with Prompt-Free Diffusion action using a conceptual Python code snippet:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "2a11a7bd-1871-45d4-b2cd-12a0dae9900f"  # Action ID for Generate Image with Prompt-Free Diffusion

# Construct the input payload based on the action's requirements
payload = {
    "image": "https://replicate.delivery/pbxt/IvpLPCeH4QTQomgDkJy4NHla7zk2lSz4Tdv6f9x6vywCsMTs/astronautridinghouse-input.jpg",
    "control": "https://replicate.delivery/pbxt/IvpLP71kyA7Zz7BiLnjoskVpbCHnaflLHVMa6DNQWwvacF9u/astronautridinghouse-canny.png",
    "outputWidth": 768,
    "outputHeight": 512,
    "guidanceScale": 2,
    "controlNetwork": "canny",
    "diffusionModel": "Deliberate-v2.0",
    "contextualEncoder": "SeeCoder",
    "preprocessingMethod": "canny",
    "numberOfInferenceSteps": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the COGNITIVE_ACTIONS_API_KEY with your actual API key. The action_id is set to the ID for generating images using prompt-free diffusion. The input payload is formatted according to the action's requirements, and the response is processed to retrieve the generated image URL.

Conclusion

The Prompt-Free Diffusion Cognitive Actions offer a powerful way to generate images using only visual inputs, enhancing the creativity and efficiency of developers. With the ability to utilize various diffusion models and control mechanisms, you can create unique images tailored to your needs. Explore these capabilities further by experimenting with different input parameters and use cases!