Generate Stunning Art with Stability AI's Cognitive Actions for Image Generation

22 Apr 2025
Generate Stunning Art with Stability AI's Cognitive Actions for Image Generation

In the rapidly evolving world of artificial intelligence, Stability AI's Stable Diffusion 3.5 Large model offers powerful capabilities for generating high-resolution artistic images. By leveraging Cognitive Actions, developers can seamlessly integrate these advanced image generation features into their applications. This post will guide you through the key functionalities of these actions, their input and output structures, and how to implement them in your projects.

Prerequisites

Before you start using Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which will be required for authentication when making requests.
  • Basic knowledge of JSON payloads, as you'll need to structure your requests accordingly.

Authentication typically involves sending your API key in the request headers.

Cognitive Actions Overview

Generate High-Resolution Artistic Images

Description:
This action allows you to create detailed and high-resolution images in various artistic styles. By utilizing text prompts with Query-Key Normalization, it enhances image quality and diversity.

Category: image-generation

Input Schema: The action accepts the following parameters:

  • prompt (string): The text prompt guiding the image generation. Example: ~*~aesthetic~*~ #boho #fashion, full-body 30-something woman laying on microfloral grass, candid pose, overlay reads Stable Diffusion 3.5, cheerful cursive typography font.
  • aspectRatio (string): Specifies the aspect ratio of the output image (default is 1:1). Examples include 16:9, 3:2, etc.
  • outputFormat (string): The format of the output image (default is webp). Options include jpg, png.
  • guidanceScale (number): Degree of adherence to the text prompt (default is 3.5, max is 20).
  • numberOfSteps (integer): The number of steps for the image sampler (default is 35, max is 50).
  • outputQuality (integer): Defines output image quality (default is 90, range: 0-100).
  • promptStrength (number): Controls the denoising level in image-to-image mode (default is 0.85).
  • randomSeed (integer, optional): Seed for reproducibility of image generation.
  • image (string, optional): Input image for image-to-image mode.

Example Input:

{
  "prompt": "~*~aesthetic~*~ #boho #fashion, full-body 30-something woman laying on microfloral grass, candid pose, overlay reads Stable Diffusion 3.5, cheerful cursive typography font",
  "aspectRatio": "1:1",
  "outputFormat": "webp",
  "guidanceScale": 4.5,
  "numberOfSteps": 40,
  "outputQuality": 90,
  "promptStrength": 0.85
}

Output: The action typically returns a URL to the generated image.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/61aaafdd-19d3-4859-a690-722468871541/9fe9593a-6b4a-4d9b-8d3c-07560d7e26bd.webp"
]

Conceptual Usage Example (Python): Here’s a conceptual example of how to invoke this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "841e1aac-bd05-4fb1-bd50-c9f22ad2fbde" # Action ID for Generate High-Resolution Artistic Images

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "~*~aesthetic~*~ #boho #fashion, full-body 30-something woman laying on microfloral grass, candid pose, overlay reads Stable Diffusion 3.5, cheerful cursive typography font",
    "aspectRatio": "1:1",
    "outputFormat": "webp",
    "guidanceScale": 4.5,
    "numberOfSteps": 40,
    "outputQuality": 90,
    "promptStrength": 0.85
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

Explanation of the Python Code:

  • The API key and endpoint URL need to be replaced with actual values.
  • The action ID identifies which Cognitive Action to execute.
  • The payload variable is structured according to the input schema required by the action.
  • The code sends a POST request to the hypothetical execution endpoint and prints the results.

Conclusion

Integrating the Stable Diffusion 3.5 Large Cognitive Actions into your applications provides a robust way to generate stunning artistic images based on user-defined prompts. By leveraging these pre-built actions, developers can save time and effort while enhancing their applications with advanced image generation capabilities. Explore the various parameters and unleash your creativity with AI-generated art!