Generate Stunning Images with the xarty8932/dream Cognitive Actions

23 Apr 2025
Generate Stunning Images with the xarty8932/dream Cognitive Actions

In the world of AI and machine learning, generating images from text prompts has become a fascinating application. The xarty8932/dream spec provides an innovative Cognitive Action that allows developers to create images using advanced inpainting and image generation techniques. With various customization options and enhanced image quality, these pre-built actions simplify the process of integrating powerful image generation capabilities into your applications.

Prerequisites

Before you start using the Cognitive Actions from the xarty8932/dream spec, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic understanding of how to send HTTP requests, particularly POST requests with JSON payloads.

Authentication typically involves passing your API key in the headers of your request to ensure secure access to the API.

Cognitive Actions Overview

Generate Image with Inpainting

This action generates images based on a text prompt, supporting both img2img and inpainting modes. It utilizes various refinement methods for enhanced image quality and allows you to control guidance and scheduling algorithms, as well as optional watermarking.

Input:

The Generate Image with Inpainting action requires a structured input payload. Here’s the schema breakdown:

  • mask (string, optional): URI of the input mask to use in inpaint mode.
  • seed (integer, optional): Random seed for generating consistent outputs.
  • image (string, optional): URI of the input image for img2img or inpaint modes.
  • width (integer, default: 1024): Width of the output image in pixels.
  • height (integer, default: 1024): Height of the output image in pixels.
  • prompt (string, required): A textual description of the desired output.
  • loraScale (number, default: 0.6): Scale factor for LoRA on trained models.
  • loraWeights (string, optional): Specify the LoRA weights to use.
  • outputCount (integer, default: 1): Number of images to produce (1-4).
  • refineStyle (string, default: "no_refiner"): Style for refining output images.
  • guidanceScale (number, default: 7.5): Strength of classifier-free guidance.
  • applyWatermark (boolean, default: true): Whether to apply a watermark.
  • negativePrompt (string, optional): Description of what should not be included in the output.
  • promptStrength (number, default: 0.8): Influence of the prompt on img2img or inpaint modes.
  • refineStepCount (integer, optional): Number of refinement steps for 'base_image_refiner'.
  • schedulingMethod (string, default: "K_EULER"): Scheduling method for model inference.
  • highNoiseFraction (number, default: 0.8): Fraction of noise added during refinement.
  • inferenceStepCount (integer, default: 50): Number of steps for denoising during inference.
  • disableSafetyChecker (boolean, default: false): Disable the safety checker for generated images.

Example Input:

{
  "width": 1024,
  "height": 1024,
  "prompt": "realstic Prompt: \"A detailed portrait of a middle-aged man with a rugged face, wearing a leather jacket. He has piercing blue eyes and a determined expression. The background is a windswept desert landscape.\" - @Gary Makinson (fast)\n",
  "loraScale": 0.6,
  "outputCount": 1,
  "refineStyle": "no_refiner",
  "guidanceScale": 7.5,
  "applyWatermark": false,
  "negativePrompt": "",
  "promptStrength": 0.8,
  "schedulingMethod": "K_EULER",
  "highNoiseFraction": 0.8,
  "inferenceStepCount": 50
}

Output:

Upon successful execution, the action returns a URL to the generated image. Here’s an example of what the output might look like:

[
  "https://assets.cognitiveactions.com/invocations/6a2192c2-de01-4a18-a219-fc123dea6d97/dddf3b43-2612-429e-8c5b-e27fcbc4b2d5.png"
]

Conceptual Usage Example (Python):

Here’s a conceptual example of how you might call the Generate Image with Inpainting action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "2dca5cf3-e8e3-4027-b2ea-57f99017e88b"  # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "realstic Prompt: \"A detailed portrait of a middle-aged man with a rugged face, wearing a leather jacket. He has piercing blue eyes and a determined expression. The background is a windswept desert landscape.\" - @Gary Makinson (fast)",
    "loraScale": 0.6,
    "outputCount": 1,
    "refineStyle": "no_refiner",
    "guidanceScale": 7.5,
    "applyWatermark": False,
    "negativePrompt": "",
    "promptStrength": 0.8,
    "schedulingMethod": "K_EULER",
    "highNoiseFraction": 0.8,
    "inferenceStepCount": 50
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • Replace the placeholder API key and endpoint with your actual credentials.
  • The payload is structured according to the required inputs for the action.
  • The response handling provides a simple way to check for errors and display the generated image URL.

Conclusion

The xarty8932/dream Cognitive Action for generating images with inpainting opens up a world of possibilities for developers looking to integrate image generation capabilities into their applications. With options for refinement, guidance, and customization, you can create high-quality images that meet your specific needs. As a next step, consider experimenting with different prompts and configurations to explore the full potential of this action. Happy coding!