Generate Stunning Images with Cognitive Actions from mikraz/mik79

22 Apr 2025
Generate Stunning Images with Cognitive Actions from mikraz/mik79

In the world of digital creativity, the ability to generate high-quality images through advanced techniques is a game changer. The Cognitive Actions provided by the mikraz/mik79 spec allows developers to harness the power of image generation with inpainting capabilities. These pre-built actions simplify the process of creating stunning images tailored to specific requirements, offering flexibility in dimensions, formats, and levels of detail.

Prerequisites

Before diving into the integration of Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic understanding of how to make HTTP requests with JSON payloads.

Conceptually, authentication typically involves passing your API key in the headers of your requests to authenticate your access to the Cognitive Actions services.

Cognitive Actions Overview

Generate Image with Inpainting

The Generate Image with Inpainting action allows developers to create high-quality images using advanced techniques. Users can specify custom dimensions, formats, and guidance levels, making this action versatile for various applications.

Input

The input schema requires the following fields:

  • prompt (required): A descriptive text prompt guiding the image generation.
  • mask: URI of the image mask for inpainting, which overrides width and height if provided.
  • seed: Optional integer for reproducible image generation.
  • image: Optional URI of an input image for inpainting.
  • model: Selects the inference model (default is "dev").
  • width: Width of the image (only applicable if aspect ratio is "custom").
  • height: Height of the image (only applicable if aspect ratio is "custom").
  • guidanceScale: A number indicating the guidance strength (default is 3).
  • fastModeEnabled: Boolean to enable faster predictions.
  • imageAspectRatio: Defines the aspect ratio of the generated image.
  • outputImageCount: Number of images to generate (between 1 and 4).
  • imageOutputFormat: Specifies the output format (webp, jpg, or png).
  • imageOutputQuality: Quality setting for the output image.
  • inferenceStepCount: Number of steps in the image generation process.
  • imagePromptStrength: Strength of the prompt application in img2img mode.
  • loraApplicationScale and additionalLoraApplicationScale: Control the application of LoRA weights.
  • Other optional parameters include safety checker settings and loading alternate weights.

Here's an example of the input JSON payload:

{
  "model": "dev",
  "prompt": "\"A dramatic, cinematic shot of mikraz as Batman standing on a gargoyle high atop a Gotham City skyscraper...\"",
  "guidanceScale": 3.5,
  "imageAspectRatio": "16:9",
  "outputImageCount": 4,
  "imageOutputFormat": "png",
  "imageOutputQuality": 100,
  "inferenceStepCount": 28,
  "imagePromptStrength": 0.8,
  "loraApplicationScale": 1,
  "additionalLoraApplicationScale": 1
}

Output

Upon successful execution, the action returns a list of URLs pointing to the generated images. For example:

[
  "https://assets.cognitiveactions.com/invocations/aa3f1f53-1743-42f9-8d2d-b5af00ea67ea/479f015f-afed-4463-b082-1fc11801ba9c.png",
  "https://assets.cognitiveactions.com/invocations/aa3f1f53-1743-42f9-8d2d-b5af00ea67ea/04b03df0-d6a3-4aad-b3f9-d3b7ca767e2c.png",
  "https://assets.cognitiveactions.com/invocations/aa3f1f53-1743-42f9-8d2d-b5af00ea67ea/72bf72f2-b5b5-4142-9e32-6c2a3649d779.png",
  "https://assets.cognitiveactions.com/invocations/aa3f1f53-1743-42f9-8d2d-b5af00ea67ea/889bbba8-127d-492a-9292-4e07e12a72b4.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual example of how you might call this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "0d3fb286-5a14-459e-a6cb-fcfc3bae75c6" # Action ID for Generate Image with Inpainting

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "\"A dramatic, cinematic shot of mikraz as Batman standing on a gargoyle high atop a Gotham City skyscraper...\"",
    "guidanceScale": 3.5,
    "imageAspectRatio": "16:9",
    "outputImageCount": 4,
    "imageOutputFormat": "png",
    "imageOutputQuality": 100,
    "inferenceStepCount": 28,
    "imagePromptStrength": 0.8,
    "loraApplicationScale": 1,
    "additionalLoraApplicationScale": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, you replace the action_id with the appropriate ID for the Generate Image with Inpainting action. The input JSON structure is built according to the required fields, and the request is sent to a hypothetical endpoint.

Conclusion

The Cognitive Actions offered by the mikraz/mik79 spec unlock powerful capabilities for developers looking to generate compelling images. By utilizing the Generate Image with Inpainting action, you can create images that are not only high-quality but also customizable to your specific needs. Explore these actions further to enhance your applications, and consider integrating them into creative projects, game development, or content creation workflows. Happy coding!