Harness the Power of Image Generation with chuanzi/su7xiaomi Cognitive Actions

24 Apr 2025
Harness the Power of Image Generation with chuanzi/su7xiaomi Cognitive Actions

In the evolving world of artificial intelligence, the ability to generate compelling images through user-defined parameters is a game-changer. The chuanzi/su7xiaomi Cognitive Actions provide a powerful API that allows developers to create images using advanced techniques such as LoRA testing. This integration empowers developers to customize and refine image generation, making it a valuable asset for applications in various fields, from gaming to marketing.

Prerequisites

Before you dive into using the Cognitive Actions, ensure you have the following:

  • API Key: You will need an API key from the Cognitive Actions platform to authenticate your requests.
  • Basic Understanding of API Calls: Familiarity with making HTTP requests will be beneficial.

Authentication typically involves sending your API key in the headers of your requests, allowing you to access the Cognitive Actions functionalities securely.

Cognitive Actions Overview

Generate Image with LoRA Testing

Description: This action generates images using Xiaomi SU7 LoRA testing, offering precise control over various image features, including masks, dimensions, and refinement styles. It allows for advanced customization through scheduler types and guidance scales.

Category: Image Generation

Input

The input for this action is structured as a JSON object, which includes the following fields:

  • mask (string, optional): URI of the input mask for inpainting mode.
  • seed (integer, optional): Random seed for reproducibility.
  • image (string, optional): URI of the input image for image-to-image translation or inpainting.
  • width (integer, default: 1024): Width of the output image in pixels.
  • height (integer, default: 1024): Height of the output image in pixels.
  • prompt (string, default: "An astronaut riding a rainbow unicorn"): Descriptive text for the desired output.
  • loraScale (number, default: 0.6): Scale for the LoRA influence.
  • numOutputs (integer, default: 1): Number of output images to generate (1 to 4).
  • refineSteps (integer, optional): Number of refinement steps.
  • refineStyle (string, default: "no_refiner"): Method of refinement.
  • modelWeights (string, optional): LoRA model weights to use.
  • guidanceScale (number, default: 7.5): Scale factor for classifier-free guidance.
  • highNoiseFrac (number, default: 0.8): Fraction of noise for refinement.
  • applyWatermark (boolean, default: true): Whether to add a watermark.
  • negativePrompt (string, optional): Constraints on the generated image.
  • promptStrength (number, default: 0.8): Strength of the prompt in translations.
  • schedulingMethod (string, default: "K_EULER"): Scheduling algorithm used.
  • numInferenceSteps (integer, default: 50): Total denoising steps.
  • disableSafetyChecker (boolean, default: false): Disables the safety checker.

Example Input:

{
  "width": 1024,
  "height": 1024,
  "prompt": "a purple su7 car in new york city street lot from above, wide shot, solo",
  "loraScale": 0.7,
  "numOutputs": 1,
  "refineStyle": "no_refiner",
  "guidanceScale": 7.5,
  "highNoiseFrac": 0.8,
  "applyWatermark": true,
  "negativePrompt": "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry",
  "promptStrength": 0.8,
  "schedulingMethod": "K_EULER",
  "numInferenceSteps": 30
}

Output

The output will typically return a URL to the generated image. Here's an example of what you might receive:

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/c2c033a7-dc28-467b-ae2f-50f146b2746c/adcfb337-b34a-4d57-9da7-9eac49aa5c35.png"
]

Conceptual Usage Example (Python)

Here’s how you might call this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "67053089-0fcb-4cd2-b95a-efe651049f04"  # Action ID for Generate Image with LoRA Testing

# Construct the input payload based on the action's requirements
payload = {
    "width": 1024,
    "height": 1024,
    "prompt": "a purple su7 car in new york city street lot from above, wide shot, solo",
    "loraScale": 0.7,
    "numOutputs": 1,
    "refineStyle": "no_refiner",
    "guidanceScale": 7.5,
    "highNoiseFrac": 0.8,
    "applyWatermark": true,
    "negativePrompt": "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry",
    "promptStrength": 0.8,
    "schedulingMethod": "K_EULER",
    "numInferenceSteps": 30
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured according to the action's input schema, ensuring that you provide the necessary parameters for successful image generation.

Conclusion

The chuanzi/su7xiaomi Cognitive Actions provide developers with an advanced toolkit for image generation, allowing for extensive customization and control. By leveraging these actions, you can enhance your applications with unique and tailored visuals, paving the way for creative solutions across various industries. Start experimenting with these capabilities today and unlock the potential of automated image generation in your projects!