Generate Stunning Images with ummtushar/pilot Cognitive Actions

22 Apr 2025
Generate Stunning Images with ummtushar/pilot Cognitive Actions

In this blog post, we will explore the capabilities of the ummtushar/pilot Cognitive Actions, specifically focusing on the action for image generation using the advanced FLUX.1 model. This powerful action allows developers to create high-quality images based on textual prompts, leveraging advanced predictive and inpainting techniques. By integrating these pre-built actions into your applications, you can save time and effort while achieving impressive visual outputs.

Prerequisites

Before you begin using the Cognitive Actions, ensure that you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Basic knowledge of how to make HTTP requests and handle JSON data.

To authenticate your API calls, you'll typically include your API key in the request headers. This will grant you access to the Cognitive Actions services.

Cognitive Actions Overview

Perform Image Generation with FLUX.1

The Perform Image Generation with FLUX.1 action enables users to generate images using the fine-tuned FLUX.1 model version 1.1. This model employs advanced techniques for image creation, including inpainting, and offers enhanced speed and quality controls.

Input

The input for this action requires a JSON object with the following schema:

  • prompt (required): A string that describes the image you want to generate.
  • model (optional): Choose between "dev" and "schnell" to optimize performance.
  • aspectRatio (optional): Defines the aspect ratio of the generated image (e.g., "16:9").
  • outputFormat (optional): The format of the output image (e.g., "png").
  • guidanceScale (optional): A number indicating the guidance scale for the diffusion process.
  • outputQuality (optional): Quality level for the output image (0 to 100).
  • numberOfOutputs (optional): Number of images to generate (default is 1).

Here's an example of a valid input payload:

{
  "model": "dev",
  "prompt": "create a big mac MCD burger that is on a beach with a sunny sky",
  "aspectRatio": "2:3",
  "outputFormat": "png",
  "guidanceScale": 3.5,
  "outputQuality": 89,
  "numberOfOutputs": 1
}

Output

The output of this action is typically a URL pointing to the generated image. For example, upon successful execution, you might receive a response like this:

[
  "https://assets.cognitiveactions.com/invocations/ba632033-6d05-449b-b163-ecf7619c4fb4/e0d161e8-3946-48dd-871f-0547ac5315fb.png"
]

This URL can be directly accessed to view or download the generated image.

Conceptual Usage Example (Python)

Here's a conceptual Python code snippet demonstrating how to call the Cognitive Actions execution endpoint for the image generation action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "517b6194-59b9-459d-ba97-6ea177736845"  # Action ID for Perform Image Generation with FLUX.1

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "create a big mac MCD burger that is on a beach with a sunny sky",
    "aspectRatio": "2:3",
    "outputFormat": "png",
    "guidanceScale": 3.5,
    "outputQuality": 89,
    "numberOfOutputs": 1
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In the code above, we replace the YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured according to the required input schema for the action. The response will provide you with the generated image URL.

Conclusion

The ummtushar/pilot Cognitive Actions offer a powerful way to integrate image generation into your applications with minimal effort. By leveraging the FLUX.1 action, developers can create visually stunning images based on simple text prompts. We encourage you to experiment with various parameters and explore the potential use cases, from content creation to enhancing user experiences in your applications. Happy coding!