Elevate Your App's Creativity with the merlintxu/matahari Cognitive Actions

24 Apr 2025
Elevate Your App's Creativity with the merlintxu/matahari Cognitive Actions

Integrating advanced image generation capabilities into your applications has never been easier thanks to the merlintxu/matahari Cognitive Actions. These actions harness the power of machine learning to create stunning images based on your custom prompts, allowing developers to add a layer of visual creativity without needing an extensive background in AI or image processing. By using these pre-built actions, you can save time and leverage sophisticated algorithms to enhance user engagement with dynamic visuals.

Prerequisites

Before you can start using the Cognitive Actions from the merlintxu/matahari spec, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic familiarity with making HTTP requests and handling JSON data.
  • A development environment set up with access to Python and the requests library for API calls.

Authentication typically involves passing your API key in the headers of your requests, which allows you to securely access the Cognitive Actions.

Cognitive Actions Overview

Generate Image with Prediction Model

The Generate Image with Prediction Model action enables you to create detailed images based on input prompts along with advanced features like image inpainting, customizable aspect ratios, and various output formats. This action falls under the category of image-generation.

Input

The input schema requires the following fields:

  • prompt (required): A string that specifies what the generated image should depict. For example: "a photo of MataHari dancing in a restaurant salon ".

Optional fields include:

  • mask: A URI string for an image mask used in inpainting mode.
  • seed: An integer to set a random seed for reproducibility.
  • image: A URI string for an input image for image-to-image transformations.
  • model: A string to specify the model (dev or schnell) for generation.
  • width: An integer to specify the image width (effective if aspect_ratio is custom).
  • height: An integer for the image height (effective if aspect_ratio is custom).
  • fastMode: A boolean to enable faster predictions.
  • aspectRatio: A string to define the image's aspect ratio.
  • imageFormat: A string to specify the output image format (webp, jpg, png).
  • imageQuality: An integer to define the quality of the output image.
  • inferenceSteps: An integer to set the number of denoising steps.
  • numberOfOutputs: An integer to specify how many images to generate.

Here’s an example of a valid input payload:

{
  "model": "dev",
  "width": 1024,
  "height": 768,
  "prompt": "a photo of MataHari dancing in a restaurant salon ",
  "aspectRatio": "16:9",
  "imageFormat": "webp",
  "imageQuality": 100,
  "mainLoraScale": 1,
  "inferenceSteps": 25,
  "promptStrength": 0.8,
  "numberOfOutputs": 4,
  "additionalLoraScale": 1,
  "diffusionGuidanceScale": 3.5
}

Output

The action returns an array of URLs pointing to the generated images. Here’s an example of the output you might receive:

[
  "https://assets.cognitiveactions.com/invocations/41917994-5330-4d0b-aac7-ef9d4a3f924e/60eabf97-ed4c-4b4c-9f14-180ca5512dbc.webp",
  "https://assets.cognitiveactions.com/invocations/41917994-5330-4d0b-aac7-ef9d4a3f924e/8daba53d-da43-479e-b598-b9eaab9331dc.webp",
  "https://assets.cognitiveactions.com/invocations/41917994-5330-4d0b-aac7-ef9d4a3f924e/dad98cb4-3e06-4103-9ea0-e79971c24a92.webp",
  "https://assets.cognitiveactions.com/invocations/41917994-5330-4d0b-aac7-ef9d4a3f924e/e6e518a9-aed5-43fc-8f49-4f147bad5702.webp"
]

Conceptual Usage Example (Python)

Here’s how you can structure a Python snippet to call the Generate Image with Prediction Model action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "b063659b-f6b0-4c59-9863-6b826ab9ffde"  # Action ID for Generate Image with Prediction Model

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "width": 1024,
    "height": 768,
    "prompt": "a photo of MataHari dancing in a restaurant salon ",
    "aspectRatio": "16:9",
    "imageFormat": "webp",
    "imageQuality": 100,
    "mainLoraScale": 1,
    "inferenceSteps": 25,
    "promptStrength": 0.8,
    "numberOfOutputs": 4,
    "additionalLoraScale": 1,
    "diffusionGuidanceScale": 3.5
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action_id corresponds to the specific action you're invoking. The payload is structured according to the required input schema for generating an image.

Conclusion

The merlintxu/matahari Cognitive Actions provide powerful tools for developers to enhance their applications with dynamic image generation capabilities. By leveraging these pre-built actions, you can create visually engaging content tailored to your users' needs. Whether you're building a creative app, a marketing tool, or simply looking to explore the possibilities of AI-driven images, these actions offer a straightforward approach to achieving stunning results.

Start integrating these actions today and unleash your application's creative potential!