Create Stunning Images with the ludocomito/flux-caravaggio Cognitive Actions

25 Apr 2025
Create Stunning Images with the ludocomito/flux-caravaggio Cognitive Actions

In this blog post, we will explore the ludocomito/flux-caravaggio API and its powerful Cognitive Actions for image generation. This API allows developers to create images through various modes, including image-to-image and inpainting, leveraging advanced models for optimal quality and speed. By using pre-built actions, developers can easily integrate sophisticated image generation capabilities into their applications, enhancing their projects with minimal effort.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform. This key is crucial for authenticating your requests.
  • Familiarity with JSON format, as the API interacts using JSON payloads.

Authentication typically involves passing the API key in the request headers, ensuring that your application is authorized to access the Cognitive Actions.

Cognitive Actions Overview

Generate Image with LoRA

Description: This action generates images using the ludocomito/flux-caravaggio API, which supports both image-to-image and inpainting modes. You can optimize for quality and speed by choosing between the 'dev' model for detailed images and the 'schnell' model for faster generation, along with customizable settings.

Category: Image Generation

Input

The required and optional fields for this action are structured as follows:

{
  "prompt": "string (required)",
  "mask": "string (optional, uri)",
  "seed": "integer (optional)",
  "image": "string (optional, uri)",
  "model": "string (default: 'dev', enum: ['dev', 'schnell'])",
  "width": "integer (optional, max: 1440, min: 256)",
  "height": "integer (optional, max: 1440, min: 256)",
  "aspectRatio": "string (default: '1:1', enum: [...])",
  "outputFormat": "string (default: 'webp', enum: ['webp', 'jpg', 'png'])",
  "customWeights": "string (optional)",
  "guidanceScale": "number (default: 3, max: 10, min: 0)",
  "outputQuality": "integer (default: 80, max: 100, min: 0)",
  "enableFastMode": "boolean (default: false)",
  "promptStrength": "number (default: 0.8, max: 1, min: 0)",
  "loraWeightScale": "number (default: 1, max: 3, min: -1)",
  "numberOfOutputs": "integer (default: 1, max: 4, min: 1)",
  "numberOfInferenceSteps": "integer (default: 28, max: 50, min: 1)"
}

Example Input:

{
  "model": "dev",
  "prompt": "Painting of a couple man and woman dancing in a room with a window shading light, dramatic emphasize lights and shadows in the style of CARAVAGGIO",
  "aspectRatio": "1:1",
  "outputFormat": "webp",
  "guidanceScale": 3.5,
  "outputQuality": 80,
  "loraWeightScale": 1,
  "numberOfOutputs": 1,
  "additionalLoraScale": 0.8,
  "numberOfInferenceSteps": 45
}

Output

The action typically returns a URL to the generated image. Here’s an example of the response:

[
  "https://assets.cognitiveactions.com/invocations/38f1277f-8497-4457-a845-27f7ad255157/6dd809ce-5ce3-48f6-948e-ca8d29bf696b.webp"
]

Conceptual Usage Example (Python)

Here is a conceptual Python code snippet demonstrating how to call the action using a hypothetical Cognitive Actions endpoint. This example focuses on structuring the input JSON payload correctly:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "d1ea8eab-d5f5-4e35-b97c-32e251cc8ac4" # Action ID for Generate Image with LoRA

# Construct the input payload based on the action's requirements
payload = {
    "model": "dev",
    "prompt": "Painting of a couple man and woman dancing in a room with a window shading light, dramatic emphasize lights and shadows in the style of CARAVAGGIO",
    "aspectRatio": "1:1",
    "outputFormat": "webp",
    "guidanceScale": 3.5,
    "outputQuality": 80,
    "loraWeightScale": 1,
    "numberOfOutputs": 1,
    "additionalLoraScale": 0.8,
    "numberOfInferenceSteps": 45
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the placeholders with your API key and ensure the action ID corresponds to the desired action. The payload is structured to match the input schema required by the action.

Conclusion

The ludocomito/flux-caravaggio Cognitive Actions provide a robust solution for image generation, allowing developers to create visually stunning content with ease. By leveraging the flexibility of the API, you can customize various parameters to fit your specific needs, whether it’s for artistic creations or practical applications. Start integrating these actions today to enhance your applications with advanced image generation capabilities!