Generate Stunning Images with the alexandersakka/mimmi Cognitive Actions

In the realm of artificial intelligence, integrating image generation capabilities into applications can transform user experiences. The alexandersakka/mimmi spec provides a powerful set of Cognitive Actions to generate images using text prompts, with advanced features like mask inpainting. By leveraging these pre-built actions, developers can easily create unique visual content tailored to their applications, improving engagement and creativity.
Prerequisites
To get started with the alexandersakka/mimmi Cognitive Actions, you'll need to meet a few basic requirements:
- API Key: Obtain an API key from the Cognitive Actions platform. This key will be necessary for authenticating your requests.
- Setup: Ensure you have a working environment with access to the internet and the ability to make HTTP requests.
When making API calls, you will include your API key in the request headers to authenticate your access. This is typically done using a Bearer token format.
Cognitive Actions Overview
Generate Image with Mask Inpainting
The Generate Image with Mask Inpainting action allows you to create images based on provided text prompts, with optional features for mask inpainting. This action supports two models, dev and schnell, catering to different inference speeds. You can customize various parameters such as guidance scale, aspect ratio, resolution, and more to fine-tune your generated images.
Input
The input for this action follows the CompositeRequest schema, where the only required field is the prompt. Below is the structure of the input object, including optional fields:
{
"prompt": "string",
"mask": "string (uri)",
"seed": "integer",
"image": "string (uri)",
"width": "integer",
"goFast": "boolean",
"height": "integer",
"loraScale": "number",
"numOutputs": "integer",
"aspectRatio": "string",
"loraWeights": "string",
"outputFormat": "string",
"guidanceScale": "number",
"outputQuality": "integer",
"additionalLora": "string",
"inferenceModel": "string",
"promptStrength": "number",
"approxMegapixels": "string",
"numInferenceSteps": "integer",
"additionalLoraScale": "number",
"disableSafetyChecker": "boolean"
}
Example Input:
{
"goFast": false,
"prompt": "A classical painting of an elderly woman MIMMI, wearing a draped robe. She is depicted in a thoughtful expression, pointing upward with his index finger in a gesture of teaching or philosophical discourse. The composition is detailed with a muted, textured background reminiscent of Renaissance art.",
"loraScale": 1,
"numOutputs": 1,
"aspectRatio": "1:1",
"outputFormat": "webp",
"guidanceScale": 3,
"outputQuality": 80,
"inferenceModel": "dev",
"promptStrength": 0.8,
"approxMegapixels": "1",
"numInferenceSteps": 28,
"additionalLoraScale": 1
}
Output
The output of this action is a URL pointing to the generated image. Typically, you will receive a response in the following format:
Example Output:
[
"https://assets.cognitiveactions.com/invocations/cbaabc12-1aa3-4451-b6b7-487749b82bcb/46f69b9f-0d1c-4f8a-8d51-6a3a137bca7e.webp"
]
Conceptual Usage Example (Python)
Here’s a conceptual example of how you might call the Cognitive Actions execution endpoint using Python:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "7b091fab-4bae-4c61-97ca-ce5b1a401cc1" # Action ID for Generate Image with Mask Inpainting
# Construct the input payload based on the action's requirements
payload = {
"goFast": False,
"prompt": "A classical painting of an elderly woman MIMMI, wearing a draped robe. She is depicted in a thoughtful expression, pointing upward with his index finger in a gesture of teaching or philosophical discourse. The composition is detailed with a muted, textured background reminiscent of Renaissance art.",
"loraScale": 1,
"numOutputs": 1,
"aspectRatio": "1:1",
"outputFormat": "webp",
"guidanceScale": 3,
"outputQuality": 80,
"inferenceModel": "dev",
"promptStrength": 0.8,
"approxMegapixels": "1",
"numInferenceSteps": 28,
"additionalLoraScale": 1
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured according to the action's input requirements, and the response will include the generated image URL.
Conclusion
The alexandersakka/mimmi Cognitive Actions provide a robust solution for developers looking to create rich, visually appealing content through image generation. By leveraging features like mask inpainting and adjustable parameters, applications can achieve a high degree of customization and creativity. As you explore these actions, consider how they can enhance user engagement in your projects, and don't hesitate to iterate on the parameters to find the perfect outputs for your needs. Happy coding!