Create Stunning Images with the swk23/mon Cognitive Actions

In the realm of image generation, the swk23/mon Cognitive Actions provide a powerful API for developers looking to create visually stunning images using advanced techniques like inpainting. This set of pre-built actions allows you to harness the potential of AI to generate detailed images based on text prompts, while also offering customization options for quality, dimensions, and formats. Whether you're building an app that requires dynamic visuals or simply experimenting with AI-generated art, these Cognitive Actions are designed to simplify the process.
Prerequisites
Before diving into the integration of the swk23/mon Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Understanding of how to structure JSON requests, as you'll be sending formatted data to the API.
Authentication typically involves passing your API key in the headers of your requests, allowing you to access the Cognitive Actions securely.
Cognitive Actions Overview
Generate Image with Inpainting
The Generate Image with Inpainting action allows you to create images by using a specified mask and text prompts. This operation not only supports inpainting—filling masked regions in images—but also offers custom settings for dimensions, quality, and output formats. Developers can choose between the 'dev' model for detailed generation and the 'schnell' model for faster results.
Input
The action requires a prompt, and it supports a variety of optional fields to customize the image generation process:
- prompt (required): A descriptive text that guides image creation.
- mask (optional): URI of an image mask for inpainting.
- image (optional): URI of an input image for inpainting modes.
- width (optional): Width of the generated image (256-1440).
- height (optional): Height of the generated image (256-1440).
- guidanceScale (optional): Scale for the diffusion process guidance (0-10).
- outputQuality (optional): Quality setting for saving output images (0-100).
- inferenceModel (optional): Choose between 'dev' or 'schnell' for the inference model.
- inferenceSteps (optional): Number of denoising steps during inference (1-50).
- imageResolution (optional): Approximate megapixel count of the generated image.
- numberOfOutputs (optional): Number of output images to generate (1-4).
- imageAspectRatio (optional): The aspect ratio for the generated image.
- imageOutputFormat (optional): Format for saving output images (webp, jpg, png).
- Additional fields for advanced users include parameters for LoRA weights, prompt intensity, and content filters.
Example Input:
{
"prompt": "\"A dimly lit office on Coruscant with large windows overlooking the city skyline, casting a soft glow into the room. Senator Mon Mothma, a poised woman with short auburn hair and sharp, intelligent eyes, sits at a circular table. She wears an elegant but simple white senatorial gown with subtle embroidery, reflecting her dignified presence. A datapad rests in front of her, and her fingers are lightly touching it as she contemplates, her expression wary and thoughtful. Shadows stretch across the room, emphasizing the secrecy of the meeting. The atmosphere is tense yet refined, the weight of political intrigue evident in her posture.\"",
"guidanceScale": 3,
"outputQuality": 80,
"inferenceModel": "dev",
"inferenceSteps": 28,
"imageResolution": "1",
"numberOfOutputs": 1,
"promptIntensity": 0.8,
"imageAspectRatio": "21:9",
"imageOutputFormat": "jpg",
"loraIntensityScale": 1,
"additionalLoraScale": 1,
"enableFastGeneration": false
}
Output
The action typically returns a URL pointing to the generated image. The output can vary based on the input parameters and the complexity of the image requested.
Example Output:
[
"https://assets.cognitiveactions.com/invocations/29a7dc18-8326-49d1-8a45-dbbf48e85785/61157075-c1a7-45d5-b185-c3b93cad9b46.jpg"
]
Conceptual Usage Example (Python)
Here’s a conceptual example of how you might call this action using Python:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "05b28828-9ebe-4b29-a034-8ced6baf27c3" # Action ID for Generate Image with Inpainting
# Construct the input payload based on the action's requirements
payload = {
"prompt": "\"A dimly lit office on Coruscant with large windows overlooking the city skyline, casting a soft glow into the room. Senator Mon Mothma, a poised woman with short auburn hair and sharp, intelligent eyes, sits at a circular table. She wears an elegant but simple white senatorial gown with subtle embroidery, reflecting her dignified presence. A datapad rests in front of her, and her fingers are lightly touching it as she contemplates, her expression wary and thoughtful. Shadows stretch across the room, emphasizing the secrecy of the meeting. The atmosphere is tense yet refined, the weight of political intrigue evident in her posture.\"",
"guidanceScale": 3,
"outputQuality": 80,
"inferenceModel": "dev",
"inferenceSteps": 28,
"imageResolution": "1",
"numberOfOutputs": 1,
"promptIntensity": 0.8,
"imageAspectRatio": "21:9",
"imageOutputFormat": "jpg",
"loraIntensityScale": 1,
"additionalLoraScale": 1,
"enableFastGeneration": False
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this example, the action ID and input payload are structured to fit the requirements of the Generate Image with Inpainting action. The endpoint URL and the exact request structure are illustrative, focusing on how to format your data correctly.
Conclusion
The swk23/mon Cognitive Actions enable developers to create high-quality images with ease. By leveraging the advanced capabilities of inpainting and custom settings, you can enhance your applications with dynamic visuals tailored to your specifications. Start experimenting with these actions today and unlock the potential of AI-generated imagery for your projects!