Generate Stunning Images with the maitreyarish/visnu Cognitive Actions

In today's world of artificial intelligence, generating images from textual descriptions has become an exciting frontier. The maitreyarish/visnu spec provides a robust set of Cognitive Actions to help developers create stunning images from prompts. These pre-built actions simplify the process of image generation, allowing you to customize aspects like aspect ratio, image quality, and the number of inference steps.
Prerequisites
Before diving into the Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform to authenticate your requests.
- Basic familiarity with JSON and making HTTP requests.
- A Python environment set up with the
requestslibrary for seamless API interaction.
To authenticate, you will typically pass your API key in the request headers as follows:
headers = {
"Authorization": "Bearer YOUR_COGNITIVE_ACTIONS_API_KEY",
"Content-Type": "application/json"
}
Cognitive Actions Overview
Generate Image from Prompt
The Generate Image from Prompt action allows you to create a detailed image based on a text prompt. You can customize various aspects such as aspect ratio and image quality, and it supports both image-to-image and inpainting modes.
- Category: Image Generation
- Purpose: Generate an image from a descriptive text prompt.
Input
The input schema requires the following fields:
- prompt (required): The text prompt guiding the image generation (e.g., "vishnu standing and looking to the right").
- mask: URI of the image mask for inpainting mode.
- seed: An integer for reproducible image generation.
- image: URI of the input image for image-to-image or inpainting mode.
- width: Width of the generated image in pixels (256-1440).
- goFast: Enables faster image generation.
- height: Height of the generated image in pixels (256-1440).
- extraLora: Load additional LoRA weights.
- loraScale: Determines how strongly the main LoRA should be applied.
- chosenModel: Select the model for inference.
- outputCount: Number of outputs to generate (1-4).
- exportFormat: Format of the output images (webp, jpg, png).
- guidanceScale: Guidance scale for the diffusion process (0-10).
- outputQuality: Quality of output images (0-100).
- extraLoraScale: Adjust the influence of the extra LoRA.
- promptStrength: Influence of the text prompt in image-to-image mode (0-1).
- imageMegapixels: Approximate megapixel count.
- imageAspectRatio: Aspect ratio of the image.
- inferenceStepsCount: Steps for the denoising process (1-50).
- disableSafetyChecker: Option to disable the safety checker for output images.
Example Input:
{
"prompt": "vishnu standing and looking to the right",
"loraScale": 1,
"chosenModel": "dev",
"outputCount": 1,
"exportFormat": "webp",
"guidanceScale": 3.5,
"outputQuality": 90,
"extraLoraScale": 1,
"promptStrength": 0.8,
"imageAspectRatio": "1:1",
"inferenceStepsCount": 28
}
Output
The action typically returns a URL to the generated image. For example:
Example Output:
[
"https://assets.cognitiveactions.com/invocations/499d9363-7478-4737-b640-c16e73223680/68c19c7f-0940-4b91-8729-001872b096c2.webp"
]
Conceptual Usage Example (Python)
Here's a conceptual Python code snippet demonstrating how to use the Generate Image from Prompt action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "8c085407-02d0-4f89-a279-9302184ca800" # Action ID for Generate Image from Prompt
# Construct the input payload based on the action's requirements
payload = {
"prompt": "vishnu standing and looking to the right",
"loraScale": 1,
"chosenModel": "dev",
"outputCount": 1,
"exportFormat": "webp",
"guidanceScale": 3.5,
"outputQuality": 90,
"extraLoraScale": 1,
"promptStrength": 0.8,
"imageAspectRatio": "1:1",
"inferenceStepsCount": 28
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet:
- Replace
YOUR_COGNITIVE_ACTIONS_API_KEYwith your actual API key. - The
payloadvariable is structured according to the input schema, ensuring that all required fields are included.
Conclusion
The maitreyarish/visnu Cognitive Actions provide powerful capabilities for generating images from text prompts. With customizable options for output quality, aspect ratio, and more, developers can create unique visual content tailored to their needs. Dive into these actions and explore the endless possibilities of image generation in your applications!