Generate Stunning Images with the mprabhakaran/prabhakar Cognitive Actions

In this blog post, we will explore the mprabhakaran/prabhakar spec, which provides a powerful Cognitive Action for generating realistic images using a custom-trained model. This action offers capabilities including image inpainting and image-to-image transformations, allowing developers to create stunning visuals based on user-defined prompts. By leveraging these pre-built actions, you can save time and resources while enhancing your applications with advanced image generation technology.
Prerequisites
Before you start integrating the Cognitive Actions, ensure you have the following set up:
- API Key: You will need an API key for the Cognitive Actions platform to authenticate your requests.
- HTTP Client: Familiarity with making HTTP requests using libraries like
requestsin Python.
Authentication typically involves passing your API key in the headers of your requests, allowing you to securely access the Cognitive Actions services.
Cognitive Actions Overview
Generate Image with Trained Model
The Generate Image with Trained Model action enables you to create realistic images based on user-specified prompts. It supports various modes, including image inpainting and image-to-image transformations, allowing you to control dimensions and aspect ratios. You can choose between two models for performance: the 'dev' model for high-quality output and the 'schnell' model for faster results.
Input
The input schema requires the following fields:
- prompt: A textual description that defines the imagery of the generated image.
- model: Select either "dev" or "schnell" for the inference model (default: "dev").
- numberOfOutputs: Specifies how many images to generate (default: 1, max: 4).
- imageAspectRatio: The aspect ratio of the image (default: "1:1").
- diffusionGuidance: Scale for the diffusion process (default: 3).
- imageOutputFormat: The format of the output image (default: "webp").
- imageOutputQuality: Quality level for output images (default: 80).
Here’s an example of the JSON payload you would send:
{
"model": "dev",
"prompt": "A vivid, lifelike image of Putturaj on a jungle safari, standing in front of a dense, green jungle with binoculars in hand and a hat shading his eyes. Behind him, exotic animals like elephants and colorful birds are visible in the lush vegetation. The sunlight filters through the canopy, creating dappled shadows. The style is realistic, with rich colors capturing a sense of adventure",
"numberOfOutputs": 1,
"imageAspectRatio": "1:1",
"diffusionGuidance": 3.5,
"imageOutputFormat": "webp",
"imageOutputQuality": 90,
"inferenceStepCount": 28,
"inputPromptIntensity": 0.8,
"primaryLoraIntensity": 1,
"additionalLoraIntensity": 1
}
Output
The action typically returns an array of URLs pointing to the generated images. Here’s an example of the output you might receive:
[
"https://assets.cognitiveactions.com/invocations/4d9fec84-1118-4daa-a554-f1c42eac8e47/19b63bde-bee9-468d-86aa-1e3d4f83ae52.webp"
]
Conceptual Usage Example (Python)
Here’s a conceptual Python code snippet demonstrating how to invoke the Generate Image with Trained Model action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "c5568483-0521-43ba-88ac-3ed544024a3f" # Action ID for Generate Image with Trained Model
# Construct the input payload based on the action's requirements
payload = {
"model": "dev",
"prompt": "A vivid, lifelike image of Putturaj on a jungle safari, standing in front of a dense, green jungle with binoculars in hand and a hat shading his eyes. Behind him, exotic animals like elephants and colorful birds are visible in the lush vegetation. The sunlight filters through the canopy, creating dappled shadows. The style is realistic, with rich colors capturing a sense of adventure",
"numberOfOutputs": 1,
"imageAspectRatio": "1:1",
"diffusionGuidance": 3.5,
"imageOutputFormat": "webp",
"imageOutputQuality": 90,
"inferenceStepCount": 28,
"inputPromptIntensity": 0.8,
"primaryLoraIntensity": 1,
"additionalLoraIntensity": 1
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, you can see how to structure the input payload and call the hypothetical Cognitive Actions execution endpoint. Make sure to replace the API key and endpoint with your actual details.
Conclusion
The mprabhakaran/prabhakar spec offers developers a robust and flexible way to generate images through the Generate Image with Trained Model action. By utilizing this powerful Cognitive Action, you can enhance your applications with stunning visuals tailored to your users' needs. Consider experimenting with different prompts and settings to explore the full potential of image generation in your projects!