Enhance Image Processing with Omini Dev's Cognitive Actions

Omini Dev is a powerful tool designed for developers looking to streamline image processing tasks. With its suite of Cognitive Actions, Omini Dev simplifies complex image manipulation operations such as filling, edge detection, depth mapping, coloring, and deblurring. By leveraging these capabilities, developers can enhance their applications with advanced image processing features, saving time and effort while achieving high-quality results.
Common use cases for Omini Dev include creating custom artwork, improving image quality for web applications, and generating visual content for marketing materials. Whether you are working on a graphics-intensive application or simply need to enhance images, Omini Dev provides the flexibility and power to meet your needs.
Prerequisites
To get started with Omini Dev, you will need an API key for the Cognitive Actions service and a basic understanding of how to make API calls.
Perform OminiControl Task
The "Perform OminiControl Task" action allows you to execute various image processing operations using the OminiControl framework. This action is particularly useful for developers looking to automate image manipulation tasks and enhance their applications with sophisticated image processing capabilities.
Input Requirements:
- seed (integer): A random seed for reproducible generation. Example:
42. - task (string): The type of OminiControl task to perform. Options include 'fill', 'canny', 'depth', 'coloring', or 'deblurring'. Default is 'fill'.
- prompt (string): A textual prompt guiding the image generation process. Example: "A yellow book with the word 'OMINI' in large font on the cover."
- controlImageUri (string): URI of the control image, serving as a reference or mask. Example:
https://github.com/jHorovitz/OminiControl/blob/main/assets/book_masked.jpg?raw=true. - numberOfOutputs (integer): Number of images to generate (1 to 4). Default is
1. - outputImageFormat (string): File format for the output images, with options of 'webp', 'jpg', or 'png'. Default is 'webp'.
- outputImageQuality (integer): Compression quality for output images, ranging from 0 (lowest) to 100 (highest). Default is
80. - guidanceScaleFactor (number): Scaling factor for guidance during the diffusion process, affecting adherence to the prompt (0 to 10). Default is
3.5. - numberOfInferenceSteps (integer): Number of steps in the inference process (1 to 50). Default is
28.
Expected Output: The output will be a URL pointing to the generated image, which reflects the specified processing task. Example output:
https://assets.cognitiveactions.com/invocations/e61a9ae3-b978-4843-ba62-dc6105efc66b/59fabf75-b056-4848-b7c5-fe1172c1af1a.webp
Use Cases for this specific action:
- Art Generation: Create unique artwork by inputting creative prompts and control images.
- Image Enhancement: Improve the quality of images through deblurring or coloring tasks.
- Custom Visual Content: Generate tailored images for marketing or branding purposes, ensuring they align with specific themes or concepts.
import requests
import json
# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"
action_id = "1e18138c-57e2-4521-9519-5990d1ec09dd" # Action ID for: Perform OminiControl Task
# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
"seed": 42,
"task": "fill",
"prompt": "A yellow book with the word 'OMINI' in large font on the cover. The text 'for FLUX' appears at the bottom.",
"controlImageUri": "https://github.com/jHorovitz/OminiControl/blob/main/assets/book_masked.jpg?raw=true",
"numberOfOutputs": 1,
"outputImageFormat": "webp",
"outputImageQuality": 80,
"guidanceScaleFactor": 3.5,
"numberOfInferenceSteps": 8
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json",
# Add any other required headers for the Cognitive Actions API
}
# Prepare the request body for the hypothetical execution endpoint
request_body = {
"action_id": action_id,
"inputs": payload
}
print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json=request_body
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully. Result:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body (non-JSON): {e.response.text}")
print("------------------------------------------------")
Conclusion
Omini Dev's Cognitive Actions, particularly the Perform OminiControl Task, empower developers to integrate advanced image processing functionalities into their applications easily. With the ability to automate various image tasks, developers can enhance user experience, streamline workflows, and create compelling visual content efficiently.
As you explore Omini Dev, consider how these actions can be applied to your projects, and don't hesitate to experiment with different inputs to achieve the desired results.