Generate Stunning Images with Cognitive Actions from EOM Phase 3

In the ever-evolving world of AI-driven creativity, the Cognitive Actions from the felixyifeiwang/eom-phase3 spec provide developers with powerful tools to generate visually compelling images. Specifically, the Generate Image with Inpainting action offers an intuitive way to create and refine images using an AI model that supports both img2img and inpainting modes. With customizable parameters, developers can take full control over the image creation process, catering to varied artistic needs.
Prerequisites
To utilize the Cognitive Actions, you will need access to the API, which includes an API key for authentication. Generally, this API key should be included in the request headers when making calls to the Cognitive Actions endpoint. Make sure to handle your API key securely to prevent unauthorized access.
Cognitive Actions Overview
Generate Image with Inpainting
The Generate Image with Inpainting action allows developers to create images with exceptional detail and customization. By leveraging parameters such as image masks, refinement methods, and guidance scales, users can greatly influence the output.
Input
The input for this action is defined by the following schema:
- mask (string, optional): URI of the input mask for inpaint mode. Black areas will be preserved, while white areas will be inpainted.
- seed (integer, optional): Random seed for generating variations. Leave blank for a randomized seed.
- image (string, optional): URI of the input image for img2img or inpaint mode.
- width (integer, optional, default: 1024): Width in pixels of the output image.
- height (integer, optional, default: 1024): Height in pixels of the output image.
- prompt (string, optional, default: "An astronaut riding a rainbow unicorn"): Text prompt to guide the image generation.
- refine (string, optional, default: "no_refiner"): Select the style of refinement for the image generation.
- loraScale (number, optional, default: 0.6): LoRA additive scale factor.
- scheduler (string, optional, default: "K_EULER"): Type of scheduler to use for image generation.
- customWeights (string, optional): URI or name of the LoRA weights to use.
- guidanceScale (number, optional, default: 7.5): Scale for classifier-free guidance.
- applyWatermark (boolean, optional, default: true): Boolean to apply a watermark on the generated image.
- negativePrompt (string, optional, default: ""): Negative text prompt to avoid certain elements.
- promptStrength (number, optional, default: 0.8): Strength of the prompt in image generation.
- numberOfOutputs (integer, optional, default: 1): Number of images to generate.
- refinementSteps (integer, optional): Number of steps for refinement when using base_image_refiner.
- highNoiseFraction (number, optional, default: 0.8): Fraction of noise to use with expert_ensemble_refiner.
- disableSafetyChecker (boolean, optional, default: false): Boolean to disable the safety checker on generated images.
- numberOfInferenceSteps (integer, optional, default: 50): Total steps for denoising.
Example Input:
{
"width": 1024,
"height": 1024,
"prompt": "a full-body TOK game character, a bearborn mage holding a arcana book, empty background",
"refine": "no_refiner",
"loraScale": 0.6,
"scheduler": "K_EULER",
"guidanceScale": 7.5,
"applyWatermark": true,
"negativePrompt": "out of frame, mutated, deformed, extras, sprites",
"promptStrength": 0.8,
"numberOfOutputs": 1,
"highNoiseFraction": 0.8,
"numberOfInferenceSteps": 50
}
Output
Upon successful execution, the action returns a URL pointing to the generated image.
Example Output:
[
"https://assets.cognitiveactions.com/invocations/ed3bffc7-eb43-4e9d-9c28-f00fdd8c5cfb/e862e5f6-249f-4fab-a5e2-077988ad2b12.png"
]
Conceptual Usage Example (Python)
Here’s a conceptual example of how to call the Cognitive Actions using Python:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "7bb1d64d-8633-4a24-a8d6-d20139a77b25" # Action ID for Generate Image with Inpainting
# Construct the input payload based on the action's requirements
payload = {
"width": 1024,
"height": 1024,
"prompt": "a full-body TOK game character, a bearborn mage holding a arcana book, empty background",
"refine": "no_refiner",
"loraScale": 0.6,
"scheduler": "K_EULER",
"guidanceScale": 7.5,
"applyWatermark": True,
"negativePrompt": "out of frame, mutated, deformed, extras, sprites",
"promptStrength": 0.8,
"numberOfOutputs": 1,
"highNoiseFraction": 0.8,
"numberOfInferenceSteps": 50
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID for Generate Image with Inpainting is specified, and the input payload is structured based on the required schema.
Conclusion
The Cognitive Actions from the felixyifeiwang/eom-phase3 spec empower developers to create unique and engaging images through AI. With the ability to customize various parameters and refine outputs, the potential applications are vast, from game development to digital art. Start integrating these capabilities into your applications and explore the creative possibilities!