Transform Your Images with Cognitive Actions: The Romy Style API Guide

In the rapidly evolving world of image generation, the guillaumesimon/romy-style API offers developers powerful Cognitive Actions designed to create stunning, styled images through inpainting techniques. These pre-built actions simplify the complex process of image manipulation by allowing you to specify various parameters, enhancing your applications with advanced visual capabilities. Whether you're looking to add artistic flair to user-generated content or automate creative workflows, these actions can significantly elevate your projects.
Prerequisites
To get started with Cognitive Actions, you'll need:
- An API key for the Cognitive Actions platform. This key allows you to authenticate your requests.
- A basic understanding of JSON for constructing input payloads.
Authentication typically involves including your API key in the request headers as a Bearer token.
Cognitive Actions Overview
Generate Styled Image with Inpainting
This action generates an image with inpainting capabilities. It allows you to specify areas of the image that should remain unchanged while transforming other sections based on a text prompt. The action supports various parameters, such as image dimensions, prompt strength, and more, giving you fine control over the output.
Input
The input schema for this action requires several fields:
- mask (string, required): URI for an input mask where black areas remain unchanged, and white areas are inpainted.
- seed (integer, optional): Seed for random number generation (randomized by default).
- image (string, required): URI of the input image.
- width (integer, default: 1024): Desired width for the output image.
- height (integer, default: 1024): Desired height for the output image.
- prompt (string, default: "An astronaut riding a rainbow unicorn"): Text providing guidance for the generated image.
- loraScale (number, default: 0.6): Scale factor for LoRA application.
- numOutputs (integer, default: 1): Number of images to generate (1 to 4).
- refineMode (string, default: "no_refiner"): Defines the refinement style for image generation.
- refineSteps (integer, optional): Number of refinement steps for the selected refine mode.
- customWeights (string, optional): Path to custom LoRA weights.
- guidanceScale (number, default: 7.5): Scale for classifier-free guidance.
- highNoiseFrac (number, default: 0.8): Fraction of noise applied with expert ensemble refiner.
- applyWatermark (boolean, default: true): Indicates whether to apply a watermark to the generated images.
- negativePrompt (string, optional): Text indicating content to avoid in the image.
- promptStrength (number, default: 0.8): Influence of the text prompt when inpainting.
- schedulingMethod (string, default: "K_EULER"): Specifies the scheduling method for denoising.
- numInferenceSteps (integer, default: 50): Total number of denoising steps during generation.
- disableSafetyChecker (boolean, default: false): Indicates if the safety checker is disabled.
Example Input
{
"width": 896,
"height": 1344,
"prompt": "children wearing wizards capes, looking at a magnificient castle on a hill, daylight, in the style of Romy",
"loraScale": 0.6,
"numOutputs": 1,
"refineMode": "no_refiner",
"guidanceScale": 7.5,
"highNoiseFrac": 0.8,
"applyWatermark": true,
"promptStrength": 0.8,
"schedulingMethod": "K_EULER",
"numInferenceSteps": 50
}
Output
The action typically returns a list of generated image URLs. For example:
[
"https://assets.cognitiveactions.com/invocations/3e85f475-21fb-44ee-8008-d1dd9771d99a/0fee59bd-a287-4ce8-82ea-53f1acd86edc.png"
]
Conceptual Usage Example (Python)
Here's a conceptual Python code snippet to demonstrate how to call this action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "de8fda00-0d28-4493-a9fa-83d765756e58" # Action ID for Generate Styled Image with Inpainting
# Construct the input payload based on the action's requirements
payload = {
"width": 896,
"height": 1344,
"prompt": "children wearing wizards capes, looking at a magnificient castle on a hill, daylight, in the style of Romy",
"loraScale": 0.6,
"numOutputs": 1,
"refineMode": "no_refiner",
"guidanceScale": 7.5,
"highNoiseFrac": 0.8,
"applyWatermark": true,
"promptStrength": 0.8,
"schedulingMethod": "K_EULER",
"numInferenceSteps": 50
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this example, replace the placeholder values with your actual API key and the correct endpoint. The action_id corresponds to the specific action you intend to execute. The payload is built using the required input fields specified in the schema.
Conclusion
The guillaumesimon/romy-style Cognitive Action for generating styled images with inpainting provides a robust toolset for developers looking to enhance their applications with advanced image generation capabilities. By leveraging the various parameters and options, you can create unique and compelling visuals tailored to user inputs.
Explore the possibilities, experiment with different prompts, and elevate your projects with this powerful API! Happy coding!