Transform Your Images with Stable Diffusion Inpainting Cognitive Actions

In the world of image processing, the ability to seamlessly modify images is invaluable. The Stable Diffusion Inpainting Cognitive Actions, powered by RunwayML’s Stable Diffusion model, allow developers to perform high-quality image inpainting. This process generates photo-realistic images based on textual prompts and masks, enabling creative applications across various domains. In this article, we'll explore how to integrate and utilize these powerful actions to enhance your applications.
Prerequisites
Before diving into the integration, ensure you have the following:
- An API key for accessing the Cognitive Actions platform.
- Basic knowledge of APIs and JSON structures.
- Familiarity with Python for implementing the example code snippets.
Authentication typically involves passing the API key in the request headers, allowing you to securely access the Cognitive Actions.
Cognitive Actions Overview
Perform Image Inpainting
The Perform Image Inpainting action utilizes the Stable Diffusion model to create high-quality inpainted images. By generating images based on a mask and a text prompt, developers can create stunning visuals tailored to specific needs.
- Category: Image Processing
Input
The input schema for this action requires several fields:
- image (required): A URI pointing to the input image for inpainting. Both the width and height must be divisible by 8; otherwise, the image will be center-cropped.
- mask (required): A URI pointing to a black and white image used as a mask. White pixels will be inpainted, while black pixels will be preserved.
- prompt (optional): A text prompt guiding the image generation (e.g., "a herd of grazing sheep").
- negativePrompt (optional): Aspects to avoid in the image generation.
- guidanceScale (optional): A numeric scale (1-20) influencing how closely the generation follows the prompt (default is 7.5).
- invertMask (optional): If true, black pixels are inpainted, and white pixels are preserved.
- numberOfOutputs (optional): Specifies the number of images to generate (1-4).
- numberOfInferenceSteps (optional): The number of steps used in the denoising process (1-500).
- seed (optional): An integer to initialize the random seed for generating outputs.
Example Input:
{
"mask": "https://replicate.delivery/mgxm/188d0097-6a6f-4488-a058-b0b7a66e5677/desktop-mask.png",
"image": "https://replicate.delivery/mgxm/f8c9cb3a-8ee8-41a7-9ef6-c65b37acc8af/desktop.png",
"prompt": "a herd of grazing sheep",
"guidanceScale": 7.5,
"numberOfOutputs": 1,
"numberOfInferenceSteps": 50
}
Output
The output of this action is typically an array of URIs pointing to the generated images. For example:
Example Output:
[
"https://assets.cognitiveactions.com/invocations/d45768b8-aa4e-4587-af2c-a0b7b3f79d9c/06792286-67a6-432b-8b32-d8ff6ef07202.png"
]
Conceptual Usage Example (Python)
Here's a conceptual example of how you might call the Perform Image Inpainting action using Python. This snippet demonstrates how to structure the input payload correctly and handle the API response.
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "a8c9313c-f708-4f96-a32e-4ce01ae18659" # Action ID for Perform Image Inpainting
# Construct the input payload based on the action's requirements
payload = {
"mask": "https://replicate.delivery/mgxm/188d0097-6a6f-4488-a058-b0b7a66e5677/desktop-mask.png",
"image": "https://replicate.delivery/mgxm/f8c9cb3a-8ee8-41a7-9ef6-c65b37acc8af/desktop.png",
"prompt": "a herd of grazing sheep",
"guidanceScale": 7.5,
"numberOfOutputs": 1,
"numberOfInferenceSteps": 50
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action_id variable corresponds to the Perform Image Inpainting action. The payload is structured as required by the action's input schema.
Conclusion
The Stable Diffusion Inpainting Cognitive Actions open a world of possibilities for image processing in your applications. By harnessing the power of AI-driven image inpainting, you can enhance creative projects, automate image editing, and much more. Start experimenting with these actions today, and see how they can transform your images into stunning visual experiences!