Enhance Your Images with Stability AI's Inpainting Cognitive Actions

23 Apr 2025
Enhance Your Images with Stability AI's Inpainting Cognitive Actions

In the realm of image processing, the ability to manipulate and enhance images is more important than ever. The stability-ai/stable-diffusion-inpainting API offers a powerful toolset to developers looking to fill in masked areas of images using advanced inpainting techniques. By leveraging the Stable Diffusion model, these Cognitive Actions provide a seamless way to enhance visual content, making it ideal for various applications such as art creation, restoration, and modification. Let's dive into how you can integrate these capabilities into your applications.

Prerequisites

Before you get started, ensure that you have:

  • An API key for the Cognitive Actions platform, which you will use to authenticate your requests.
  • Basic knowledge of JSON for structuring your input and output data.

Authentication typically involves passing your API key in the headers of your HTTP requests, allowing you to securely access the Cognitive Actions.

Cognitive Actions Overview

Fill Masked Image Regions

Description:
This action enhances images by filling in masked areas using the Stable Diffusion inpainting model. By utilizing a diffusion-based text-to-image generation model, it incorporates advanced training steps and mask-generation strategies to produce high-quality results.

Category: Image Processing

Input: The input schema requires several fields, with some being optional. Here are the primary fields needed to invoke this action:

  • image (required): URI of the initial image from which variations will be generated.
    Example:
    https://replicate.delivery/pbxt/HtGQBfA5TrqFYZBf0UL18NTqHrzt8UiSIsAkUuMHtjvFDO6p/overture-creations-5sI6fQgYIuo.png
  • mask (required): URI of a black and white image used as a mask for inpainting. White pixels will be inpainted, while black pixels will remain unchanged.
    Example:
    https://replicate.delivery/pbxt/HtGQBqO9MtVbPm0G0K43nsvvjBB0E0PaWOhuNRrRBBT4ttbf/mask.png
  • prompt (optional): A text description of the desired image appearance. The default is "a vision of paradise. unreal engine."
    Example:
    Face of a yellow cat, high resolution, sitting on a park bench
  • guidanceScale (optional): A numeric scale for classifier-free guidance, ranging from 1 to 20 (default: 7.5).
  • numberOfOutputs (optional): Specifies the number of images to generate (default: 1, maximum: 4).
  • numberOfInferenceSteps (optional): Specifies the number of denoising steps (default: 50, range: 1-500).
  • width (optional): Specifies the width of the generated image in pixels (default: 512). Must be a multiple of 64.
  • height (optional): Specifies the height of the generated image in pixels (default: 512). Must be a multiple of 64.
  • Additional fields for customization include negativePrompt, scheduler, and disableSafetyChecker.

Example Input:

{
  "mask": "https://replicate.delivery/pbxt/HtGQBqO9MtVbPm0G0K43nsvvjBB0E0PaWOhuNRrRBBT4ttbf/mask.png",
  "image": "https://replicate.delivery/pbxt/HtGQBfA5TrqFYZBf0UL18NTqHrzt8UiSIsAkUuMHtjvFDO6p/overture-creations-5sI6fQgYIuo.png",
  "prompt": "Face of a yellow cat, high resolution, sitting on a park bench",
  "guidanceScale": 7.5,
  "numberOfOutputs": 1,
  "numberOfInferenceSteps": 25
}

Output: The action typically returns URLs of the generated images. Here’s an example output:

[
  "https://assets.cognitiveactions.com/invocations/0dcf54f3-2f9c-4785-b46b-7e7f88b470a3/6f3c2f73-32bf-461d-9d92-a67ae83a2cef.png"
]

Conceptual Usage Example (Python): Here’s how you can integrate the action into your application using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "62bf6188-43f1-4faa-98a3-637b71a939f7" # Action ID for Fill Masked Image Regions

# Construct the input payload based on the action's requirements
payload = {
    "mask": "https://replicate.delivery/pbxt/HtGQBqO9MtVbPm0G0K43nsvvjBB0E0PaWOhuNRrRBBT4ttbf/mask.png",
    "image": "https://replicate.delivery/pbxt/HtGQBfA5TrqFYZBf0UL18NTqHrzt8UiSIsAkUuMHtjvFDO6p/overture-creations-5sI6fQgYIuo.png",
    "prompt": "Face of a yellow cat, high resolution, sitting on a park bench",
    "guidanceScale": 7.5,
    "numberOfOutputs": 1,
    "numberOfInferenceSteps": 25
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the API key and endpoint with your actual values. The input payload is structured according to the action's input requirements. This example demonstrates how to call the API and handle potential errors.

Conclusion

Integrating Stability AI's inpainting capabilities into your applications can significantly enhance the quality and creativity of your image processing workflows. By leveraging the Fill Masked Image Regions action, developers can create stunning visual outputs that meet specific needs. Explore various use cases, from digital art creation to image restoration, and start experimenting with the power of Cognitive Actions today!