Create Stunning Images with Inpainting from Text Prompts

23 Apr 2025
Create Stunning Images with Inpainting from Text Prompts

The Visual Library offers powerful Cognitive Actions that allow developers to generate images through advanced inpainting techniques. By leveraging text prompts, this service simplifies the process of creating custom visuals while providing options for image refinement, scheduling, and watermark application. The ability to generate images tailored to specific requirements can significantly enhance user experiences across various applications, be it for marketing, gaming, or artistic endeavors.

Imagine being able to create unique artwork or product visuals with just a few lines of code. This capability opens up a world of possibilities for developers looking to integrate dynamic image generation into their projects. Whether you're designing a game character, crafting promotional materials, or simply experimenting with creative ideas, the Visual Library's inpainting feature can help you bring your vision to life.

Prerequisites

To get started, you'll need a Cognitive Actions API key, as well as a basic understanding of making API calls. This will enable you to access the inpainting capabilities effectively.

Generate Image with Inpainting

The Generate Image with Inpainting action creates images based on provided text prompts while allowing for specific areas of the image to be preserved or altered through inpainting techniques. This action is particularly valuable for developers who want to create customized images that reflect specific themes or concepts.

Input Requirements

To use this action, you will need to provide the following inputs:

  • mask: A URI for the inpainting mask, where black areas are preserved and white areas are inpainted.
  • image: The URI of the input image for inpainting or img2img mode.
  • prompt: A text prompt that describes what the image should depict.
  • width: The desired width of the output image (default is 1024).
  • height: The desired height of the output image (default is 1024).
  • numberOfOutputs: How many images to generate (between 1 and 4).
  • Additional parameters like denoisingSteps, refinementStyle, and applyWatermark can also be specified to customize the output further.

Expected Output

The expected output is a URI link to the generated image, which will reflect the inputs provided through the inpainting process. For example, after submitting a request, you might receive an output like:

https://assets.cognitiveactions.com/invocations/8c9b9a59-3b96-4380-baf8-922f2c913618/f0178f38-8b0b-4689-9ce7-004e9aee552a.png

Use Cases for this Specific Action

  • Creative Design: Artists and designers can leverage this action to quickly generate visuals for projects, allowing for rapid prototyping and experimentation.
  • Marketing Materials: Businesses can create unique promotional images tailored to specific campaigns or themes, enhancing their brand's visual storytelling.
  • Game Development: Developers can generate character designs or backgrounds dynamically based on gameplay context, enriching the user experience.
  • Personalized Content: Users can create custom images for social media or personal projects, reflecting their individual tastes and styles.
import requests
import json

# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"

action_id = "1e80d88a-e1a5-4728-852b-eb050e1cf403" # Action ID for: Generate Image with Inpainting

# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
  "width": 1024,
  "height": 1024,
  "prompt": "In the colors and style of TOK, a man is dancing",
  "adversePrompt": "",
  "applyWatermark": true,
  "denoisingSteps": 50,
  "numberOfOutputs": 1,
  "promptIntensity": 0.7,
  "refinementStyle": "no_refiner",
  "guidanceIntensity": 7.5,
  "highNoiseFraction": 0.72,
  "schedulingAlgorithm": "K_EULER",
  "loraAdjustmentFactor": 0.7
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json",
    # Add any other required headers for the Cognitive Actions API
}

# Prepare the request body for the hypothetical execution endpoint
request_body = {
    "action_id": action_id,
    "inputs": payload
}

print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json=request_body
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully. Result:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body (non-JSON): {e.response.text}")
    print("------------------------------------------------")

Conclusion

The Visual Library's inpainting capabilities empower developers to generate stunning images directly from text prompts, offering flexibility and creativity in visual content creation. With the ability to customize various parameters, such as the number of outputs and refinement styles, the action can be tailored to meet diverse project needs. As you explore these features, consider how you can integrate them into your applications to enhance user engagement and deliver personalized experiences. Start experimenting with image generation today and unlock the potential of visual creativity!