Seamlessly Merge Images with ControlNet Using the fofr/image-merger Actions

25 Apr 2025
Seamlessly Merge Images with ControlNet Using the fofr/image-merger Actions

Integrating advanced image processing capabilities into your applications has never been easier with the fofr/image-merger Cognitive Actions. This powerful API allows developers to merge images intelligently, utilizing sophisticated features like ControlNet guidance, various merge modes, and options for animation and upscaling. By leveraging these pre-built actions, you can enhance your applications with high-quality image merging capabilities without needing extensive image processing expertise.

Prerequisites

Before you start using the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which you will use for authentication.
  • Basic experience with making API calls and JSON payload handling.

To authenticate your requests, you will typically pass your API key in the request headers as follows:

Authorization: Bearer YOUR_COGNITIVE_ACTIONS_API_KEY

Cognitive Actions Overview

Merge Images with ControlNet

Description: This action merges two images while optionally allowing a third image to guide the merging process using ControlNet. It supports multiple merge modes and offers features for animation and upscaling.

Category: Image Processing

Input

The input for this action requires two images and offers various optional parameters to customize the merging process. Below is the schema for the input:

  • imageOne (required): URI of the first image.
  • imageTwo (required): URI of the second image.
  • seed (optional): An integer to fix the random seed for reproducibility.
  • steps (optional): The number of diffusion steps, default is 20.
  • width (optional): Width of the generated image in pixels, default is 768.
  • height (optional): Height of the generated image in pixels, default is 768.
  • prompt (optional): A textual prompt to guide the merging style.
  • animate (optional): If true, produces an animation; default is false.
  • mergeMode (optional): Specifies the merging mode; default is 'full'.
  • controlImage (optional): URI of an optional control image.
  • upscaleSteps (optional): Number of upscale steps; default is 20.
  • animateFrames (optional): Total frames for animation; default is 24.
  • negativePrompt (optional): Elements to avoid in the merged image.
  • upscaleTwoTimes (optional): If true, doubles the resolution of the final image.
  • imageOneStrength (optional): Influence of the first image, range 0 to 1.
  • imageTwoStrength (optional): Influence of the second image, range 0 to 1.
  • returnTemporaryFiles (optional): If true, returns intermediate temporary files; default is false.

Example Input:

{
  "steps": 20,
  "width": 768,
  "height": 768,
  "prompt": "an svg illustration, sharp, solid color, thick outline",
  "animate": false,
  "imageOne": "https://replicate.delivery/pbxt/KLpMSbIo0rCeITgKcB6CPTsfUbSquTptlLHOR7SyDBiaUBUS/0_2.webp",
  "imageTwo": "https://replicate.delivery/pbxt/KLpMTQ754bUSZlPnrYog5JFI0mRGVoXAkQSlPk1yfHssW532/0_2-1.webp",
  "mergeMode": "left_right",
  "controlImage": "https://replicate.delivery/pbxt/KLpMSa1lK4SMNxrhDnXFkk6BYkpIZVVXg3WrQIlLPCUn4Uaw/0_3.webp",
  "upscaleSteps": 20,
  "animateFrames": 24,
  "negativePrompt": "garish, soft, ugly, broken, distorted",
  "upscaleTwoTimes": true,
  "imageOneStrength": 1,
  "imageTwoStrength": 1,
  "returnTemporaryFiles": false
}

Output

The output for this action typically returns a URL to the merged image. Here is an example of the expected output:

[
  "https://assets.cognitiveactions.com/invocations/1f9a3283-6703-4f3f-90d1-6285c42fed5a/2a91b260-cb99-4579-b2f7-5bb5e4db27d3.png"
]

Conceptual Usage Example (Python)

Here's a conceptual example of how to call the Merge Images with ControlNet action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "b866f2ca-210d-413c-b549-6d4c1ecda170"  # Action ID for Merge Images with ControlNet

# Construct the input payload based on the action's requirements
payload = {
    "steps": 20,
    "width": 768,
    "height": 768,
    "prompt": "an svg illustration, sharp, solid color, thick outline",
    "animate": false,
    "imageOne": "https://replicate.delivery/pbxt/KLpMSbIo0rCeITgKcB6CPTsfUbSquTptlLHOR7SyDBiaUBUS/0_2.webp",
    "imageTwo": "https://replicate.delivery/pbxt/KLpMTQ754bUSZlPnrYog5JFI0mRGVoXAkQSlPk1yfHssW532/0_2-1.webp",
    "mergeMode": "left_right",
    "controlImage": "https://replicate.delivery/pbxt/KLpMSa1lK4SMNxrhDnXFkk6BYkpIZVVXg3WrQIlLPCUn4Uaw/0_3.webp",
    "upscaleSteps": 20,
    "animateFrames": 24,
    "negativePrompt": "garish, soft, ugly, broken, distorted",
    "upscaleTwoTimes": true,
    "imageOneStrength": 1,
    "imageTwoStrength": 1,
    "returnTemporaryFiles": false
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this Python snippet, replace the placeholders with your API key and the action ID. The input payload is structured to meet the action's requirements, ensuring the images are merged correctly. Note that the endpoint URL and exact request structure provided here are illustrative.

Conclusion

The fofr/image-merger Cognitive Action empowers developers to create stunning merged images with ease. With features like ControlNet guidance, customizable merge modes, and options for animation and upscaling, integrating advanced image processing into your application is straightforward. Start exploring these capabilities today and consider how they can enhance your projects!