Enhance Your Images with ControlNet Preprocessors: A Developer's Guide

22 Apr 2025
Enhance Your Images with ControlNet Preprocessors: A Developer's Guide

In today's digital landscape, processing and enhancing images is crucial for a variety of applications, from web development to machine learning. The ControlNet Preprocessors offer a powerful suite of Cognitive Actions designed to execute advanced preprocessing techniques on images, making it easier for developers to integrate these capabilities into their applications. In this guide, we will explore the "Perform Image Preprocessing" action, which provides a comprehensive set of preprocessing techniques to enhance image quality and extract meaningful features.

Prerequisites

Before you dive into using the ControlNet Preprocessors Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which will authenticate your requests.
  • Internet access to target the image URIs you wish to preprocess.

Authentication generally involves passing your API key in the request headers, allowing you to securely access the actions.

Cognitive Actions Overview

Perform Image Preprocessing

The Perform Image Preprocessing action is designed to execute various preprocessing techniques on images. This includes methods such as Canny edge detection, Holistically-Nested Edge Detection (HED), face detection, and more. These techniques can significantly enhance image analysis by extracting key features and improving image quality.

Input:

The input schema requires the following fields:

  • imageUri (required): A valid URI pointing to the image you wish to preprocess.
  • hedDetection (optional): A boolean flag to run HED detection. Default is true.
  • samDetection (optional): A boolean flag to run SAM detection. Default is true.
  • faceDetection (optional): A boolean flag to run face detection. Default is true.
  • mlsdDetection (optional): A boolean flag to run Multi-Line Segment Detector (MLSD). Default is true.
  • leresDetection (optional): A boolean flag to run LeReS detection. Default is true.
  • midasDetection (optional): A boolean flag to run MiDaS detection. Default is true.
  • lineArtDetection (optional): A boolean flag to run line art detection. Default is true.
  • pidiNetDetection (optional): A boolean flag to run PidiNet detection. Default is true.
  • openPoseDetection (optional): A boolean flag to run OpenPose detection. Default is true.
  • cannyEdgeDetection (optional): A boolean flag to run Canny edge detection. Default is true.
  • normalBaeDetection (optional): A boolean flag to run NormalBae detection. Default is true.
  • lineArtAnimeDetection (optional): A boolean flag to run line art anime detection. Default is true.
  • contentShuffleDetection (optional): A boolean flag to run content shuffle detection. Default is true.

Example Input:

{
  "imageUri": "https://replicate.delivery/pbxt/JrCCMJ7WAdjQSIkZUQrUkOmyDeFhkRkgQgH1S3QvryX8Iypg/IMG_7845.jpeg",
  "hedDetection": true,
  "samDetection": true,
  "faceDetection": true,
  "mlsdDetection": true,
  "leresDetection": true,
  "midasDetection": true,
  "lineArtDetection": true,
  "pidiNetDetection": true,
  "openPoseDetection": true,
  "cannyEdgeDetection": true,
  "normalBaeDetection": true,
  "lineArtAnimeDetection": true,
  "contentShuffleDetection": true
}

Output:

Upon successful execution, the action returns an array of URLs pointing to the processed images. These images represent various preprocessing results based on the specified flags.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/6999dac9-55d0-4e11-bb10-19401a21c2fc/3bb7630f-eae8-43c5-b96f-0847dca96f3f.png",
  "https://assets.cognitiveactions.com/invocations/6999dac9-55d0-4e11-bb10-19401a21c2fc/12a27f98-4423-4541-9f92-e79ba949fee2.png",
  ...
]

Conceptual Usage Example (Python):

Here’s a conceptual Python snippet demonstrating how to invoke the Perform Image Preprocessing action using a generic Cognitive Actions endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "ccd80c1b-bc8f-4677-9342-e247f351e4b1"  # Action ID for Perform Image Preprocessing

# Construct the input payload based on the action's requirements
payload = {
    "imageUri": "https://replicate.delivery/pbxt/JrCCMJ7WAdjQSIkZUQrUkOmyDeFhkRkgQgH1S3QvryX8Iypg/IMG_7845.jpeg",
    "hedDetection": true,
    "samDetection": true,
    "faceDetection": true,
    "mlsdDetection": true,
    "leresDetection": true,
    "midasDetection": true,
    "lineArtDetection": true,
    "pidiNetDetection": true,
    "openPoseDetection": true,
    "cannyEdgeDetection": true,
    "normalBaeDetection": true,
    "lineArtAnimeDetection": true,
    "contentShuffleDetection": true
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace the COGNITIVE_ACTIONS_API_KEY and COGNITIVE_ACTIONS_EXECUTE_URL with your specific details. The action_id variable corresponds to the ID of the action you wish to execute. The payload variable contains the necessary input based on the action's requirements.

Conclusion

The ControlNet Preprocessors provide developers with a rich set of capabilities to enhance and analyze images effortlessly. By using the Perform Image Preprocessing action, you can leverage advanced techniques like edge detection, face detection, and more, all through a simple API call. Start integrating these actions into your applications today, and unlock the potential of image processing to elevate user experiences!