Detect and Visualize Objects in Images with Adirik/Codet Cognitive Actions

21 Apr 2025
Detect and Visualize Objects in Images with Adirik/Codet Cognitive Actions

Integrating advanced object detection capabilities into your applications has never been easier with the Adirik/Codet Cognitive Actions. This powerful set of actions allows developers to leverage the CoDet model, designed for effective image analysis and object identification. By utilizing these pre-built actions, you can save development time while enhancing your application's functionality with sophisticated image processing capabilities.

Prerequisites

Before you start using the Adirik/Codet Cognitive Actions, ensure that you have the following:

  • An API key for the Cognitive Actions platform, which will be required for authentication.
  • Basic familiarity with making HTTP requests and handling JSON data in your programming environment.

Authentication typically involves passing your API key in the request headers to gain access to the available actions.

Cognitive Actions Overview

Detect Objects in Image

The Detect Objects in Image action utilizes the CoDet model, trained on the LVIS dataset, to detect and identify objects within an image. It returns essential details such as bounding boxes, class IDs, class names, and confidence scores. This action supports open-vocabulary training with image-caption pairs, providing flexibility and precision in object detection.

Input

The input for this action is structured as follows:

  • image (string, required): A URI string pointing to the input image to be processed.
  • confidence (number, optional): A numeric threshold for filtering detections, ranging from 0 to 1, with a default value of 0.5.
  • showVisualization (boolean, optional): A flag to indicate whether to plot and display detection results on the input image (default is true).

Example Input:

{
  "image": "https://replicate.delivery/pbxt/JtQteKssPB1rIll144OqsX4FDuQ2pih89jOAE8MZ3QnPx8Dw/1.jpeg",
  "confidence": 0.5,
  "showVisualization": true
}

Output

This action typically returns the following:

  • A JSON object containing details about detected objects, including bounding boxes and confidence scores.
  • Visualization output as an image file showing the detected objects on the original image.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/91cc4ca9-927e-4fc0-a789-3b37b2d855a6/314e02be-df1e-4efc-8d60-09107b0defe6.json",
  "https://assets.cognitiveactions.com/invocations/91cc4ca9-927e-4fc0-a789-3b37b2d855a6/141db03a-d63f-43c7-8308-30da014dd9c6.png"
]

Conceptual Usage Example (Python)

Here’s a brief conceptual example of how you might call the Detect Objects in Image action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "78c261fc-2463-4096-8b7e-d382e2442ac2" # Action ID for Detect Objects in Image

# Construct the input payload based on the action's requirements
payload = {
    "image": "https://replicate.delivery/pbxt/JtQteKssPB1rIll144OqsX4FDuQ2pih89jOAE8MZ3QnPx8Dw/1.jpeg",
    "confidence": 0.5,
    "showVisualization": true
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The action ID is specific to the Detect Objects in Image action, and the input JSON payload is structured as per the action's requirements. The endpoint URL and request structure are illustrative, so ensure you adapt them according to your specific setup.

Conclusion

The Adirik/Codet Cognitive Actions provide a powerful way to enhance your applications with advanced image object detection capabilities. By utilizing the Detect Objects in Image action, you can quickly implement detection features, filter results based on confidence levels, and visualize outcomes directly on images. Consider exploring further use cases, such as integrating this action into real-time applications or combining it with other actions for more complex workflows. Happy coding!