Enhance Your Applications with Image Segmentation Using Meta/SAM-2 Cognitive Actions

24 Apr 2025
Enhance Your Applications with Image Segmentation Using Meta/SAM-2 Cognitive Actions

Integrating advanced image processing capabilities into your applications has never been easier, thanks to the Meta/SAM-2 Cognitive Actions. These pre-built actions enable developers to leverage powerful models for specific tasks, such as image segmentation. By utilizing the SAM 2 model developed by Meta AI Research, you can efficiently segment objects in images with various prompts, making it a valuable tool for developers looking to enhance their applications with image analysis features.

Prerequisites

Before you start using the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform, which you will use to authenticate your requests.
  • Basic familiarity with making API calls and handling JSON data.

To authenticate your API requests, you will generally pass your API key in the request headers.

Cognitive Actions Overview

Segment Images Using SAM 2

The Segment Images Using SAM 2 action allows you to use the SAM 2 model to segment objects within images. This model is particularly effective at handling various prompts, providing improved processing capabilities for image segmentation tasks.

  • Category: Image Segmentation

Input

The action requires the following input schema:

{
  "image": "string",
  "useM2M": "boolean",
  "pointsPerSide": "integer",
  "predictedIouThreshold": "number",
  "stabilityScoreThreshold": "number"
}
  • Required Field:
    • image: The URI of the input image. This field is mandatory.
  • Optional Fields:
    • useM2M: A boolean indicating whether to use M2M (default is true).
    • pointsPerSide: The number of points per side for mask generation (default is 32).
    • predictedIouThreshold: The predicted Intersection Over Union (IOU) threshold value (default is 0.88).
    • stabilityScoreThreshold: The stability score threshold used to evaluate mask stability (default is 0.95).

Example Input:

{
  "image": "https://replicate.delivery/pbxt/LMbGi83qiV3QXR9fqDIzTl0P23ZWU560z1nVDtgl0paCcyYs/cars.jpg",
  "useM2M": true,
  "pointsPerSide": 32,
  "predictedIouThreshold": 0.88,
  "stabilityScoreThreshold": 0.95
}

Output

The action typically returns the following data structure:

{
  "combined_mask": "string",
  "individual_masks": ["string"]
}
  • Output Fields:
    • combined_mask: A URI pointing to the combined mask image.
    • individual_masks: An array of URIs pointing to individual mask images for each segmented object.

Example Output:

{
  "combined_mask": "https://assets.cognitiveactions.com/invocations/a32d9736-43d6-422e-9cf1-3ce3a02afa9a/9dc055ee-8ffe-4c92-9b3c-e63ad439c04b.png",
  "individual_masks": [
    "https://assets.cognitiveactions.com/invocations/a32d9736-43d6-422e-9cf1-3ce3a02afa9a/2ea68564-d14e-4ab2-8faa-1ee2f111399b.png",
    // ... additional masks
  ]
}

Conceptual Usage Example (Python)

Here’s how you can call the Segment Images Using SAM 2 action using a conceptual Python code snippet:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "973bb4ca-8bd3-465e-9c55-3c252e0d5bfb"  # Action ID for Segment Images Using SAM 2

# Construct the input payload based on the action's requirements
payload = {
    "image": "https://replicate.delivery/pbxt/LMbGi83qiV3QXR9fqDIzTl0P23ZWU560z1nVDtgl0paCcyYs/cars.jpg",
    "useM2M": true,
    "pointsPerSide": 32,
    "predictedIouThreshold": 0.88,
    "stabilityScoreThreshold": 0.95
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • Replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key.
  • The action_id is set to the ID of the action you want to execute.
  • The payload is structured according to the input schema, ensuring that all required fields are included.

Conclusion

The Meta/SAM-2 Cognitive Actions provide a robust and efficient way for developers to integrate image segmentation capabilities into their applications. By utilizing the SAM 2 model, you can easily segment objects in images, enhancing your app's functionality and user experience. Consider exploring further use cases or combining these actions with other capabilities to unlock even more potential in your development projects.