Accelerate Image Segmentation with FastSAM Cognitive Actions

24 Apr 2025
Accelerate Image Segmentation with FastSAM Cognitive Actions

In the realm of computer vision, image segmentation is a crucial technique that allows developers to partition images into meaningful segments. The Fast Segment Anything Model (FastSAM) provides an efficient solution for this task. By integrating the FastSAM Cognitive Action, developers can leverage a state-of-the-art convolutional neural network model that achieves remarkable performance while being 50 times faster than traditional methods. This blog post will guide you through the capabilities of the FastSAM action and how you can integrate it into your applications.

Prerequisites

To start using the FastSAM Cognitive Actions, you will need:

  • An API key from the Cognitive Actions platform to authenticate your requests.
  • Basic knowledge of how to make HTTP requests in your programming language of choice, particularly how to handle JSON payloads.

For authentication, you will typically pass your API key in the headers of your requests. This is a common practice for most APIs to ensure secure access.

Cognitive Actions Overview

Execute Fast Segment Anything

The Execute Fast Segment Anything action performs image segmentation using the Fast Segment Anything Model. This action is categorized under image-segmentation and is designed to provide high-speed segmentation results while utilizing minimal resources.

Input

The input for this action requires an inputImage URI and has several optional parameters that can enhance the segmentation process. Here’s a breakdown of the input schema:

  • inputImage (string, required): The URI pointing to the input image that will be processed.
  • modelName (string, optional): Selects the model to use for processing. Options are FastSAM-x or FastSAM-s. Default is FastSAM-x.
  • imageSize (integer, optional): Specifies the size of the image in pixels. Acceptable values are 512, 576, 640, 704, 768, 832, 896, 960, and 1024. Default is 640.
  • retina (boolean, optional): Indicates whether high-resolution segmentation masks should be drawn. Default is true.
  • pointLabels (string, optional): Specifies the labels for points as a string array. Use '1' for foreground and '0' for background. Default is '0'.
  • withContours (boolean, optional): Determines whether to draw the edges of the masks. Default is false.
  • textualPrompt (string, optional): A text prompt that influences the processing (e.g., 'a black dog').
  • enhancedQuality (boolean, optional): Applies morphologyEx for better quality. Default is false.
  • boundingBoxPrompt (string, optional): Coordinates for a bounding box in the format 'x, y, w, h'. Default is '0,0,0,0'.
  • confidenceThreshold (number, optional): Sets the object confidence threshold for detection. Default is 0.25.
  • intersectionOverUnion (number, optional): The IOU (Intersection Over Union) threshold for filtering annotations. Default is 0.7.
  • pointCoordinatesPrompt (string, optional): Coordinates for points as an array of arrays. Default is '[[0,0]]'.

Here’s an example input payload:

{
    "retina": true,
    "imageSize": 640,
    "modelName": "FastSAM-x",
    "inputImage": "https://replicate.delivery/pbxt/J45o004GBhb0mvUJgqQlaMM7lmHkdcOF5e1FV2yFFXttNbCH/download.jpeg",
    "pointLabels": "[0]",
    "withContours": true,
    "boundingBoxPrompt": "[0,0,0,0]",
    "confidenceThreshold": 0.4,
    "intersectionOverUnion": 0.9,
    "pointCoordinatesPrompt": "[[0,0]]"
}

Output

Upon successful execution, the action will return a URI pointing to the segmented output image. Here’s an example of the output format:

https://assets.cognitiveactions.com/invocations/a5fdbcf6-6dba-40ed-ad1c-c5081097709d/67cbebf4-3f6a-4c58-a951-c943e9bb2d0d.png

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet demonstrating how to call the Fast Segment Anything action using a hypothetical Cognitive Actions execution endpoint:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "0be3e72c-ea16-490e-8a75-f26fe35bad23" # Action ID for Execute Fast Segment Anything

# Construct the input payload based on the action's requirements
payload = {
    "retina": true,
    "imageSize": 640,
    "modelName": "FastSAM-x",
    "inputImage": "https://replicate.delivery/pbxt/J45o004GBhb0mvUJgqQlaMM7lmHkdcOF5e1FV2yFFXttNbCH/download.jpeg",
    "pointLabels": "[0]",
    "withContours": true,
    "boundingBoxPrompt": "[0,0,0,0]",
    "confidenceThreshold": 0.4,
    "intersectionOverUnion": 0.9,
    "pointCoordinatesPrompt": "[[0,0]]"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload variable is constructed based on the action's input requirements, and the action_id is set to represent the Execute Fast Segment Anything action. The code handles potential exceptions during the request appropriately.

Conclusion

The FastSAM Cognitive Action enables developers to perform high-speed image segmentation with ease. By utilizing this action, you can significantly boost the performance of your image processing applications while maintaining high-quality results. Now that you're equipped with the knowledge to integrate FastSAM, consider exploring various use cases such as object detection, image editing, and automated content analysis. Happy coding!