Integrating Image Segmentation with the jimothyjohn/nanosam Cognitive Actions

In the realm of image processing and machine learning, the ability to execute advanced models efficiently is paramount, especially on edge devices. The jimothyjohn/nanosam cognitive actions offer developers a streamlined way to leverage such capabilities. One of the standout features is the Run Accelerated MobileSAM Detection action, which utilizes a highly-accelerated MobileSAM model to deliver rapid and accurate image segmentation.
This blog post will guide you through the integration of this action into your applications, enabling you to perform detection via bounding boxes or foreground and background points with ease.
Prerequisites
Before diving into the code, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Basic knowledge of making HTTP requests.
- Libraries like
requestsin Python for handling API calls.
In a typical scenario, you would authenticate by passing your API key in the headers of your requests.
Cognitive Actions Overview
Run Accelerated MobileSAM Detection
The Run Accelerated MobileSAM Detection action allows you to execute a MobileSAM model optimized for edge devices, providing efficient image segmentation by detecting objects within specified bounding boxes.
Category: Image Segmentation
Input
The input for this action is defined by the CompositeRequest schema, which requires the following fields:
- image (required): The URI of the input image.
- boundingBoxX0 (optional): Normalized starting horizontal position of the bounding box (default is 0.1).
- boundingBoxX1 (optional): Normalized ending horizontal position of the bounding box (default is 0.9).
- boundingBoxY0 (optional): Normalized starting vertical position of the bounding box (default is 0.1).
- boundingBoxY1 (optional): Normalized ending vertical position of the bounding box (default is 0.9).
Example Input:
{
"image": "https://replicate.delivery/pbxt/JaSSMuwKbXcw75MNEWJPaey67YIX8zurroBqYSHh2OAKSI0R/pexels-jeremy-bishop-2422915.jpg",
"boundingBoxX0": 0.1,
"boundingBoxX1": 0.9,
"boundingBoxY0": 0.1,
"boundingBoxY1": 0.9
}
Output
Upon successful execution, this action returns a URI link to the processed image with the segmentation applied.
Example Output:
https://assets.cognitiveactions.com/invocations/a1beb024-1d99-4211-9abc-048474beecbf/6ad5d37f-bc60-4d11-ac0c-8f89dc5fbcac.jpg
Conceptual Usage Example (Python)
Here’s a conceptual example of how you might call the Run Accelerated MobileSAM Detection action using Python:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "79b03823-b440-4e60-8931-b36f8f457296" # Action ID for Run Accelerated MobileSAM Detection
# Construct the input payload based on the action's requirements
payload = {
"image": "https://replicate.delivery/pbxt/JaSSMuwKbXcw75MNEWJPaey67YIX8zurroBqYSHh2OAKSI0R/pexels-jeremy-bishop-2422915.jpg",
"boundingBoxX0": 0.1,
"boundingBoxX1": 0.9,
"boundingBoxY0": 0.1,
"boundingBoxY1": 0.9
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, we construct the input JSON payload based on the required schema and send it to the Cognitive Actions execution endpoint. The API key is included in the headers for authentication, and we handle potential errors gracefully.
Conclusion
The Run Accelerated MobileSAM Detection action from the jimothyjohn/nanosam cognitive actions offers an efficient way to implement image segmentation in your applications. With this powerful tool, you can enhance your projects significantly by incorporating advanced image detection capabilities.
Consider exploring other use cases for this action, such as real-time object detection in mobile applications or automated image analysis in various industries. Happy coding!