Transform Your Images into 3D Models with camenduru/dust3r Cognitive Actions

In the realm of computer vision, the ability to generate 3D models from 2D images has become increasingly valuable across various applications, including gaming, virtual reality, and architectural design. The camenduru/dust3r API offers a powerful Cognitive Action that allows developers to create 3D models using the DUSt3R technique from a pair of images. This easy-to-use geometric 3D vision solution can seamlessly integrate into your applications, providing a sophisticated way to visualize and manipulate spatial data.
Prerequisites
Before diving into the integration, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Basic knowledge of making API calls and handling JSON data.
- Familiarity with Python for executing the conceptual examples provided.
Authentication typically involves passing your API key in the headers of your requests, ensuring secure access to the Cognitive Actions.
Cognitive Actions Overview
Create 3D Model from Image Pair
Description: Generate a 3D model using the DUSt3R technique from two input images, providing an easy-to-use geometric 3D vision solution.
Category: 3D Reconstruction
Input
The Create 3D Model from Image Pair action requires the following input fields:
- imageOne (required): The URI of the first input image. Must be a valid URL.
- imageTwo (required): The URI of the second input image. Must be a valid URL.
- maskSky (optional): Indicates if the sky should be masked. Defaults to
false. - schedule (optional): The type of schedule to be used (
linearorcosine). Defaults tolinear. - imageSize (optional): The size of the image in pixels. Defaults to 512.
- cameraSize (optional): The size of the camera. Defaults to 0.05.
- cleanDepth (optional): Indicates whether to clean the depth map. Defaults to
true. - windowSize (optional): Specifies the window size. Defaults to 1.
- referenceId (optional): The reference identifier. Defaults to 0.
- asPointCloud (optional): Indicates whether to output as a point cloud. Defaults to
false. - scenegraphType (optional): Specifies the type of scene graph to be used (
complete,swin, oroneref). Defaults tocomplete. - numberOfIterations (optional): The number of iterations to perform. Defaults to 300.
- transparentCameras (optional): Indicates if cameras should be transparent. Defaults to
false. - minimumConfidenceThreshold (optional): The minimum confidence threshold. Defaults to 3.
Example Input:
{
"maskSky": false,
"imageOne": "https://replicate.delivery/pbxt/KVCgnEVoTYocNCeWjCeEjc1RILo8u4d3jqPX9Srak3QiX0rB/frame01.jpg",
"imageTwo": "https://replicate.delivery/pbxt/KVCgnxgya22Ksw8WwG2gYLZIu07Ch6eQkzwoQQDeMbH2FXf4/frame02.jpg",
"schedule": "linear",
"imageSize": 512,
"cameraSize": 0.05,
"cleanDepth": true,
"windowSize": 1,
"referenceId": 0,
"asPointCloud": false,
"scenegraphType": "complete",
"numberOfIterations": 300,
"transparentCameras": false,
"minimumConfidenceThreshold": 3
}
Output
The Create 3D Model from Image Pair action typically returns a URL pointing to the generated 3D model file.
Example Output:
https://assets.cognitiveactions.com/invocations/29cd7998-a11d-4a57-8fe3-5ac922c38711/cab6c4ee-2a22-436a-904f-2f2a9866c22e.glb
Conceptual Usage Example (Python)
Below is a conceptual Python code snippet demonstrating how to call this Cognitive Action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "70ef0d3b-5040-43e3-8f3a-316032b0349d" # Action ID for Create 3D Model from Image Pair
# Construct the input payload based on the action's requirements
payload = {
"maskSky": false,
"imageOne": "https://replicate.delivery/pbxt/KVCgnEVoTYocNCeWjCeEjc1RILo8u4d3jqPX9Srak3QiX0rB/frame01.jpg",
"imageTwo": "https://replicate.delivery/pbxt/KVCgnxgya22Ksw8WwG2gYLZIu07Ch6eQkzwoQQDeMbH2FXf4/frame02.jpg",
"schedule": "linear",
"imageSize": 512,
"cameraSize": 0.05,
"cleanDepth": true,
"windowSize": 1,
"referenceId": 0,
"asPointCloud": false,
"scenegraphType": "complete",
"numberOfIterations": 300,
"transparentCameras": false,
"minimumConfidenceThreshold": 3
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this example, you will replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The action ID is specified for the Create 3D Model from Image Pair action, and the input payload is structured according to the required schema.
Conclusion
The Create 3D Model from Image Pair action from the camenduru/dust3r API provides an efficient way to generate 3D models from images, opening up numerous possibilities for developers in various fields. By leveraging this Cognitive Action, you can enhance your applications with advanced 3D visualization capabilities.
Explore the integration of additional Cognitive Actions, experiment with different parameters, and unleash the full potential of your 3D modeling projects today!