Transform Your Images into 3D Models with camenduru/dust3r Cognitive Actions

23 Apr 2025
Transform Your Images into 3D Models with camenduru/dust3r Cognitive Actions

In the realm of computer vision, the ability to generate 3D models from 2D images has become increasingly valuable across various applications, including gaming, virtual reality, and architectural design. The camenduru/dust3r API offers a powerful Cognitive Action that allows developers to create 3D models using the DUSt3R technique from a pair of images. This easy-to-use geometric 3D vision solution can seamlessly integrate into your applications, providing a sophisticated way to visualize and manipulate spatial data.

Prerequisites

Before diving into the integration, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of making API calls and handling JSON data.
  • Familiarity with Python for executing the conceptual examples provided.

Authentication typically involves passing your API key in the headers of your requests, ensuring secure access to the Cognitive Actions.

Cognitive Actions Overview

Create 3D Model from Image Pair

Description: Generate a 3D model using the DUSt3R technique from two input images, providing an easy-to-use geometric 3D vision solution.

Category: 3D Reconstruction

Input

The Create 3D Model from Image Pair action requires the following input fields:

  • imageOne (required): The URI of the first input image. Must be a valid URL.
  • imageTwo (required): The URI of the second input image. Must be a valid URL.
  • maskSky (optional): Indicates if the sky should be masked. Defaults to false.
  • schedule (optional): The type of schedule to be used (linear or cosine). Defaults to linear.
  • imageSize (optional): The size of the image in pixels. Defaults to 512.
  • cameraSize (optional): The size of the camera. Defaults to 0.05.
  • cleanDepth (optional): Indicates whether to clean the depth map. Defaults to true.
  • windowSize (optional): Specifies the window size. Defaults to 1.
  • referenceId (optional): The reference identifier. Defaults to 0.
  • asPointCloud (optional): Indicates whether to output as a point cloud. Defaults to false.
  • scenegraphType (optional): Specifies the type of scene graph to be used (complete, swin, or oneref). Defaults to complete.
  • numberOfIterations (optional): The number of iterations to perform. Defaults to 300.
  • transparentCameras (optional): Indicates if cameras should be transparent. Defaults to false.
  • minimumConfidenceThreshold (optional): The minimum confidence threshold. Defaults to 3.

Example Input:

{
  "maskSky": false,
  "imageOne": "https://replicate.delivery/pbxt/KVCgnEVoTYocNCeWjCeEjc1RILo8u4d3jqPX9Srak3QiX0rB/frame01.jpg",
  "imageTwo": "https://replicate.delivery/pbxt/KVCgnxgya22Ksw8WwG2gYLZIu07Ch6eQkzwoQQDeMbH2FXf4/frame02.jpg",
  "schedule": "linear",
  "imageSize": 512,
  "cameraSize": 0.05,
  "cleanDepth": true,
  "windowSize": 1,
  "referenceId": 0,
  "asPointCloud": false,
  "scenegraphType": "complete",
  "numberOfIterations": 300,
  "transparentCameras": false,
  "minimumConfidenceThreshold": 3
}

Output

The Create 3D Model from Image Pair action typically returns a URL pointing to the generated 3D model file.

Example Output:

https://assets.cognitiveactions.com/invocations/29cd7998-a11d-4a57-8fe3-5ac922c38711/cab6c4ee-2a22-436a-904f-2f2a9866c22e.glb

Conceptual Usage Example (Python)

Below is a conceptual Python code snippet demonstrating how to call this Cognitive Action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "70ef0d3b-5040-43e3-8f3a-316032b0349d" # Action ID for Create 3D Model from Image Pair

# Construct the input payload based on the action's requirements
payload = {
    "maskSky": false,
    "imageOne": "https://replicate.delivery/pbxt/KVCgnEVoTYocNCeWjCeEjc1RILo8u4d3jqPX9Srak3QiX0rB/frame01.jpg",
    "imageTwo": "https://replicate.delivery/pbxt/KVCgnxgya22Ksw8WwG2gYLZIu07Ch6eQkzwoQQDeMbH2FXf4/frame02.jpg",
    "schedule": "linear",
    "imageSize": 512,
    "cameraSize": 0.05,
    "cleanDepth": true,
    "windowSize": 1,
    "referenceId": 0,
    "asPointCloud": false,
    "scenegraphType": "complete",
    "numberOfIterations": 300,
    "transparentCameras": false,
    "minimumConfidenceThreshold": 3
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, you will replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The action ID is specified for the Create 3D Model from Image Pair action, and the input payload is structured according to the required schema.

Conclusion

The Create 3D Model from Image Pair action from the camenduru/dust3r API provides an efficient way to generate 3D models from images, opening up numerous possibilities for developers in various fields. By leveraging this Cognitive Action, you can enhance your applications with advanced 3D visualization capabilities.

Explore the integration of additional Cognitive Actions, experiment with different parameters, and unleash the full potential of your 3D modeling projects today!