Transform Mobile Videos into NeRF Datasets with COLMAP Actions

23 Apr 2025
Transform Mobile Videos into NeRF Datasets with COLMAP Actions

The world of 3D graphics and virtual environments is evolving rapidly, and with it, the demand for tools that can effortlessly convert standard media into immersive experiences. Enter the COLMAP actions from the jimothyjohn/colmap API, specifically designed to streamline the process of transforming mobile videos into NeRF (Neural Radiance Fields) datasets. This powerful utility allows developers to harness the capabilities of COLMAP, making it not only straightforward but also efficient to create 3D assets from videos.

By using these pre-built actions, developers can save significant time and effort that would otherwise be spent on manual data preparation. The flexibility in output formats and image resolutions ensures that a wide range of applications—from gaming to virtual reality—can be catered for, making these actions invaluable. Typical use cases include creating immersive environments for gaming, training simulations, or even augmented reality applications where detailed 3D models are essential.

Prerequisites

To successfully utilize the COLMAP actions, developers will need a basic understanding of API calls and an API key for accessing the Cognitive Actions. Familiarity with video formats and 3D modeling concepts will also be beneficial, as it will help in optimizing the settings for specific use cases.

Convert Video to NeRF Dataset

The Convert Video to NeRF Dataset action is designed to transform mobile videos into datasets that are ready for use with NeRF-compatible software like Nerfstudio. This action is categorized under video processing and addresses the challenge of preparing video content for advanced 3D modeling.

Input Requirements

The input for this action is structured as follows:

  • video (required): A URI pointing to the short sample video to be converted. For example: https://replicate.delivery/mgxm/42fc2c94-74c0-4eee-b7ed-727a9b611c24/LionStatue.MOV.
  • name (optional): The name of the experiment, defaulting to 'colmap-out'. This can help in organizing multiple outputs.
  • mediaType (optional): Specifies the type of media input with options including 'images', 'video', and 'insta360'. The default is 'video'.
  • continuous (optional): Indicates if the video is continuous, defaulting to true.
  • outputFormat (optional): Specifies the Colmap output format, with options like 'instant-ngp', 'nerfacto', and 'arf'. Default is 'nerfacto'.
  • imageResolution (optional): Determines the resolution of images, with choices of 'Low', 'Med', and 'High'. Default is 'Low'.

Expected Output

Upon successful execution, the action typically returns a URI pointing to the generated NeRF dataset, which could look something like this: https://assets.cognitiveactions.com/invocations/5aadd589-910e-4d63-af1b-66ee7f011832/db08becc-3b44-4b49-b85e-1c186ae1a75e.zip. This dataset is then ready to be utilized in various 3D applications.

Use Cases for this Specific Action

Developers might choose to use the Convert Video to NeRF Dataset action when they need to create 3D models from video footage for applications like virtual tours, where users can explore environments from their devices. It is also useful in game development, where real-world references can be transformed into interactive assets, enhancing realism and immersion.

import requests
import json

# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"

action_id = "4462b25d-a61e-42bd-8b73-544afaba4dc3" # Action ID for: Convert Video to NeRF Dataset

# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
  "name": "lion",
  "video": "https://replicate.delivery/mgxm/42fc2c94-74c0-4eee-b7ed-727a9b611c24/LionStatue.MOV",
  "mediaType": "video",
  "continuous": true,
  "outputFormat": "nerfacto",
  "imageResolution": "Low"
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json",
    # Add any other required headers for the Cognitive Actions API
}

# Prepare the request body for the hypothetical execution endpoint
request_body = {
    "action_id": action_id,
    "inputs": payload
}

print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json=request_body
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully. Result:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body (non-JSON): {e.response.text}")
    print("------------------------------------------------")

Conclusion

The COLMAP Cognitive Actions provide developers with a robust toolset for converting mobile videos into NeRF datasets, unlocking new possibilities in 3D modeling and immersive experiences. By leveraging these actions, developers can significantly reduce the time spent on data preparation while ensuring high-quality outputs tailored to their specific needs.

As a next step, consider experimenting with different video sources and output formats to explore the full capabilities of the COLMAP actions. The potential applications are vast—ranging from gaming and simulations to innovative augmented reality experiences—making this integration a valuable addition to any developer's toolkit.