Create Stunning 3D Models from Images with fire/trellis Cognitive Actions

22 Apr 2025
Create Stunning 3D Models from Images with fire/trellis Cognitive Actions

In the evolving landscape of 3D graphics, the ability to convert 2D images into 3D models has become an invaluable tool for developers. The fire/trellis Cognitive Actions provide a robust solution for generating high-quality 3D models from images. With features that allow for customizable parameters such as texture size and mesh simplification, developers can harness the power of structured 3D latents for scalable and versatile 3D generation. This article will guide you through the capabilities of the Generate 3D Model from Image action, its input and output structures, and conceptual usage examples to help you integrate this functionality into your applications.

Prerequisites

Before diving into the integration of Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform.
  • Access to the appropriate endpoint to execute the actions.

Authentication typically involves passing your API key in the headers of your requests, allowing you to securely access and utilize the Cognitive Actions.

Cognitive Actions Overview

Generate 3D Model from Image

The Generate 3D Model from Image action allows you to create a GLB file (3D model) from a given image. This action falls under the category of 3d-reconstruction and is designed for developers looking to enhance their applications with 3D graphics capabilities.

Input

To execute this action, you'll need to provide the following required and optional fields in your input JSON schema:

  • Required Fields:
    • image: (string) The URI of the input image to generate a 3D asset from. This is mandatory.
  • Optional Fields:
    • seed: (integer) A random seed for generating consistent outputs (default is 0).
    • textureSize: (integer) Specifies the texture size for GLB extraction (default is 2048, range: 512 to 2048).
    • meshSimplify: (number) Level of mesh simplification for GLB (default is 0.65, range: 0 to 0.98).
    • generateColor: (boolean) Indicates if a color video render should be generated (default is false).
    • generateModel: (boolean) Determines if a 3D model file should be generated (default is true).
    • randomizeSeed: (boolean) Whether to randomize the seed (default is true).
    • generateNormal: (boolean) Determines if a normal video render should be generated (default is false).
    • sparseStructureSamplingSteps: (integer) Number of sampling steps for sparse structure generation (default is 12, range: 1 to 50).
    • structuredLatentSamplingSteps: (integer) Number of sampling steps for structured latent generation (default is 12, range: 1 to 50).
    • sparseStructureGuidanceStrength: (number) Guidance strength for sparse structure generation (default is 7.5, range: 0 to 10).
    • structuredLatentGuidanceStrength: (number) Guidance strength for structured latent generation (default is 3, range: 0 to 10).

Here’s an example input JSON payload:

{
  "seed": 0,
  "image": "https://replicate.delivery/pbxt/MCprdYbZYopM9oV25EeGmNQdKjeYjIB6sfyei5rTFeaXEoun/Set_in_role_playing_game__A_young_woman_with_realistic_adult_proportio_S1009161289_St8_G1.png",
  "textureSize": 2048,
  "meshSimplify": 0,
  "generateColor": false,
  "generateModel": true,
  "randomizeSeed": true,
  "generateNormal": false,
  "sparseStructureSamplingSteps": 12,
  "structuredLatentSamplingSteps": 12,
  "sparseStructureGuidanceStrength": 7.5,
  "structuredLatentGuidanceStrength": 3
}

Output

Upon successful execution, the action typically returns a JSON object with the following structure:

  • model_file: (string) The URL of the generated 3D model file (GLB).
  • color_video: (null) Optional color video output (if generated).
  • normal_video: (null) Optional normal video output (if generated).
  • combined_video: (null) Optional combined video output (if generated).
  • no_background_image: (string) The URL of the image generated without a background.

Here’s an example output JSON:

{
  "model_file": "https://assets.cognitiveactions.com/invocations/c5914e8f-aa23-406d-bebb-92989dce6c1e/1f2c258f-a57d-4247-94a6-84f995c89831.glb",
  "color_video": null,
  "normal_video": null,
  "combined_video": null,
  "no_background_image": "https://assets.cognitiveactions.com/invocations/c5914e8f-aa23-406d-bebb-92989dce6c1e/d48dd006-818c-475c-91c1-f00cd3258451.png"
}

Conceptual Usage Example (Python)

Here's a conceptual Python code snippet demonstrating how you might call the Cognitive Actions execution endpoint for this action. Remember that the endpoint URL and request structure are illustrative.

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "be97583c-3aa4-4b84-ad08-1c2ed6278bc9"  # Action ID for Generate 3D Model from Image

# Construct the input payload based on the action's requirements
payload = {
    "seed": 0,
    "image": "https://replicate.delivery/pbxt/MCprdYbZYopM9oV25EeGmNQdKjeYjIB6sfyei5rTFeaXEoun/Set_in_role_playing_game__A_young_woman_with_realistic_adult_proportio_S1009161289_St8_G1.png",
    "textureSize": 2048,
    "meshSimplify": 0,
    "generateColor": False,
    "generateModel": True,
    "randomizeSeed": True,
    "generateNormal": False,
    "sparseStructureSamplingSteps": 12,
    "structuredLatentSamplingSteps": 12,
    "sparseStructureGuidanceStrength": 7.5,
    "structuredLatentGuidanceStrength": 3
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, replace "YOUR_COGNITIVE_ACTIONS_API_KEY" with your actual API key. The action ID and input payload are structured based on the requirements outlined earlier.

Conclusion

The fire/trellis Cognitive Action for generating 3D models from images provides developers with a powerful tool to enhance their applications with advanced 3D graphics capabilities. By leveraging the customizable parameters and structured outputs, you can create dynamic and visually appealing 3D assets from simple images. Explore the possibilities of integrating this action into your projects and elevate your applications to new heights!