Transform Images into Stunning 3D Assets with firtoz/trellis Cognitive Actions

24 Apr 2025
Transform Images into Stunning 3D Assets with firtoz/trellis Cognitive Actions

In today's digital landscape, creating high-quality 3D assets from 2D images is a game-changer for developers and artists alike. The firtoz/trellis API offers powerful Cognitive Actions designed specifically for this purpose. By leveraging the TRELLIS model, you can convert your input images into detailed 3D assets, including 3D Gaussians, Radiance Fields, and textured meshes. This article provides a comprehensive guide to integrating the "Generate 3D Assets from Images" action into your applications.

Prerequisites

To get started with the Cognitive Actions, you will need:

  • An API key for the Cognitive Actions platform.
  • Basic knowledge of making HTTP requests in your preferred programming language.

Conceptually, authentication is often handled by passing the API key in the request headers, allowing you to securely access the functionalities provided by the Cognitive Actions.

Cognitive Actions Overview

Generate 3D Assets from Images

The Generate 3D Assets from Images action allows you to utilize the TRELLIS model to convert input images into high-quality 3D assets. This action supports various output formats, ensuring you can generate detailed shapes and textures efficiently.

  • Category: 3D Reconstruction

Input

The input for this action is structured as follows:

{
  "seed": 0,
  "images": [
    "https://example.com/image1.png",
    "https://example.com/image2.png"
  ],
  "textureSize": 2048,
  "generate3DModel": true,
  "generateColorVideo": true,
  "meshSimplification": 0.9,
  "randomizeSeedValue": true,
  "generateNormalVideo": false,
  "sparseStructureSamplingSteps": 12,
  "structuredLatentSamplingSteps": 12,
  "sparseStructureGuidanceStrength": 7.5,
  "structuredLatentGuidanceStrength": 3
}

Required Fields:

  • images (array of strings): An array of image URIs for generating the 3D asset.

Optional Fields:

  • seed (integer): A random seed for generation, defaults to 0.
  • textureSize (integer): Size of the texture in pixels (512 - 2048), applicable when generate3DModel is true.
  • generate3DModel (boolean): Indicates if a GLB format 3D model should be generated, defaults to false.
  • generateColorVideo (boolean): Determines if a color video render is generated, defaults to true.
  • meshSimplification (number): Degree of mesh simplification during GLB model extraction (0.9 - 0.98), applicable when generate3DModel is true.
  • randomizeSeedValue (boolean): Randomizes the seed if true, defaults to true.
  • generateNormalVideo (boolean): Indicates if a normal map video render should be generated, defaults to false.
  • sparseStructureSamplingSteps (integer): Number of sampling steps for sparse structure generation (1 - 50).
  • structuredLatentSamplingSteps (integer): Number of sampling steps for structured latent generation (1 - 50).
  • sparseStructureGuidanceStrength (number): Guidance strength for sparse structure generation (0 - 10).
  • structuredLatentGuidanceStrength (number): Guidance strength for structured latent generation (0 - 10).

Example Input

Here’s an example of how to structure the input JSON payload:

{
  "seed": 0,
  "images": [
    "https://replicate.delivery/pbxt/MClj4HeBGlMw8Jwr8nRJgG4gtSMuIzHYZmsV2XKeJkYtqFYg/yoimiya_3.png",
    "https://replicate.delivery/pbxt/MClj53w5pbLeLnZuBtDdhqIyolFZBXJ30nlM2d3IeCNfbawR/yoimiya_2.png",
    "https://replicate.delivery/pbxt/MClj4vk3vYcbRp88EPypUzwUnJFScjLLEqTDgVNKiQg2LiRS/yoimiya_1.png"
  ],
  "textureSize": 2048,
  "generate3DModel": true,
  "generateColorVideo": true,
  "meshSimplification": 0.9,
  "randomizeSeedValue": true,
  "generateNormalVideo": false,
  "sparseStructureSamplingSteps": 12,
  "structuredLatentSamplingSteps": 12,
  "sparseStructureGuidanceStrength": 7.5,
  "structuredLatentGuidanceStrength": 3
}

Output

The action typically returns the following output structure:

{
  "model_file": "https://assets.cognitiveactions.com/invocations/e01b8dd3-68b0-400a-a728-4a95b4d74b8d/773bb4fc-06bc-4c97-973a-ac054902b4ca.glb",
  "color_video": "https://assets.cognitiveactions.com/invocations/e01b8dd3-68b0-400a-a728-4a95b4d74b8d/7f100956-768a-4ef7-92d7-156418e96e80.mp4",
  "gaussian_ply": null,
  "normal_video": null,
  "combined_video": null,
  "no_background_images": null
}

Key Output Fields:

  • model_file: URL to the generated 3D model in GLB format.
  • color_video: URL to the generated color video.
  • Other fields may return null if not applicable.

Conceptual Usage Example (Python)

Here's a conceptual Python code snippet demonstrating how to call the action:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "595e5647-f9a3-4dc9-80eb-0d97b333036d"  # Action ID for Generate 3D Assets from Images

# Construct the input payload based on the action's requirements
payload = {
    "seed": 0,
    "images": [
        "https://replicate.delivery/pbxt/MClj4HeBGlMw8Jwr8nRJgG4gtSMuIzHYZmsV2XKeJkYtqFYg/yoimiya_3.png",
        "https://replicate.delivery/pbxt/MClj53w5pbLeLnZuBtDdhqIyolFZBXJ30nlM2d3IeCNfbawR/yoimiya_2.png",
        "https://replicate.delivery/pbxt/MClj4vk3vYcbRp88EPypUzwUnJFScjLLEqTDgVNKiQg2LiRS/yoimiya_1.png"
    ],
    "textureSize": 2048,
    "generate3DModel": True,
    "generateColorVideo": True,
    "meshSimplification": 0.9,
    "randomizeSeedValue": True,
    "generateNormalVideo": False,
    "sparseStructureSamplingSteps": 12,
    "structuredLatentSamplingSteps": 12,
    "sparseStructureGuidanceStrength": 7.5,
    "structuredLatentGuidanceStrength": 3
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet:

  • Replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key.
  • The action_id corresponds to the action you wish to execute.
  • The payload is structured according to the action's input requirements.

Conclusion

The firtoz/trellis Cognitive Action for generating 3D assets from images provides an efficient way to create visually stunning and detailed 3D models. With just a few lines of code, developers can integrate this functionality into their applications, enhancing user experiences and creative possibilities. Explore further use cases, such as game development, virtual reality, and animation, as you harness the power of 3D asset generation!