Transform Images into Stunning 3D Assets with firtoz/trellis Cognitive Actions

In today's digital landscape, creating high-quality 3D assets from 2D images is a game-changer for developers and artists alike. The firtoz/trellis API offers powerful Cognitive Actions designed specifically for this purpose. By leveraging the TRELLIS model, you can convert your input images into detailed 3D assets, including 3D Gaussians, Radiance Fields, and textured meshes. This article provides a comprehensive guide to integrating the "Generate 3D Assets from Images" action into your applications.
Prerequisites
To get started with the Cognitive Actions, you will need:
- An API key for the Cognitive Actions platform.
- Basic knowledge of making HTTP requests in your preferred programming language.
Conceptually, authentication is often handled by passing the API key in the request headers, allowing you to securely access the functionalities provided by the Cognitive Actions.
Cognitive Actions Overview
Generate 3D Assets from Images
The Generate 3D Assets from Images action allows you to utilize the TRELLIS model to convert input images into high-quality 3D assets. This action supports various output formats, ensuring you can generate detailed shapes and textures efficiently.
- Category: 3D Reconstruction
Input
The input for this action is structured as follows:
{
"seed": 0,
"images": [
"https://example.com/image1.png",
"https://example.com/image2.png"
],
"textureSize": 2048,
"generate3DModel": true,
"generateColorVideo": true,
"meshSimplification": 0.9,
"randomizeSeedValue": true,
"generateNormalVideo": false,
"sparseStructureSamplingSteps": 12,
"structuredLatentSamplingSteps": 12,
"sparseStructureGuidanceStrength": 7.5,
"structuredLatentGuidanceStrength": 3
}
Required Fields:
images(array of strings): An array of image URIs for generating the 3D asset.
Optional Fields:
seed(integer): A random seed for generation, defaults to 0.textureSize(integer): Size of the texture in pixels (512 - 2048), applicable whengenerate3DModelis true.generate3DModel(boolean): Indicates if a GLB format 3D model should be generated, defaults to false.generateColorVideo(boolean): Determines if a color video render is generated, defaults to true.meshSimplification(number): Degree of mesh simplification during GLB model extraction (0.9 - 0.98), applicable whengenerate3DModelis true.randomizeSeedValue(boolean): Randomizes the seed if true, defaults to true.generateNormalVideo(boolean): Indicates if a normal map video render should be generated, defaults to false.sparseStructureSamplingSteps(integer): Number of sampling steps for sparse structure generation (1 - 50).structuredLatentSamplingSteps(integer): Number of sampling steps for structured latent generation (1 - 50).sparseStructureGuidanceStrength(number): Guidance strength for sparse structure generation (0 - 10).structuredLatentGuidanceStrength(number): Guidance strength for structured latent generation (0 - 10).
Example Input
Here’s an example of how to structure the input JSON payload:
{
"seed": 0,
"images": [
"https://replicate.delivery/pbxt/MClj4HeBGlMw8Jwr8nRJgG4gtSMuIzHYZmsV2XKeJkYtqFYg/yoimiya_3.png",
"https://replicate.delivery/pbxt/MClj53w5pbLeLnZuBtDdhqIyolFZBXJ30nlM2d3IeCNfbawR/yoimiya_2.png",
"https://replicate.delivery/pbxt/MClj4vk3vYcbRp88EPypUzwUnJFScjLLEqTDgVNKiQg2LiRS/yoimiya_1.png"
],
"textureSize": 2048,
"generate3DModel": true,
"generateColorVideo": true,
"meshSimplification": 0.9,
"randomizeSeedValue": true,
"generateNormalVideo": false,
"sparseStructureSamplingSteps": 12,
"structuredLatentSamplingSteps": 12,
"sparseStructureGuidanceStrength": 7.5,
"structuredLatentGuidanceStrength": 3
}
Output
The action typically returns the following output structure:
{
"model_file": "https://assets.cognitiveactions.com/invocations/e01b8dd3-68b0-400a-a728-4a95b4d74b8d/773bb4fc-06bc-4c97-973a-ac054902b4ca.glb",
"color_video": "https://assets.cognitiveactions.com/invocations/e01b8dd3-68b0-400a-a728-4a95b4d74b8d/7f100956-768a-4ef7-92d7-156418e96e80.mp4",
"gaussian_ply": null,
"normal_video": null,
"combined_video": null,
"no_background_images": null
}
Key Output Fields:
model_file: URL to the generated 3D model in GLB format.color_video: URL to the generated color video.- Other fields may return null if not applicable.
Conceptual Usage Example (Python)
Here's a conceptual Python code snippet demonstrating how to call the action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "595e5647-f9a3-4dc9-80eb-0d97b333036d" # Action ID for Generate 3D Assets from Images
# Construct the input payload based on the action's requirements
payload = {
"seed": 0,
"images": [
"https://replicate.delivery/pbxt/MClj4HeBGlMw8Jwr8nRJgG4gtSMuIzHYZmsV2XKeJkYtqFYg/yoimiya_3.png",
"https://replicate.delivery/pbxt/MClj53w5pbLeLnZuBtDdhqIyolFZBXJ30nlM2d3IeCNfbawR/yoimiya_2.png",
"https://replicate.delivery/pbxt/MClj4vk3vYcbRp88EPypUzwUnJFScjLLEqTDgVNKiQg2LiRS/yoimiya_1.png"
],
"textureSize": 2048,
"generate3DModel": True,
"generateColorVideo": True,
"meshSimplification": 0.9,
"randomizeSeedValue": True,
"generateNormalVideo": False,
"sparseStructureSamplingSteps": 12,
"structuredLatentSamplingSteps": 12,
"sparseStructureGuidanceStrength": 7.5,
"structuredLatentGuidanceStrength": 3
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet:
- Replace
YOUR_COGNITIVE_ACTIONS_API_KEYwith your actual API key. - The
action_idcorresponds to the action you wish to execute. - The
payloadis structured according to the action's input requirements.
Conclusion
The firtoz/trellis Cognitive Action for generating 3D assets from images provides an efficient way to create visually stunning and detailed 3D models. With just a few lines of code, developers can integrate this functionality into their applications, enhancing user experiences and creative possibilities. Explore further use cases, such as game development, virtual reality, and animation, as you harness the power of 3D asset generation!