Transforming Images into 3D Shapes with Hunyuan3D Cognitive Actions

22 Apr 2025
Transforming Images into 3D Shapes with Hunyuan3D Cognitive Actions

In the world of 3D modeling and reconstruction, generating 3D shapes from 2D images can open up a plethora of possibilities for developers and artists alike. The ndreca/hunyuan3d-2-test API provides a powerful Cognitive Action to facilitate this transformation, allowing you to create detailed 3D mesh models directly from an input image. In this article, we will explore the capabilities of the Generate 3D Shape from Image action, explain how to use it, and provide conceptual examples to help you seamlessly integrate this functionality into your applications.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for the Cognitive Actions platform that enables access to the Hunyuan3D service.
  • Basic knowledge of JSON and RESTful APIs, as the integration will involve sending and receiving JSON payloads.

When authenticating your requests, you'll typically include your API key in the request headers. This ensures that only authorized users can invoke the Cognitive Actions.

Cognitive Actions Overview

Generate 3D Shape from Image

This action is designed to generate a 3D shape from an input image URI. It offers various customization options, including randomness control, inference steps, guidance scale, octree resolution, and background removal.

Input

The input for this action requires the following fields:

  • seed (integer, optional): A random seed for generation, determining the randomness of the output. Default value is 1234.
  • image (string, required): A valid URI pointing to the input image used to create the 3D shape.
  • steps (integer, optional): Specifies the number of inference steps for generation, ranging from 20 to 50, with a default of 50.
  • guidanceScale (number, optional): Controls the influence of the guidance during generation, ranging from 1 to 20, with a default of 5.5.
  • octreeResolution (integer, optional): Defines the resolution of the octree used for mesh generation, with options 256, 384, and 512. Default is 512.
  • removeBackground (boolean, optional): Indicates whether to remove the background from the input image. Default is true.

Here's an example JSON payload for the input:

{
  "seed": 1234,
  "image": "https://replicate.delivery/pbxt/MVC2B2XKgv4X13qIpW6t2m59EVfY2CqaS9e2CSsWNHPJjQAd/image.png",
  "steps": 50,
  "guidanceScale": 5.5,
  "octreeResolution": 256,
  "removeBackground": true
}

Output

Upon successful execution, the action returns a JSON object containing the generated 3D mesh file's URI. Here is an example of the expected output:

{
  "mesh": "https://assets.cognitiveactions.com/invocations/13f5056b-1185-4e62-848d-1b0894a6dd2d/dd9fa22b-42be-4ea5-b993-4d0a06d6d74d.glb"
}

Conceptual Usage Example (Python)

Below is a conceptual Python snippet demonstrating how to invoke the Generate 3D Shape from Image action. This example focuses on structuring the input JSON payload correctly and sending it to the Cognitive Actions endpoint.

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"  # Hypothetical endpoint

action_id = "f2b9bbf3-55cb-480f-92ee-d5bc446b6d12"  # Action ID for Generate 3D Shape from Image

# Construct the input payload based on the action's requirements
payload = {
    "seed": 1234,
    "image": "https://replicate.delivery/pbxt/MVC2B2XKgv4X13qIpW6t2m59EVfY2CqaS9e2CSsWNHPJjQAd/image.png",
    "steps": 50,
    "guidanceScale": 5.5,
    "octreeResolution": 256,
    "removeBackground": true
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this snippet:

  • Replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key.
  • Set the URL to the Cognitive Actions API endpoint.
  • The action_id corresponds to the Generate 3D Shape from Image action.
  • The payload is constructed based on the required input schema.

Conclusion

The Generate 3D Shape from Image Cognitive Action offers a robust method for transforming 2D images into intricate 3D mesh models, enriching the capabilities of your applications. By leveraging its customizable parameters, developers can fine-tune the output to meet specific needs, whether for gaming, simulations, or advanced visualizations.

As you explore the Hunyuan3D API, consider experimenting with different images and parameters to unlock the full potential of 3D reconstruction. Happy coding!