Create Stunning 3D Models with cjwbw/shap-e Cognitive Actions

21 Apr 2025
Create Stunning 3D Models with cjwbw/shap-e Cognitive Actions

The cjwbw/shap-e spec provides developers with powerful Cognitive Actions that harness the capabilities of the Shap-E model by OpenAI. These actions allow you to generate 3D models from image inputs, making it an invaluable tool for applications in gaming, virtual reality, and design. With options for rendering, batch processing, and more, these pre-built actions simplify the process of 3D reconstruction.

Prerequisites

Before diving into the Cognitive Actions, ensure you have the following:

  • An API key for accessing the Cognitive Actions platform.
  • Basic familiarity with JSON and making HTTP requests.
  • A suitable development environment set up for making API calls.

Authentication typically involves passing your API key in the request headers.

Cognitive Actions Overview

Generate 3D Implicit Functions

The Generate 3D Implicit Functions action enables the creation of conditional 3D implicit functions. This is done using the Shap-E model, which generates 3D models based on image inputs. The action supports batch processing and provides options for rendering the output as either NeRF (Neural Radiance Fields) or STF (Signed Distance Function).

Input:

The action accepts a structured input as defined in its schema:

  • image (optional): URI for a synthetic view image used to generate the 3D model. For optimal results, ensure the image has no background.
  • prompt (optional): Text prompt used for 3D model generation, which is ignored if an image is provided. Example: "a shark".
  • saveMesh (optional): Boolean flag to save the latents as meshes. Default is false.
  • batchSize (optional): Number of 3D models to generate per request. Default is 1.
  • renderMode (optional): Specifies the render mode ("nerf" or "stf"). Default is "nerf".
  • renderSize (optional): Defines the renderer size, with a default of 128.
  • guidanceScale (optional): Adjusts the guidance scale for model generation, defaulting to 15.

Example Input:

{
  "prompt": "a shark",
  "batchSize": 1,
  "renderMode": "nerf",
  "renderSize": 128,
  "guidanceScale": 15
}

Output:

The action typically returns a URL pointing to the generated 3D model output. For example:

[
  "https://assets.cognitiveactions.com/invocations/8b208a8a-e68f-441f-a41d-7cbf701f7fd3/bbc4cbea-47ce-4d4b-8707-bc4e0ddcf406.gif"
]

In this case, the output is a link to a GIF representation of the generated 3D model.

Conceptual Usage Example (Python):

Here's how you might structure a call to execute this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "1f88b09d-5a96-497e-8069-55dcd9ea9195"  # Action ID for Generate 3D Implicit Functions

# Construct the input payload based on the action's requirements
payload = {
    "prompt": "a shark",
    "batchSize": 1,
    "renderMode": "nerf",
    "renderSize": 128,
    "guidanceScale": 15
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload}  # Hypothetical structure
    )
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this example, the action ID and input payload are clearly structured. The endpoint URL and request structure are illustrative and should be adapted to your specific implementation.

Conclusion

The cjwbw/shap-e Cognitive Actions provide an effective solution for generating 3D models from images, allowing for creative applications across various domains. By utilizing these actions, developers can streamline their workflow and enhance their applications with stunning 3D content. Explore the possibilities and take your projects to the next dimension!