Generate Stunning Images with the chenxwh/sana Cognitive Actions

In the realm of image generation, the chenxwh/sana API provides a robust set of Cognitive Actions that allow developers to create high-resolution images with remarkable artistic diversity. This powerful framework employs advanced techniques like the Linear Diffusion Transformer to produce images up to 4096x4096 pixels, making it an ideal choice for applications requiring high-quality visuals. By leveraging these pre-built actions, developers can save time while focusing on integrating creative visual content into their applications.
Prerequisites
Before diving into the Cognitive Actions, ensure you have the following:
- An API key for accessing the chenxwh/sana service.
- Basic knowledge of JSON format for structuring requests and handling responses.
- Familiarity with making HTTP requests, particularly POST requests.
For authentication, you'll typically include your API key in the request headers to ensure secure access to the Cognitive Actions.
Cognitive Actions Overview
Generate High-Resolution Images with Sana
Description: This action generates images with a wide artistic range and resolutions up to 4096x4096 using the Sana framework. It utilizes the Linear Diffusion Transformer for efficient high-resolution, high-quality image synthesis with strong text-image alignment.
Category: image-generation
Input
The input for this action requires a structured payload defined by the following schema:
{
"seed": 12345,
"width": 1024,
"height": 1024,
"prompt": "a cyberpunk cat with a neon sign that says 'Sana'",
"guidanceScale": 5,
"negativePrompt": "",
"pageGuidanceScale": 2,
"modelConfiguration": "1600M-1024px",
"numberOfInferenceSteps": 18
}
- seed (optional, integer): Random seed for generating outputs. If left blank, the seed will be randomized.
- width (optional, integer): The width of the output image in pixels (default: 1024).
- height (optional, integer): The height of the output image in pixels (default: 1024).
- prompt (required, string): The text prompt used to generate the image (default: "a cyberpunk cat with a neon sign that says 'Sana'").
- guidanceScale (optional, number): Classifier-free guidance scale from 1 to 20 (default: 5).
- negativePrompt (optional, string): Elements to exclude from the output.
- pageGuidanceScale (optional, number): Scale for PAG guidance from 1 to 20 (default: 2).
- modelConfiguration (optional, string): Specifies the model configuration (default: "1600M-1024px").
- numberOfInferenceSteps (optional, integer): Number of denoising steps used during inference (default: 18).
Example Input
{
"width": 1024,
"height": 1024,
"prompt": "a cyberpunk cat with a neon sign that says \"Sana\"",
"guidanceScale": 5,
"negativePrompt": "",
"pageGuidanceScale": 2,
"numberOfInferenceSteps": 18
}
Output
Upon successful execution, the action returns a URL pointing to the generated image:
https://assets.cognitiveactions.com/invocations/d9f2895b-9e3c-4527-a7b6-88a148e9c39e/713c47ce-a321-4ba1-a361-428ae2f69249.png
This URL can be directly used to display the image in your application.
Conceptual Usage Example (Python)
Here's a conceptual Python code snippet demonstrating how to call the Generate High-Resolution Images with Sana action:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "e36bbb40-982e-4000-a3fb-88e48868c7c5" # Action ID for Generate High-Resolution Images with Sana
# Construct the input payload based on the action's requirements
payload = {
"width": 1024,
"height": 1024,
"prompt": "a cyberpunk cat with a neon sign that says \"Sana\"",
"guidanceScale": 5,
"negativePrompt": "",
"pageGuidanceScale": 2,
"numberOfInferenceSteps": 18
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload}
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code, replace YOUR_COGNITIVE_ACTIONS_API_KEY with your actual API key. The payload is structured according to the action's requirements, and the response handling ensures you receive the image URL upon successful execution.
Conclusion
The chenxwh/sana Cognitive Actions enable developers to harness the power of advanced image generation techniques effortlessly. By integrating these actions into your applications, you can create visually stunning content tailored to your needs. As you explore the capabilities of the Sana framework, consider experimenting with different prompts and configurations to discover the full potential of this innovative technology. Happy coding!