Create Stunning Images with Edge Control using Flux Canny Pro Actions

In the realm of image generation, the black-forest-labs/flux-canny-pro API offers powerful Cognitive Actions that leverage advanced techniques to create professional-quality images. One of the standout features of this API is its ability to generate images using edge-guided methods, particularly through Canny edge detection. This is particularly useful for artists and developers looking to convert sketches into detailed artworks while preserving the original structure and composition.
In this blog post, we will explore the capabilities of the Generate Images with Edge Control action. We will cover how to integrate this action into your applications, including its input requirements, output expectations, and a conceptual Python example to get you started.
Prerequisites
Before diving into the Cognitive Actions, ensure you have the following:
- An API key for accessing the Cognitive Actions platform.
- Basic familiarity with making HTTP requests and handling JSON data.
- A development environment set up for making API calls (e.g., Python with the
requestslibrary).
Authentication
Authentication typically involves passing your API key in the headers of your requests. This ensures that your application can securely access the Cognitive Actions services.
Cognitive Actions Overview
Generate Images with Edge Control
The Generate Images with Edge Control action allows you to create professional images by utilizing edge-guided techniques. This action is especially beneficial for transforming sketches into intricate art and retexturing images while maintaining their original composition.
Input
The input to this action requires the following fields, as defined in its schema:
- controlImage (string, required): A URI pointing to the input image used for edge guidance. Accepted formats include jpeg, png, gif, or webp.
- prompt (string, required): A text description that serves as the basis for the image generation.
- seed (integer, optional): A random seed for reproducibility (default: not set).
- steps (integer, optional): Number of diffusion steps for image generation (default: 50, range: 15-50).
- guidance (number, optional): Balances adherence to the prompt and image diversity (default: 30, range: 1-50).
- outputFormat (string, optional): Desired output image format (default: "jpg").
- safetyTolerance (integer, optional): Moderation level for image safety (default: 2, range: 1-6).
- promptUpsampling (boolean, optional): Adjusts the prompt for more creative output (default: false).
Example Input:
{
"steps": 28,
"prompt": "a photo of a car on a city street",
"guidance": 25,
"controlImage": "https://replicate.delivery/pbxt/M0j11UQhwUWoxUQ9hJCOaALsAHTeoPZcGGtUf6n3BJxtKHul/output-14.webp",
"outputFormat": "jpg",
"safetyTolerance": 2,
"promptUpsampling": false
}
Output
Upon successful execution, the action will return a URI pointing to the generated image. Here’s an example of the expected output:
Example Output:
https://assets.cognitiveactions.com/invocations/57809f7a-7519-4bed-ab9f-d67631a1244f/dab2e8f6-419e-43ac-9d8c-9b4b72a0f5c2.jpg
Conceptual Usage Example (Python)
Here is a conceptual Python code snippet illustrating how to call the Generate Images with Edge Control action through a hypothetical Cognitive Actions execution endpoint:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "d6089dd6-782e-4646-bc14-d633a5f24194" # Action ID for Generate Images with Edge Control
# Construct the input payload based on the action's requirements
payload = {
"steps": 28,
"prompt": "a photo of a car on a city street",
"guidance": 25,
"controlImage": "https://replicate.delivery/pbxt/M0j11UQhwUWoxUQ9hJCOaALsAHTeoPZcGGtUf6n3BJxtKHul/output-14.webp",
"outputFormat": "jpg",
"safetyTolerance": 2,
"promptUpsampling": false
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload}
)
response.raise_for_status() # Raise an exception for bad status codes
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this snippet, replace YOUR_COGNITIVE_ACTIONS_API_KEY and the hypothetical endpoint with your actual API key and the service endpoint. The action_id is set to the specific ID for the Generate Images with Edge Control action. The input payload is constructed based on the schema outlined earlier.
Conclusion
The black-forest-labs/flux-canny-pro offers a powerful tool for developers looking to harness the potential of edge-guided image generation. By using the Generate Images with Edge Control action, you can create stunning visuals that maintain structural integrity while allowing for creative expression.
Next steps could involve experimenting with different prompts and control images to see the variety of outputs possible. Happy coding!