Generate Stunning Images with the "mangeohult/tbf" Cognitive Actions

In the realm of image generation, the "mangeohult/tbf" spec provides powerful Cognitive Actions that enable developers to create and modify images with ease. These pre-built actions simplify the integration of advanced image manipulation capabilities into your applications. Whether you're looking to generate new images from text prompts or inpaint existing images, these actions offer a streamlined approach to enhancing your projects.
Prerequisites
Before diving into using the Cognitive Actions, ensure you have the following:
- API Key: You will need an API key to authenticate your requests to the Cognitive Actions platform.
- Endpoint Setup: Familiarize yourself with the endpoint for executing the actions, which will be necessary for making API calls.
Authentication typically involves passing your API key in the headers of your requests, allowing secure access to the services.
Cognitive Actions Overview
Generate Inpainted Image
The Generate Inpainted Image action allows you to create and modify images based on specific input parameters, such as masks, prompts, dimensions, and model choices. It leverages efficient models like "dev" and "schnell," enabling both detailed and rapid image generation.
Input
The input schema for this action requires the following fields:
- prompt (required): A detailed text description of the desired image.
- mask (optional): URI of an image mask used for inpainting.
- seed (optional): An integer for reproducibility of results.
- image (optional): URI of an input image for image-to-image or inpainting mode.
- model (optional): The model to use for inference (default is "dev").
- width (optional): The width of the generated image.
- height (optional): The height of the generated image.
- outputFormat (optional): The format of the output image (default is "webp").
- guidanceScale (optional): A scale that influences the generation process.
- numberOfOutputs (optional): Number of images to generate (default is 1).
Example Input:
{
"model": "dev",
"prompt": "Create a realistic and playful image of tbf with soft, light brown fur, taking part in a photoshoot. The teddy bear is wearing a Brommapojkarna football jersey, featuring the team’s iconic red and black stripes with a visible team logo on the chest. The jersey is slightly oversized, giving the bear a cute and charming look as it poses for the camera. The teddy bear stands confidently on a small platform or studio set, with professional lighting and camera equipment around, as if it’s being professionally photographed. The background is a clean, minimalistic studio setting, with soft lighting that highlights the texture of the bear’s fur and the jersey. The bear’s expression is friendly, and it might be holding a small football, adding to the sporty theme of the photoshoot.",
"loraScale": 1,
"outputFormat": "webp",
"guidanceScale": 3.5,
"outputQuality": 90,
"extraLoraScale": 1,
"promptStrength": 0.8,
"numberOfOutputs": 1,
"aspectRatioOption": "1:1",
"numberOfInferenceSteps": 28
}
Output
The action typically returns a list of generated image URLs in the specified format.
Example Output:
[
"https://assets.cognitiveactions.com/invocations/1d453de7-7140-4507-95f1-aa6e2e0947dd/6fb1de13-680e-420d-a276-3ea0ad2d313b.webp"
]
Conceptual Usage Example (Python)
Here’s a conceptual Python code snippet showing how to call the Generate Inpainted Image action using a hypothetical endpoint:
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "eee21471-299e-483c-b9f3-ab102d3872b7" # Action ID for Generate Inpainted Image
# Construct the input payload based on the action's requirements
payload = {
"model": "dev",
"prompt": "Create a realistic and playful image of tbf with soft, light brown fur...",
"loraScale": 1,
"outputFormat": "webp",
"guidanceScale": 3.5,
"outputQuality": 90,
"extraLoraScale": 1,
"promptStrength": 0.8,
"numberOfOutputs": 1,
"aspectRatioOption": "1:1",
"numberOfInferenceSteps": 28
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this example, you will need to replace the API key and endpoint with your own. The action ID corresponds to the Generate Inpainted Image action, and the input payload is structured according to the action's requirements.
Conclusion
The "mangeohult/tbf" Cognitive Actions provide developers with a robust toolkit for generating and manipulating images based on text prompts and specific parameters. With capabilities like inpainting and model selection, these actions can enhance your applications significantly. Consider experimenting with different prompts and configurations to see the diverse possibilities these actions offer in your projects!