Create Stunning Images from Text with blkoutuk/blkout Cognitive Actions

In the realm of artificial intelligence, the ability to generate images from text prompts represents a remarkable leap forward in creativity and technology. The blkoutuk/blkout Cognitive Actions provide powerful tools for developers to integrate image generation capabilities into their applications. By utilizing these pre-built actions, you can easily create visually appealing images based on custom text inputs, making it an ideal choice for artistic projects, content creation, and more.
Prerequisites
Before you get started with the blkoutuk/blkout Cognitive Actions, ensure you have the following:
- An API key for the Cognitive Actions platform.
- Access to the relevant API endpoints (conceptual structure is provided below).
- Basic familiarity with JSON and making HTTP requests.
Authentication typically involves passing your API key in the headers of your requests, allowing you to securely access the Cognitive Actions services.
Cognitive Actions Overview
Generate Image from Text Prompt
The Generate Image from Text Prompt action allows developers to produce images by utilizing a text prompt. This action offers flexibility in image-to-image generation and customizable settings for various parameters, such as aspect ratio, image quality, and model optimizations for speed or detailed inference.
Category: Image Generation
Input
The input schema for this action requires a JSON object that includes various fields. Here’s a breakdown of the required and optional parameters:
- Required:
prompt: Text prompt that guides the image generation.
- Optional:
mask: URI of an image mask for inpainting mode.seed: Integer for generating reproducible outputs.image: URI of an input image for image-to-image or inpainting mode.width: Width of the generated image (only ifaspect_ratiois custom).goFast: Boolean to enable faster predictions.height: Height of the generated image (only ifaspect_ratiois custom).extraLora: URI for additional LoRA weights.loraScale: Scale for applying the main LoRA.inferenceModel: Selects the model for inference.externalWeights: URI for loading external LoRA weights.numberOfOutputs: Number of outputs to generate.approxMegapixels: Approximate number of megapixels.imageAspectRatio: Aspect ratio for the generated image.imageOutputFormat: Format of the output images.imageOutputQuality: Quality level for output images.additionalLoraScale: Scale for applying extra LoRA.imagePromptStrength: Intensity of the prompt influence.diffusionGuidanceScale: Guidance scale for the diffusion process.numberOfInferenceSteps: Number of denoising steps.disableImageSafetyChecker: Disable safety checks for generated images.
Here’s an example input JSON payload:
{
"image": "https://replicate.delivery/pbxt/M4ugvBRUab0QuGKEfgbh0eDUujrVOivTmDyuV7ZopQGG1Qug/Screenshot%202024-12-03%20041207.png",
"goFast": false,
"prompt": "an image of MRK as a clay human reindeer with a red nose fit to pull Santa's sleigh",
"loraScale": 1,
"inferenceModel": "dev",
"numberOfOutputs": 2,
"approxMegapixels": "1",
"imageAspectRatio": "9:16",
"imageOutputFormat": "png",
"imageOutputQuality": 80,
"additionalLoraScale": 1,
"imagePromptStrength": 0.65,
"diffusionGuidanceScale": 3,
"numberOfInferenceSteps": 28
}
Output
Upon successful execution, the action returns an array of URLs pointing to the generated images. Here’s an example of the expected output:
[
"https://assets.cognitiveactions.com/invocations/673173c5-c73f-4870-b407-4f4a936013bd/bc99bb44-d85f-4962-9b50-d530f23ffb70.png",
"https://assets.cognitiveactions.com/invocations/673173c5-c73f-4870-b407-4f4a936013bd/f2dd1af3-a77a-4534-9f17-b4c486689c42.png"
]
Conceptual Usage Example (Python)
Here’s how you might call the Generate Image from Text Prompt action using Python. This example illustrates how to structure your input JSON payload appropriately and send a request to a hypothetical endpoint.
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "a1b3cf52-d175-4f87-8a23-fb3af470d685" # Action ID for Generate Image from Text Prompt
# Construct the input payload based on the action's requirements
payload = {
"image": "https://replicate.delivery/pbxt/M4ugvBRUab0QuGKEfgbh0eDUujrVOivTmDyuV7ZopQGG1Qug/Screenshot%202024-12-03%20041207.png",
"goFast": False,
"prompt": "an image of MRK as a clay human reindeer with a red nose fit to pull Santa's sleigh",
"loraScale": 1,
"inferenceModel": "dev",
"numberOfOutputs": 2,
"approxMegapixels": "1",
"imageAspectRatio": "9:16",
"imageOutputFormat": "png",
"imageOutputQuality": 80,
"additionalLoraScale": 1,
"imagePromptStrength": 0.65,
"diffusionGuidanceScale": 3,
"numberOfInferenceSteps": 28
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this code snippet, replace the API key and endpoint with your actual credentials. The input payload is structured according to the action's requirements, and the response is processed to display the generated image URLs.
Conclusion
The blkoutuk/blkout Cognitive Actions provide a robust framework for developers looking to enhance their applications with image generation capabilities. By leveraging these actions, you can transform text prompts into beautiful images, offering endless possibilities for creativity and innovation. Explore various use cases, from game development to content creation, and start integrating these powerful tools into your projects today!