Create Stunning Images from Text with mcai/realistic-vision-v2.0 Actions

24 Apr 2025
Create Stunning Images from Text with mcai/realistic-vision-v2.0 Actions

In the world of digital creativity, the ability to generate realistic images from textual descriptions offers immense potential for developers and artists alike. The mcai/realistic-vision-v2.0 specification provides a powerful Cognitive Action that allows you to create visually engaging images based on your prompts. This action enables you to control various parameters such as image dimensions, styles, and levels of detail, giving you the flexibility needed for optimal results.

Utilizing pre-built Cognitive Actions not only accelerates development but also enhances the capabilities of your applications with advanced AI features. Let’s dive into how you can harness this technology.

Prerequisites

To get started with the Cognitive Actions, you'll need to ensure you have the following:

  • An API key for the Cognitive Actions platform to authenticate your requests.
  • Basic knowledge of making API calls and handling JSON data.

Typically, authentication is accomplished by including your API key in the request headers, allowing you to securely access the actions provided by the platform.

Cognitive Actions Overview

Generate Image from Text with Realistic Vision

Description: This action allows you to create a new, realistic image from a textual description using the Realistic Vision V2.0 model. It offers options for image dimensions, style, and detail control for optimal results.

Category: Image Generation

Input

The input schema for this action requires several fields:

  • seed (optional): A random seed for generating images (integer). If left blank, a random seed will be used.
  • width: Width of the output image in pixels (integer). Choose from predefined values (default: 512).
  • height: Height of the output image in pixels (integer). Choose from predefined values (default: 768).
  • prompt: Text input describing the desired image (string). For example, "A dream of a distant galaxy, by Caspar David Friedrich, matte painting trending on artstation HQ".
  • scheduler: Select the scheduling algorithm for image generation (string). Default is "EulerAncestralDiscrete".
  • guidanceScale: Scale factor for classifier-free guidance (number). Valid range is 1 to 20 (default: 7.5).
  • negativePrompt (optional): List of undesired characteristics in the generated image (string).
  • numberOfOutputs: Number of image variations to generate (integer). Accepts values between 1 and 4 (default: 1).
  • numberOfInferenceSteps: Steps used in the denoising process (integer). Range is 1 to 500 (default: 30).

Example Input:

{
  "width": 512,
  "height": 768,
  "prompt": "A dream of a distant galaxy, by Caspar David Friedrich, matte painting trending on artstation HQ",
  "scheduler": "EulerAncestralDiscrete",
  "guidanceScale": 7,
  "negativePrompt": "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck",
  "numberOfOutputs": 1,
  "numberOfInferenceSteps": 25
}

Output

The action typically returns a list of URLs pointing to the generated images.

Example Output:

[
  "https://assets.cognitiveactions.com/invocations/98eee868-acd5-4453-8d24-e6cea2a65a7a/4ac1de3c-43e1-404f-bb54-373f4a14b41c.png"
]

Conceptual Usage Example (Python)

Here’s a conceptual example of how you might invoke this action using Python:

import requests
import json

# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint

action_id = "a68265a2-2521-4d75-946e-b7c91e66341f" # Action ID for Generate Image from Text

# Construct the input payload based on the action's requirements
payload = {
    "width": 512,
    "height": 768,
    "prompt": "A dream of a distant galaxy, by Caspar David Friedrich, matte painting trending on artstation HQ",
    "scheduler": "EulerAncestralDiscrete",
    "guidanceScale": 7,
    "negativePrompt": "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime:1.4), text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck",
    "numberOfOutputs": 1,
    "numberOfInferenceSteps": 25
}

headers = {
    "Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
    "Content-Type": "application/json"
}

try:
    response = requests.post(
        COGNITIVE_ACTIONS_EXECUTE_URL,
        headers=headers,
        json={"action_id": action_id, "inputs": payload} # Hypothetical structure
    )
    response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)

    result = response.json()
    print("Action executed successfully:")
    print(json.dumps(result, indent=2))

except requests.exceptions.RequestException as e:
    print(f"Error executing action {action_id}: {e}")
    if e.response is not None:
        print(f"Response status: {e.response.status_code}")
        try:
            print(f"Response body: {e.response.json()}")
        except json.JSONDecodeError:
            print(f"Response body: {e.response.text}")

In this code snippet, you will replace the placeholder API key and URL with your actual values. The action_id and payload are structured according to the input schema detailed earlier. This example demonstrates a basic API call that sends the input and handles the response.

Conclusion

The mcai/realistic-vision-v2.0 Cognitive Actions provide a powerful tool for developers looking to enhance their applications with image generation capabilities. By using the provided schema and examples, you can easily integrate this functionality into your projects, allowing for exciting and creative applications. Explore the possibilities of generating stunning visuals from text and take your development to the next level!