Enhance Multilingual Applications with Text Embeddings

In today's interconnected world, the ability to process and understand multiple languages is crucial for developers working on multilingual applications. The Multilingual E5 Small offers powerful Cognitive Actions that allow you to generate high-quality text embeddings across various languages. By leveraging these embeddings, you can enhance the functionality of applications, improve user experiences, and streamline natural language processing tasks.
This service simplifies the complex task of creating normalized or non-normalized embeddings in batch mode, making it easier to integrate multilingual capabilities into your projects. Whether you're developing chatbots, translation services, or content recommendation systems, these text embeddings can significantly enhance your application's performance.
Prerequisites
To get started with the Multilingual E5 Small Cognitive Actions, you'll need an API key for the Cognitive Actions service and a basic understanding of how to make API calls.
Generate Multilingual Text Embeddings
The Generate Multilingual Text Embeddings action produces text embeddings using the multilingual-e5-small model. This action is designed to tackle the challenges of processing text in multiple languages, providing a robust solution for applications that require language versatility.
Purpose
This action generates embeddings that encapsulate the semantic meaning of the input text, enabling applications to understand and compare phrases or sentences in different languages effectively. It enhances compatibility in multilingual applications by allowing for batch processing of text inputs.
Input Requirements
The input for this action is a JSON object that includes:
texts: A list of strings to be embedded (e.g.,["In the water, fish are swimming.", "Fish swim in the water.", "A book lies open on the table."]).batchSize: An integer indicating the number of text items to process in a single batch, with a default value of 32.normalizeEmbeddings: A boolean value that specifies whether the embeddings should be normalized, defaulting to true.
Expected Output
The output will be a JSON array containing arrays of floating-point numbers, which represent the embeddings for each input text. For instance, the output might look like:
[
[0.0129, 0.0266, -0.0390, ...],
[0.0221, 0.0125, -0.0317, ...],
...
]
Each array corresponds to the text input's embedding, allowing for qualitative comparisons between different pieces of text.
Use Cases for this specific action
- Chatbots and Virtual Assistants: Improve the understanding of user queries in multiple languages, allowing for more accurate and context-aware responses.
- Translation Services: Facilitate better translations by ensuring that the meaning and context of phrases are preserved across different languages.
- Content Recommendation Systems: Enhance user experience by providing recommendations based on user queries in their preferred language, improving engagement and satisfaction.
import requests
import json
# Replace with your actual Cognitive Actions API key and endpoint
# Ensure your environment securely handles the API key
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
# This endpoint URL is hypothetical and should be documented for users
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute"
action_id = "6cad660b-b159-4214-9a07-95f9a5828853" # Action ID for: Generate Multilingual Text Embeddings
# Construct the exact input payload based on the action's requirements
# This example uses the predefined example_input for this action:
payload = {
"texts": "[\"In the water, fish are swimming.\", \"Fish swim in the water.\", \"A book lies open on the table.\"]",
"batchSize": 32,
"normalizeEmbeddings": true
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json",
# Add any other required headers for the Cognitive Actions API
}
# Prepare the request body for the hypothetical execution endpoint
request_body = {
"action_id": action_id,
"inputs": payload
}
print(f"--- Calling Cognitive Action: {action.name or action_id} ---")
print(f"Endpoint: {COGNITIVE_ACTIONS_EXECUTE_URL}")
print(f"Action ID: {action_id}")
print("Payload being sent:")
print(json.dumps(request_body, indent=2))
print("------------------------------------------------")
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json=request_body
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully. Result:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body (non-JSON): {e.response.text}")
print("------------------------------------------------")
Conclusion
The Multilingual E5 Small's ability to generate text embeddings is a game-changer for developers looking to enhance their multilingual applications. By providing a robust solution for generating high-quality embeddings in various languages, this action allows you to improve user experience, streamline natural language processing tasks, and expand the functionality of your applications.
As you integrate these Cognitive Actions into your projects, consider exploring additional use cases and applications that could benefit from enhanced multilingual capabilities. The possibilities are vast, and the potential for innovation is immense.