Transform Natural Language into SQL Queries with nateraw/defog-sqlcoder-7b-2 Actions

In the realm of data analysis and manipulation, the ability to translate natural language queries into SQL code can significantly enhance productivity and accessibility. The nateraw/defog-sqlcoder-7b-2 API offers a powerful Cognitive Action designed to convert your natural language questions into SQL queries. This functionality not only speeds up the query generation process but also ensures high-quality, accurate SQL outputs.
Prerequisites
Before diving into the integration of the Cognitive Actions, ensure you have the following:
- An API key for accessing the Cognitive Actions platform.
- Familiarity with making HTTP requests and handling JSON payloads.
Authentication typically involves passing your API key in the request headers. This will allow you to securely access the API and utilize its powerful capabilities.
Cognitive Actions Overview
Generate SQL from Natural Language
The Generate SQL from Natural Language action takes natural language questions and converts them into SQL queries using a specialized language model. This action falls under the text-generation category, making it particularly useful for developers who want to streamline data retrieval processes.
Input
The action requires the following fields within its input schema:
- question (required): A string representing the natural language question to be converted into SQL.
Example:"Do we get more sales from customers in New York compared to customers in San Francisco? Give me the total sales for each city, and the difference between the two." - tableMetadata (required): A string providing the SQL schema that includes the tables and their columns relevant to the question.
Example:
CREATE TABLE products ( product_id INTEGER PRIMARY KEY, name VARCHAR(50), price DECIMAL(10,2), quantity INTEGER ); CREATE TABLE customers ( customer_id INTEGER PRIMARY KEY, name VARCHAR(50), address VARCHAR(100) ); CREATE TABLE sales ( sale_id INTEGER PRIMARY KEY, product_id INTEGER, customer_id INTEGER, sale_date DATE, quantity INTEGER ); - topK (optional): An integer to specify the number of top tokens to consider during output generation (default is 50).
- topP (optional): A number that sets the cumulative probability threshold for output generation (default is 0.9).
- temperature (optional): A number controlling the randomness of the output (default is 0.6).
- maxNewTokens (optional): The maximum number of tokens to generate in the output (default is 512).
- promptTemplate (optional): A template for structuring the input prompt.
- presencePenalty (optional): A penalty for tokens already present in the sequence (default is 0).
- frequencyPenalty (optional): A penalty to reduce the likelihood of generating tokens based on their frequency (default is 0).
Here’s an example input payload structured according to the schema:
{
"topK": 50,
"topP": 0.9,
"question": "Do we get more sales from customers in New York compared to customers in San Francisco? Give me the total sales for each city, and the difference between the two.",
"temperature": 0,
"maxNewTokens": 512,
"tableMetadata": "CREATE TABLE products (...); CREATE TABLE customers (...); CREATE TABLE sales (...);",
"promptTemplate": "### Task\nGenerate a SQL query to answer [QUESTION]{question}[/QUESTION]\n...",
"presencePenalty": 0,
"frequencyPenalty": 0
}
Output
The action typically returns a SQL query as a structured output. For the example question, the output could look like this:
SELECT c.city,
SUM(s.quantity) AS total_sales,
SUM(CASE WHEN c.city = 'New York' THEN s.quantity ELSE 0 END) -
SUM(CASE WHEN c.city = 'San Francisco' THEN s.quantity ELSE 0 END) AS difference_in_sales
FROM sales s
JOIN customers c ON s.customer_id = c.customer_id
GROUP BY c.city
ORDER BY total_sales DESC;
The output may vary based on the complexity of the question and the provided database schema.
Conceptual Usage Example (Python)
Below is a conceptual Python code snippet that demonstrates how to invoke the Generate SQL from Natural Language action using a hypothetical Cognitive Actions endpoint.
import requests
import json
# Replace with your Cognitive Actions API key and endpoint
COGNITIVE_ACTIONS_API_KEY = "YOUR_COGNITIVE_ACTIONS_API_KEY"
COGNITIVE_ACTIONS_EXECUTE_URL = "https://api.cognitiveactions.com/actions/execute" # Hypothetical endpoint
action_id = "c4d40dad-2701-4060-b293-c766292eabea" # Action ID for Generate SQL from Natural Language
# Construct the input payload based on the action's requirements
payload = {
"topK": 50,
"topP": 0.9,
"question": "Do we get more sales from customers in New York compared to customers in San Francisco? Give me the total sales for each city, and the difference between the two.",
"temperature": 0,
"maxNewTokens": 512,
"tableMetadata": "CREATE TABLE products (...); CREATE TABLE customers (...); CREATE TABLE sales (...);",
"promptTemplate": "### Task\nGenerate a SQL query to answer [QUESTION]{question}[/QUESTION]\n...",
"presencePenalty": 0,
"frequencyPenalty": 0
}
headers = {
"Authorization": f"Bearer {COGNITIVE_ACTIONS_API_KEY}",
"Content-Type": "application/json"
}
try:
response = requests.post(
COGNITIVE_ACTIONS_EXECUTE_URL,
headers=headers,
json={"action_id": action_id, "inputs": payload} # Hypothetical structure
)
response.raise_for_status() # Raise an exception for bad status codes (4xx or 5xx)
result = response.json()
print("Action executed successfully:")
print(json.dumps(result, indent=2))
except requests.exceptions.RequestException as e:
print(f"Error executing action {action_id}: {e}")
if e.response is not None:
print(f"Response status: {e.response.status_code}")
try:
print(f"Response body: {e.response.json()}")
except json.JSONDecodeError:
print(f"Response body: {e.response.text}")
In this snippet, replace the placeholders with your actual API key and endpoint. The input payload is created based on the requirements of the action, and the request is sent to execute the action.
Conclusion
The nateraw/defog-sqlcoder-7b-2 Cognitive Action for generating SQL from natural language questions is a game-changer for developers looking to simplify their data interactions. By leveraging this action, you can automate SQL query generation, reduce manual errors, and enhance the overall efficiency of your database operations. Explore various use cases, integrate this action into your applications, and unlock the full potential of your data!