Chat With Your SQL Database Using LLM

Large Language Models (LLMs) are revolutionizing data interaction. With tools like ChatGPT, you can ask complex questions and receive insightful answers in seconds, thanks to its ability to process vast amounts of information.. Now, imagine applying this capability to your own database. You can effortlessly query and analyze your data, gaining valuable insights and making data-driven decisions faster than ever before.

Langchain has taken this a step further by developing a tool that crafts prompts to interact with databases using natural language. This tool enables the creation of complex queries, including join operations, without requiring technical expertise. It's perfect for less tech-savvy users, allowing them to explore and analyze your database with ease. By leveraging Langchain, anyone can unlock the full potential of their data, making powerful insights accessible to all through simple, conversational interactions.

In this blog, we will guide you through setting up LangChain and using Ollama and ChatGPT to interact with your SQL database. You'll learn how to harness the power of natural language processing to make complex queries and gain insights from your data effortlessly.

Installing Ollama

First of all, we need to install an LLM model. We will assume that you already have Docker installed. If you prefer to install it directly on your machine, follow the instructions at Ollama Download.

💡
If you want to use ChatGPT, you can skip this part.

For CPU-Only Setup

Pull the Ollama Docker image and run it:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Now we will enter the Ollama container and pull and run llama3 model:

docker exec -it ollama ollama run llama3
💡
For different models like Gemma or any other LLM from Ollama, refer to their library. For Example:

For GPU Setup

If you prefer using a GPU, follow the instructions on Docker Hub.

Setting Up the Environment

Install the necessary Python packages:

pip install langchain langchain_community langchain_core mysql-connector-python

Let's Initialize Our LLM Model Variable

For Ollama models you can use the following code:

from langchain_community.llms import Ollama

# Initialize the LLM with the model you choose
llm = Ollama(model="ollama3", base_url="http://localhost:11434")

If you prefer to use ChatGPT you can initialize your LLM like this:

from langchain_community.llms import OpenAI

# Initialize the LLM with ChatGPT
llm = OpenAI(api_key="your_openai_api_key")

Interacting with the LLM

You can use the LLM directly for natural language queries. Here’s how:

...

# Define your question string
question = "Hi How are you?"

# Initialize an empty string to store the result
result = ""

# Iterate over chunks of the question string generated by the llm.stream method
for chunk in llm.stream(question):
    # Append each chunk to the result string
    result += chunk

# Print the final result string
print(result)

Now Let's Try Connecting It with MySQL

Here's a sample code to interact with your SQL database using Ollama:

Database Connection

This line connects to a MySQL database using the specified connection string, which includes the username, password, URL, and database name. Replace placeholders with actual values.

...

from langchain_community.sql_database import SQLDatabase

db = SQLDatabase.from_uri("mysql+mysqlconnector://<MYSQL_USERNAME>:<MYSQL_PASSWORD>@<MYSQL_URL>/<MYSQL_DATABASE>")

...

Creating the SQL Agent

This creates an SQL agent using a language model (llm) and a toolkit that includes the database connection and the language model. The agent is configured for a specific type (ZERO_SHOT_REACT_DESCRIPTION), and various settings like verbosity and error handling.

...

from langchain_community.agent_toolkits import create_sql_agent
from langchain.agents import AgentType
from langchain_community.sql_database import SQLDatabaseToolkit

agent_exec = create_sql_agent(
    llm=llm,
    toolkit=SQLDatabaseToolkit(db=db, llm=llm),
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
)

...

Executing a Query

This sets an example query to find the top 10 products by sales and invokes the agent to execute this query. The ConsoleCallbackHandler is used to handle callbacks and for logging and debugging.

...

from langchain.callbacks.tracers import ConsoleCallbackHandler

question = "What are the top 10 products by sales?"
result = agent_exec.invoke(question, config={"callbacks": [ConsoleCallbackHandler()]})

...

Printing the Result

Lastly, we print the result and we can see what the LLM answered.

...

print(result)

Using SQLite

SQLite is a popular choice for local databases. You can use the Chinook database, a sample database available for SQLite, for practice.

Setting Up Chinook Database

Download the Chinook Database: Download the SQL script for the Chinook database from here.

Windows Setup: Download and install SQLite from SQLite Download Page.

Linux Setup: Just run the following command:

sudo apt install sqlite3

Open the command prompt and navigate to the directory containing Chinook_Sqlite.sql.

sqlite3 Chinook.db < Chinook_Sqlite.sql

After running the code, a new file named Chinook.db will be created in your current directory.

Database Connection

This line connects to a SQL Lite Database by specifying the location for the db file.

...

from langchain_community.sql_database import SQLDatabase
# Connect to the SQLite database
db = SQLDatabase.from_uri("sqlite:///path/to/Chinook.db")

...

Creating the SQL Agent

This creates an SQL agent using a language model (llm) and a toolkit that includes the database connection and the language model. The agent is configured for a specific type (ZERO_SHOT_REACT_DESCRIPTION), and various settings like verbosity and error handling.

...

from langchain_community.agent_toolkits import create_sql_agent
from langchain.agents import AgentType
from langchain_community.sql_database import SQLDatabaseToolkit

agent_exec = create_sql_agent(
    llm=llm,
    toolkit=SQLDatabaseToolkit(db=db, llm=llm),
    agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
)

...

Executing a Query

This sets an example query to find the top 10 products by sales and invokes the agent to execute this query. The ConsoleCallbackHandler is used to handle callbacks and for logging and debugging.

...

from langchain.callbacks.tracers import ConsoleCallbackHandler

question = "List all the tracks with their album names."
result = agent_exec.invoke(question, config={"callbacks": [ConsoleCallbackHandler()]})

...

Printing the Result

Lastly, we print the result and we can see what the LLM answered.

...

print(result)

Additional Examples

Summarizing Sales Data:

...

question = "Give me a summary of sales data for the past year."
result = agent_exec.invoke(question, config={"callbacks": [ConsoleCallbackHandler()]})

...

Fetching Customer Details:

...

question = "Show me the details of customers who made purchases in the last month."
result = agent_exec.invoke(question, config={"callbacks": [ConsoleCallbackHandler()]})

...

Conclusion

Using LLMs to interact with SQL databases can simplify data querying and analysis significantly. In this blog, we've demonstrated how to set up and use Ollama to interact with your SQL database, and we also provided an example of how to use ChatGPT by simply changing the LLM variable. Whether you’re using MySQL, SQLite, or any other SQL database, integrating LLMs like Ollama or ChatGPT into your workflow can enhance your ability to gain insights and make data-driven decisions.

Feel free to experiment with different models and databases to find the best fit for your needs. If you have any questions or need further assistance, don't hesitate to reach out!