LangChain and OpenAI integration diagram showing agent workflow for SQL database interaction

Empowering LangChain with Agents and Executors

Introduction to LangChain and OpenAI for Advanced AI Applications

Using APIs from OpenAI is a no-brainer for quick AI integration into the existing project. Also, using OpenAI is very straightforward: make the API call and get a response.

Consider a scenario: We have a large SQL table with tons of data. Now, the manager is not a technical person, and he needs to query the database.

One easy way is to go to ChatGPT, provide the tables and columns information, then ask for the SQL query.

Programmatically, we can implement it by utilizing both system messages and user messages. The system message will contain the table information, and with user message will ask for the query.

With the query we got from OpenAI, the manager will execute it in the database.

Here’s the sample code for this,

import openai
from dotenv import load_dotenv

load_dotenv()

def get_sql_query(tables_columns, question):
    prompt = f"Database schema:\n{tables_columns}\n\nQuestion: {question}\n\nWrite a SQL query to answer this question:"
    
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a SQL expert. Output only the SQL query without any explanation."},
            {"role": "user", "content": prompt}
        ],
        temperature=0
    )
    
    return response.choices[0].message['content'].strip()

tables_info = """
orders(id, customer_id, order_date, status)
customers(id, name, email)
"""

query = get_sql_query(tables_info, "How many open orders are there?")
print(query)Code language: PHP (php)

Limitations

Looking at the code, the execution of the process is not very scalable. Let’s find these limitations,

  1. We are passing the tables and columns information of the database to OpenAI. What if there are a lot of tables, and it may exceed the OpenAI message limit? The more information we provide, it will increase the pricing. Additionally, the conversation-type AI model works best with a short message. It may occur, the OpenAI model will only require one or two tables; the rest of the tables are irrelevant.
  2. As output, instead of getting a direct SQL query, we may get a human-like conversation, which is not exactly a SQL query or even it is a query; it may have a prefix and a postfix. The manager has to detect the query.
  3. Considering the first and second steps go well, the manager also has to execute the query in the database.

Solution with Agents

With all these limitations, what if,

  1. We send only the table list and query to the OpenAI
  2. OpenAI will detect the relevant tables and ask for column descriptions of these tables
  3. As a follow-up response, we provide these table details
  4. Now the OpenAI will provide us with exactly the SQL query
  5. We will execute the query in our DB, and show the output as a report to the manager

From a non-technical manager’s perspective, the manager will ask, How many open orders are available?. As a response, our system will provide a table-formatted report to the manager.

We can automate this entire process with the LangChain framework.

Helper Methods From Our End

In terms of implementation, this will be the flow,

User Message: How many open orders are there?
OpenAI: Give me the user and order table information.
User Message: TABLE_INFORMATION
OpenAI: Here’s the SQL_QUERY
User Message: FETCH_DATA_USING_QUERY and ask for HTML format from OpenAI

We need three helper methods for OpenAI,

  1. We have to fetch the user and orders table details and provide them to OpenAI.
  2. After we get the Query, we have to retrieve data from the database.
  3. With data, ask OpenAI will provide a filename and HTML table data. We have to save it.

For these two tasks, we should have a method in our system.
First task, to get the table details, we will use,

def describe_tables(table_names):
  c = conn.cursor()
  tables = ', '.join("'" + table + "'" for table in table_names)
  rows = c.execute(f"SELECT sql FROM sqlite_master WHERE type='table' AND name IN ({tables});")
  return '\n'.join(row[0] for row in rows if row[0] is not None)Code language: JavaScript (javascript)

For the second task, we need a method where a SQL query will be executed and the result will be returned.

def run_sqlite_query(query):
  try:
    c = conn.cursor()
    c.execute(query)
    return c.fetchall()
  except sqlite3.OperationalError as err:
    return f"The following error occurred: {str(err)}" 
Code language: JavaScript (javascript)

The final task is when OpenAI will provide the HTML content with a file name, persist it in our system,

def write_report(filename, html):
  with open(filename, 'w') as f:
    f.write(html)Code language: JavaScript (javascript)

For system messages, we must let OpenAI know the available tables in our database, so it can detect which tables must be used for the query. After detecting the tables, it can later ask us for the table information.

def list_tables():
  c = conn.cursor()
  c.execute("SELECT name FROM sqlite_master WHERE type='table';")
  rows = c.fetchall()
  return "\n".join(row[0] for row in rows if row[0] is not None)Code language: JavaScript (javascript)

Tools

To let OpenAI use our method, we have to provide it as a tool. For instance, below is the tool for run_sql_query,

class RunQueryArgsSchema(BaseModel):
  query: str

run_query_tool = Tool.from_function(
  func=run_sqlite_query,
  name="run_sqlite_query",
  description="Run a SQLite query",
  args_schema=RunQueryArgsSchema,
)

The RunQueryArgsSchema Tell OpenAI to run the tool, we need a string-type query.

Prompt, Agent, Agent Executor

An infographic on prompt system design in openAi's langchain

Prompt is taking the system message, what this conversation is about. Also, take the question in the template.

Agent takes all the tools, prompts, and uses them Agent Executor to execute it in a sequential manner. It ensures, the history is preserved.

Our final code will be very much like the following,

chat = ChatOpenAI(
  callbacks=[ChatModelStartHandler()],
)

tables = list_tables()

prompt = ChatPromptTemplate(
  messages=[
    SystemMessage(content=(
        "You are an AI that has access to a SQLite database.\n"
        f"The database has tables of: {tables}\n"
        "Do not make assumptions about what tables exist "
        "or what columns they have.\n"
        "If you need to know the schema of a table, use 'describe_tables' function"
    )),
    MessagesPlaceholder(variable_name="chat_history"),
    HumanMessagePromptTemplate.from_template("{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
  ]
)

memory = ConversationBufferMemory(
  memory_key="chat_history",
  return_messages=True,
)
 
tools = [
  run_query_tool,
  describe_table_tool,
  write_report_tool
]

agent = OpenAIFunctionsAgent(
  llm=chat,
  prompt=prompt,
  tools=tools,
  verbose=True,
)

agent_executor = AgentExecutor(
  agent=agent,
  tools=tools,
  verbose=True,
  memory=memory,
)

agent_executor(
  "Summarize the top 5 most popular products. Write the results to a report file."
)Code language: PHP (php)

So, from the manager, after the question, summarize the top 5 most popular products. Write the results to a report file., An HTML report file will be provided to the manager.

Conclusion

A complete code is available in the github for detailed references.

Experience the iXora Solution difference as your trusted offshore software development partner. We’re here to empower your vision with dedicated, extended, and remote software development teams. Our agile processes and tailored software development services optimize your projects, ensuring efficiency and success. At iXora Solution, we thrive in a dynamic team culture and experience innovation in the field of custom-made software development.

Have specific project requirements? Personalized or customized software solutions! You can contact iXora Solution expert teams for any consultation or coordination from here. We are committed to maximizing your business growth with our expertise as a custom software development and offshore solutions provider. Let’s make your goals a reality.
Thanks for your patience!

Add a Comment

Your email address will not be published. Required fields are marked *