RetrievalQA?

RetrievalQA is a specialized question-answering framework that leverages retrievers (like FAISS or vector databases) to find relevant documents or text snippets based on a query and then uses language models (LLMs) to generate answers based on the retrieved content. This architecture improves the LLM's ability to answer questions accurately by limiting the scope to relevant data instead of relying on the model's general knowledge.

In this framework:


FAISS Retrieval QA with OpenAI GPT-4

Description

This Python script builds a conversational QA system by integrating FAISS vector stores with OpenAI GPT-4. It retrieves relevant documents from the FAISS index to answer user questions interactively. Below is a step-by-step breakdown of the script’s functionality:

Script Breakdown


Python Code


#!/usr/bin/env python

import os
import openai  # Ensure openai is imported
from langchain_openai import OpenAIEmbeddings, ChatOpenAI  # Updated imports
from langchain_community.vectorstores import FAISS
from langchain.chains import RetrievalQA

# Step 1: Set up API Key
openai.api_key = os.getenv("OPENAI_API_KEY")
if not openai.api_key:
    raise ValueError("OpenAI API key not found. Set OPENAI_API_KEY as an environment variable.")

# Step 2: Load the FAISS Index from Disk
faiss_index_path = "faiss_index"
embeddings = OpenAIEmbeddings()

if os.path.exists(faiss_index_path):
    print(f"Loading FAISS index from '{faiss_index_path}'...")
    vector_store = FAISS.load_local(
        faiss_index_path,
        embeddings,
        allow_dangerous_deserialization=True  # Enable safe deserialization
    )
else:
    raise FileNotFoundError(f"No FAISS index found at '{faiss_index_path}'. Please generate the index first.")

# Step 3: Initialize ChatOpenAI Model
llm = ChatOpenAI(model="gpt-4")

# Step 4: Build the RetrievalQA Chain without Unsupported Parameters
retriever = vector_store.as_retriever(search_kwargs={"k": 3})

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
)

# Step 5: Query Loop with Manual Handling for Empty Results
print("Ask me anything about the indexed content. Type 'exit' to quit.")
while True:
    question = input("\nYour Question: ")
    if question.lower() == 'exit':
        print("Goodbye!")
        break

    try:
        # Retrieve relevant documents manually
        docs = retriever.get_relevant_documents(question)
        if not docs:
            print("No relevant information found in the index.")
            continue

        # If documents are found, run the QA chain
        result = qa_chain.run(question)
        print(f"Answer: {result}")

    except Exception as e:
        print(f"Error: {str(e)}")

Explanation of the Code


Output

When the script runs, the following output is shown for various scenarios:


Loading FAISS index from 'faiss_index'...
Ask me anything about the indexed content. Type 'exit' to quit.

Your Question: What is the purpose of this document?
Answer: [Answer provided by GPT-4 based on indexed content]

Your Question: exit
Goodbye!

Conclusion

This script creates a simple yet powerful conversational system by combining FAISS with OpenAI GPT-4. It enables fast and relevant document retrieval followed by natural language responses, making it suitable for chatbots, knowledge systems, and question-answering applications.