How To Build A Powerful And Intelligent Question-answering System By Using Tavily Search Api, Chroma, Google Gemini Llms, And The Langchain Framework

11 hours ago

ARTICLE AD BOX

In this tutorial, we show really to build a powerful and intelligent question-answering strategy by combining nan strengths of Tavily Search API, Chroma, Google Gemini LLMs, and nan LangChain framework. The pipeline leverages real-time web hunt utilizing Tavily, semantic archive caching pinch Chroma vector store, and contextual consequence procreation done nan Gemini model. These devices are integrated done LangChain’s modular components, specified arsenic RunnableLambda, ChatPromptTemplate, ConversationBufferMemory, and GoogleGenerativeAIEmbeddings. It goes beyond elemental Q&A by introducing a hybrid retrieval system that checks for cached embeddings earlier invoking caller web searches. The retrieved documents are intelligently formatted, summarized, and passed done a system LLM prompt, pinch attraction to root attribution, personification history, and assurance scoring. Key functions specified arsenic precocious punctual engineering, sentiment and entity analysis, and move vector shop updates make this pipeline suitable for precocious usage cases for illustration investigation assistance, domain-specific summarization, and intelligent agents.

!pip instal -qU langchain-community tavily-python langchain-google-genai streamlit matplotlib pandas tiktoken chromadb langchain_core pydantic langchain

We instal and upgrade a broad group of libraries required to build an precocious AI hunt assistant. It includes devices for retrieval (tavily-python, chromadb), LLM integration (langchain-google-genai, langchain), information handling (pandas, pydantic), visualization (matplotlib, streamlit), and tokenization (tiktoken). These components shape nan halfway instauration for constructing a real-time, context-aware QA system.

import os import getpass import pandas arsenic pd import matplotlib.pyplot arsenic plt import numpy arsenic np import json import time from typing import List, Dict, Any, Optional from datetime import datetime

We import basal Python libraries utilized passim nan notebook. It includes modular libraries for situation variables, unafraid input, clip tracking, and information types (os, getpass, time, typing, datetime). Additionally, it brings successful halfway information subject devices for illustration pandas, matplotlib, and numpy for information handling, visualization, and numerical computations, arsenic good arsenic json for parsing system data.

if "TAVILY_API_KEY" not successful os.environ: os.environ["TAVILY_API_KEY"] = getpass.getpass("Enter Tavily API key: ") if "GOOGLE_API_KEY" not successful os.environ: os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter Google API key: ") import logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s') logger = logging.getLogger(__name__)

We securely initialize API keys for Tavily and Google Gemini by prompting users only if they’re not already group successful nan environment, ensuring safe and repeatable entree to outer services. It besides configures a standardized logging setup utilizing Python’s logging module, which helps show execution travel and seizure debug aliases correction messages passim nan notebook.

from langchain_community.retrievers import TavilySearchAPIRetriever from langchain_community.vectorstores import Chroma from langchain_core.documents import Document from langchain_core.output_parsers import StrOutputParser, JsonOutputParser from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate from langchain_core.runnables import RunnablePassthrough, RunnableLambda from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain.chains.summarize import load_summarize_chain from langchain.memory import ConversationBufferMemory

We import cardinal components from nan LangChain ecosystem and its integrations. It brings successful nan TavilySearchAPIRetriever for real-time web search, Chroma for vector storage, and GoogleGenerativeAI modules for chat and embedding models. Core LangChain modules for illustration ChatPromptTemplate, RunnableLambda, ConversationBufferMemory, and output parsers alteration elastic punctual construction, representation handling, and pipeline execution.

class SearchQueryError(Exception): """Exception raised for errors successful nan hunt query.""" pass def format_docs(docs): formatted_content = [] for i, doc successful enumerate(docs): metadata = doc.metadata root = metadata.get('source', 'Unknown source') title = metadata.get('title', 'Untitled') people = metadata.get('score', 0) formatted_content.append( f"Document {i+1} [Score: {score:.2f}]:n" f"Title: {title}n" f"Source: {source}n" f"Content: {doc.page_content}n" ) return "nn".join(formatted_content)

We specify 2 basal components for hunt and archive handling. The SearchQueryError people creates a civilization objection to negociate invalid aliases grounded hunt queries gracefully. The format_docs usability processes a database of retrieved documents by extracting metadata specified arsenic title, source, and relevance people and formatting them into a clean, readable string.

class SearchResultsParser: def parse(self, text): try: if isinstance(text, str): import re import json json_match = re.search(r'{.*}', text, re.DOTALL) if json_match: json_str = json_match.group(0) return json.loads(json_str) return {"answer": text, "sources": [], "confidence": 0.5} elif hasattr(text, 'content'): return {"answer": text.content, "sources": [], "confidence": 0.5} else: return {"answer": str(text), "sources": [], "confidence": 0.5} isolated from Exception arsenic e: logger.warning(f"Failed to parse JSON: {e}") return {"answer": str(text), "sources": [], "confidence": 0.5}

The SearchResultsParser people provides a robust method for extracting system accusation from LLM responses. It attempts to parse a JSON-like drawstring from nan exemplary output, returning to a plain matter consequence format if parsing fails. It gracefully handles drawstring outputs and connection objects, ensuring accordant downstream processing. In lawsuit of errors, it logs a informing and returns a fallback consequence containing nan earthy answer, quiet sources, and a default assurance score, enhancing nan system’s responsibility tolerance.

class EnhancedTavilyRetriever: def __init__(self, api_key=None, max_results=5, search_depth="advanced", include_domains=None, exclude_domains=None): self.api_key = api_key self.max_results = max_results self.search_depth = search_depth self.include_domains = include_domains aliases [] self.exclude_domains = exclude_domains aliases [] self.retriever = self._create_retriever() self.previous_searches = [] def _create_retriever(self): try: return TavilySearchAPIRetriever( api_key=self.api_key, k=self.max_results, search_depth=self.search_depth, include_domains=self.include_domains, exclude_domains=self.exclude_domains ) isolated from Exception arsenic e: logger.error(f"Failed to create Tavily retriever: {e}") raise def invoke(self, query, **kwargs): if not query aliases not query.strip(): raise SearchQueryError("Empty hunt query") try: start_time = time.time() results = self.retriever.invoke(query, **kwargs) end_time = time.time() search_record = { "timestamp": datetime.now().isoformat(), "query": query, "num_results": len(results), "response_time": end_time - start_time } self.previous_searches.append(search_record) return results isolated from Exception arsenic e: logger.error(f"Search failed: {e}") raise SearchQueryError(f"Failed to execute search: {str(e)}") def get_search_history(self): return self.previous_searches

The EnhancedTavilyRetriever people is simply a civilization wrapper astir nan TavilySearchAPIRetriever, adding greater flexibility, control, and traceability to hunt operations. It supports precocious features for illustration limiting hunt depth, domain inclusion/exclusion filters, and configurable consequence counts. The invoke method performs web searches and tracks each query’s metadata (timestamp, consequence time, and consequence count), storing it for later analysis.

class SearchCache: def __init__(self): self.embedding_function = GoogleGenerativeAIEmbeddings(model="models/embedding-001") self.vector_store = None self.text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200) def add_documents(self, documents): if not documents: return try: if self.vector_store is None: self.vector_store = Chroma.from_documents( documents=documents, embedding=self.embedding_function ) else: self.vector_store.add_documents(documents) isolated from Exception arsenic e: logger.error(f"Failed to adhd documents to cache: {e}") def search(self, query, k=3): if self.vector_store is None: return [] try: return self.vector_store.similarity_search(query, k=k) isolated from Exception arsenic e: logger.error(f"Vector hunt failed: {e}") return []

The SearchCache people implements a semantic caching furniture that stores and retrieves documents utilizing vector embeddings for businesslike similarity search. It uses GoogleGenerativeAIEmbeddings to person documents into dense vectors and stores them successful a Chroma vector database. The add_documents method initializes aliases updates nan vector store, while nan hunt method enables accelerated retrieval of nan astir applicable cached documents based connected semantic similarity. This reduces redundant API calls and improves consequence times for repeated aliases related queries, serving arsenic a lightweight hybrid representation furniture successful nan AI adjunct pipeline.

search_cache = SearchCache() enhanced_retriever = EnhancedTavilyRetriever(max_results=5) memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) system_template = """You are a investigation adjunct that provides meticulous answers based connected nan hunt results provided. Follow these guidelines: 1. Only usage nan discourse provided to reply nan question 2. If nan discourse doesn't incorporate nan answer, opportunity "I don't person capable accusation to reply this question." 3. Cite your sources by referencing nan archive numbers 4. Don't dress up information 5. Keep nan reply concise but complete Context: {context} Chat History: {chat_history} """ system_message = SystemMessagePromptTemplate.from_template(system_template) human_template = "Question: {question}" human_message = HumanMessagePromptTemplate.from_template(human_template) prompt = ChatPromptTemplate.from_messages([system_message, human_message])

We initialize nan halfway components of nan AI assistant: a semantic SearchCache, nan EnhancedTavilyRetriever for web-based querying, and a ConversationBufferMemory to clasp chat history crossed turns. It besides defines a system punctual utilizing ChatPromptTemplate, guiding nan LLM to enactment arsenic a investigation assistant. The punctual enforces strict rules for actual accuracy, discourse usage, root citation, and concise answering, ensuring reliable and grounded responses.

def get_llm(model_name="gemini-2.0-flash-lite", temperature=0.2, response_mode="json"): try: return ChatGoogleGenerativeAI( model=model_name, temperature=temperature, convert_system_message_to_human=True, top_p=0.95, top_k=40, max_output_tokens=2048 ) isolated from Exception arsenic e: logger.error(f"Failed to initialize LLM: {e}") raise output_parser = SearchResultsParser()

We specify nan get_llm function, which initializes a Google Gemini connection exemplary pinch configurable parameters specified arsenic exemplary name, temperature, and decoding settings (e.g., top_p, top_k, and max tokens). It ensures robustness pinch correction handling for grounded exemplary initialization. An lawsuit of SearchResultsParser is besides created to standardize and building nan LLM’s earthy responses, enabling accordant downstream processing of answers and metadata.

def plot_search_metrics(search_history): if not search_history: print("No hunt history available") return df = pd.DataFrame(search_history) plt.figure(figsize=(12, 6)) plt.subplot(1, 2, 1) plt.plot(range(len(df)), df['response_time'], marker='o') plt.title('Search Response Times') plt.xlabel('Search Index') plt.ylabel('Time (seconds)') plt.grid(True) plt.subplot(1, 2, 2) plt.bar(range(len(df)), df['num_results']) plt.title('Number of Results per Search') plt.xlabel('Search Index') plt.ylabel('Number of Results') plt.grid(True) plt.tight_layout() plt.show()

The plot_search_metrics usability visualizes capacity trends from past queries utilizing Matplotlib. It converts nan hunt history into a DataFrame and land 2 subgraphs: 1 showing consequence clip per hunt and nan different displaying nan number of results returned. This immunodeficiency successful analyzing nan system’s ratio and hunt value complete time, helping developers fine-tune nan retriever aliases place bottlenecks successful real-world usage.

def retrieve_with_fallback(query): cached_results = search_cache.search(query) if cached_results: logger.info(f"Retrieved {len(cached_results)} documents from cache") return cached_results logger.info("No cache hit, performing web search") search_results = enhanced_retriever.invoke(query) search_cache.add_documents(search_results) return search_results def summarize_documents(documents, query): llm = get_llm(temperature=0) summarize_prompt = ChatPromptTemplate.from_template( """Create a concise summary of nan pursuing documents related to this query: {query} {documents} Provide a broad summary that addresses nan cardinal points applicable to nan query. """ ) concatenation = ( {"documents": lambda docs: format_docs(docs), "query": lambda _: query} | summarize_prompt | llm | StrOutputParser() ) return chain.invoke(documents)

These 2 functions heighten nan assistant’s intelligence and efficiency. The retrieve_with_fallback usability implements a hybrid retrieval mechanism: it first attempts to fetch semantically applicable documents from nan section Chroma cache and, if unsuccessful, falls backmost to a real-time Tavily web search, caching nan caller results for early use. Meanwhile, summarize_documents leverages a Gemini LLM to make concise summaries from retrieved documents, guided by a system punctual that ensures relevance to nan query. Together, they alteration low-latency, informative, and context-aware responses.

def advanced_chain(query_engine="enhanced", model="gemini-1.5-pro", include_history=True): llm = get_llm(model_name=model) if query_engine == "enhanced": retriever = lambda query: retrieve_with_fallback(query) else: retriever = enhanced_retriever.invoke def chain_with_history(input_dict): query = input_dict["question"] chat_history = memory.load_memory_variables({})["chat_history"] if include_history other [] docs = retriever(query) discourse = format_docs(docs) consequence = prompt.invoke({ "context": context, "question": query, "chat_history": chat_history }) memory.save_context({"input": query}, {"output": result.content}) return llm.invoke(result) return RunnableLambda(chain_with_history) | StrOutputParser()

The advanced_chain usability defines a modular, end-to-end reasoning workflow for answering personification queries utilizing cached aliases real-time search. It initializes nan specified Gemini model, selects nan retrieval strategy (cached fallback aliases nonstop search), constructs a consequence pipeline incorporating chat history (if enabled), formats documents into context, and prompts nan LLM utilizing a system-guided template. The concatenation besides logs nan relationship successful representation and returns nan last answer, parsed into cleanable text. This creation enables elastic experimentation pinch models and retrieval strategies while maintaining speech coherence.

qa_chain = advanced_chain() def analyze_query(query): llm = get_llm(temperature=0) analysis_prompt = ChatPromptTemplate.from_template( """Analyze nan pursuing query and provide: 1. Main topic 2. Sentiment (positive, negative, neutral) 3. Key entities mentioned 4. Query type (factual, opinion, how-to, etc.) Query: {query} Return nan study successful JSON format pinch nan pursuing structure: {{ "topic": "main topic", "sentiment": "sentiment", "entities": ["entity1", "entity2"], "type": "query type" }} """ ) concatenation = analysis_prompt | llm | output_parser return chain.invoke({"query": query}) print("Advanced Tavily-Gemini Implementation") print("="*50) query = "what twelvemonth was activity of nan chaotic released and what was its reception?" print(f"Query: {query}")

We initialize nan last components of nan intelligent assistant. qa_chain is nan assembled reasoning pipeline fresh to process personification queries utilizing retrieval, memory, and Gemini-based consequence generation. The analyze_query usability performs a lightweight semantic study connected a query, extracting nan main topic, sentiment, entities, and query type utilizing nan Gemini exemplary and a system JSON prompt. The illustration query, astir Breath of nan Wild’s merchandise and reception, showcases really nan adjunct is triggered and prepared for full-stack conclusion and semantic interpretation. The printed heading marks nan commencement of interactive execution.

try: print("nSearching for answer...") reply = qa_chain.invoke({"question": query}) print("nAnswer:") print(answer) print("nAnalyzing query...") try: query_analysis = analyze_query(query) print("nQuery Analysis:") print(json.dumps(query_analysis, indent=2)) isolated from Exception arsenic e: print(f"Query study correction (non-critical): {e}") except Exception arsenic e: print(f"Error successful search: {e}") history = enhanced_retriever.get_search_history() print("nSearch History:") for i, h successful enumerate(history): print(f"{i+1}. Query: {h['query']} - Results: {h['num_results']} - Time: {h['response_time']:.2f}s") print("nAdvanced hunt pinch domain filtering:") specialized_retriever = EnhancedTavilyRetriever( max_results=3, search_depth="advanced", include_domains=["nintendo.com", "zelda.com"], exclude_domains=["reddit.com", "twitter.com"] ) try: specialized_results = specialized_retriever.invoke("breath of nan chaotic sales") print(f"Found {len(specialized_results)} specialized results") summary = summarize_documents(specialized_results, "breath of nan chaotic sales") print("nSummary of specialized results:") print(summary) except Exception arsenic e: print(f"Error successful specialized search: {e}") print("nSearch Metrics:") plot_search_metrics(history)

We show nan complete pipeline successful action. It performs a hunt utilizing nan qa_chain, displays nan generated answer, and past analyzes nan query for sentiment, topic, entities, and type. It besides retrieves and prints each query’s hunt history, consequence time, and consequence count. Also, it runs a domain-filtered hunt focused connected Nintendo-related sites, summarizes nan results, and visualizes hunt capacity utilizing plot_search_metrics, offering a broad position of nan assistant’s capabilities successful real-time use.

In conclusion, pursuing this tutorial gives users a broad blueprint for creating a highly capable, context-aware, and scalable RAG strategy that bridges real-time web intelligence pinch conversational AI. The Tavily Search API lets users straight propulsion caller and applicable contented from nan web. The Gemini LLM adds robust reasoning and summarization capabilities, while LangChain’s abstraction furniture allows seamless orchestration betwixt memory, embeddings, and exemplary outputs. The implementation includes precocious features specified arsenic domain-specific filtering, query study (sentiment, topic, and entity extraction), and fallback strategies utilizing a semantic vector cache built pinch Chroma and GoogleGenerativeAIEmbeddings. Also, system logging, correction handling, and analytics dashboards supply transparency and diagnostics for real-world deployment.

Check retired nan Colab Notebook. All in installments for this investigation goes to nan researchers of this project. Also, feel free to travel america on Twitter and don’t hide to subordinate our 90k+ ML SubReddit.

Asif Razzaq is nan CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing nan imaginable of Artificial Intelligence for societal good. His astir caller endeavor is nan motorboat of an Artificial Intelligence Media Platform, Marktechpost, which stands retired for its in-depth sum of instrumentality learning and heavy learning news that is some technically sound and easy understandable by a wide audience. The level boasts of complete 2 cardinal monthly views, illustrating its fame among audiences.