Using Advanced Retrievers in LangChain

Words By Harpreet Sahota

November 30, 2023

More Techniques to Improve Retrieval Quality

Advanced retrievers in LangChain with Comet ML and CometLLM — Photo by Regine Tholen on Unsplash

If you’ve ever hit the wall with basic retrievers, it’s time to gear up with some “advanced” retrievers from LangChain.

This isn’t just an upgrade; it’s a new way to think about digging through data. Picture this: instead of a single line of inquiry, you deploy a squad of queries tailored to scout out a broader intel landscape. You’re not just searching; you’re launching a multi-pronged investigation into your database. This blog will arm you with everything you need to master advanced retrievers, and help you outsmart traditional search limitations and bring you closer to the heart of the matter.

For anyone looking to sharpen their retrieval results, let’s dive into how these retrievers work and how they can change your game.

MultiQueryRetriever

The MultiQueryRetriever addresses the limitations in distance-based similarity search by generating multiple alternative “perspectives” on your query/question.

By generating alternative questions and retrieving documents based on those questions, you can cover a broader range of information and increase the chances of finding the most relevant content. To determine if you need to use the MultiQueryRetriever, consider the following:

• If you use a distance-based similarity search and want to improve the retrieval accuracy, the MultiQueryRetriever can be a valuable tool.

• If you want to provide a more comprehensive set of results to the user by considering different perspectives on their question, the MultiQueryRetriever is a suitable choice.

• If you have a vector database with many documents and want to retrieve the most relevant ones, the MultiQueryRetriever can help you achieve that.

MultiQueryRetriever in LangChain retrieves relevant documents based on multiple generated queries.

You need it to improve retrieval accuracy, provide diverse results, and handle large vector databases.

Want to learn how to build modern software with LLMs using the newest tools and techniques in the field? Check out this free LLMOps course from industry expert Elvis Saravia of DAIR.AI!

Consider using the MultiQueryRetriever when you want to overcome the limitations of distance-based similarity search and provide a more comprehensive set of results to the user.

from langchain.chat_models import ChatOpenAI
from langchain.retrievers.multi_query import MultiQueryRetriever

question = "Why are all great things slow of growth?"

llm = ChatOpenAI(temperature=0)

# we instantiated the retreiever above
retriever_from_llm = MultiQueryRetriever.from_llm(
    retriever=db.as_retriever(), llm=llm
)

# Set logging for the queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

unique_docs = retriever_from_llm.get_relevant_documents(query=question)

And you can see the queries that were generated:

INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the reasons behind the slow growth of all great things?', '2. Can you explain why great things tend to have a slow growth rate?', '3. What factors contribute to the slow pace of growth in all great things?']

Feel free to see what the actual Documents are as follows:

unique_docs

You can also supply a prompt along with an output parser to split the results into a list of queries:

from typing import List
from langchain import LLMChain
from pydantic import BaseModel, Field
from langchain.prompts import PromptTemplate
from langchain.output_parsers import PydanticOutputParser


# Output parser will split the LLM result into a list of queries
class LineList(BaseModel):
    # "lines" is the key (attribute name) of the parsed output
    lines: List[str] = Field(description="Lines of text")


class LineListOutputParser(PydanticOutputParser):
    def __init__(self) -> None:
        super().__init__(pydantic_object=LineList)

    def parse(self, text: str) -> LineList:
        lines = text.strip().split("\n")
        return LineList(lines=lines)


output_parser = LineListOutputParser()

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are a modern Stoic philosopher who has grown up in the
    hoods of south Sacramento, California. The teachings of Epictetus helped
    you overcome ups and downs in life. You are a student and teacher of his
    teachings,
    Original question: {question}""",
)

llm = ChatOpenAI(temperature=0)

# Chain
llm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT, output_parser=output_parser)

# Run
retriever = MultiQueryRetriever(
    retriever=db.as_retriever(), llm_chain=llm_chain, parser_key="lines"
)  # "lines" is the key (attribute name) of the parsed output

# Results
unique_docs = retriever.get_relevant_documents(
    query="How can I live a meaningful life"
)

len(unique_docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['As a modern Stoic philosopher who has experienced the challenges of growing up in the hoods of South Sacramento, I can understand the desire to seek a meaningful life. The teachings of Epictetus can indeed provide valuable guidance in this pursuit. Here are some principles that can help you live a meaningful life:', '', '1. Focus on what you can control: Epictetus emphasized the importance of distinguishing between what is within our control and what is not. By focusing our energy on the things we can control, such as our thoughts, actions, and attitudes, we can find meaning in our ability to shape our own lives.', '', '2. Cultivate virtue: According to Stoicism, the ultimate goal in life is to live in accordance with virtue. Virtue encompasses qualities such as wisdom, courage, justice, and temperance. By striving to cultivate these virtues in our daily lives, we can find purpose and meaning in our actions.', '', '3. Embrace adversity: Stoicism teaches us to view adversity as an opportunity for growth and self-improvement. Rather than being overwhelmed by the challenges we face, we can choose to embrace them as valuable lessons and opportunities to develop resilience and character.', '', '4. Practice gratitude: Epictetus emphasized the importance of gratitude in finding contentment and meaning in life. By cultivating a mindset of gratitude, we can learn to appreciate the simple joys and blessings that surround us, even in the midst of difficult circumstances.', '', '5. Serve others: Stoicism encourages us to live a life of service to others. By helping and supporting those around us, we can find purpose and fulfillment in making a positive impact on the lives of others.', '', '6. Live in accordance with nature: Stoicism teaches us to align our lives with the natural order of the universe. By accepting the impermanence of things and embracing the present moment, we can find meaning in living in harmony with the flow of life.', '', '7. Seek wisdom: Epictetus believed that the pursuit of wisdom is essential for living a meaningful life. By continuously seeking knowledge, reflecting on our experiences, and learning from others, we can deepen our understanding of ourselves and the world around us.', '', 'Remember, living a meaningful life is a personal journey, and it may take time to fully integrate these principles into your daily life. But by embracing the teachings of Epictetus and applying them in your own unique circumstances, you can find purpose, fulfillment, and meaning in your life, regardless of your background or upbringing.']
17

Contextual compression

Contextual compression in LangChain is a technique used to compress and filter documents based on their relevance to a given query.

It aims to extract only the relevant information from documents, reducing the need for expensive language model calls and improving response quality.

Contextual compression is achieved by using a base retriever and a document compressor.

The base retriever retrieves the initial set of documents based on the query, and the document compressor processes these documents to extract the relevant content. You can use contextual compression when you have a document storage system and want to improve retrieval performance by returning only the most relevant information. It is particularly useful when the relevant information is buried within documents containing a lot of irrelevant text.

To determine if you need to use contextual compression, consider the following factors:

If your document storage system contains a large amount of data with potentially irrelevant information.
If you want to reduce the cost and response time of language model calls by extracting only the relevant content.
If you want to improve the overall retrieval performance and quality of your application.

By using contextual compression, you can enhance the efficiency and effectiveness of your document retrieval process, resulting in better user experiences and optimized resource utilization.

from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS

# Helper function for printing docs
def pretty_print_docs(docs):
    print(f"\n{'-' * 100}\n".join([f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]))

documents = TextLoader('/content/golden_hymns_of_epictetus.txt').load()

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100)

texts = text_splitter.split_documents(documents)

retriever = Chroma.from_documents(texts, OpenAIEmbeddings()).as_retriever()

docs = retriever.get_relevant_documents("What do the Stoics say of Socrates?")

pretty_print_docs(docs)

Adding contextual compression with `LLMChainExtractor`

We’ll add an LLMChainExtractor, which will iterate over the initially returned documents and extract from each only the content that is relevant to the query.

from langchain.llms import OpenAI
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor

llm = OpenAI(temperature=0)

compressor = LLMChainExtractor.from_llm(llm)

compression_retriever = ContextualCompressionRetriever(base_compressor=compressor,
                                                       base_retriever=retriever)

compressed_docs = compression_retriever.get_relevant_documents("What do the Stoics say of Socrates?")

pretty_print_docs(compressed_docs)

LLMChainFilter

The LLMChainFilter in LangChain is a component used for filtering and processing documents based on their relevance to a given query.

It is a simpler but more robust compressor that uses an LLM chain to decide which of the initially retrieved documents to filter out and which ones to return, without manipulating the document contents.

from langchain.retrievers.document_compressors import LLMChainFilter

llm = OpenAI(temperature=0)

_filter = LLMChainFilter.from_llm(llm)

filter_retriever = ContextualCompressionRetriever(base_compressor=_filter,
                                                       base_retriever=retriever)

compressed_docs = compression_retriever.get_relevant_documents("What do the Stoics say of Socrates?")

pretty_print_docs(compressed_docs)

EmbeddingsFilter

Making an extra LLM call over each retrieved document is expensive and slow.

The EmbeddingsFilter provides a cheaper and faster option by embedding the documents and query and only returning those documents which have sufficiently similar embeddings to the query.

It follows the same pattern we’ve seen in the last two examples.

embeddings = OpenAIEmbeddings()

embeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)

compression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)

compressed_docs = compression_retriever.get_relevant_documents("What does Epictetus say about being mindful of the company you keep?")

pretty_print_docs(compressed_docs)

DocumentCompressorPipeline

The DocumentCompressorPipeline is a feature in LangChain that allows you to combine multiple compressors and document transformers in sequence.

It helps in compressing and transforming documents in a contextual manner. The pipeline can include compressors like EmbeddingsRedundantFilter to remove redundant documents based on embedding similarity, and EmbeddingsFilter to filter documents based on their similarity to the query. Document transformers like TextSplitter can be used to split documents into smaller pieces.

You may need to use the DocumentCompressorPipeline when you want to perform multiple compression and transformation steps on your documents.

It provides a flexible way to customize the compression process according to your specific requirements.

from langchain.document_transformers import EmbeddingsRedundantFilter
from langchain.retrievers.document_compressors import DocumentCompressorPipeline, EmbeddingsFilter
from langchain.text_splitter import CharacterTextSplitter

splitter = CharacterTextSplitter(chunk_size=300, chunk_overlap=50, separator=". ")

redundant_filter = EmbeddingsRedundantFilter(embeddings=embeddings)

relevant_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)

pipeline_compressor = DocumentCompressorPipeline(
    transformers=[splitter, redundant_filter, relevant_filter]
)

compression_retriever = ContextualCompressionRetriever(base_compressor=pipeline_compressor, base_retriever=retriever)

compressed_docs = compression_retriever.get_relevant_documents("What waere the characteristics of Socrates?")

pretty_print_docs(compressed_docs)

Ensemble Retriever

The EnsembleRetriever in LangChain is a retrieval algorithm that combines the results of multiple retrievers and reranks them using the Reciprocal Rank Fusion algorithm.

It is used to improve the performance of retrieval by leveraging the strengths of different algorithms. You may need to use the EnsembleRetriever when you want to achieve better retrieval performance than any single algorithm can provide. It is particularly useful when combining a sparse retriever (e.g., BM25) with a dense retriever (e.g., embedding similarity) because their strengths are complementary.

The sparse retriever is good at finding relevant documents based on keywords, while the dense retriever is good at finding relevant documents based on semantic similarity.

To use the EnsembleRetriever, you need to initialize it with a list of retrievers and their corresponding weights. The retrievers can be instances of different retrieval algorithms, such as BM25Retriever and FAISSRetriever. The weights determine the importance of each retriever in the ensemble. The EnsembleRetriever then combines the results of the retrievers and reranks them based on the Reciprocal Rank Fusion algorithm.

If you have multiple retrievers that perform well on different aspects of the task, combining them using the EnsembleRetriever can lead to improved performance.

Additionally, if you have a combination of sparse and dense retrievers, the EnsembleRetriever can help leverage their complementary strengths.

!pip install rank_bm25

from langchain.retrievers import BM25Retriever, EnsembleRetriever
from langchain.vectorstores import FAISS
from langchain.vectorstores import Chroma

doc_list = "/content/golden_hymns_of_epictetus.txt"

# initialize the bm25 retriever and faiss retriever
bm25_retriever = BM25Retriever.from_texts(doc_list)
bm25_retriever.k = 2

embedding = OpenAIEmbeddings()

vectorstore = Chroma.from_texts(doc_list, embedding)

retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# initialize the ensemble retriever
ensemble_retriever = EnsembleRetriever(retrievers=[bm25_retriever, retriever], weights=[0.5, 0.5])

docs = ensemble_retriever.get_relevant_documents("Socrates")
docs

Other retrievers

This blog is already quite lengthy, but I want you to know there are number of other retrievers you can use. Using these retrievers all follow the same pattern that we’ve seen here

Conclusion

As we wrap up, remember that these advanced retrievers aren’t just another tool in your arsenal; it’s your ace in the hole for tackling the complexity of information retrieval.

We’ve walked through its strategic approach to broadening search perspectives and how it can finesse your search results with precision. Whether you’re managing a vast repository of documents or seeking nuanced answers, these retrievers are helpful for pushing the boundaries of what’s possible in data retrieval. So, put it to the test and watch as it transforms your quest for information into a multi-faceted discovery journey.

Happy querying!

Using Advanced Retrievers in LangChain

More Techniques to Improve Retrieval Quality

MultiQueryRetriever

Contextual compression

Adding contextual compression with `LLMChainExtractor`

LLMChainFilter

EmbeddingsFilter

DocumentCompressorPipeline

Ensemble Retriever

Other retrievers

Conclusion

Harpreet Sahota

Products

Learn

Company

Pricing

More Techniques to Improve Retrieval Quality

MultiQueryRetriever

Contextual compression

Adding contextual compression with LLMChainExtractor

LLMChainFilter

EmbeddingsFilter

DocumentCompressorPipeline

Ensemble Retriever

Other retrievers

Conclusion

Harpreet Sahota

Related Articles

Adding contextual compression with `LLMChainExtractor`