LLM-based Retrieval Augmented Generation

LLM-based Retrieval Augmented Generation#

In case text-embeddings perform poorly for identifying relevant documents, one can also ask LLMs to identify relevant documents. Therefore, we provide a list of files with corresponding summaries of these files and ask the LLM to tell us which documents are relevant. We then take the content of this document selection and assemble it to a long-context prompt.

from utilities import prompt_scadsai_llm, remove_outer_markdown, text_to_json
from IPython.display import display, Markdown

docs_root_folder = "hpc-compendium/doc.zih.tu-dresden.de/docs/"

compendium_url = "https://compendium.hpc.tu-dresden.de/"

This is again the question we aim to answer:

question = "How can I access the Jupyter Hub on the HPC system?"

Identifying relevant documents#

To identify relevant documents, we first load the summary list.

# Read the content of summaries.md 
with open('hpc_compendium_summaries.md', 'r', encoding='utf-8') as f:
    summaries = f.read()

# Print first 300 characters to verify
print("First part of the content:")
print(summaries[:700], "...")

First part of the content:
* accessibility.md:
This document is an accessibility statement for the Technische Universität Dresden's websites, outlining the university's efforts to make its online presence accessible in accordance with German laws and regulations, and providing contact information for reporting accessibility issues and seeking remedies.

* data_protection_declaration.md:
This document outlines a data protection policy, stating that only IP addresses are collected for error analysis and stored temporarily, with users having the right to request information about their data and contact relevant authorities if needed.

* index.md:
This document provides an overview of the High-Performance Computing (HPC)  ...

response = prompt_scadsai_llm(f"""
Given a question and a list of document summaries, identify documents that might be helpful for answering the question.

## Question
{question} 

## Document summaries

{summaries}

## Your task:
Which of the documents above might be relevant for answering this question: {question}

Answer with a list of filenames in JSON format
""")

# post-processing of the result to get a proper list
json = remove_outer_markdown(response)
relevant_file_paths = text_to_json(json)
[print(f) for f in relevant_file_paths];

access/jupyterhub.md
access/jupyterlab.md
access/overview.md
quickstart/getting_started.md

full_texts = {}
for file in relevant_file_paths:
    with open(docs_root_folder + file, 'r', encoding='utf-8') as f:
        full_texts[compendium_url + file[:-3]] = f.read()


documents = "\n".join([f"### {file} \n\n```\n{content}\n```\n" for file, content in full_texts.items()])

documents[:500]

'### https://compendium.hpc.tu-dresden.de/access/jupyterhub \n\n```\n# JupyterHub\n\nWith our JupyterHub service, we offer you a quick and easy way to work with\nJupyter notebooks on ZIH systems. This page covers starting and stopping\nJupyterHub sessions, error handling and customizing the environment.\n\nWe also provide a comprehensive documentation on how to use\n[JupyterHub for Teaching (git-pull feature, quickstart links, direct links to notebook files)](jupyterhub_for_teaching.md).\n\n## Disclaimer\n\n!!'

response = prompt_scadsai_llm(f"""
Given a question and a list of document summaries, identify documents that might be helpful for answering the question.

## Question
{question} 

## Documents

{documents}

## Your task:
Answer question: {question}
In case you used one of the documents above, cite it using markdown-formatted links to the respective document. Keep the links untouched!
""")

display(Markdown(response))

To access the Jupyter Hub on the HPC system, you can follow these steps:

Go to the JupyterHub page and click on the link to access JupyterHub.
Log in with your ZIH credentials (without @tu-dresden.de).
Choose a profile (system and resources) and start a new session.

As mentioned in the JupyterHub documentation, “JupyterHub is available at https://jupyterhub.hpc.tu-dresden.de.”

Also, note that you need to have an active HPC project to access JupyterHub, as stated in the Access to ZIH Systems document.

For more information on accessing JupyterHub and working with Jupyter notebooks on the HPC system, you can refer to the Quick Start guide, which provides an overview of the steps needed to submit a High Performance Computing (HPC) job, including accessing JupyterHub.

Exercise#

Measure how long it takes to retrieve an answer using this approach, compared to long-context prompting.

Hint: Use the same LLM for both approaches. To do this with a length-limited LLM, you may have to shorten the full text.

LLM-based Retrieval Augmented Generation

Contents

LLM-based Retrieval Augmented Generation#

Identifying relevant documents#

Exercise#