LLM-based Retrieval Augmented Generation

LLM-based Retrieval Augmented Generation#

In case text-embeddings perform poorly for identifying relevant documents, one can also ask LLMs to identify relevant documents. Therefore, we provide a list of files with corresponding summaries of these files and ask the LLM to tell us which documents are relevant. We then take the content of this document selection and assemble it to a long-context prompt.

from utilities import prompt_scadsai_llm, remove_outer_markdown, text_to_json
from IPython.display import display, Markdown

docs_root_folder = "hpc-compendium/doc.zih.tu-dresden.de/docs/"

compendium_url = "https://compendium.hpc.tu-dresden.de/"

This is again the question we aim to answer:

question = "How can I access the Jupyter Hub on the HPC system?"

Identifying relevant documents#

To identify relevant documents, we first load the summary list.

# Read the content of summaries.md 
with open('hpc_compendium_summaries.md', 'r', encoding='utf-8') as f:
    summaries = f.read()

# Print first 300 characters to verify
print("First part of the content:")
print(summaries[:700], "...")

First part of the content:
* accessibility.md:
This document is an accessibility statement for the Technische Universität Dresden's websites, outlining the university's efforts to make its online presence barrier-free in accordance with German law, and providing contact information for reporting accessibility issues and seeking redress.

* data_protection_declaration.md:
This document outlines a data protection policy, stating that only IP addresses are collected for error analysis and not shared with third parties unless required by law, and users have the right to request information about their personal data and contact relevant authorities.

* index.md:
This documentation provides information on the High-Performan ...

response = prompt_scadsai_llm(f"""
Given a question and a list of document summaries, identify documents that might be helpful for answering the question.

## Question
{question} 

## Document summaries

{summaries}

## Your task:
Which of the documents above might be relevant for answering this question: {question}

Answer with a list of filenames in JSON format
""")

# post-processing of the result to get a proper list
json = remove_outer_markdown(response)
relevant_file_paths = text_to_json(json)
[print(f) for f in relevant_file_paths];

access/jupyterhub.md
access/overview.md
access/jupyterlab.md
quickstart/getting_started.md
software/big_data_frameworks.md
software/data_analytics_with_python.md
software/data_analytics_with_r.md
software/data_analytics_with_rstudio.md
access/desktop_cloud_visualization.md
access/graphical_applications_with_webvnc.md

full_texts = {}
for file in relevant_file_paths:
    with open(docs_root_folder + file, 'r', encoding='utf-8') as f:
        full_texts[compendium_url + file[:-3]] = f.read()


documents = "\n".join([f"### {file} \n\n```\n{content}\n```\n" for file, content in full_texts.items()])

documents[:500]

'### https://compendium.hpc.tu-dresden.de/access/jupyterhub \n\n```\n# JupyterHub\n\nWith our JupyterHub service, we offer you a quick and easy way to work with\nJupyter notebooks on ZIH systems. This page covers starting and stopping\nJupyterHub sessions, error handling and customizing the environment.\n\nWe also provide a comprehensive documentation on how to use\n[JupyterHub for Teaching (git-pull feature, quickstart links, direct links to notebook files)](jupyterhub_for_teaching.md).\n\n## Disclaimer\n\n!!'

response = prompt_scadsai_llm(f"""
Given a question and a list of document summaries, identify documents that might be helpful for answering the question.

## Question
{question} 

## Documents

{documents}

## Your task:
Answer question: {question}
In case you used one of the documents above, cite it using markdown-formatted links to the respective document. Keep the links untouched!
""")

display(Markdown(response))

To access the Jupyter Hub on the HPC system, you can visit the JupyterHub page and follow the instructions provided. According to the https://compendium.hpc.tu-dresden.de/access/jupyterhub document, you can access JupyterHub at https://jupyterhub.hpc.tu-dresden.de and log in with your ZIH credentials.

Additionally, you can also find more information on how to access JupyterHub in other documents such as https://compendium.hpc.tu-dresden.de/access/overview and https://compendium.hpc.tu-dresden.de/quickstart/getting_started.

Please note that you need to have a ZIH HPC login to access JupyterHub, and you can apply for it via the HPC login application form as mentioned in the https://compendium.hpc.tu-dresden.de/quickstart/getting_started document.

It is also worth mentioning that JupyterHub is available on other clusters such as vis as mentioned in the https://compendium.hpc.tu-dresden.de/access/desktop_cloud_visualization document.

Please let me know if you need further assistance!

Exercise#

Measure how long it takes to retrieve an answer using this approach, compared to long-context prompting.

Hint: Use the same LLM for both approaches. To do this with a length-limited LLM, you may have to shorten the full text.

LLM-based Retrieval Augmented Generation

Contents

LLM-based Retrieval Augmented Generation#

Identifying relevant documents#

Exercise#