PowerPoint Karaoke about Arxiv papers#

In this notebook we program an agent that is capable of generating PowerPoint slide decks out of Arxiv papers.

We will use the ScaDS.AI LLM infrastructure infrastructure at the Center for Information Services and High Performance Computing (ZIH) of TU Dresden. To use it, you must be connected via TU Dresden VPN and have your API key stored in a SCADSAI_API_KEY environment variable.

from llama_index.core.agent import ReActAgent
from llama_index.llms.ollama import Ollama
from llama_index.llms.openai_like import OpenAILike
from llama_index.core.tools import FunctionTool
import os
from IPython.display import display, Markdown
from arxiv_utilities import convert_to_markdown, search_arxiv, download_pdf, pdf_to_markdown, make_powerpoint_slides

First, we initialize the LLM. The server supports the OpenAI-API.

llm = OpenAILike(model="meta-llama/Llama-3.3-70B-Instruct", 
                 request_timeout=120.0, 
                 api_base="https://llm.scads.ai/v1", 
                 api_key=os.environ.get('SCADSAI_API_KEY'))

Next, we specify tools. The actual functionality is programmed in arxiv_utilities.py. Note: To make these functions work, they require detailed docstrings describing precisely what parameters the functions require.

tools = []

@tools.append
def search_publications(query=None, author=None, year=None, max_results=10):
    """Searches the arxiv for papers using a query, selects papers from given authors and/or by year."""
    papers = search_arxiv(query=query, author=author, year=year, max_results=max_results)
    markdown = convert_to_markdown(papers)
    return markdown

@tools.append
def download_paper(paper_link):
    """Downloads a paper and return its contents as markdown."""
    filename = download_pdf(paper_link)

    if filename is not None:
        return pdf_to_markdown(filename)

# You can also add external tools like this.
tools.append(make_powerpoint_slides)

We can then initialize the agent.

agent = ReActAgent.from_tools([FunctionTool.from_defaults(fn=t) for t in tools], llm=llm, verbose=False)

Using this small helper function, we can ask the agent and will read its output as properly formatted markdown.

def chat(query):
    response = agent.chat(query)
    display(Markdown(response.response))
chat("""
I need to give a presentation about the latest arxiv paper from the year 2022 that was about LLMs.
Please make a powerpoint slide deck about this paper.
The first slide should have the same title as the paper, and mention the authors, and give a link to the paper.
The following slides are about the individual chapters of the paper.
""")

PDF downloaded: http://arxiv.org/abs/2301.00303v1, licensed CC-BY 4.0

The powerpoint slide deck about the paper “Rethinking with Retrieval: Faithful Large Language Model Inference” has been created and saved as slides.pptx. The slide deck includes the title of the paper, the authors, and a link to the paper, as well as slides about the introduction, method, experiments, and conclusion of the paper.

Exercise#

The following language models are available on the Server. Find out which of those are capable of generating a slide deck. E.g. run the prompt above for every LLM 10 times and count how often a pptx file is created.

Hints:

  • You may have to specify the pptx filename to make this work.

  • To see what the agent is doing under the hood, consider setting verbose=True.

Available models are:

import openai
client = openai.OpenAI(base_url="https://llm.scads.ai/v1",
                       api_key=os.environ.get('SCADSAI_API_KEY'))

print("\n".join([model.id for model in client.models.list().data]))
meta-llama/Meta-Llama-3.1-70B-Instruct
Qwen/Qwen2-VL-7B-Instruct
de-en-translator
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
openGPT-X/Teuken-7B-instruct-research-v0.4
CohereForAI/c4ai-command-r-08-2024
meta-llama/Llama-3.3-70B-Instruct
Qwen/QwQ-32B-Preview