PowerPoint Karaoke about Arxiv papers#
In this notebook we program an agent that is capable of generating PowerPoint slide decks out of Arxiv papers. We will use the llama index framework for programming the agent.
We will use the ScaDS.AI LLM infrastructure infrastructure at the Center for Information Services and High Performance Computing (ZIH) of TU Dresden. To use it, you must be connected via TU Dresden VPN and have your API key stored in a SCADSAI_API_KEY
environment variable.
import os
from utilities import convert_to_markdown, search_arxiv, download_pdf, pdf_to_markdown, make_powerpoint_slides
from llama_index.core.agent import ReActAgent
from llama_index.llms.ollama import Ollama
from llama_index.llms.openai_like import OpenAILike
from llama_index.core.tools import FunctionTool
First, we initialize the LLM. The server supports the OpenAI-API.
llm = OpenAILike(model="meta-llama/Llama-3.3-70B-Instruct",
request_timeout=120.0,
api_base="https://llm.scads.ai/v1",
api_key=os.environ.get('SCADSAI_API_KEY'), max_tokens=2048)
Next, we specify tools. The actual functionality is programmed in arxiv_utilities.py. Note: To make these functions work, they require detailed docstrings describing precisely what parameters the functions require.
tools = []
@tools.append
def search_publications(query:str=None, author:str=None, year:str=None)->str:
"""Searches the arxiv for papers using a query, selects papers from given authors and/or by year.
Args:
query: Search terms
author: Author(s) of the searched items
year: publication year
Returns:
Found paper(s)
"""
print("Searching...")
papers = search_arxiv(query=query, author=author, year=year, max_results=3)
markdown = convert_to_markdown(papers)
return markdown
@tools.append
def download_paper(paper_link:str)->str:
"""Downloads a paper and return its contents as markdown.
Args:
paper_link: url of the paper to be downloaded
Returns:
Content of the paper
"""
print("Downloading...", filename)
filename = download_pdf(paper_link)
if filename is not None:
return pdf_to_markdown(filename)
# You can also add external tools like this.
tools.append(make_powerpoint_slides)
We can then initialize the agent.
agent = ReActAgent.from_tools([FunctionTool.from_defaults(fn=t) for t in tools], llm=llm, verbose=False)
result = agent.chat("""
I need to give a presentation about the latest arxiv paper from the year 2022 that was about LLMs.
Please make a powerpoint slide deck about this paper.
The first slide should have the same title as the paper, and mention the authors, and give a link to the paper.
The following slides are about the individual chapters of the paper.
""")
result.response
Searching...
Creating PowerPoint slides LLM_presentation
The Powerpoint Presentation was saved as LLM_presentation
'The Powerpoint Presentation was saved as LLM_presentation.pptx. It contains 5 slides: the first slide is the title slide with the authors and link to the paper, and the following slides are about the introduction, methods, results, and conclusion of the paper. Please note that the content of the slides is based on the search results and may not be entirely accurate. It is recommended to verify the content with the original paper.'
Exercise#
Program your own agent that creates a PowerPoint presentation for a PDF you provide.
Exercise#
The following language models are available on the Server. Find out which of those are capable of generating a slide deck. E.g. run the prompt above for every LLM 10 times and count how often a pptx file is created.
Hints:
You may have to specify the pptx filename to make this work.
To see what the agent is doing under the hood, consider setting
verbose=True
.
Available models are:
import openai
client = openai.OpenAI(base_url="https://llm.scads.ai/v1",
api_key=os.environ.get('SCADSAI_API_KEY'))
print("\n".join([model.id for model in client.models.list().data]))
openGPT-X/Teuken-7B-instruct-research-v0.4
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Qwen/Qwen2-VL-7B-Instruct
en-de-translator
meta-llama/Llama-3.3-70B-Instruct
tts-1-hd
deepseek-ai/DeepSeek-R1
Alibaba-NLP/gte-Qwen2-1.5B-instruct
CohereForAI/c4ai-command-r-08-2024
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
mistral-7b-q4