This notebook shows how to leverage GPT-4o to turn rich PDF documents such as slide decks or exports from web pages into usable content for your RAG application.
This technique can be used if you have a lot of unstructured data containing valuable information that you want to be able to retrieve as part of your RAG pipeline.
For example, you could build a Knowledge Assistant that could answer user queries about your company or product based on information contained in PDF documents.
The example documents used in this notebook are located at data/example_pdfs. They are related to OpenAI's APIs and various techniques that can be used as part of LLM projects.