RAG - Retrieval Augmented Generation
Technique combining retrieval from external sources with language model generation to enhance output accuracy and relevance in AI-powered text generation tasks.
RAG Applications (built in top of LLMs) FullStack Web RAG Application Select between multiple datasources and as using functions as tools to gather information
LLamaIndex
RAG -> solution to limited context windows (retrieve most relevant data form DB => augment query context => generate response)
Vector Search / Vector Embeddings
- emebedd your data into vector space (easy to locate relevant data)
- embed (DATA => LLM => VECTOR SPACE)
- retrieve context (PROMPT => LLM => VECTOR SPACE)
- query (PROMPT + Context(2) )=> LLM => response)
Query Engine a) retriever (context out of the index, get nodes) b) post processing (process the nodes the retriever fetched, before giving to LLM) c) synthetizer: combine processed nodes, prompt template and query into a single prompt to the LLM
Created on 4/3/2024