Orca 2 RAG using TxtAI
parmarjatin4911@gmail.com
Posted on January 28, 2024
Orca 2 RAG using TxtAI
pip install "txtai[all]" einops autoawq
Section 1: Setup LLM
from txtai.pipeline import LLM, Textractor
from txtai import Embeddings
import os
import nltk
nltk.download('punkt')
llm = LLM("TheBloke/Orca-2-13B-AWQ", trust_remote_code=True)
print("Before RAG:")
print(llm("Tell me about Mervin Praison in one line"))
print("\n------------------\n")
Section 2: Build RAG Pipeline with Vector Search
def execute(question, context):
prompt = f"""system
You are a friendly assistant. You answer questions from users.
user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {context}
assistant
"""
return llm(prompt, maxlength=6096, pad_token_id=32000)
def stream(path):
for f in sorted(os.listdir(path)):
if f.endswith(("docx", "xlsx", "pdf", "txt")):
fpath = os.path.join(path, f)
for paragraph in textractor(fpath):
yield paragraph
textractor = Textractor(paragraphs=True)
embeddings = Embeddings(content=True)
docs_path = os.path.join(os.path.dirname(os.path.realpath(file)), 'data')
embeddings.index(stream(docs_path))
def context(question):
return "\n".join(x["text"] for x in embeddings.search(question))
def rag(question):
return execute(question, context(question))
print("After RAG:")
print(rag("Tell me about Mervin Praison in one line"))
With Citation
Section 1: Create LLM and Textractor Pipelines
from txtai.pipeline import LLM, Textractor
import os
import nltk
nltk.download('punkt')
llm = LLM("TheBloke/Orca-2-13B-AWQ", trust_remote_code=True)
textractor = Textractor()
docs_path = os.path.join(os.path.dirname(os.path.realpath(file)), 'data')
text = textractor(os.path.join(docs_path, "/home/praison/data.txt"))
question = "Tell me about Mervin Praison in one line"
prompt = f"""system
You are a friendly assistant. You answer questions from users.
user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {text}
assistant
"""
print(llm(prompt, maxlength=6096, pad_token_id=32000))
Section 2: Build RAG Pipeline with Vector Search
import os
from txtai import Embeddings
def execute(question, text):
prompt = f"""system
You are a friendly assistant. You answer questions from users.
user
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context: {text}
assistant
"""
return llm(prompt, maxlength=6096, pad_token_id=32000)
def stream(path):
for f in sorted(os.listdir(path)):
if f.endswith(("docx", "xlsx", "pdf", "txt")):
fpath = os.path.join(path, f)
for paragraph in textractor(fpath):
yield paragraph
textractor = Textractor(paragraphs=True)
embeddings = Embeddings(content=True)
embeddings.index(stream(docs_path))
def context(question):
return "\n".join(x["text"] for x in embeddings.search(question))
def rag(question):
return execute(question, context(question))
print("RAG Pipeline Result:")
print(rag("Tell me about Mervin Praison in one line"))
Section 3: Implement Citations for LLMs
for x in embeddings.search(rag("Tell me about Mervin Praison in one line")):
print(x)
print(x["text"])
Section 4: Create Extractor Pipeline for Citations
from txtai.pipeline import Extractor
def prompt(question):
return [{
"query": question,
"question": f"""
Answer the following question using only the context below. Only include information specifically discussed.
question: {question}
context:
"""
}]
llm = LLM("TheBloke/Orca-2-13B-AWQ", template="""system
You are a friendly assistant. You answer questions from users.
user
{text}
assistant
""")
extractor = Extractor(embeddings, llm, output="reference")
result = extractor(prompt("Tell me about Mervin Praison in one line"), maxlength=4096, pad_token_id=32000)[0]
print("ANSWER:", result["answer"])
print("CITATION:", embeddings.search("select id, text from txtai where id = :id", limit=1, parameters={"id": result["reference"]}))
../data/data.txt
Mervin Praison is an AI, Senior DevOps, and Site Reliability Engineer with a diverse range of technical interests and expertise. He has contributed to various projects and topics such as differences between keyword, sparse, and dense vector indexes, and has developed content on TensorFlow & Keras, focusing on models, optimizers, and activation functions. Praison also demonstrates proficiency in programming and system setup, evident in his guides on AutoGen Agent Training using AgentOptimizer and LocalAI Setup. Additionally, his GitHub activity reflects a consistent contribution pattern in software development and engineering
Posted on January 28, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024
November 30, 2024