Elastic D&D - Update 11 - Veverbot - Data Vectorization
Joe
Posted on November 4, 2023
Last week we talked about audio transcription changes. If you missed it, you can check that out here!
Veverbot
Veverbot is my own custom AI assistant that aims to help players get quick answers about things that happened during their campaign so far. This is absolutely a work-in-progress, but even the first iteration of him is very cool.
This is a fairly involved process, so today I will be talking about what needs to be done from the logging / Elastic configuration side of things in order for Veverbot to work.
Elastic Configuration
For Veverbot to work, we simply need to add/adjust the mappings of index templates. Currently, I am using two templates: one for the "dnd-notes-*" indices, and another for an index named "virtual_dm-questions_answers". The second index contains the questions that players ask Veverbot, as well as the responses that Veverbot provides back to the players.
dnd-notes-* component template
{
"name": "dnd-notes",
"component_template": {
"template": {
"mappings": {
"properties": {
"@timestamp": {
"format": "strict_date_optional_time",
"type": "date"
},
"session": {
"type": "long"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"finished": {
"type": "boolean"
},
"message": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"message_vector": {
"dims": 1536,
"similarity": "cosine",
"index": "true",
"type": "dense_vector"
}
}
}
}
}
}
virtual_dm-questions_answers component template
{
"name": "virtual_dm-questions_answers",
"component_template": {
"template": {
"mappings": {
"properties": {
"question_vector": {
"dims": 1536,
"similarity": "cosine",
"index": "true",
"type": "dense_vector"
},
"answer": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"question": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
},
"answer_vector": {
"dims": 1536,
"similarity": "cosine",
"index": "true",
"type": "dense_vector"
}
}
}
}
}
}
NOTE:
The mappings and templates are automatically created via the docker-compose file! This is simply educational, a user will not have to deal with the creation of any of this.
Logging
With the mappings in place, we can now ingest logs with a dense_vector field. If you recall, this step happens on the note input page of Streamlit and is applied to every note that gets sent to Elastic.
Audio Note
st.session_state["message_vector"] = api_get_vector_object(st.session_state.transcribed_text)
Text Note
st.session_state["message_vector"] = api_get_vector_object(st.session_state.log_message)
The function that gets called simply makes a get request to the FastAPI that was talked about in the week 9 blog post!
def api_get_vector_object(text):
# returns vector object from supplied text
fastapi_endpoint = "/get_vector_object/"
full_url = fastapi_url + fastapi_endpoint + text
response = requests.get(full_url)
try:
message_vector = response.json()
except:
message_vector = None
print(response.content)
return message_vector
The API accepts the text as a variable, creates an embedding via OpenAI, and returns the vector object from the embedding. This vector object is what will allow Veverbot to compare user questions to player notes and return an answer.
@app.get("/get_vector_object/{text}")
async def get_vector_object(text):
import openai
openai.api_key = "API_KEY"
embedding_model = "text-embedding-ada-002"
openai_embedding = openai.Embedding.create(input=text, model=embedding_model)
return openai_embedding["data"][0]["embedding"]
The log is indexed as normal, now with dense_vector field. This field is what will allow Veverbot to compare user questions to player notes and return an answer, which we will talk about next week!
Closing Remarks
As previously stated, next week I will be talking about Veverbot from the Streamlit side. I will essentially walk through the user experience and what is happening in the background to produce the "conversation" that happens on the front end.
Check out the GitHub repo below. You can also find my Twitch account in the socials link, where I will be actively working on this during the week while interacting with whoever is hanging out!
Happy Coding,
Joe
Posted on November 4, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 17, 2023