Prashant Iyer
Posted on July 28, 2024
You've probably asked a question to a language model before and then had it give you an answer. After all, this is what we most commonly use language models for.
But have you ever received a question from a language model? While not as common, this application of AI has diverse use cases in areas like education, where you might want a model to give you practice questions for a test, and in sales enablement, where you question your business's sales team about your products to improve their ability to make sales.
Now, what if we had a face off⚔️ between two different models: one that asked questions about a topic and another that answered them? All without human intervention?
In this article, we're going to look at exactly that. We'll provide a sample passage about OpenAI's AI safety team as context to our models. We'll then let our models duel it out! One model will ask questions based on this passage, and another model will respond!
Our AI Models🤖
Intoducing, slim-q-gen-tiny-tool
. This will be our question model, capable of generating 3 different types of questions:
- Multiple choice questions
- Boolean (true/false) questions
- General open-ended questions
Facing off against this will be bling-phi-3-gguf
! This will be our answer model, giving appropriate responses to any of the above types of questions.
One important note is that both these models are GGUF quantized. This means that they are smaller and faster versions of their original counterparts. What this means for us is that we can run them on just a CPU, with no need for resources like GPUs!
Step 1: Providing input parameters✏️
This is what our function signature for this example looks like.
def ask_and_answer_game(source_passage, q_model="slim-q-gen-tiny-tool", number_of_tries=10, question_type="question",
temperature=0.5):
-
source_passage
is the text input that we will provide our models, -
q_model
is our questioning model, -
number_of_tries
is the number of questions we will attempt to generate (more on this later!) -
question_type
can be either"multiple choice"
,"boolean"
or"question"
corresponding to each of the types of questions we saw above, -
temperature
is a value ranging from 0 to 1 that determines how much variance we will see in our generated questions. Here, the value of 0.5 is relatively high so that we get a good variety of questions with little repetition!
Step 2: Loading in our models🪫🔋
With the inputs taken care of, let's now load in both our models.
q_model = ModelCatalog().load_model(q_model, sample=True, temperature=temperature)
Notice that we have sample=True
to increase variety in our model output (the questions generated).
Now, for the answer model.
answer_model = ModelCatalog().load_model("bling-phi-3-gguf")
We won't mess with the sample
or temperature
options here because we want concise, fact-based answers from this model.
Step 3: Generating our questions🤔💬
We'll try to generate questions number_of_tries
times, which in this case is 10. We'll then then update our questions
list with only the unique questions, to avoid repetitions.
questions = []
# Loop number_of_tries times
for x in range(0, number_of_tries):
response = q_model.function_call(source_passage, params=[question_type])
new_q = response["llm_response"]["question"]
# Check to see that the question generated is unique
if new_q and new_q not in questions:
questions.append(new_q)
An important function here is q_model.function_call()
. This is how the llmware
library lets you prompt language models with just a single function call. Here, we pass in the source text and question type as arguments.
The function returns response
, a dictionary with a lot of information about the call, but we're only interested in the question
key, which is located inside the llm_response
dictionary.
Step 4: Responding to our questions📝
Now that the questions have been generated, the duel is on! Let's use our answering model to now respond to these questions. We'll loop through our questions
list, pass in the source passage as context to the model and ask each question.
# Loop through each question
for i, question in enumerate(questions):
# Print out the question
print(f"\nquestion: {i} - {question}")
# Validate the question list and run inference
if isinstance(question, list) and len(question) > 0:
response = answer_model.inference(question[0], add_context=test_passage)
# Print out the answer
print(f"response: ", response["llm_response"])
It is important to note that our question model returns each question
as a list
, with the first element (question[0]
) containing the actual string corresponding to the question.
For each question
, we then need to perform some validation:
- Check to see that the
question
is of the correct data type (list
) - Check to see that the
question
is not empty.
Then, the answer_model.inference()
function will ask our model the question, passing in the test_passage
as context.
Finally, we print out the response.
Results!✅
Let's quickly look at our sample passage. This passage was taken from a CNBC news story in May 2024 about OpenAI's work with safety and security.
"OpenAI said Tuesday it has established a new committee to make recommendations to the company’s board about safety and security, weeks after dissolving a team focused on AI safety. In a blog post, OpenAI said the new committee would be led by CEO Sam Altman as well as Bret Taylor, the company’s board chair, and board member Nicole Seligman. The announcement follows the high-profile exit this month of an OpenAI executive focused on safety, Jan Leike. Leike resigned from OpenAI leveling criticisms that the company had under-invested in AI safety work and that tensions with OpenAI’s leadership had reached a breaking point."
Now, let's see what our output looks like!
We can see all the questions that were asked about the passage, as well as concise, fact-based responses given to them!
Note that there are only 9 questions here while we provided number_of_tries=10
. This means that one question generated was a duplicate and was ignored.
Conclusion
And with that, we're done with this example! Recall that we used the llmware
library to:
- Load in a question and answer model
- Generate unique questions about a source passage
- Respond to each question with accuracy.
And remember that we did all of this on just a CPU! 💻
Check out our YouTube videon on this example!
If you made it this far, thank you for taking the time to go through this topic with us ❤️! For more content like this, make sure to visit our dev.to page.
The source code for many more examples like this one are on our GitHub. Find this example here.
Our repository also contains a notebook for this example that you can run yourself using Google Colab, Jupyter or any other platform that supports .ipynb notebooks.
Join our Discord to interact with a growing community of AI enthusiasts of all levels of experience!
Please be sure to visit our website llmware.ai for more information and updates.
Posted on July 28, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.