Meet Rajesh Gor
Posted on April 8, 2024
Intution
It was Saturday noon that I sat down with an empty and relaxed mind, after waking up a bit late than usual and completing the chores. I had a curiosity popping in my head for a couple of days as to how OpenAI had made the use of ChatGPT anonymous to the world.
The Twitter post from Simon Willison about scrapping ChatGPT made me wonder if is it possible to do that now? since ChatGPT access is made anonymous.
Building the MVP
I opened an incognito mode and to my surprise, it was not a joke despite being an announcement made on April 1st. ChatGPT was accessible without having to create an account or authenticate into the application. Users won't be able to store or share conversations, that’s the point of being anonymous right? But the major catch in it is that the conversation will be used to train the model. This made me wonder, is the data from the authenticated and authorized users of ChatGPT not used to train itself, or is it just condolence? Who knows, but hey ChatGPT without login that’s a big deal.
I opened the network tab and the first request I saw was to the chat.openai.com
as expected. This was the request that initiates the authentication and validation of the request, basically sending some necessary cookies like CSRF tokens, API-specific tokens, etc. These cookies are necessary for the subsequent requests in the actual prompting and sending of the messages.
This was not giving out any response as such but was setting important cookies via the response, so this was just like initializing the ChatGPT client request.
Then on the browser, I sent a prompt to the ChatGPT application as usual, after some digging in the request and response in the network tab, I was able to find the endpoint that sent and received the generated message to the given prompt.
The request was to the /backend-anon/conversation endpoint as a POST request with a specific structure. This looked easy to get a script for getting a quick response from GPT.
I quickly started with a simple Python script which looked like this:
import requests
url = 'https://chat.openai.com/backend-anon/conversation'
headers = {
'accept': 'text/event-stream',
'accept-language': 'en-GB,en;q=0.8',
'content-type': 'application/json',
'cookie': '*****',
'oai-language': 'en-US',
'openai-sentinel-chat-requirements-token': '*****',
'origin': 'https://chat.openai.com',
'referer': 'https://chat.openai.com/',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36'
}
data = {
"action": "next",
"messages": [{
"id": "aaa2975f-214f-4b93-ac55-77aa21aeced5",
"author": {"role": "user"},
"content": {"content_type": "text", "parts": ["hi"]},
"metadata": {}
}],
"parent_message_id": "aaa1ee93-2920-4208-a067-599c1d544b7c",
"model": "text-davinci-002-render-sha",
"timezone_offset_min": -330,
"history_and_training_disabled": True,
"conversation_mode": {"kind": "primary_assistant"},
"force_paragen": False,
"force_paragen_model_slug": "",
"force_nulligen": False,
"force_rate_limit": False,
}
response = requests.post(url, headers=headers, json=data)
print(response.text)
And sure enough, this script was a good starting point to validate my idea of creating a ChatGPT anonymous client, it worked for a few minutes and then it stopped because I was hitting too much time on the same conversation message-id as well as the cookies used are the same too.
So, this needed to be fixed, because this was the hacky solution just the MVP for the initial idea and ticking the feasibility of creating it.
Next up was to configure the client itself to set the message IDs and cookies from an actual request to their backend.
Creating the anonymous Client
I added the first request to the chat.openai.com
and created a client for sending the request, by doing that I could store the cookies set from the initial request to be carried to the request for the prompt i.e. /backend-anon/conversation
endpoint.
However, I found that I also needed to set the header openai-sentinel-chat-requirements-token
dynamically for the chat specifically. Sentinal tokens would be a way for OpenAI API to flag a request as authentic or non-authorized. They are generated each time for a new chat, so these are obtained from a new request to the /backend-anon/sentinel/chat-requirements
endpoint.
This needed two more requests, one to get the cookies for authorizing the user-agent and the other for authenticating as a genuine request from ChatGPT with the help of sentinel tokens.
So the final script was composed of three steps:
- Sending the request to openai API for cookies
- Sending the request to openai sentinel endpoint for chat-requirement token
- Constructing the body and sending the prompt to get the message from the conversation endpoint
Getting Authentication Cookies
The first request to set cookies from the base endpoint of ChatGPT
import requests
import json
import uuid
BASE_URL = "https://chat.openai.com"
# First request to get the cookies
headers = {
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"
}
request_client = requests.session()
response = request_client.get(BASE_URL, headers=headers)
set_cookie = response.cookies.get_dict()
# format the cookies to be sent in the headers of subsequent requests
cookies = "; ".join([f"{key}={value}" for key, value in set_cookie.items()])
For the user-agent header, we can use the faker-useragent to generate a new fake-useragent each time we create a client. Simply install the package with pip pip install fake-useragent
from fake_useragent import UserAgent
ua = UserAgent()
# Get a random browser user-agent string
print(ua.random)
The above code will generate a new user-agent string for our client.
The requests.session()
function will create a session that we can use to request multiple endpoints with the same session, by doing so we don't have to copy the cookies or credentials for each request.
The cookies set from this API include the following:
__cf_bm
_cfuvid
__Host-next-auth.csrf-token
__Secure-next-auth.callback-url
__cflb
The OpenAI API requires these cookies to authorize the request for the subsequent calls to other endpoints, and this is the entry point API where we can set the cookies.
Getting Sentinel token for chat-requirements
The second request is to get the sentinel token for the chat.
# second request to get the OpenAI's Sentinel token
chat_req_url = f"{BASE_URL}/backend-anon/sentinel/chat-requirements"
headers = {
"accept": "text/event-stream",
"accept-language": "en-GB,en;q=0.8",
"content-type": "application/json",
"oai-device-id": set_cookie.get("oai-did"),
"oai-language": "en-US",
"origin": "https://chat.openai.com",
"referer": "https://chat.openai.com/",
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36",
}
chat_req_res = request_client.post(chat_req_url, headers=headers)
chat_req_token = ""
if chat_req_res.status_code == 200:
# grab the sentinel token
chat_req_token = chat_req_res.json().get("token")
The /backend-anon/sentinel/chat-requirements
endpoint is the API that is used for sending a POST request and getting back the sentinel token. In the request headers, the important header is the openai-device-id
or oai-device-id
which would be a unique ID for the device accessing the ChatGPT app.
By sending the post request to the sentinel API, the response we get has the chat-requirements-token
which is a sentinel token of OpenAI platform. We store the token for later use, this will be added as one of the headers in the request for the prompt conversation request.
Constructing the body for conversation request
The third Request is for sending the prompt for generating the message.
def stream_messages(client, url, headers, data):
last_msg = None
second_last_msg = None
try:
with client.post(url, headers=headers, json=data, stream=True) as response:
response.raise_for_status()
for line in response.iter_lines():
if line:
line_str = line.decode('utf-8') or ""
if line_str.startswith("data:"):
line_str = line_str[5:]
second_last_msg = last_msg
if "[DONE]" in line_str:
break
last_msg = json.loads(line_str)
return second_last_msg
except requests.exceptions.RequestException as e:
print("Error:", e)
# third request to get the response to the prompt
url = f"{BASE_URL}/backend-anon/conversation"
headers = {
**headers,
"openai-sentinel-chat-requirements-token": chat_req_token,
}
parent_id = str(uuid.uuid4())
msg_id = str(uuid.uuid4())
prompt = input("Prompt: ")
data = {
"action": "next",
"messages": [
{
"id": f"{msg_id}",
"author": {"role": "user"},
"content": {"content_type": "text", "parts": [f"{prompt}"]},
"metadata": {},
}
],
"parent_message_id": f"{parent_id}",
"model": "text-davinci-002-render-sha",
"timezone_offset_min": -330,
"history_and_training_disabled": True,
"conversation_mode": {"kind": "primary_assistant"},
"force_paragen": False,
"force_paragen_model_slug": "",
"force_nulligen": False,
"force_rate_limit": False,
}
msg = stream_messages(request_client, url, headers, data) or {}
messages = msg.get("message", {}).get("content", {}).get("parts", [])
if len(messages) > 0:
resp_message["message"] = messages[0]
return resp_message
The stream message function takes in the client, URL, and headers as well as the data and returns the final message after streaming it from the POST request.
The data is sent over a stream in chunks, we get the data as:
data: {"message": {"id": "e74b1d88-ea0e-4df3-86be-c2d935ad60b5", "author": {"role": "system", "name": null, "metadata": {}}, "create_time": null, "update_time": null, "content": {"content_type": "text", "parts": [""]}, "status": "finished_successfully", "end_turn": true, "weight": 0.0, "metadata": {"is_visually_hidden_from_conversation": true}, "recipient": "all"}, "conversation_id": "aad455dc-ce3f-41ec-8cc4-6ad3febc3380", "error": null}
We cannot parse this string as json as it is not valid, we need to trim and parse the object from it. However the data is streamed, so it responds in chunks(packets) we only need the last chunk of it. Not the last chunk because the last chunk is data: [DONE]
this is the message that indicates the response is served.
We get the second last message as the final data of
the conversation response.
This is the process of getting the prompt message from the ChatGPT API, technically there is no official API for ChatGPT but we can mimic the process.
This was the gist of how I came up with this solution for creating an anonymous client for ChatGPT, it has a Python package/CLI as the name anonymous-chatgpt.
You can observe, the parameter for the prompt request body (3rd request) as history_and_training_disabled
which should disable and share the data from this conversation to train the model itself, but who knows what happens really? I have set it to False
but it might still pass the data for training and track the data, that’s a possibility and it might be just a parameter without a distinct else condition. Just like we have a joke for cookies, if enabled track data, else track data.
Here is the demo:
Links:
Anonymous ChatGPT GitHub Repository
Python Package - PyPI
Hope you enjoyed this first stroll kind of a post, just exploring the unexplored ideas and thinking through the intuitions. Thanks for reading.
Have a nice day, Happy Coding and dreaming :)
Posted on April 8, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.