Application for collecting data from the Telegram (part 2: Chat participants)
Vadim Kolobanov
Posted on October 31, 2021
Hello everyone In the second part, we will continue to collect chat data from the "secure and twice encrypted" Telegram server.
Before reading the article, I strongly recommend reading the first part. In it, we created a Telegram developer account and set up our project.
Now our project is just one ".py" file, settings and session file.
As one well- known wisdom says:
A chain is only as strong as its weakest link.
Therefore, we will make a beautiful strong project
In this part, we will get a list of users from the telegram chat and see, what information about its participants telegram will give us if we are not the chat administrator, but went there as a regular user.
Go to PyCharm my friends
In order not to get confused in our code in the future, we will create several files in the project directory:
Users.py
links.txt
We will write all our code in this chapter in a separate file Users.py . This will greatly simplify our work in the future.
Let's import this file into our main project, which we wrote in the first part (At the end of the article I will post the code from both parts so don't worry)
import Users
In addition, for further work with chat users, we will need a couple more imports from our Telethon library.
from telethon.tl.functions.channels import GetParticipantsRequest
from telethon.tl.types import ChannelParticipantsSearch
For clarity and convenience, let's install the tqdm library in our project. It will allow us to create in our console a beautiful readable Progress Bar ( a graphical strip of the progress of our unloading )
Writing the pip install tqdm command
Importing the library class into our project
from tqdm import tqdm
All imports in main file:
import configparser
from telethon import TelegramClient
import Users
from telethon.tl.functions.channels import GetParticipantsRequest
from telethon.tl.types import ChannelParticipantsSearch
from tqdm import tqdm
We still have a file that we don't understand links.txt , which you and I will use as a list of our links to chats from which we will parse data, but we will talk about this a little further.
In our file Users.py let's create an async function:
async def dump_all_participants(channel,
ChannelParticipantsSearch,
client,
GetParticipantsRequest,
tqdm):
channel— this will be our telegram chat which we will pass to our function
ChannelParticipantsSearch and GetParticipantsRequest are our imports that we did above, they are the same classes of the Telethon library that we will need in our function.
tqdm is our library for progress bar
client is our connection that we created in the first part. No way without it)
Now let's set up reading chat links from our file links.txt
In our main file Update.py inside the main function from the last part, we will write this code
async def main():
with open("links.txt", "r") as f:
while True:
try:
text = f.readline()
url = text
channel = await client.get_entity(url)
await Users.dump_all_participants(channel, ChannelParticipantsSearch,
client, GetParticipantsRequest, tqdm)
except Exception:
pass
Here, immediately after reading the file, we will call our dump_all_participants function from the file Users.py by passing our arguments to it.
Moving on....
Go to the file Users.py where we created our dump_all_participants function and we will write the initial settings for the Telethon library.
async def dump_all_participants(channel,
ChannelParticipantsSearch,
client,
GetParticipantsRequest,
tqdm):
print('Get info from channel', channel.title)
OFFSET_USER = 0 #start user position
LIMIT_USER = 200 #the maximum number of records transmitted
#at a time, but not more than 200
ALL_PARTICIPANTS = []
FILTER_USER = ChannelParticipantsSearch('') #filter user
Let's create an infinite while loop:
while True:
Within our cycle, the magic itself begins.
participants = await client(GetParticipantsRequest(channel,
FILTER_USER, OFFSET_USER, LIMIT_USER, hash=0)
if not participants.users:
break
ALL_PARTICIPANTS.extend(participants.users)
OFFSET_USER += len(participants.users)
We turned to telegram, and with the help of its class and our constants, we sent a request to get a list of users of the chat we are interested in, which is stored in the channel variable.
As a result, in the participants variable we will get a list of objects that contain information about each user.
For clarity, we will prescribe in advance the recording of the received information in a text file:
with open(f"{channel.title}.txt", "w", encoding="utf-8") as file:
Be sure to specify the encoding to avoid problems with nicknames containing characters.
Now let's iterate through our users in a loop but using our tqdm library
for participant in tqdm(ALL_PARTICIPANTS):
try:
file.writelines(f" ID: {participant.id} "
f" First_Name:{participant.first_name}"
f" Last_Name: {participant.last_name}"
f" Username: {participant.username}"
f" Phone: {participant.phone}"
f" Channel: {channel.title} \n")
except Exception as e:
print(e)
print('The end!')
Here is the result on the example of the first found chat about Python:
In the root of our program, we have a file {channel name}.txt which contains information about users
To run our parsing, in a file links.txt you need to insert a link to a chat or group. You can find this link in the chat description.
Important! You will not be able to parse the channel because you are not its administrator. To parse a chat or a group, you need to be subscribed to them.)
There can be many links to chats. It is important to write them from a new line.
Perfectly! We just downloaded the list of chat users. We found out their unique identifiers (IDs), which will help us in the future to find messages of a certain user in this (or in any other chat). We will talk about unloading messages from the chat in part 3.
Thank you all for your attention! Write about your impressions in the comments.
By tradition, the full code.
File Update.py
import configparser
from telethon import TelegramClient
import Users
from telethon.tl.functions.channels import GetParticipantsRequest
from telethon.tl.types import ChannelParticipantsSearch
from tqdm import tqdm
config = configparser.ConfigParser()
config.read("config.ini")
api_id: str = config['Telegram']['api_id']
api_hash = config['Telegram']['api_hash']
username = config['Telegram']['username']
client = TelegramClient(username, api_id, api_hash)
client.start()
async def main():
with open("links.txt", "r") as f:
while True:
try:
text = f.readline()
url = text
channel = await client.get_entity(url)
print(url)
await Users.dump_all_participants(channel,
ChannelParticipantsSearch,
client,
GetParticipantsRequest, tqdm)
except Exception:
pass
with client:
client.loop.run_until_complete(main())
File Users.py
async def dump_all_participants(channel,
ChannelParticipantsSearch,
client,
GetParticipantsRequest, tqdm):
print('Get info from channel', channel.title)
OFFSET_USER = 0
LIMIT_USER = 200
ALL_PARTICIPANTS = []
FILTER_USER = ChannelParticipantsSearch('')
while True:
participants = await client(GetParticipantsRequest(channel,
FILTER_USER, OFFSET_USER, LIMIT_USER,hash=0))
if not participants.users:
break
ALL_PARTICIPANTS.extend(participants.users)
OFFSET_USER += len(participants.users)
with open(f"{channel.title}.txt", "w", encoding="utf-8")
as file:
for participant in tqdm(ALL_PARTICIPANTS):
try:
file.writelines(f" ID: {participant.id} "
f" First_Name:{participant.first_name}"
f" Last_Name: {participant.last_name}"
f" Username: {participant.username}"
f" Phone: {participant.phone}"
f" Channel: {channel.title} \n")
except Exception as e:
print(e)
print('The end!')
I wish you success!
Posted on October 31, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.