🚀 YSQL-DocDB calls with yb_debug_log_docdb_requests
Franck Pachot
Posted on July 15, 2022
YugabyteDB is a distributed SQL database:
- distributed refers to the lower half, called DocDB - a distributed document store with support for global transactions with all ACID properties
- SQL refers to one of the API, called YSQL, which reuses PostgreSQL.
I've detailed this in Distributed SQL Deep Dive: Inside YugabyteDB’s Two-Layer Architecture but if you want to learn with hands-on lab, I guess you want to see what happens between those two layers. Here is how I do it. I'm going to verify the issue #12553 which is not an issue as it has already been optimized a long time ago.
I start a single-node lab:
docker run -it yugabytedb/yugabyte:latest bash
yugabyted start
ysqlsh -h $(hostname)
I tail -f
the PostgreSQL log:
\pset tuples_only on
\o .yb-tail-logs.sql
select format('tail -F %s&',pg_current_logfile());
\pset tuples_only off
\o
\! sh .yb-tail-logs.sql
Now I'm ready to set yb_debug_log_docdb_requests
and see the result on my screen but, before, let's create a demo table:
drop table if exists demo;
create table demo (a int, b int);
create index demon on demo(a asc);
insert into demo select generate_series(1,10);
select count(*) from demo;
SeqScan
Now, I enable the logging of all requestes from YSQL to DocDB and run the same count(*)
:
explain (costs off, analyze) select count(*) from demo;
set yb_debug_log_docdb_requests=true;
select count(*) from demo;
set yb_debug_log_docdb_requests=false;
I run it once without the additional logging because I don't want to see the calls to get the metadata from the yb-master
. They are cached, and the the second run will show only the calls to the yb-tserver
(s). It is also the occasion to show the plan, this is a Seq Scan
.
The following log is displayed:
I0715 14:25:09.197710 336 pg_session.cc:240] Applying operation: { READ active: 1 read_time: { read: <invalid> local_limit
: <invalid> global_limit: <invalid> in_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 58265696 schema_version: 2 targets { tscall { opcode: 4
operands { value { int64_value: 0 } } } } column_refs { } is_aggregate: 1 limit: 1024 return_paging_state: 1 ysql_catalog_version: 22 table_id: "000033e10000300080000000000
04016" }
We have one call here. The number of calls is important in a distributed database because the YSQL location is where you are connected (the PostgreSQL backend) but DocDB is one of the yb-tserver
(s) of your YugabyteDB cluster. This can involve some latency, so you want the minimum of roundtrips. Don't worry, our engineers do the best here.
Here the operation is a READ (it counts as 1 in the Read ops/s
you see in the monitoring) and the request is YQL_CLIENT_PGSQL. YQL stands for YugabyteDB Query Language, which is a client for DocDB, and this one is from PGSQL, the PostgreSQL API, running from the PostgreSQL backend.
We recognize the table_id
which I described in a previous post. Here is how I get the list of my tables adn indexes with their DocDB table_id
:
yugabyte=#
select
'0000'||lpad(to_hex(datoid::int),4,'0')
||'0000'||'3000'||'8000'
||'00000000'||lpad(to_hex(reloid::int),4,'0')
as uuid, datname, relname, relkind, amname, nspname from
(select oid reloid, relname, relnamespace, relkind, relowner, relam from pg_class) rel
natural left join (select nspname,oid relnamespace from pg_namespace) ns
natural left join (select amname,oid relam from pg_am) am
cross join (select datname,oid datoid from pg_database where datname=current_database()) dat
where nspname not in ('pg_catalog','information_schema')
order by relkind desc, 1
;
uuid | datname | relname | relkind | amname | nspname
----------------------------------+----------+---------+---------+--------+---------
000033e1000030008000000000004016 | yugabyte | demo | r | | public
000033e1000030008000000000004019 | yugabyte | demon | i | lsm | public
(2 rows)
Index Only Scan
With a range condition on the indexed column, and querying only the indexed column, I have an Index Only Scan
:
explain (costs off, analyze) select a from demo where a>0;
set yb_debug_log_docdb_requests=true;
select a from demo where a>0;
set yb_debug_log_docdb_requests=false;
This is a range scan on the index. We can see it in the EXPLAIN, but it is clear from the call, knowing the index table_id
:
I0715 14:26:26.456454 336 pg_session.cc:240] Applying operation: { READ active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in
_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 56172640 schema_version: 0 targets { column_id: 0 } targets { column_id: 1 } column_refs { id
s: 0 ids: 1 } is_forward_scan: 1 is_aggregate: 0 limit: 1024 return_paging_state: 1 ysql_catalog_version: 22 table_id: "000033e1000030008000000000004019" condition_expr { co
ndition { op: QL_OP_AND operands { condition { op: QL_OP_GREATER_THAN_EQUAL operands { column_id: 0 } operands { value { int32_value: 0 } } } } } } upper_bound { key: "~~~!"
is_inclusive: 1 } col_refs { column_id: 0 attno: 1 } col_refs { column_id: 1 attno: -101 } partition_key: "48800000000021" }
There is a range condition here, QL_OP_GREATER_THAN_EQUAL, with a lower bound (value { int32_value: 0 }
) on the first column of the index ({ column_id: 0 }
). What I wanted to verify here is that there is only one call.
Index Scan
By selecting all columns, it is to read the table in addition to the index:
explain (costs off, analyze) select * from demo where a>0;
set yb_debug_log_docdb_requests=true;
select * from demo where a>0;
set yb_debug_log_docdb_requests=false;
I have two calls to DocDB:
I0715 14:27:36.410634 336 pg_session.cc:240] Applying operation: { READ active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in
_txn_limit: <invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 58462304 schema_version: 0 targets { column_id: 1 } column_refs { ids: 1 } is_forward_scan: 1
is_aggregate: 0 limit: 1024 return_paging_state: 1 table_id: "000033e1000030008000000000004019" condition_expr { condition { op: QL_OP_AND operands { condition { op: QL_OP_
GREATER_THAN_EQUAL operands { column_id: 0 } operands { value { int32_value: 0 } } } } } } upper_bound { key: "~~~!" is_inclusive: 1 } col_refs { column_id: 1 attno: -101 }
partition_key: "48800000000021" }
I0715 14:27:36.411345 336 pg_session.cc:240] Applying operation: { READ active: 1 read_time: { read: <invalid> local_limit: <invalid> global_limit: <invalid> in_txn_limit:
<invalid> serial_no: 0 } request: client: YQL_CLIENT_PGSQL stmt_id: 56172640 schema_version: 2 targets { column_id: 1 } targets { column_id: 2 } targets { column_id: -8 } c
olumn_refs { ids: 1 ids: 2 } is_forward_scan: 1 is_aggregate: 0 limit: 1024 return_paging_state: 1 ysql_catalog_version: 22 table_id: "000033e1000030008000000000004016" ybct
id_column_value { value { binary_value: "47DC00532F3E68ABEAA545879804CD13CF0C340E00002121" } } batch_arguments { order: 0 ybctid { value { binary_value: "47DC00532F3E68ABEAA
545879804CD13CF0C340E00002121" } } } batch_arguments { order: 1 ybctid { value { binary_value: "475DF9530C646C9C79FE4B1C84AC32D590FF102600002121" } } } batch_arguments { ord
er: 2 ybctid { value { binary_value: "47DDFD53A6B8D58EFAE94FFD9FC1CBD136BDBBAF00002121" } } } batch_arguments { order: 3 ybctid { value { binary_value: "4758C053A4D995B0370B
4C9FB1E97520A0992C6A00002121" } } } batch_arguments { order: 4 ybctid { value { binary_value: "4792BC536CA73C978AE34FD1A1A1432B278DC18B00002121" } } } batch_arguments { orde
r: 5 ybctid { value { binary_value: "472C1253F1CD65ABE9C64B8380E8DCF1B2FC67D400002121" } } } batch_arguments { order: 6 ybctid { value { binary_value: "470EAB53DB8CD72BFC014
44CA2E88A04287B938300002121" } } } batch_arguments { order: 7 ybctid { value { binary_value: "479B3E53BA1156DB6C9C4AF09C7352C17B8ECDDC00002121" } } } batch_arguments { order
: 8 ybctid { value { binary_value: "47786553B15A4CA138624D70824263AAE1AAE71600002121" } } } batch_arguments { order: 9 ybctid { value { binary_value: "47341353692414BBF3934B
468B103C2153519D3200002121" } } } col_refs { column_id: 1 attno: 1 } col_refs { column_id: 2 attno: 2 } partition_key: "DC00" }
In addition to the request to the index, similar to the previous one, but returning the primary key of the table, I see one call to get the additional columns from the table. I can see that the 10 primary key have been grouped as 10 batch_arguments
. So that's what I wanted to verify. In YugabyteDB secondary indexes global, consistent, and high performance.
Posted on July 15, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.