For readers that may not be familiar, what are TiDB and TiKV?
Sri Satya
Posted on May 24, 2021
(LT): TiDB and TiKV are the two biggest projects for us right now! TiDB, our flagship product, is a highly scalable, distributed, cloud-native NewSQL database. It’s totally open source, written in Go, and supports hybrid transactional/analytical processing (HTAP) workloads. While it supports MySQL well enough to operate products like Wordpress and Confluence without patches, it also provides features such as horizontal scalability, strong consistency, and high availability. It has been battle-tested in production by over 500 enterprises across multiple industries.
Behind every TiDB cluster lies a TiKV cluster. TiKV is a distributed transactional key-value database originally created by PingCAP as the underlying storage engine for TiDB. It is now adopted as a Cloud Native Computing Foundation (CNCF) Incubating Project. TiKV is developed with the intention of creating a common cloud-native data substrate. It works alongside PD, our coordinator, to keep the cluster in order and tame the chaos of distributed transactions.
Over the last few years, we’ve started to see this vision come to fruition, and we’re very excited to have such a vibrant community involvement in our projects. From projects like Titan, TiPrometheus, and Tidis to tools like the tikv-browser, we’re
seeing more and more people adopt our ideas and technology, and we couldn’t be happier.
How does TiKV joining the CNCF affect that?
(LT): Well, we built TiKV with the intention of it being a building block for other technologies, not just TiDB and TiSpark. The CNCF’s interest only confirmed our aspiration, and we couldn’t be more excited to have them steward the project and help us foster its growth.
TiKV—and ultimately all of our projects—has benefitted immensely from all of the knowledge and mentoring we’ve received from the foundation. While some companies treat foundations like graveyards, we’re treating this as a long-term investment, and our TiKV team has only grown since the CNCF adopted TiKV. We hope to continue this trend, growing both our team at PingCAP and our community maintainership.
Continuing on the thread of TiKV, how does that compare with something like FoundationDB?
We think FoundationDB is way cool and we’ve been admiring their work for years now. We have an ongoing discussion about how/where FoundationDB and TiKV differ but, ultimately, we think our users are choosing TiKV for the vibrant ecosystem around it and the rock-solid guarantee of enterprise-level support and our legacy of open source stewardship.
Technically, their transactions model differs—FoundationDB uses Paxos for metadata, replicating logs to all replicas, while TiKV uses Multi-Raft for all its data. While Raft is essentially just a limited form of Paxos, research has shown that Raft is considerably easier for operators to reason about, which means less mistakes when the network is in chaos.
TiKV’s coprocessor, when harnessed by query layers like TiDB, offers a distinct advantage for users looking for an infrastructure building block, not just a key value store.
What should users know if they want to run TiKV and TiDB themselves?
(LT): Users should be prepared for a distributed system. They’ll notice considerable more complexity to set up a TiDB cluster compared to, say, a MySQL master and replica, but this complexity will pay off when they’re scaling from one to hundreds of nodes.
We’d also love to invite folks to come and chat with our community! We can help you through the entire lifecycle with TiDB, TiKV, TiSpark, or anything else. PingCAP offers comprehensive support for all our products, all the way from evaluation and design, to proof-of-concept and benchmarking, all the way through to deploying them in production, migrating your data, and performing a seamless handoff. We also have managed TiDB clusters you can try out without provisioning any hardware!
Transactions
Transactions have been part of database systems for decades. In distributed databases they are the key mechanism for supporting properties such as isolation and consistency (well known from the ACID property and the CAP theorem), in other words, making it possible to manage a distributed database in a similar way to a non-distributed one.
TiKV supports distributed transactions based on Percolator, and implemented using MVCC and a collaborative protocol between the TiKV server and its client. Using the transactional API, TiKV guarantees Snapshot Isolation and Linearizability. The transaction sub-system also handles the scheduling of command execution, local concurrency control, and efficient execution of reads and scans.
A strong transaction system is essential to making TiKV fast and correct.
The Transaction SIG
The Transaction SIG is a group for people interested in transactions in TiKV or distributed transactions in general. In addition to working on the implementation in TiKV (including testing, modelling, documenting, and bug-finding, as well as writing code), the SIG aims to be a place for people to discuss how to use transactions, understand how transactions work and how to make the best use of them, and keep up to date with transaction-related research.
Posted on May 24, 2021
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.