Bill "The Vest Guy" Penberthy
Posted on May 2, 2022
A ledger database is a NoSQL database that provides an immutable, transparent, and cryptographically verifiable transaction log that is owned by a central authority. The key here is that you never actually change the data in the table. What we generally consider an update, which replaces the old content with the new content, is not applicable when working with a ledger database. Instead, an update adds a new version of the record. All previous versions still exist, so your update never overwrites existing data. The cryptographically verified part comes into place because this ensures that the record is immutable.
Ledger databases are generally used to record economic and financial activity within an organization, such as by tracking credits and debits within an account. Other common use cases are generally around workflows, such as tracking the various steps taken on an insurance claim or tracing the movement of an item through a supply-chain network. For those of you that ever needed to implement audit tables in a relational database – you can start to see how managing that all automatically will be of benefit.
Note: A ledger database may seem very similar to Blockchain. However, there is one significant difference and that is that the ledger database is generally a centralized ledger whereas Blockchain is a distributed ledger. There will be times when the organization, such as a bank of financial organization, is not comfortable with a distributed ledger and instead prefer a simpler architecture with the same guarantee of immutable and verifiable data.
Now that we have discussed, at a high level, the features of a ledger database, let’s take a look at AWS’ version, Amazon QLDB.
Amazon QLDB
In traditional, relational database architecture, the approach is to write data into tables as part of a transaction. This transaction is generally stored in a transaction log and includes all the database modifications that were made during the transaction. This allows you to replay the log and those transactions in the event of a system failure or for data replication. However, generally, those logs are not immutable and generally aren’t designed to allow easy access to users.
This approach is different with QLDB, as the journal is the central feature of the database. In a practical sense, the journal is structurally similar to the transaction log, however, it takes a write-only approach to storing application data with all writes, inserts, updates, and deletes, being committed to the journal first. QLDB then uses this journal to interpret the current set of your data. It does this by materializing that data in queryable, user-defined tables. These tables also provide access into the history of the data within that table, including revisions and metadata.
While a lot of the functionality seems similar, some of the terminology is different. Table 1 shows the mapping between these terms.
Relational Term | QLDB Term |
---|---|
Database | Ledger |
Table | Table |
Index | Index |
Table row | Amazon Ion Document |
Column | Document Attribute |
SQL | PartiQL |
Audit Logs | Journal |
Table 1. Terminology map between RDBMS and QLDB
The key difference is around the difference between the table row and column from a relational database that was replaced with an Amazon Ion Document. Amazon Ion is a richly-typed, self-describing, hierarchical data serialization format that offers interchangeable binary and text representations. The text format is a superset of JSON and is easy to read and work with while the binary representation is efficient for storage and transmission. Ion’s rich type system enables unambiguous semantics for data (e.g., a timestamp value can be encoded using the timestamp type). This support for rich types allows an Ion document to be able to conceptually replace a database row.
Note – The Ion Document format provides a lot of the support lacking in the Amazon Timestream SDK where columns and values are defined separately. This seems to be an unfortunate example of how various service teams seem to never talk to each other nor follow a single standard!!
Creating an Amazon QLDB ledger is very simple. In the AWS console, navigate to Amazon QLDB. Here you will be given the ability to Create ledger. Creating a ledger only requires:
Ledger name – The name of your ledger, needs to be unique by region
Ledger permissions mode – There are 2 choices here, Standard and Allow all. The “Standard” permissions mode is the default and it enables control over the ledger and its tables using IAM. The “Allow all” mode allows any user with access to the ledger to manage all tables, indexes, and data.
Encrypt Data at rest – Where you choose the key to use when encrypting data. There is no ability to opt-out – all data will be encrypted at rest.
Tags – additional label describing your resource
You can also create a table at this time, however, there is no way through the UI for you to be able to add an Index. So, instead, let’s look at how you can do that in .NET as well as add data to your newly created table.
.NET and Amazon QLDB
The first thing you will need to do to work with Amazon QLDB is to add the appropriate NuGet packages. The first one you should add is Newtonsoft.Json, which we will use in our examples to manage serialization and deserialization. The next ones to add are specific to QLDB, AWSSDK.QLDBSession, Amazon.IonObjectMapper, Amazon.QLDB.Driver, and Amazon.QLDB.Driver.Serialization. If you install the Driver.Serialization package, NuGet will install all the other required QLDB packages as dependencies.
The next step is to build an Amazon.QLDB.Driver.IQldbDriver. You do that with a Builder as shown below.
private IQldbDriver qldbDriver;
AmazonQLDBSessionConfig amazonQldbSessionConfig = new AmazonQLDBSessionConfig();
qldbDriver = QldbDriver.Builder()
.WithQLDBSessionConfig(amazonQldbSessionConfig)
.WithLedger(ledgerName)
.Build();
Note how you can not just “new” up a driver, and instead must use dot notation based upon the Builder() method of the QldbDriver. Next, you need to ensure that the table and applicable indexes are set up. See the code below that contains a constructor that sets up the client and calls the validation code.
private IQldbDriver qldbDriver;
private IValueFactory valueFactory;
private string ledgerName = "ProdDotNetOnAWSLedger";
private string tableName = "Account";
public DataManager()
{
valueFactory = new ValueFactory();
AmazonQLDBSessionConfig amazonQldbSessionConfig =
new AmazonQLDBSessionConfig();
qldbDriver = QldbDriver.Builder()
.WithQLDBSessionConfig(amazonQldbSessionConfig)
.WithLedger(ledgerName)
.Build();
ValidateTableSetup(tableName);
ValidateIndexSetup(tableName, "AccountNumber");
}`
As you can see below, checking for the existence of the table is simple as the driver has a **ListTableNames** method that you can use to determine the presence of your table. If it doesn’t exist, then process the query to create it.
Let’s pull that method out and examine it, because this is how you will do all your interactions with QLDB, by executing queries.
`private void ValidateTableSetup(string tableName)
{
if (!qldbDriver.ListTableNames().Any(x => x == tableName))
{
qldbDriver.Execute(y => { y.Execute($"CREATE TABLE {tableName}"); });
}
}
The Execute method accepts a function, in this case, we used a lambda function that executes a “CREATE TABLE” command. The code takes a similar approach to creating the index, however, you have to go through more steps to be able to determine whether the index has already been created as you have to query a schema table first and then parse through the list of indexes on that table. That code is shown below.
private void ValidateIndexSetup(string tableName, string indexField)
{
var result = qldbDriver.Execute(x =>
{
IIonValue ionTableName = this.valueFactory.NewString(tableName);
return x.Execute($"SELECT * FROM information_schema.user_tables WHERE name = ?", ionTableName);
});
var resultList = result.ToList();
bool isLIsted = false;
if (resultList.Any())
{
IIonList indexes = resultList.First().GetField("indexes");
foreach (IIonValue index in indexes)
{
string expr = index.GetField("expr").StringValue;
if (expr.Contains(indexField))
{
isLIsted = true;
break;
}
}
}
if (!isLIsted)
{
qldbDriver.Execute(y => y.Execute($"CREATE INDEX ON {tableName}({indexField})"));
}
}
As mentioned earlier, the save is basically a query that is executed. A simple save method is shown below:
public void Save(object item)
{
qldbDriver.Execute(x =>
{
x.Execute($"INSERT INTO {tableName} ?", ToIonValue(item));
}
);
}
private IIonValue ToIonValue(object obj)
{
return IonLoader.Default.Load(JsonConvert.SerializeObject(obj));
}
This example, as well as the one before this one where we did a check to see whether an index exists, both have some dependence on Ion. Remember, Ion is based upon richly-typed JSON, so it expects values to be similarly represented – which is why there are two different approaches for converting an item to an IonValue, even something as simple as the tableName, for comparison. The first of these is through the Amazon.IonDotnet.Tree.ValueFactory object as shown in the index validation snippet while the other is through the Amazon.IonDotnet.Builders.IonLoader as shown in the ToIonValue method.
Once you have the ledger, table, and any indexes you may want set up, the next step is to save some data into the table. This is done by executing a SQL-like command as shown below.
public void Save(object item)
{
qldbDriver.Execute(x =>
{
x.Execute($"INSERT INTO {tableName} ?",
ToIonValue(item));
}
);
}
Getting it out of the database is a little bit trickier as you want it to be cast into the appropriate data model. The approach for retrieving an item from the database and converting it into a plain old class object (POCO) is shown below.
public List<T> Get<T>(string accountNumber)
{
IIonValue ionKey = valueFactory.NewString(accountNumber);
return qldbDriver.Execute(x =>
{
IQuery<T> query = x.Query<T>(
$"SELECT * FROM {tableName} as d
WHERE d.AccountNumber = ?", ionKey);
var results = x.Execute(query);
return results.ToList();
});
}
You can see that this is a two-step process. First, you create a generic query that is defined with the SELECT text and the appropriate model for the return set. You then Execute that query. This brings you back an Amazon.QLDB.Driver.Generic.IResult object that can be converted to return a list of the requested items matching the AccountNumber that was passed into the function.
That’s a quick review of the general functionality of QLDB. Let’s look at one more specific case, the history of an item – one of the prime drivers of using a ledger database. You do this by using the history function, a PartiQL extension that returns revisions from the system-defined view of your table. The syntax for using the history function is
SELECT * FROM history(table,`start-time`,`end-time` ] ] ) AS h
[ WHERE h.metadata.id = 'id' ]
The start time and end time values are optional. Using a start time will select any versions that are active when the start time occurs and any subsequent versions up until the end time. If no start time is provided, then all versions of the document are retrieved, and the current time is used in the query if the end time value is not explicitly defined. However, as you can see, the optional WHERE clause is using a metadata.id value – which we have never talked about. This metadata.id value is used by QLDB to uniquely identify each document and can be accessed through the metadata of the document. One way to get this information is shown below:
public string GetMetadataId(Account item)
{
IIonValue ionKey =
valueFactory.NewString(item.AccountNumber);
IEnumerable<IIonValue> result = new List<IIonValue>();
qldbDriver.Execute(x =>
{
result = x.Execute($"SELECT d.metadata.id as id
FROM _ql_committed_{tableName} as d
WHERE d.data.AccountNumber = ?", ionKey);
});
IIonValue working = result.First();
return working.GetField("id").StringValue;
}
This code snippet accesses the “committed” view, one of the different views provided by QLDB that provides access to information about your data. Taking the result from that query, and plugging it into the previous snippet where we queried using the history function will get you the history of that document for the time period defined in the query. Thus, you can access the current version as well as have access to all the history; all by executing SQL-like commands.
There is a lot more we can go into about using Amazon QLDB and .NET together, probably a whole other book! To recap, however, Amazon QLDB is a ledger database, which means it provides an immutable, transparent, and cryptographically verifiable transaction log that you can access and evaluate independently from the active version.
Posted on May 2, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.