Storing currency values: data types, caveats, best practices

cardinalby

Cardinal

Posted on January 10, 2023

Storing currency values: data types, caveats, best practices

Intro

Repeatedly facing questions, debates, and mistakes related to storing and representing currency amounts, I decided to collect all facts and advice regarding this topic in one place. This article is not the final source of the truth, but it contains useful information that you should take into consideration when designing software.

Domain info: currencies

Just facts:

There is an ISO 4217 standard that describes currency codes and their minor units. Data are also available in XML and CSV representations (following the links from the page).

  1. A currency has a 3-letter code, a numeric code, and a name. A currency may have multiple listed locations.
  2. Some currencies have exchange rates that are pegged (fixed) to another currency.
  3. Minor unit is the smallest unit of a currency, e.g. 1 dollar equals 100 cents (with 2 decimals).
  4. Most currencies have 2 decimals; some have none, and some have 3 decimals.
  5. Mauritania and Madagascar does not use a decimal division of units, setting 1 ouguiya = 5 khoums, ariary = 5 iraimbilanja.
  6. Cryptocurrencies can have up to 18 decimals (ETH).
  7. The number of decimals can change over time due to inflation.
  8. The same can happen because of redenomination, but a new currency code should be introduced.
  9. For some currencies, there are no physical denominations for the minor unit.
  10. Storing prices of small units of goods (probably as a result of conversion from another currency) can require using more decimals than are defined for a currency.

Storage requirements

  1. Obvious one: store currency amounts along with a link to the currency specification (foreign key in databases, special class in programming languages) to interpret and operate with it correctly.
  2. Storing a specification for a currency you should include:
    • Minimum accountable unit instead of or in addition to precision (see fact 5).
    • Lowest physical denomination of the currency if you deal with cash operations (see fact 9).
  3. Ensure precision for currency amounts equals the max precision of all supported currencies.
  4. Consider adding additional precision for operational needs: accumulators and intermediate calculations or for storing prices of small units of goods. For example, you may want to accumulate a 10 % fee from 1 cent operations, sum them up until they reach the minimum accountable unit (cent) and withdraw from a client.

Data types

There are different data types that can technically store money values. Let’s see how the listed requirements can be fulfilled by utilizing different data types.

1️⃣ Integer number of minor units

One of the popular (Stripe approaches) is storing an integer number of minor units. Simply put, you store 5 $ as 500 cents. This way you can do accurate calculations and comparisons internally and then display the result formatting the number in a proper way as an amount of dollars.

Taking into consideration the requirement about additional precision (let's say 3 extra decimals) you will represent 5 $ as 500 000 of "micro units", where 1 micro unit equals 1 000 cents.

Issues and limitations:

  • It's preferable to consider the precision beforehand.
  • It's complicates the business logic of the application and introduces error-prone value conversions between units, micro-units and normal currency amounts that are presented to a user or external systems.
  • Due to the fact 7 (minor unit of a currency can change) or because of the need to add additional precision you may need to rescale all values in the future.
  • External systems you interact with can misinterpret the magnitude of an integer-represented value after rescaling:
    • 3rd-party services/customers which are not aware of the rescaling.
    • Your own services that can't be immediately updated with the new definitions of the currencies (limitation of the deployment process, caches).
    • A message queue, or an event streaming platform where you can't modify messages during the rescaling process.
  • This problem can be solved by explicitly passing the scale of a number everywhere along with the number.

Suggested type: BigInt

It's a not-native data type that handles integers of arbitrary (or big enough) size and provides math operations on it. It's the most suitable way of storing minor units or micro units. Don't confuse it with SQL bigint that in fact is Int64.

Most of the languages (
JavaScript,
PHP,
Go,
Python,
Java,
C#,
C++
) support it natively or with 3rd-party libraries.
To store these values in databases you should use Decimal column type with precision = 0 (
SQL databases,
MongoDB
).

For correct (de)serialization you will probably need to use strings for compatibility between different implementations.

Usually, more memory is required by BigInt type, compared to native integers, and computations take longer because CPUs don't have hardware support for this data type.

Actually, it can depend on the binary representation of a value in a concrete implementation.
Normally, standard libraries provide optimal
(2^32-base, or 2^64-base) representation and have only constant overhead, while 3rd-party string-based
(C++, PHP) types have linear
overhead (due to 10-base representation). In most databases the binary representation of decimal type is not
optimal (
100-base or
10000-base).

With some concerns: Int64

Some choose signed or unsigned Int64 (also referenced as BigInt in SQL, which can lead to confusion) for storing their cents or smaller subunits of currency. Even though it may seem to be sufficient for your use-case, I want to highlight the following issues with it:

  • It may not be sufficient for storing minimal units of cryptocurrencies (see fact 6) or values with extended precision (see requirement 5).
  • Some languages require casting to Float64 for math operations (Go). The problem is Float64 has only 52 bits of mantissa; it's not enough to fit arbitrary Int64 value.
  • Not all programming languages support Int64 values:
    • PHP running on x32 architectures cannot handle Int64 values.
    • Some languages (Java, PHP) do not support any unsigned integers.
    • JavaScript uses signed Float64 as an internal representation for the number data type. This means that even if one can serialize Int64 numbers to JSON in their backend application, by default a JavaScript application will overflow its number type trying to deserialize JSON containing this value.

The problem with support of Int64 in external systems can be mitigated by serializing Int64 to a string and using BigInt types to handle these values, but it reduces the benefits from using hardware-supported Int64 values.

Int64 in JavaScript

The problem with JavaScript numeric type is relevant if you use JSON.parse() and JSON.stringify() without
additional arguments - as many "HTTP request" libraries do. If you have control over these calls you can pass custom
replacer
argument for JSON.stringify() and custom
reviver
argument for
JSON.parse()
and manually handle Int64 values using a data type different from default number.

What could this "different" type be?

2️⃣ Decimal

This approach implies using special Decimal numeric type that allows to store fractional decimal numbers accurately with the specified precision (maximum precision differs in different databases).

Most languages (
JavaScript,
PHP,
Go,
Python,
Java,
C#,
C++
) have built-in support or 3rd-party libraries for handling this data type.
SQL databases offer own standard Decimal
type.

Note that there are 2 different types of decimal type implementations:

  • Decimal128 with limited number of significant digits (MongoDB, C#).
  • So-called BigDecimal type of arbitrary (or big enough) size (SQL Decimal, BigDecimal in Java) with different internal representations.

As in the case of BigInt, the binary representation can differ:

  • 2 decimal digits in each byte (base 100, like it is done in databases).
  • BigInt with exponent in a manner similar to base 2 floating point values.
  • Array or string of single decimal digits.

The performance concerns described above for BigInt are relevant for Decimal types as well.

The main advantages of using Decimals comparing to BigInts are:

  • No major overhead if values are already stored as Decimals in the database (you'll do it anyway even with BigInts). Also, formatting BigInt values requires computations similar to those within Decimal type.
  • More intuitive representation of the values simplifies the business logic and formatting numbers to a human-readable format.
  • Changing minor units can be done by altering the precision of the decimal column in the database; you don't have to rescale the values.
  • Value serialized to string naturally includes the information about precision. If you change the precision, it cannot be misinterpreted by external systems as in the case of BigInts.

As in the case of BigInt values, all serialization goes through decimal strings for compatibility between different libraries. Of course, serializing in the form of mantissa + exponent can be more efficient for performance-sensitive applications.

However, you still need to keep track of minimal accountable units. Limiting the precision of Decimal in the database requires remembering about precision reserve for intermediate operations and accumulators.

Bad choice

Float, Double

The first rule here is "never use floating-point data types for storing money amounts". Humans expect money calculations to be made in base 10, but floating-point arithmetic uses base 2 representation that can lead to results that are not expected in financial sense. The common example to illustrate the problem is 0.1 + 0.2 != 0.3. This rule is relevant for both programming languages and databases.

Even though decimal representation also can't store all amounts precisely (e.g. 1/3 - read bonus part at the end),
it's expected by people, financial institutions and regulations.

SQL Server MONEY

This proprietary type of Microsoft SQL Server stores amounts as Int64 internally and
is not recommended
to use.

Bonus: rational numbers

Rational number is a number that can be expressed as the fraction A/B. In most applications, their use is not necessary, however, for some use cases, they are the only option for obtaining the desired precision.

I can imagine a game or an expenses splitting application in which you need to take 1/3 of 10 $, save it (to a database, message queue, or simply to memory) and later multiply it by 30. Using decimals (both Decimal type and BigInt type with the number of cents) can't provide exact precision - the result will never be exactly 100. But using rational numbers, you can save the intermediate value as numerator = 10, denominator = 3 and later do simple arithmetic to get the value 100/1.

Operations on non-decimal units, such as time and distance units, historical non-decimal currencies, calculating probabilities, etc., may be candidates for introducing rational numbers when absolute precision is required.

There is standard Fractions library in Python, and 3rd-party libraries in other languages:
JavaScript,
PHP,
Go,
C#,
C++.

Unfortunately, the only database I know to be supporting rational numbers is PostgreSQL with pg_rational extension. Storing rational numbers in separate "numerator" and "denominator" columns limits the possibilities of math calculations directly in the database.

👏 Thank you for reading

Any comments, criticism, and sharing of your own experience would be appreciated!

💖 💪 🙅 🚩
cardinalby
Cardinal

Posted on January 10, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related