When it comes to configuration files, TOML (Tom’s Obvious, Minimal Language) is a popular choice because it’s simple and easy to use. It’s great for projects that need human-readable configs, and with support for various data types—including integers—it’s pretty versatile too.
TOML Integers
Integers in TOML are whole numbers. They can be:
- Positive (e.g.,
+99
,42
) - Negative (e.g.,
-17
) - Zero (e.g.,
0
,+0
,-0
→ all are the same)
int1 = +99
int2 = 42
int3 = 0
int4 = -17
Note – Positive integers can be prefixed with a plus sign, while negative integers are always prefixed with a minus sign. Additionally, the integer values -0 and +0 are valid and identical to an unprefixed zero.
Large Numbers with Underscores
For large numbers, underscores may be used between digits to enhance readability. Each underscore must be surrounded by at least one digit on both sides. That means you can use underscores (_) between digits to improve readability, but they must not appear at the start or end of the number (e.g., 1_000_000
is valid, but _1000
is not).
int1 = 1_000 # Same as 1000
int2 = 5_349_221 # More readable
int3 = 53_49_221 # Indian-style grouping
int4 = 1_2_3_4_5 # Allowed, but not recommended
We’ll look at a few examples of valid and invalid numbers in just a moment. But let’s first see, what exactly is Indian-style grouping? As an Indian, I’m pretty familiar with it, but I realize that not everyone might be, so let me break it down for them. By the way, I’m really proud of my heritage, and I always enjoy sharing a bit of our culture, ethics, and values with the world whenever I can. I think we all feel that way, right..?
So, what is Indian number system grouping?
The Indian number system is a way of grouping numbers that is different from the international system used in many other countries. The key difference lies in how large numbers are grouped and named.
Grouping:
- First three digits: The first three digits from the right are grouped together (ones, tens, hundreds).
- Subsequent groups: After the first three digits, the remaining digits are grouped in sets of two.
For example:
- 1,00,000 (one lakh) instead of 100,000
- 10,00,000 (ten lakh) instead of 1,000,000
- 1,00,00,000 (one crore) instead of 10,000,000
Commas are used to separate the groups of digits. The first comma is placed after the first three digits from the right, and subsequent commas are placed after every two digits.
The number 123,456,789 would be written as 12,34,56,789 in the Indian number system.
In short, the Indian number system has a unique way of grouping digits and uses specific place values to represent large numbers. It is typically used in neighboring countries like Nepal, Bangladesh, Pakistan, and other South Asian countries.
Valid Integers
positive_int = 42
negative_int = -100
large_number = 1_000_000 # Readable format
Invalid Integers
invalid_int = 1.0 # Not an integer (it's a float)
leading_zero = 007 # Leading zeros are NOT allowed
Leading Zeros: TOML does not allow integers to start with 0
unless the value is 0
itself.
Other Number Formats
TOML supports hexadecimal, octal, and binary for non-negative integers.
- Hexadecimal (base 16) →
0x
prefix - Octal (base 8) →
0o
prefix - Binary (base 2) →
0b
prefix
Key Rules:
- No
+
sign is allowed at the beginning. - Leading zeros are allowed (after the prefix).
- Hexadecimal values are not case-sensitive (uppercase and lowercase letters work the same).
- Underscores (
_
) can be used to improve readability, but not between the prefix and the number.
# Hexadecimal (Base 16) – Prefix: 0x
hex1 = 0xDEADBEEF # Same as 3735928559 in decimal
hex2 = 0xdeadbeef # Same as above, case-insensitive
hex3 = 0xdead_beef # Same as above, underscore for readability
# Octal (Base 8) – Prefix: 0o
oct1 = 0o01234567 # Same as 342391 in decimal
oct2 = 0o755 # Common in Unix file permissions
# Binary (Base 2) – Prefix: 0b
bin1 = 0b11010110 # Same as 214 in decimal
Integer Limits
- TOML supports 64-bit signed integers (from
-2^63
to2^63 - 1
), meaning numbers can range from −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. - If a number goes beyond this range, it cannot be stored correctly (some digits would be lost).
- In such cases, TOML must show an error instead of trying to adjust or round the number.
Valid 64-bit signed integers
small_number = -9223372036854775808 # Minimum value
large_number = 9223372036854775807 # Maximum value
Invalid (too large or too small)
too_large = 9223372036854775808 # Error! Exceeds the maximum limit
too_small = -9223372036854775809 # Error! Below the minimum limit
If you try to use a number outside this range, TOML must stop and show an error instead of storing an incorrect value.
Please note that some TOML parsers or validators might accept values beyond this limit. However, those that strictly follow the specification will show errors if the limit is exceeded.
Conclusion
TOML integers are simple, yet powerful. From basic whole numbers to more complex representations like hexadecimal, octal, and binary, TOML allows a variety of formats. It also ensures that large numbers are readable, either with underscores or through direct formatting, but it keeps the rules strict when it comes to leading zeros. By following these rules, you can write clear, precise configuration files that are easy to read and maintain.
Understanding the nuances of TOML’s integer syntax allows for cleaner and more readable configuration files, making the process of managing complex systems more efficient and error-free.
If you’re working with TOML and large datasets, remember the value of good readability practices. This small detail can greatly improve the developer experience and reduce mistakes in the long run.