Master TOML Numbers: Ultimate Guide to Integers & Floats
TOML (Tom’s Obvious, Minimal Language) is a popular configuration file format known for its simplicity and readability. It is widely used in various applications for defining configurations in a structured manner. One of the fundamental data types in TOML is numbers, which can be categorized into integers and floats. In this blog, we will explore TOML numbers in depth with clear explanations and practical examples.
Understanding TOML Numbers
TOML supports two primary numeric types:
- Integers: Whole numbers without a fractional or decimal part.
- Floats: Numbers that include decimal points or are represented in scientific notation.
These numbers are used in configuration files for various applications, such as specifying limits, settings, or numeric identifiers.
TOML Integers
Integers in TOML are whole numbers. They can be:
- Positive (e.g.,
+99
,42
) - Negative (e.g.,
-17
) - Zero (e.g.,
0
,+0
,-0
→ all are the same)
int1 = +99
int2 = 42
int3 = 0
int4 = -17
Note – Positive integers can be prefixed with a plus sign, while negative integers are always prefixed with a minus sign. Additionally, the integer values -0 and +0 are valid and identical to an unprefixed zero.
Large Numbers with Underscores
For large numbers, underscores may be used between digits to enhance readability. Each underscore must be surrounded by at least one digit on both sides. That means you can use underscores (_) between digits to improve readability, but they must not appear at the start or end of the number (e.g., 1_000_000
is valid, but _1000
is not).
int1 = 1_000 # Same as 1000
int2 = 5_349_221 # More readable
int3 = 53_49_221 # Indian-style grouping
int4 = 1_2_3_4_5 # Allowed, but not recommended
We’ll look at a few examples of valid and invalid numbers in just a moment. But let’s first see, what exactly is Indian-style grouping? As an Indian, I’m pretty familiar with it, but I realize that not everyone might be, so let me break it down for them. By the way, I’m really proud of my heritage, and I always enjoy sharing a bit of our culture, ethics, and values with the world whenever I can. I think we all feel that way, right..?
So, what is Indian number system grouping?
The Indian number system is a way of grouping numbers that is different from the international system used in many other countries. The key difference lies in how large numbers are grouped and named.
Grouping:
- First three digits: The first three digits from the right are grouped together (ones, tens, hundreds).
- Subsequent groups: After the first three digits, the remaining digits are grouped in sets of two.
For example:
- 1,00,000 (one lakh) instead of 100,000
- 10,00,000 (ten lakh) instead of 1,000,000
- 1,00,00,000 (one crore) instead of 10,000,000
Commas are used to separate the groups of digits. The first comma is placed after the first three digits from the right, and subsequent commas are placed after every two digits.
The number 123,456,789 would be written as 12,34,56,789 in the Indian number system.
In short, the Indian number system has a unique way of grouping digits and uses specific place values to represent large numbers. It is typically used in neighboring countries like Nepal, Bangladesh, Pakistan, and other South Asian countries.
Valid Integers
positive_int = 42
negative_int = -100
large_number = 1_000_000 # Readable format
Invalid Integers
invalid_int = 1.0 # Not an integer (it's a float)
leading_zero = 007 # Leading zeros are NOT allowed
Leading Zeros: TOML does not allow integers to start with 0
unless the value is 0
itself.
Other Number Formats
TOML supports hexadecimal, octal, and binary for non-negative integers.
- Hexadecimal (base 16) →
0x
prefix - Octal (base 8) →
0o
prefix - Binary (base 2) →
0b
prefix
Key Rules:
- No
+
sign is allowed at the beginning. - Leading zeros are allowed (after the prefix).
- Hexadecimal values are not case-sensitive (uppercase and lowercase letters work the same).
- Underscores (
_
) can be used to improve readability, but not between the prefix and the number.
# Hexadecimal (Base 16) – Prefix: 0x
hex1 = 0xDEADBEEF # Same as 3735928559 in decimal
hex2 = 0xdeadbeef # Same as above, case-insensitive
hex3 = 0xdead_beef # Same as above, underscore for readability
# Octal (Base 8) – Prefix: 0o
oct1 = 0o01234567 # Same as 342391 in decimal
oct2 = 0o755 # Common in Unix file permissions
# Binary (Base 2) – Prefix: 0b
bin1 = 0b11010110 # Same as 214 in decimal
Integer Limits
- TOML supports 64-bit signed integers (from
-2^63
to2^63 - 1
), meaning numbers can range from −9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. - If a number goes beyond this range, it cannot be stored correctly (some digits would be lost).
- In such cases, TOML must show an error instead of trying to adjust or round the number.
Valid 64-bit signed integers
small_number = -9223372036854775808 # Minimum value
large_number = 9223372036854775807 # Maximum value
Invalid (too large or too small)
too_large = 9223372036854775808 # Error! Exceeds the maximum limit
too_small = -9223372036854775809 # Error! Below the minimum limit
If you try to use a number outside this range, TOML must stop and show an error instead of storing an incorrect value.
Please note that some TOML parsers or validators might accept values beyond this limit. However, those that strictly follow the specification will show errors if the limit is exceeded.
TOML Floats
TOML floats follow the IEEE 754 binary64 format (same as double-precision floating-point numbers in many programming languages).
How Floats Are Written in Toml:
A float must have an integer part, followed by:
- A fractional part (a decimal point + digits) OR
- An exponent part (
E
ore
+ integer) OR - Both (fractional part first, then exponent).
Valid Floats
# Floats with a Fractional Part
flt1 = +1.0 # Positive float
flt2 = 3.1415 # Pi approximation
flt3 = -0.01 # Negative float
# Floats with an Exponent Part (scientific notation)
flt4 = 5e+22 # 5 × 10^22
flt5 = 1e06 # 1 × 10^6 (same as 1,000,000)
flt6 = -2E-2 # -2 × 10^(-2) (same as -0.02)
# Floats with Both Fractional and Exponent Parts
flt7 = 6.626e-34 # 6.626 × 10^-34 (Planck's constant)
Invalid floats
# INVALID examples
invalid_float_1 = .7 # Missing whole number (No digit before the decimal)
invalid_float_2 = 7. # Missing fractional part (No digit after the decimal)
invalid_float_3 = 3.e+20 # Missing fractional part before 'e' (No digit after the decimal)
Rule: A decimal point must have digits on both sides. This means a float must have digits on both sides of the decimal point.
Readability with Underscores
Similar to integers, underscores may be used to enhance readability. Each underscore must be surrounded by at least one digit.
# Readable Floats Using Underscores
flt8 = 224_617.445_991_228 # Underscores improve readability
Rule: Underscores must be between numbers, not at the start or end.
Zero in Toml Float
In TOML, the values -0.0
and +0.0
are valid and represent zero, but with a sign.
-0.0
: Negative zero+0.0
: Positive zero0.0
: Zero without any sign
According to the IEEE 754 standard for floating-point numbers:
-0.0
and+0.0
are treated as the same value (0.0
), but they have different internal representations, which can be important in certain calculations.- The sign of zero may affect certain edge cases in mathematical operations, but for most practical uses, they are treated the same as
0.0
.
This is mostly used in scientific computing or cases where the sign of zero can have meaning.
Special Float Values
TOML also supports special float values, which are always written in lowercase.
Infinity (inf
)
inf
: Represents positive infinity.+inf
: Same asinf
, positive infinity (optional + sign).-inf
: Represents negative infinity.
# infinity
sf1 = inf # positive infinity
sf2 = +inf # positive infinity (same as inf)
sf3 = -inf # negative infinity
Infinity is often used in mathematical operations to represent values that exceed any finite number.
Not-a-Number (nan
)
nan
: Represents “Not a Number,” which is used to signal invalid or undefined numbers (like dividing zero by zero).+nan
: Same asnan
, but with an optional plus sign.-nan
: Same asnan
, but with a minus sign.
# not a number, 0 / 0 results in nan because there's no defined answer to that division.
sf4 = nan # actual sNaN/qNaN encoding is implementation-specific
sf5 = +nan # same as `nan`
sf6 = -nan # valid, actual encoding is implementation-specific
NaN is used when a result doesn’t make sense (like dividing zero by zero). The way NaN (Not-a-Number) is encoded depends on the implementation, so how it’s stored can vary. This means how NaN is stored in memory (internally) depends on the system or language you’re using.
There are two common types of NaN:
- sNaN (Signaling NaN): Used to signal an error in a computation (usually a floating-point exception or to trigger a trap, depending on the system).
- qNaN (Quiet NaN): Used to propagate through calculations without signaling an error (i.e., it propagates “quietly” through calculations without raising exceptions).
However, for TOML, it doesn’t matter whether it’s sNaN or qNaN; it simply stores it as nan
, and how it’s handled is up to the implementation (such as in programming languages or hardware).
Practical Example: Using Numbers in a TOML Config File
Here’s a simple example of a TOML configuration file (for demonstration purposes only) that uses both integers and floats:
# Database configuration
db_max_connections = 100
query_timeout = 30.5 # Float value in seconds
# Application settings
retry_attempts = 5
cache_expiry = 60.0 # Float value, ensures decimal representation
# Science-related values
pi_value = 3.14159
speed_of_light = 2.99792458e8 # Scientific notation
Here,
db_max_connections
is an integer representing the maximum number of connections.query_timeout
is a float because time values often require more precision.retry_attempts
is an integer because you can’t retry a fractional number of times.cache_expiry
is written as60.0
to explicitly indicate it’s a float, as it represents time, and it’s good practice to allow for fractional values in time settings. This way, you can be more precise with your configuration.pi_value
andspeed_of_light
demonstrate precision and scientific notation.
Best Practices for Using Numbers in TOML
- Use underscores (
_
) for readability in large numbers. - Use floats where precision matters, such as timeouts or scientific values.
- Stick to integer values when working with counts, IDs, or whole numbers.
- Avoid leading zeros in integers.
- Use scientific notation (
e
) only when necessary, as it may not always be clear.
Conclusion
TOML provides a simple yet powerful way to define numerical values. By understanding integers and floats, you can create clear, well-structured TOML configuration files that are both human-readable and machine-friendly. Whether you’re configuring application settings, scientific data, or system parameters, following best practices will help maintain accuracy and readability.
By keeping these principles in mind, you’ll be able to work with TOML numbers effectively and avoid common pitfalls.