Birth Generator 1.1.1
Procedural birth date generation for C++23
Loading...
Searching...
No Matches
Usage Guide

This guide covers every feature of the birth-generator library in detail. For a quick overview, see the README. For the full API reference, run doxygen Doxyfile from the repository root and open doc/api/html/index.html.

Quick Start

#include <iostream>
int main()
{
auto& gen = dasmig::bthg::instance();
// Random birth (population-weighted country selection).
auto b = gen.get_birth();
std::cout << b.date_string() << " — "
<< b.country_code << ", age " << +b.age
<< ", " << b.cohort << "\n";
// Country-specific birth.
auto jp = gen.get_birth("JP");
std::cout << "Japan: " << jp << " (age " << +jp.age << ")\n";
// Sex-specific birth.
auto m = gen.get_birth("BR", dasmig::sex::male);
std::cout << "Male from Brazil: " << m << "\n";
// Year-specific birth.
auto y = gen.get_birth("DE", dasmig::year_t{1990});
std::cout << "Born in 1990: " << y << "\n";
// Age-range birth.
auto adult = gen.get_birth("US", dasmig::age_range{18, 65});
std::cout << "Adult: " << adult << " (age " << +adult.age << ")\n";
}
Birth generator library — procedural birthday generation for C++23.
static bthg & instance()
Access the global singleton instance.
Definition birthgen.hpp:148
Specifies an inclusive age range [min, max].
Definition birthgen.hpp:65
Strong type for specifying a birth year.
Definition birthgen.hpp:58

Installation

  1. Copy dasmig/birthgen.hpp and dasmig/random.hpp into your include path.
  2. Copy the resources/ folder (containing full/ and/or lite/ subdirectories with their three TSV files) so it is accessible at runtime.
  3. Compile with C++23 enabled: -std=c++23.

Loading Resources

The library ships two dataset tiers:

Tier Enum Countries Description
lite dasmig::dataset::lite ~195 Sovereign states only
full dasmig::dataset::full ~235 All countries and territories (UN WPP coverage)

Each tier stores three TSV files:

File Content
countries.tsv ISO codes, name, region, latitude, life expectancy, C-section rate
age_pyramid.tsv Population by single year of age (0–100) for male and female
monthly_births.tsv Seasonal birth weights by month (latitude-derived)

Automatic loading (singleton)

On first access the singleton constructor probes these base paths:

Priority Base path
1 resources/
2 ../resources/
3 birth-generator/resources/

It loads lite/ if found, otherwise falls back to full/.

Explicit tier loading

gen.load(dasmig::dataset::lite); // ~195 sovereign states
gen.load(dasmig::dataset::full); // ~235 countries (can combine)
Birth generator that produces demographically plausible random birthdays using UN WPP 2024 population...
Definition birthgen.hpp:134
void load(const std::filesystem::path &dir)
Load birth data from a resource directory.
Definition birthgen.hpp:376

Direct path loading

gen.load("/data/births/lite"); // directory containing the 3 TSVs

Generating Births

auto& gen = dasmig::bthg::instance();
// Random birth (population-weighted country).
auto b = gen.get_birth();
// Country-specific birth.
auto us = gen.get_birth("US");
// Seeded (deterministic) birth.
auto det = gen.get_birth(std::uint64_t{42});
birth get_birth(std::string_view cca2)
Generate a random birth for a specific country.
Definition birthgen.hpp:160

If the country code is not found (e.g., "XX"), get_birth() throws std::invalid_argument.

Sex-specific generation

Fix the biological sex instead of drawing it from the M:F ratio:

auto m = gen.get_birth("BR", dasmig::sex::male); // always male
auto f = gen.get_birth("US", dasmig::sex::female); // always female
// Random country, fixed sex.
auto r = gen.get_birth(dasmig::sex::male);
// Deterministic variant.
auto d = gen.get_birth("JP", dasmig::sex::female, std::uint64_t{42});

Year-specific generation

Fix the birth year instead of drawing from the age pyramid. Age is derived as ref_year − year and clamped to [0, 100]:

auto b = gen.get_birth("DE", dasmig::year_t{1990});
// Deterministic variant.
auto d = gen.get_birth("JP", dasmig::year_t{1985}, std::uint64_t{42});

Sex + year generation

Fix both sex and year — only month, day, weekday, LE, and cohort are randomised:

auto b = gen.get_birth("US", dasmig::sex::female, dasmig::year_t{1985});
// Deterministic variant.
auto d = gen.get_birth("IN", dasmig::sex::male,
dasmig::year_t{1970}, std::uint64_t{42});

Age-range generation

Constrain the age to an inclusive range. The age is rejection-sampled from the country's age pyramid within the bounds:

// Adults only.
auto adult = gen.get_birth("US", dasmig::age_range{18, 65});
// Narrow band.
auto young = gen.get_birth("JP", dasmig::age_range{25, 30});
// Deterministic variant.
auto d = gen.get_birth("BR", dasmig::age_range{20, 40}, std::uint64_t{42});

Throws std::invalid_argument if min > max.

Birth Fields

Every dasmig::birth object exposes these fields:

Field Type Description
country_code std::string ISO 3166-1 alpha-2 code
year std::uint16_t Birth year
month std::uint8_t Birth month (1–12)
day std::uint8_t Birth day (1–31)
age std::uint8_t Age in completed years
bio_sex dasmig::sex sex::male or sex::female
weekday std::uint8_t Day of week (0=Sun, 1=Mon, …, 6=Sat)
le_remaining double Estimated years of life remaining
cohort std::string Generational cohort label

String conversion

// Implicit conversion to std::string (ISO 8601 date).
std::string iso = b;
// Explicit method.
std::string date = b.date_string(); // "1987-09-15"
// Stream output.
std::cout << b; // prints "1987-09-15"

Generational cohorts

Cohort Birth years
Greatest Generation ≤ 1927
Silent Generation 1928–1945
Baby Boomer 1946–1964
Generation X 1965–1980
Millennial 1981–1996
Generation Z 1997–2012
Generation Alpha ≥ 2013

Country-Specific Generation

Pass an ISO 3166-1 alpha-2 code to generate a birth from a specific country:

auto brazil = gen.get_birth("BR");
auto japan = gen.get_birth("JP");
auto nigeria = gen.get_birth("NG");

The age pyramid, life expectancy, seasonal model, and C-section rate are all country-specific.

Population Weighting

By default, countries are selected proportional to their total population. This means births from China, India, and the United States are far more common than births from small nations.

// Check current mode.
bool w = gen.weighted(); // true by default
bthg & weighted(bool enable)
Set whether country selection is population-weighted.
Definition birthgen.hpp:352

Uniform Selection

Switch to equal-probability country selection:

gen.weighted(false);
auto b = gen.get_birth(); // any country equally likely
gen.weighted(true); // restore population-weighted

Seeding and Deterministic Generation

Per-call seeding

auto b = gen.get_birth(std::uint64_t{42}); // deterministic
auto b2 = gen.get_birth(std::uint64_t{42}); // identical to b

Replay via birth seed

Every birth stores the random seed used to generate it:

auto b = gen.get_birth(); // random
auto replay = gen.get_birth(std::uint64_t{b.seed()}); // exact same birth

Generator-level seeding

gen.seed(100);
auto a = gen.get_birth();
auto b = gen.get_birth();
gen.seed(100);
auto a2 = gen.get_birth(); // same as a
auto b2 = gen.get_birth(); // same as b
gen.unseed(); // restore non-deterministic state
bthg & unseed()
Reseed the engine with a non-deterministic source.
Definition birthgen.hpp:345
bthg & seed(std::uint64_t seed_value)
Seed the internal random engine for deterministic sequences.
Definition birthgen.hpp:338

Multi-Instance Support

Construct independent instances for isolation:

gen1.load(dasmig::dataset::lite);
gen2.load("path/to/custom/data");
// gen1 and gen2 have separate data and random engines.

Useful when embedding inside other generators or when different threads need independent generators.

Generation Pipeline

Each get_birth() call runs the following pipeline:

  1. Country selection — population-weighted or uniform from loaded entries.
  2. Biological sex — Bernoulli trial weighted by the country's male:female population ratio (or fixed if a sex parameter is provided).
  3. Age — drawn from a discrete distribution over 0–100, shaped by the country's age pyramid (or derived from a fixed year_t, or rejection-sampled within an age_range).
  4. Birth yearreference_year − age (reference year = current year).
  5. Birth month — drawn from 12 seasonal weights derived from the country's latitude (sinusoidal model: NH peaks in September, SH peaks in March).
  6. Birth day — uniform within the month's valid range, then rejection-sampled for weekday deficit: if the candidate day falls on a weekend, it is rejected with probability csection_rate × 0.5 (up to 3 retries).
  7. Weekday — computed via std::chrono::weekday from the final date.
  8. Life expectancy remainingmax(0, LE_at_birth − age) using sex-specific period life expectancy from UN WPP 2024.
  9. Cohort label — mapped from birth year to generational label.

Data Pipeline

The scripts/prepare_births.py script downloads and processes data from three sources:

  1. UN WPP 2024 (via PPgp/wpp2024 GitHub): population by age + sex, life expectancy by sex.
  2. REST Countries v3.1: ISO codes, latitude, independence status, regions.
  3. WHO C-section rates: hardcoded from WHO Global Health Observatory data.

It produces three TSV files per tier:

resources/
├── full/
│ ├── countries.tsv # 250 rows
│ ├── age_pyramid.tsv # 235 rows × 203 columns
│ └── monthly_births.tsv # 250 rows × 13 columns
└── lite/
├── countries.tsv # 195 rows
├── age_pyramid.tsv # 193 rows × 203 columns
└── monthly_births.tsv # 195 rows × 13 columns

Regenerate with:

python scripts/prepare_births.py

Downloaded files are cached in scripts/.cache/ to avoid repeated downloads.

Thread Safety

The random engine inside each bthg instance is not thread-safe. If you call get_birth() from multiple threads, either:

  • Use a separate bthg instance per thread, or
  • Protect calls with a mutex.

The singleton bthg::instance() returns a single shared instance, so concurrent access must be synchronized.

Error Reference

Error Cause
std::runtime_error("No birth data loaded. Call load() first.") Called get_birth() before any load() and auto-probe found nothing.
std::invalid_argument("Unknown country code: XX") The country code passed to get_birth("XX") is not in the loaded dataset.
std::invalid_argument("age_range: min must be <= max") Passed an age_range where min > max.