
Requires C++23 (e.g., -std=c++23 for GCC/Clang, /std:c++latest for MSVC).

API Reference · Usage Guide · Releases
Features
- Demographically Plausible Characteristics. Generates random human physical traits using a multi-stage pipeline: height from country/sex-specific Gaussian distributions, BMI from log-normal distributions, and categorical sampling for phenotypic traits.
- Seven Physical Traits. Every generated profile includes: height (cm), weight (kg), BMI, eye colour, hair colour, Fitzpatrick skin type, ABO/Rh blood type, and handedness.
- Country-Specific Distributions. Height means, BMI values, eye/hair/skin distributions, blood type frequencies, and left-handedness rates all vary by country, reflecting real-world population statistics.
- Data-Driven. Height and BMI from NCD-RisC via Our World in Data, blood types from published population studies, phenotypic traits from Katsara & Nothnagel 2019, VISAGE consortium, and Papadatou-Pastou 2020.
- Deterministic Seeding. Per-call
get_biodata(seed) for reproducible results, generator-level seed() / unseed() for deterministic sequences, and biodata::seed() for replaying a previous generation.
- Multi-Instance Support. Construct independent
bdg instances with their own data and random engine.
- Typed Enumerations.
eye_color, hair_color, skin_type, blood_type, and handedness enums with string conversion helpers.
Integration
biodatagen.hpp is the single required file released here. You also need random.hpp in the same directory. Add
Biodata generator library — procedural human physical characteristics generation for C++23.
Biodata generator that produces demographically plausible human physical characteristics using countr...
to the files you want to generate biodata and set the necessary switches to enable C++23 (e.g., -std=c++23 for GCC and Clang).
Additionally you must supply the biodata generator with the resources folder containing full/ and/or lite/ subdirectories with the TSV data file, also available in the release.
Usage
#include <iostream>
bdg::instance().
load(dasmig::dataset::lite);
bdg::instance().load(dasmig::dataset::full);
auto b = bdg::instance().get_biodata();
std::cout << b << '\n';
auto us = bdg::instance().get_biodata("US");
std::cout << "Height: " << us.height_cm << " cm\n";
std::cout << "Weight: " << us.weight_kg << " kg\n";
std::cout << "BMI: " << us.bmi << "\n";
auto m = bdg::instance().get_biodata("BR", dasmig::sex::male);
auto seeded = bdg::instance().get_biodata("US", std::uint64_t{42});
auto replay = bdg::instance().get_biodata("US", seeded.seed());
bdg::instance().seed(100);
bdg::instance().unseed();
bdg my_gen;
my_gen.load("path/to/resources/lite");
auto c = my_gen.get_biodata("JP");
void load(const std::filesystem::path &dir)
Load biodata from a resource directory.
static std::string_view blood_type_str(blood_type t)
Blood type label.
static std::string_view eye_color_str(eye_color c)
Eye colour label.
static std::string_view hair_color_str(hair_color c)
Hair colour label.
static std::string_view handedness_str(handedness h)
Handedness label.
static std::string_view skin_type_str(skin_type t)
Skin type label.
For the complete feature guide — fields, seeding, enums, and more — see the Usage Guide.
Generation Pipeline
Each call to get_biodata() runs this pipeline:
- Sex — 50/50 or forced via
sex parameter.
- Height — Gaussian distribution using country/sex-specific mean and standard deviation from NCD-RisC anthropometric data.
- BMI — Log-normal distribution from country/sex-specific mean, modelling the natural right-skew of BMI.
- Weight — Derived:
BMI × height_m².
- Eye Colour — Categorical sampling from country-specific blue/intermediate/brown distribution.
- Hair Colour — Categorical sampling from country-specific black/brown/blond/red distribution.
- Skin Type — Categorical sampling from Fitzpatrick I–VI distribution.
- Blood Type — Categorical sampling from ABO/Rh frequencies (O+, A+, B+, AB+, O−, A−, B−, AB−).
- Handedness — Bernoulli sampling from country-specific left-handedness rate.
Data Sources
| Trait | Source | Coverage |
| Height (mean, SD) | NCD-RisC via OWID | 202 countries |
| BMI (mean) | WHO GHO via OWID | 197 countries |
| Blood type | Published population studies (Wikipedia compilation) | 124 countries |
| Eye colour | Katsara & Nothnagel 2019 + regional estimates | 70 countries |
| Hair colour | VISAGE consortium + regional estimates | 62 countries |
| Skin tone | WHO UV guidance + ethnic composition estimates | 81 countries |
| Handedness | Papadatou-Pastou et al. 2020 meta-analysis | 75 countries |
Dataset Tiers
| Tier | Countries | Description |
lite | ~111 | Countries with specific data for at least one phenotypic trait |
full | ~197 | All countries with height data; phenotypic gaps filled with regional defaults |
Building
# Example
make
# Tests
make test
# Code coverage
make coverage
# API docs
make docs
Compiler Support
Tested with:
- Clang 18+ (
-std=c++23)
- GCC 14+ (
-std=c++23)
- MSVC 19.38+ (
/std:c++latest)
Dependencies
| Dependency | Version | Bundled | Purpose |
| effolkronium/random | 1.4.1 | Yes (random.hpp) | Thread-safe RNG wrapper |
| Catch2 | 3.x | Yes (amalgamated) | Unit testing |
Related Libraries
License

This library is released under the MIT License.
MIT License
Copyright (c) 2020-2026 Diego Dasso Migotto