Biodata Generator 1.0.0
Procedural human physical characteristics generation for C++23
Loading...
Searching...
No Matches
Biodata Generator for C++

Biodata Generator for C++

Requires C++23 (e.g., -std=c++23 for GCC/Clang, /std:c++latest for MSVC).

GitHub license CI GitHub Releases GitHub Issues C++23 Header-only Platform Documentation

API Reference · Usage Guide · Releases

Features

  • Demographically Plausible Characteristics. Generates random human physical traits using a multi-stage pipeline: height from country/sex-specific Gaussian distributions, BMI from log-normal distributions, and categorical sampling for phenotypic traits.
  • Seven Physical Traits. Every generated profile includes: height (cm), weight (kg), BMI, eye colour, hair colour, Fitzpatrick skin type, ABO/Rh blood type, and handedness.
  • Country-Specific Distributions. Height means, BMI values, eye/hair/skin distributions, blood type frequencies, and left-handedness rates all vary by country, reflecting real-world population statistics.
  • Data-Driven. Height and BMI from NCD-RisC via Our World in Data, blood types from published population studies, phenotypic traits from Katsara & Nothnagel 2019, VISAGE consortium, and Papadatou-Pastou 2020.
  • Deterministic Seeding. Per-call get_biodata(seed) for reproducible results, generator-level seed() / unseed() for deterministic sequences, and biodata::seed() for replaying a previous generation.
  • Multi-Instance Support. Construct independent bdg instances with their own data and random engine.
  • Typed Enumerations. eye_color, hair_color, skin_type, blood_type, and handedness enums with string conversion helpers.

Integration

biodatagen.hpp is the single required file released here. You also need random.hpp in the same directory. Add

// For convenience.
using bdg = dasmig::bdg;
Biodata generator library — procedural human physical characteristics generation for C++23.
Biodata generator that produces demographically plausible human physical characteristics using countr...

to the files you want to generate biodata and set the necessary switches to enable C++23 (e.g., -std=c++23 for GCC and Clang).

Additionally you must supply the biodata generator with the resources folder containing full/ and/or lite/ subdirectories with the TSV data file, also available in the release.

Usage

#include <iostream>
// For convenience.
using bdg = dasmig::bdg;
// Manually load a specific dataset tier if necessary.
bdg::instance().load(dasmig::dataset::lite); // ~111 countries (best coverage)
// OR
bdg::instance().load(dasmig::dataset::full); // ~197 countries (gap-filled)
// Generate random biodata (uniform country selection).
auto b = bdg::instance().get_biodata();
std::cout << b << '\n'; // implicit string conversion
// Generate biodata for a specific country.
auto us = bdg::instance().get_biodata("US");
std::cout << "Height: " << us.height_cm << " cm\n";
std::cout << "Weight: " << us.weight_kg << " kg\n";
std::cout << "BMI: " << us.bmi << "\n";
// Request a specific sex.
auto m = bdg::instance().get_biodata("BR", dasmig::sex::male);
// Access typed enum fields.
std::cout << "Eyes: " << dasmig::biodata::eye_color_str(b.eyes) << '\n';
std::cout << "Hair: " << dasmig::biodata::hair_color_str(b.hair) << '\n';
std::cout << "Skin: " << dasmig::biodata::skin_type_str(b.skin) << '\n';
std::cout << "Blood: " << dasmig::biodata::blood_type_str(b.blood) << '\n';
std::cout << "Hand: " << dasmig::biodata::handedness_str(b.hand) << '\n';
// Deterministic generation — same seed always produces the same result.
auto seeded = bdg::instance().get_biodata("US", std::uint64_t{42});
// Replay a previous generation using its seed.
auto replay = bdg::instance().get_biodata("US", seeded.seed());
// Seed the engine for a deterministic sequence.
bdg::instance().seed(100);
// ... generate biodata ...
bdg::instance().unseed(); // restore non-deterministic state
// Independent instance — separate data and random engine.
bdg my_gen;
my_gen.load("path/to/resources/lite");
auto c = my_gen.get_biodata("JP");
void load(const std::filesystem::path &dir)
Load biodata from a resource directory.
static std::string_view blood_type_str(blood_type t)
Blood type label.
static std::string_view eye_color_str(eye_color c)
Eye colour label.
static std::string_view hair_color_str(hair_color c)
Hair colour label.
static std::string_view handedness_str(handedness h)
Handedness label.
static std::string_view skin_type_str(skin_type t)
Skin type label.

For the complete feature guide — fields, seeding, enums, and more — see the Usage Guide.

Generation Pipeline

Each call to get_biodata() runs this pipeline:

  1. Sex — 50/50 or forced via sex parameter.
  2. Height — Gaussian distribution using country/sex-specific mean and standard deviation from NCD-RisC anthropometric data.
  3. BMI — Log-normal distribution from country/sex-specific mean, modelling the natural right-skew of BMI.
  4. Weight — Derived: BMI × height_m².
  5. Eye Colour — Categorical sampling from country-specific blue/intermediate/brown distribution.
  6. Hair Colour — Categorical sampling from country-specific black/brown/blond/red distribution.
  7. Skin Type — Categorical sampling from Fitzpatrick I–VI distribution.
  8. Blood Type — Categorical sampling from ABO/Rh frequencies (O+, A+, B+, AB+, O−, A−, B−, AB−).
  9. Handedness — Bernoulli sampling from country-specific left-handedness rate.

Data Sources

Trait Source Coverage
Height (mean, SD) NCD-RisC via OWID 202 countries
BMI (mean) WHO GHO via OWID 197 countries
Blood type Published population studies (Wikipedia compilation) 124 countries
Eye colour Katsara & Nothnagel 2019 + regional estimates 70 countries
Hair colour VISAGE consortium + regional estimates 62 countries
Skin tone WHO UV guidance + ethnic composition estimates 81 countries
Handedness Papadatou-Pastou et al. 2020 meta-analysis 75 countries

Dataset Tiers

Tier Countries Description
lite ~111 Countries with specific data for at least one phenotypic trait
full ~197 All countries with height data; phenotypic gaps filled with regional defaults

Building

# Example
make
# Tests
make test
# Code coverage
make coverage
# API docs
make docs

Compiler Support

Tested with:

  • Clang 18+ (-std=c++23)
  • GCC 14+ (-std=c++23)
  • MSVC 19.38+ (/std:c++latest)

Dependencies

Dependency Version Bundled Purpose
effolkronium/random 1.4.1 Yes (random.hpp) Thread-safe RNG wrapper
Catch2 3.x Yes (amalgamated) Unit testing

Related Libraries

Library Description
name-generator Culturally appropriate full names
nickname-generator Gamer-style nicknames
birth-generator Demographically plausible birthdays
city-generator Weighted city selection by population
country-generator Weighted country selection by population
entity-generator ECS-based entity generation

License

This library is released under the MIT License.

MIT License
Copyright (c) 2020-2026 Diego Dasso Migotto