Software

AttributeRiskCalculation: Calculating Attribute Disclosure Risks in Synthetic Microdata

This package calculates attribute disclosure risks in synthetic microdata using Bayesian estimation methods. It reports the joint posterior probability of true confidential values being inferred correctly and its ranking among all guesses.

GitHub link

IdentificationRiskCalculation: Calculating the Identification Risk in Partially Synthetic Microdata

This package calculates the identification risk in partially synthetic microdata. The expected match risk, the true match rate, and the false match rate are reported. The calculation supports mixed data type, including categorical variables and continuous variables.

GitHub link

NestedCategBayesImpute: Modeling and Generating Synthetic Versions of Nested Categorical Data in the Presence of Impossible Combinations

This tool set provides a set of functions to fit the nested Dirichlet process mixture of products of multinomial distributions (NDPMPM) model for nested categorical household data in the presence of impossible combinations. It has direct applications in generating synthetic nested household data.

CRAN link

NPBayesImputeCat: Non-parametric Bayesian Multiple Imputation for Categorical Data

These routines create multiple imputations of missing at random categorical data, with or without structural zeros. Imputations are based on Dirichlet process mixtures of multinomial distributions, which is a non-parametric Bayesian modeling approach that allows for flexible joint modeling.

CRAN link

Jingchen (Monika) Hu

Associate Professor of Statistics, Vassar College

Leave a Reply