Data overview

The HiRID database contains a large selection of all routinely collected data relating to patient admissions to the Department of Intensive Care Medicine of the Bern University Hospital, Switzerland (ICU). The data was extracted from the ICU Patient Data Management System which is used to prospectively register patient health information, measurements of organ function parameters, results of laboratory tests and treatment parameters from ICU admission to discharge.

HiRID contains:

Demographic data
Measurements from bedside monitoring
Measurements and settings of medical devices such as mechanical ventilation
Observations by health care providers e.g.: GCS, RASS, urine and other fluid output
Lab values
Administered drugs, fluids and nutrition

HiRID has a higher time resolution than other published datasets, most importantly for bedside monitoring with most parameters recorded every 2 minutes.

Anonymization procedure

To ensure the anonymization of individuals in the data set, we followed the procedures successfully applied for the MIMIC-III and AmsterdamUMCdb dataset which in turn adopted the Health Insurance Portability and Accountability Act (HIPAA) and in the case of AmsterdamUMCdb also the European Union's General Data Protection Regulation (GDPR) standards.

Removal of all eighteen identifying data elements listed in HIPAA
Dates were shifted by a random offset such that the admission date lies between 2100 and 2200. We made sure to preserve the seasonality, time of day and the day of week.
Patient age, height and weight are binned into bins of size 5. For patient age, the max bin is 90 years and contains also all older patients.
Measurements and medications with changing units over time were standardized to the latest unit used. This standardization was necessary to make a conclusion about estimated admission times, based on the units used in a specific patient, impossible.
Free text was removed from the database
k-anonymization was applied on patient age, weight, height and sex.

Data content

The data is published as original source data as well as in two pre-processed states. The source data contains all raw variables, whereas the pre-processed data contains only a small subset of aggregated variables. Further information can be found here:

🗂Structure of the published data

Patient ID / ICU admission

The dataset treats each ICU admission uniquely and it is not possible to identify multiple ICU admissions as originating from the same patient. For each ICU (re-)admission a unique "Patient ID" is generated.

APACHE Group

We store an Apache II or IV group for most stays. This table can be used to look up the encoding.

APACHE Group mapping

Variables

Variables in raw data

We provide a list of all variables as csv files included in the downloadable dataset, as well as an always updated google spread sheet:

Raw variable spread sheet

Additionally to the variables in these files, the dataset contains age at admission and sex of each patient in the general table of the dataset.

Variable id	Name	Type	Source variable ids	Unit
vm1	Heart rate	double	200	'/min
vm3	Systolic BP (invasive)	double	100	mmHg
vm4	Diastolic BP (invasive)	double	120	mmHg
vm5	MAP	double	110	mmHg
vm13	Cardiac output	double	1000	l/min
vm20	SpO2	double	4000, 8280	%
vm28	RASS	double	15001565
vm62	peak inspiratory pressure (ventilator)	double	3110	cmH2O
vm136	Lactate arterial	double	24000524	mmol/l
vm146	Lactate venous	double	24000732, 24000485	mmol/l
vm172	INR	double	24000567
vm174	Serum glucose	double	20005110, 24000523, 24000585	mmol/l
vm176	C-reactive protein	double	20002200	mg/l
pm41	Dobutamine	double	426	flow [mg/min]
pm42	Milrinone	double	1000441	flow [mg/min]
pm43	Levosimendan	double	1000606	flow [mg/min]
pm44	Theophyllin	double	1000706, 1000707,1000698, 1000267	flow [mg/min]
pm87	Non-opiod analgesics	long	275, 1000471, 1000472, 1000473, 1000489, 1000490, 1000683, 1000900, 225, 1000605, 1000632, 1000858	binary indication of drug presence [yes/no]

Data details