🔬

Data details

Data overview

The HiRID database contains a large selection of all routinely collected data relating to patient admissions to the Department of Intensive Care Medicine of the Bern University Hospital, Switzerland (ICU). The data was extracted from the ICU Patient Data Management System which is used to prospectively register patient health information, measurements of organ function parameters, results of laboratory tests and treatment parameters from ICU admission to discharge.

HiRID contains:

  • Demographic data
  • Measurements from bedside monitoring
  • Measurements and settings of medical devices such as mechanical ventilation
  • Observations by health care providers e.g.: GCS, RASS, urine and other fluid output
  • Lab values
  • Administered drugs, fluids and nutrition

HiRID has a higher time resolution than other published datasets, most importantly for bedside monitoring with most parameters recorded every 2 minutes.

Anonymization procedure

To ensure the anonymization of individuals in the data set, we followed the procedures successfully applied for the MIMIC-III and AmsterdamUMCdb dataset which in turn adopted the Health Insurance Portability and Accountability Act (HIPAA) and in the case of AmsterdamUMCdb also the European Union's General Data Protection Regulation (GDPR) standards.

  • Removal of all eighteen identifying data elements listed in HIPAA
  • Dates were shifted by a random offset such that the admission date lies between 2100 and 2200. We made sure to preserve the seasonality, time of day and the day of week.
  • Patient age, height and weight are binned into bins of size 5. For patient age, the max bin is 90 years and contains also all older patients.
  • Measurements and medications with changing units over time were standardized to the latest unit used. This standardization was necessary to make a conclusion about estimated admission times, based on the units used in a specific patient, impossible.
  • Free text was removed from the database
  • k-anonymization was applied on patient age, weight, height and sex.

Data content

The data is published as original source data as well as in two pre-processed states. The source data contains all raw variables, whereas the pre-processed data contains only a small subset of aggregated variables. Further information can be found here:

🗂
Structure of the published data

Patient ID / ICU admission

The dataset treats each ICU admission uniquely and it is not possible to identify multiple ICU admissions as originating from the same patient. For each ICU (re-)admission a unique "Patient ID" is generated.

APACHE Group

We store an Apache II or IV group for most stays. This table can be used to look up the encoding.

Variables

Variables in raw data

We provide a list of all variables as csv files included in the downloadable dataset, as well as an always updated google spread sheet:

Additionally to the variables in these files, the dataset contains age at admission and sex of each patient in the general table of the dataset.

Variables in pre-processed dataset

Variable idNameTypeSource variable idsUnit
vm1
Heart rate
double
200
'/min
vm3
Systolic BP (invasive)
double
100
mmHg
vm4
Diastolic BP (invasive)
double
120
mmHg
vm5
MAP
double
110
mmHg
vm13
Cardiac output
double
1000
l/min
vm20
SpO2
double
4000, 8280
%
vm28
RASS
double
15001565
vm62
peak inspiratory pressure (ventilator)
double
3110
cmH2O
vm136
Lactate arterial
double
24000524
mmol/l
vm146
Lactate venous
double
24000732, 24000485
mmol/l
vm172
INR
double
24000567
vm174
Serum glucose
double
20005110, 24000523, 24000585
mmol/l
vm176
C-reactive protein
double
20002200
mg/l
pm41
Dobutamine
double
426
flow [mg/min]
pm42
Milrinone
double
1000441
flow [mg/min]
pm43
Levosimendan
double
1000606
flow [mg/min]
pm44
Theophyllin
double
1000706, 1000707,1000698, 1000267
flow [mg/min]
pm87
Non-opiod analgesics
long
275, 1000471, 1000472, 1000473, 1000489, 1000490, 1000683, 1000900, 225, 1000605, 1000632, 1000858
binary indication of drug presence [yes/no]