🔬

Data details

Data overview

The HiRID database contains a large selection of all routinely collected data relating to patient admissions to the Department of Intensive Care Medicine of the Bern University Hospital, Switzerland (ICU). The data was extracted from the ICU Patient Data Management System which is used to prospectively register patient health information, measurements of organ function parameters, results of laboratory tests and treatment parameters from ICU admission to discharge.

HiRID contains:

  • Demographic data
  • Measurements from bedside monitoring
  • Measurements and settings of medical devices such as mechanical ventilation
  • Observations by health care providers e.g.: GCS, RASS, urine and other fluid output
  • Lab values
  • Administered drugs, fluids and nutrition

HiRID has a higher time resolution than other published datasets, most importantly for bedside monitoring with most parameters recorded every 2 minutes.

Anonymization procedure

To ensure the anonymization of individuals in the data set, we followed the procedures successfully applied for the MIMIC-III and AmsterdamUMCdb dataset which in turn adopted the Health Insurance Portability and Accountability Act (HIPAA) and in the case of AmsterdamUMCdb also the European Union's General Data Protection Regulation (GDPR) standards.

  • Removal of all eighteen identifying data elements listed in HIPAA
  • Dates were shifted by a random offset such that the admission date lies between 2100 and 2200. We made sure to preserve the seasonality, time of day and the day of week.
  • Patient age, height and weight are binned into bins of size 5. For patient age, the max bin is 90 years and contains also all older patients.
  • Measurements and medications with changing units over time were standardized to the latest unit used. This standardization was necessary to make a conclusion about estimated admission times, based on the units used in a specific patient, impossible.
  • Free text was removed from the database
  • k-anonymization was applied on patient age, weight, height and sex.

Data content

The data is published as original source data as well as in two pre-processed states. The source data contains all raw variables, whereas the pre-processed data contains only a small subset of aggregated variables. Further information can be found here:

🗂Structure of the published data

Patient ID / ICU admission

The dataset treats each ICU admission uniquely and it is not possible to identify multiple ICU admissions as originating from the same patient. For each ICU (re-)admission a unique "Patient ID" is generated.

APACHE Group

We store an Apache II or IV group for most stays. This table can be used to look up the encoding.

APACHE Group mapping

Variables

Variables in raw data

We provide a list of all variables as csv files included in the downloadable dataset, as well as an always updated google spread sheet:

Raw variable spread sheet

Additionally to the variables in these files, the dataset contains age at admission and sex of each patient in the general table of the dataset.

Variables in pre-processed dataset

Variable idNameTypeSource variable idsUnit
vm1

Heart rate

double

200

'/min

vm3

Systolic BP (invasive)

double

100

mmHg

vm4

Diastolic BP (invasive)

double

120

mmHg

vm5

MAP

double

110

mmHg

vm13

Cardiac output

double

1000

l/min

vm20

SpO2

double

4000, 8280

%

vm28

RASS

double

15001565

vm62

peak inspiratory pressure (ventilator)

double

3110

cmH2O

vm136

Lactate arterial

double

24000524

mmol/l

vm146

Lactate venous

double

24000732, 24000485

mmol/l

vm172

INR

double

24000567

vm174

Serum glucose

double

20005110, 24000523, 24000585

mmol/l

vm176

C-reactive protein

double

20002200

mg/l

pm41

Dobutamine

double

426

flow [mg/min]

pm42

Milrinone

double

1000441

flow [mg/min]

pm43

Levosimendan

double

1000606

flow [mg/min]

pm44

Theophyllin

double

1000706, 1000707,1000698, 1000267

flow [mg/min]

pm87

Non-opiod analgesics

long

275, 1000471, 1000472, 1000473, 1000489, 1000490, 1000683, 1000900, 225, 1000605, 1000632, 1000858

binary indication of drug presence [yes/no]