<html>
<head>
<title>MIT-BIH Arrhythmia Database Directory (Introduction)</title>
</head>
<body bgcolor="#FFFFFF">
<a href="mitdbdir.htm"><h1 align=center>MIT-BIH Arrhythmia Database Directory
</h1></a>
<p>
<b>Next:</b> <a href="records.htm">Records</a>
<b>Up:</b> <a href="mitdbdir.htm#toc">Contents</a>
<b>Previous:</b> <a href="foreword.htm">Foreword</a>
<a name="intro"><h1>Introduction</h1></a>
<p>
This introduction describes how the database records were
obtained, and discusses the characteristics of the recorded signals.
Following these notes are annotated ``full disclosure'' plots of the entire
database. These can be useful for obtaining an overall
impression of the contents of individual records. Following the
``full disclosure'' plots are sample ECG strips. These strips
were chosen to illustrate the salient features of each record.
Next are notes on the important features of each record.
These notes also include background information on the subjects,
including their medications.
At the end of the book are tables of rhythms and annotations, which
summarize the contents of the database.
These tables can be helpful in finding a record with a specific
set of characteristics.
<a name="selection"><h2>Selection criteria</h2>
<p>
The source of the ECGs included in the MIT-BIH Arrhythmia Database is a
set of over 4000 long-term Holter recordings that were obtained
by the Beth Israel Hospital Arrhythmia Laboratory between 1975 and 1979.
Approximately 60% of these recordings were obtained from inpatients.
The database contains 23 records
(numbered from 100 to 124 inclusive with some numbers missing)
chosen at random from this set, and 25 records
(numbered from 200 to 234 inclusive, again with some numbers missing)
selected from the same set to include a variety of rare but clinically
important phenomena that would not be well-represented
by a small random sample of Holter recordings.
Each of the 48 records is slightly over 30 minutes long.
<p>
The first group is intended to serve as a representative sample of the
variety of waveforms and artifact that an arrhythmia detector might
encounter in routine clinical use. A table of random numbers was used
to select tapes, and then to select half-hour segments of them.
Segments selected in this way were excluded only if neither of the two
ECG signals was of adequate quality for analysis by human experts.
<p>
Records in the second group were chosen to
include complex ventricular, junctional,
and supraventricular arrhythmias and conduction abnormalities. Several
of these records were selected because features of the rhythm, QRS
morphology variation, or signal quality may be expected to present
significant difficulty to arrhythmia detectors; these records have
gained considerable notoriety among database users.
<p>
The subjects were 25 men aged 32 to 89 years, and 22 women aged 23 to 89
years. (Records 201 and 202 came from the same male subject.)
<a name="leads"><h2>ECG lead configuration</h2></a>
<p>
In most records, the upper signal is a modified limb lead II (MLII),
obtained by placing the electrodes on the chest. The lower signal is
usually a modified lead V1 (occasionally V2 or V5, and in one instance V4);
as for the upper signal, the electrodes are also placed on the chest.
This configuration is routinely used by the BIH Arrhythmia Laboratory.
Normal QRS complexes are usually prominent in the upper signal.
The lead axis for the lower signal may be nearly orthogonal to the mean
cardiac electrical axis, however (i.e., normal beats are usually
biphasic and may be nearly isoelectric).
Thus normal beats are frequently difficult to discern in the lower signal,
although ectopic beats will often be more prominent (see, for example, record
106).
A notable exception is record 114, for which the signals were reversed.
Since this happens occasionally in clinical practice, arrhythmia detectors
should be equipped to deal with this situation.
In records 102 and 104, it was not possible to use modified lead II because of
surgical dressings on the patients;
modified lead V5 was used for the upper signal in these records.
<a name="analog"><h2>Analog recording and playback</h2></a>
<p>
The original analog recordings were made using nine Del Mar Avionics
model 445 two-channel recorders, designated <i>A</i> through <i>I</i>:
<table border>
<tr><th><i>Recorder</i></th><th><i>Records</i></th></tr>
<tr><td align=center><i>A</i></td><td>102, 107, 111, 115, 121</td></tr>
<tr><td align=center><i>B</i></td><td>212</td></tr>
<tr><td align=center><i>C</i></td><td>203</td></tr>
<tr><td align=center><i>D</i></td><td>118, 124, 217</td></tr>
<tr><td align=center><i>E</i></td>
<td>101, 103, 106, 108, 112, 117, 119, 122, 209, 219, 220, 223, 233</td></tr>
<tr><td align=center><i>F</i></td>
<td>104, 109, 123, 205, 207, 210, 215, 221</td></tr>
<tr><td align=center><i>G</i></td>
<td>100, 105, 114, 116, 213, 214, 222, 228</td></tr>
<tr><td align=center><i>H</i></td><td>113, 201, 202, 231</td></tr>
<tr><td align=center><i>I</i></td><td>200, 230, 232, 234</td></tr>
</table>
<br>
(It is not known which recorder was used for record 208.)
<p>
During the digitization process, the analog recordings were played back
on a Del Mar Avionics model 660 unit. The analog tapes used for records
112, 115 through 124, 205, 220, 223, and 230 through 234 were played back
and digitized at twice real time; the rest were played back at real time
using a specially constructed capstan for the model 660 unit.
Skew between the two signals was found to be as great as 40
milliseconds for some of the analog recorders.
In addition to the fixed skew that results from extremely small differences
in the orientations of the tape heads on the recorder and the playback unit,
microscopic vertical wobbling of the tape, either during recording or playback,
introduces a variable skew, which may be comparable in magnitude to the fixed
skew.
This problem (which also
occurs on the AHA database) may present difficulties for certain two-channel
analysis methods designed for real-time applications.
<p>
Minor tape speed variations should not pose problems for typical arrhythmia
detectors. It is difficult to avoid tape sticking or slippage during low-speed
playback, and several episodes of tape slippage were noted and marked with
comment annotations. Wow and flutter should be studied carefully in the
context of heart-rate variability studies, since flutter compensation
was not possible in these recordings. A number of frequency-domain artifacts
have been identified and related to specific mechanical components of the
recorders and the playback unit:
<table border>
<tr><th><i>Frequency (Hz)</i></th><th><i>Source</i></th></tr>
<tr><td align=center>0.042</td><td>Recorder pressure wheel</td></tr>
<tr><td align=center>0.083</td><td>Playback unit capstan (for twice real-time playback)</td></tr>
<tr><td align=center>0.090</td><td>Recorder capstan</td></tr>
<tr><td align=center>0.167</td>
<td>Playback unit capstan (for real-time playback)</td></tr>
<tr><td align=center>0.18-0.10</td>
<td>Takeup reel (frequency decreases over time)</td></tr>
<tr><td align=center>0.20-0.36</td>
<td>Supply reel (frequency increases over time)</td></tr>
</table>
<br>
The most significant of these artifacts by far is the 0.167 Hz artifact on
recordings that were played back at real time. The next largest is the
0.090 Hz artifact; the 0.083 Hz artifact on recordings that were played back
at twice real-time is of roughly the same magnitude as the 0.090 Hz artifact.
The 0.042 Hz artifact is of much lower magnitude. Other frequencies
related to the drive train (at 0.42 Hz, 1.96 Hz, 9.1 Hz, and
42 Hz) do not appear as noticeable artifacts.
The frequencies of the last two artifacts listed in the table depend on
how much tape is on the supply and takeup reels; the supply reel causes
a much more noticeable artifact than does the takeup reel. Other
frequency-domain artifacts generated by the supply reel appear in the
0.10-0.18 Hz and 0.30-0.54 Hz bands.
<p>
Four of the 48 records (102, 104, 107, and 217) include paced beats.
The original analog recordings do not represent the pacemaker artifacts
with sufficient fidelity to permit them to be recognized by pulse amplitude
(or slew rate) and duration alone, the method commonly used for real-time
processing. The database records reproduce the analog recordings with
sufficient fidelity to permit use of pacemaker artifact detectors designed for
tape analysis, however.
<a name="digitization"><h2>Digitization</h2></a>
<p>
The analog outputs of the playback unit
were filtered to limit analog-to-digital converter (ADC)
saturation and for anti-aliasing,
using a passband from 0.1 to 100 Hz relative to real time, well
beyond the lowest and highest frequencies recoverable from the recordings.
The bandpass-filtered signals were
digitized at 360 Hz per signal relative to real time
using hardware constructed at the MIT Biomedical Engineering Center and
at the BIH Biomedical Engineering Laboratory.
The sampling frequency was chosen to facilitate implementations of 60 Hz
(mains frequency) digital notch filters in arrhythmia detectors.
Since the recorders were battery-powered, most of the 60 Hz noise present
in the database arose during playback.
In those records that were digitized at twice real time, this noise appears
at 30 Hz (and multiples of 30 Hz) relative to real time.
<p>
Samples were acquired from each signal almost simultaneously (the intersignal
sampling skew was on the order of a few microseconds). As noted above, analog
tape skew was several orders of magnitude larger. The ADCs were
unipolar, with 11-bit resolution over a ±5 mV range. Sample values thus
range from 0 to 2047 inclusive, with a value of 1024 corresponding to zero
volts.
<p>
The 11-bit samples were originally recorded in 8-bit first difference format
(this was necessary because of limited mass storage capacity). Given the
sampling frequency and the resolution of the ADC, the difference encoding
implies a maximum recordable slew rate of ±225 mV/s. In practice, this
limit was exceeded by the input signals very infrequently, only during severe
noise on a small number of records. The effect on the quality of the recorded
signals is totally negligible. On this CD-ROM, the samples have been
reconstructed from the first differences and stored as pairs of 12-bit
amplitudes packed in triplets of consecutive bytes (for details on the storage
format, see <a href="/physiotools/wag/signal-5.htm">signal(5)</a>).
<a name="annotations"><h2>Annotations</h2></a>
<p>
An initial set of beat labels was produced by a simple slope-sensitive
QRS detector, which marked each detected event as a normal beat. Two
identical 150-foot chart recordings were printed for each 30-minute record,
with these initial beat labels in the margin.
For each record, the two charts were given to two cardiologists, who worked
on them independently. The cardiologists added additional beat labels
where the detector missed beats, deleted false detections as necessary,
and changed the labels for all abnormal beats. They also added rhythm
labels, signal quality labels, and comments.
<p>
The annotations were transcribed from the paper chart recordings.
Once both sets of cardiologists' annotations for a given record
had been transcribed and verified, they were automatically compared
beat-by-beat, and another chart recording was printed. This chart showed
the cardiologists' annotations in the margin, with all discrepancies
highlighted. Each discrepancy was reviewed and resolved by consensus.
The corrections were transcribed, and the annotations were then analyzed
by an auditing program, which checked them for consistency and which
located the ten longest and shortest R-R intervals in each record (to
identify possible missing or falsely detected beats).
<p>
In early copies of the database, most beat labels were placed
at the R-wave peak, but manually inserted labels were not always
located precisely at the peak.
In copies of the database made since 1983, the beat labels have been shifted
from their original locations.
The ECG (usually the upper signal) was digitally bandpass-filtered to
emphasize the QRS complexes, and each beat label was moved to the major local
extremum, after correction for phase shift in the filter.
A few noisy beats were manually realigned.
This process was applied to all records except record 117 in 1983; the
beat labels for record 117 were not realigned until March 1998, however.
The result is that annotations generally appear at the R-wave peak, and
are located with sufficient accuracy to make the reference annotation
files usable for studies requiring waveform averaging and for
heart rate variability studies (but note the comments
with respect to analog tape wow and flutter above).
In the annotated ECG plots produced by <tt>psfd</tt> and <tt>pschart</tt>,
and in printed copies of this directory, each label is placed so that the
fiducial mark for the annotation corresponds to the left edge of the label.
<p>
The database contains approximately 109,000 beat labels. Sixteen
were corrected in the first seven years after the database was released in 1980
(in records 104, 108, 114, 203, 207, 217, and 222); in addition,
all of the left bundle branch block beats in record 214 were originally
labelled as normal beats. The rhythm labels
have been more substantially revised and now include notations for paced
rhythm, bigeminy, and trigeminy, which were missing in early copies.
<p>
In October 1998, a rhythm label in record 203 was corrected. In
October 2001, a seventeenth error in the beat labels was discovered
and corrected (in record 209). In April 2003, 26 PVC annotations in
record 119 were manually realigned by small amounts (up to 74 ms). In
May 2003, an eighteenth error in the beat labels was discovered and
corrected (in record 214). In April 2005, many of the episodes
previously labelled as atrial fibrillation in record 222 were
partially or completely relabelled as atrial flutter. In April 2008,
three beat labels were corrected (two in record 108, and one in record
215). In June 2010, the 22nd and 23rd errors in the beat labels were
found and corrected (both in record 203). Thanks to Bob Bruce, Pat
Hamilton, Yin Dengfeng, Roger Mark, Sebastian Vasquez, and Mariano
Llamedo Soria for finding and reporting these errors.
<hr>
<a name="symbols"><h1>Symbols used in plots</h1></a>
<p>
[An expanded and updated version of the table below can be found at
<a href="/physiobank/annotations.shtml">
<tt>http://www.physionet.org/physiobank/annotations.shtml</tt></a>.]
<p>
<table border>
<tr><th><i>Symbol</i></th><th><i>Meaning</i></th></tr>
<tr><td><b>·</b> <i>or</i> N</td><td>Normal beat</td></tr>
<tr><td>L</td><td>Left bundle branch block beat</td></tr>
<tr><td>R</td><td>Right bundle branch block beat</td></tr>
<tr><td>A</td><td>Atrial premature beat</td></tr>
<tr><td>a</td><td>Aberrated atrial premature beat</td></tr>
<tr><td>J</td><td>Nodal (junctional) premature beat</td></tr>
<tr><td>S</td><td>Supraventricular premature beat</td></tr>
<tr><td>V</td><td>Premature ventricular contraction</td></tr>
<tr><td>F</td><td>Fusion of ventricular and normal beat</td></tr>
<tr><td>[</td><td>Start of ventricular flutter/fibrillation</td></tr>
<tr><td>!</td><td>Ventricular flutter wave</td></tr>
<tr><td>]</td><td>End of ventricular flutter/fibrillation</td></tr>
<tr><td>e</td><td>Atrial escape beat</td></tr>
<tr><td>j</td><td>Nodal (junctional) escape beat</td></tr>
<tr><td>E</td><td>Ventricular escape beat</td></tr>
<tr><td>/</td><td>Paced beat</td></tr>
<tr><td>f</td><td>Fusion of paced and normal beat</td></tr>
<tr><td>x</td><td>Non-conducted P-wave (blocked APB)</td></tr>
<tr><td>Q</td><td>Unclassifiable beat</td></tr>
<tr><td>|</td><td>Isolated QRS-like artifact</td></tr>
<tr><td colspan=2 align=center>Rhythm annotations appear <i>below</i> the
level used for beat annotations:</td></tr>
<tr><td>(AB</td><td>Atrial bigeminy</td></tr>
<tr><td>(AFIB</td><td>Atrial fibrillation</td></tr>
<tr><td>(AFL</td><td>Atrial flutter</td></tr>
<tr><td>(B</td><td>Ventricular bigeminy</td></tr>
<tr><td>(BII</td><td>2° heart block</td></tr>
<tr><td>(IVR</td><td>Idioventricular rhythm</td></tr>
<tr><td>(N</td><td>Normal sinus rhythm</td></tr>
<tr><td>(NOD</td><td>Nodal (A-V junctional) rhythm</td></tr>
<tr><td>(P</td><td>Paced rhythm</td></tr>
<tr><td>(PREX</td><td>Pre-excitation (WPW)</td></tr>
<tr><td>(SBR</td><td>Sinus bradycardia</td></tr>
<tr><td>(SVTA</td><td>Supraventricular tachyarrhythmia</td></tr>
<tr><td>(T</td><td>Ventricular trigeminy</td></tr>
<tr><td>(VFL</td><td>Ventricular flutter</td></tr>
<tr><td>(VT</td><td>Ventricular tachycardia</td></tr>
<tr><td colspan=2 align=center>Signal quality and comment annotations appear <i>above</i>
the level used for beat annotations:</td></tr>
<tr><td><i>qq</i></td>
<td>
Signal quality change: the first character (`c' or `n') indicates the quality
of the upper signal (clean or noisy), and the second character indicates the
quality of the lower signal</td></tr>
<tr><td>U</td><td>Extreme noise or signal loss in both signals: ECG is unreadable</td></tr>
<tr><td>M (<i>or</i> MISSB)</td><td>Missed beat</td></tr>
<tr><td>P (<i>or</i> PSE)</td><td>Pause</td></tr>
<tr><td>T (<i>or</i> TS)</td><td>Tape slippage</td></tr>
</table>
<HR>
<P><ADDRESS>
<I><A HREF="mailto:george@mit.edu">George B. Moody (<tt>george@mit.edu</tt>)</A></ADDRESS></I><BR>
24 May 1997
<br>
<i>Revised 24 June 2010</i>
</body>
</html>