ECG Classification with Machine Learning: A Practical Overview
Heartbeat data is scarce, imbalanced, and noisy — exactly where deep models overfit. A tour of the feature pipeline, the class-imbalance trap, and kernel-based fixes.
An electrocardiogram is a deceptively simple signal — a voltage wiggling over time — that carries some of the highest-stakes information in medicine. Automating its reading is a natural target for machine learning, and the literature is enormous. It is also a field where the obvious approach, "throw a deep network at it," quietly underperforms. ECG is a scarce-data, imbalanced, noisy problem, and those three words explain almost everything about what works.
From waveform to features
A raw ECG is first cleaned and segmented into individual heartbeats, each aligned around the R peak. From there, two broad strategies:
- Hand-crafted features — RR intervals, QRS width, wave amplitudes, morphology descriptors. Compact, interpretable, and friendly to classical models.
- Learned features — let a 1D CNN or RNN read the waveform directly. Powerful when you have the data, prone to overfitting when you don't.
The trap that ruins benchmarks: class imbalance
Most heartbeats are normal. The arrhythmias you actually care about — the dangerous ones — are rare, sometimes well under 1% of beats. A model that predicts "normal" for everything scores 99% accuracy and is completely useless. This is the single most common way ECG papers fool themselves.
The fixes are well known but non-negotiable:
- Report the right metrics — precision, recall, F1, and per-class scores. Never headline accuracy on imbalanced data.
- Resample or reweight — oversample minority classes (SMOTE and variants), or penalize their errors more heavily in the loss.
- Frame rare events as anomaly detection — model "normal" thoroughly and flag deviations, rather than trying to learn a class you have almost no examples of.
Why scarce data favours the right kernel
This is where the instinct to reach for the biggest network backfires. With limited labelled beats, a deep model memorizes; a kernel-based SVM with a well-chosen similarity measure generalizes. The whole problem reduces to one question: what makes two heartbeats "similar"?
Get that ruler right and a convex, data-efficient classifier outperforms a hungry deep net. Our ECG work pushes exactly here — a quantum-inspired Angle–distance kernel that encodes each beat as a fixed-length complex vector and measures similarity by angular alignment and projection gap together, improving both the accuracy and the stability of SVM-based classification and anomaly detection.
A practical checklist
Before you trust an ECG model
- Did the patient-level split prevent the same patient leaking into train and test?
- Are minority arrhythmia classes reported separately, not hidden in an average?
- Is the metric precision/recall/F1 — not accuracy?
- Was denoising fit only on training data, to avoid leakage?
- Does performance hold on a different dataset or lead configuration?
The takeaway
ECG classification rewards humility about data more than enthusiasm about architecture. Respect the imbalance, split by patient, measure the right thing — and on scarce biomedical signals, a carefully designed similarity measure often beats the deepest network in the room.
Read the full paper →