We are proud to announce that the University of Arizona recently completed an independent validation of our sleep staging analytics, which was published in the Journal of Clinical Sleep Medicine. In this study, participants’ sleep was studied using polysomnography (PSG), the gold-standard of sleep tracking, while also wearing WHOOP. The study overwhelmingly showed that the accuracy of WHOOP was excellent when compared to PSG, thus demonstrating WHOOP to be a reliable, non-invasive wearable for sleep tracking.
According to Dr. Sairam Parthasarathy, MD, who is a professor of medicine at the University of Arizona College of Medicine-Tucson and director of the Center for Sleep and Circadian Sciences, “The accuracy of WHOOP as a wearable and its availability compared to the limited accessibility of polysomnography may in the future facilitate better population-health management.”
To summarize the study’s key findings:
Below we dive into how WHOOP was able to pull off this impressive feat, what PSG consists of, and explain the study and its results in greater detail.
Tracking and analyzing sleep is central to the WHOOP membership experience. Our heavy focus on sleep is motivated by decades of research and irrefutable evidence that sleep is essential for recovery and performance. Getting enough sleep is harder now than ever before, Gallup poll data shows that Americans today average over an hour less sleep per night than we did 70 years ago.
The current epidemic of sleep deprivation not only makes us sleepier, but increases our risk of cardiovascular disease, cancer and obesity, and shortens our life expectancy. WHOOP members are, on average, doing far better than the national trend, but still benefit from improving their sleep. Below we break down from the beginning how we built the WHOOP Sleep Analytics platform.
Your WHOOP strap was designed from the ground up to provide the most accurate possible sleep tracking, we collect hundreds of data points per second from our 3-axis accelerometer, 3-axis gyroscope, and PPG-heart rate sensor. WHOOP also measures capacitive touch and temperature but does not use data from those sensors in its sleep algorithm.
PPG, or photoplethysmography, is a technique that involves measuring blood flow by assessing superficial changes in blood volume. If you’ve ever wondered what the tiny green lights are on the bottom of your WHOOP, they are the very important first part of PPG. Between the two green lights there is a small photo-receptor which measures light. When you shine specific colors (wavelengths) of light onto the skin, blood volume can be measured by looking at the light reflected back from our skin since blood absorbs specific colors and reflects others.
Once blood flow is measured, we can then derive heart rate, heart rate variability, and respiratory rate, all of which are used in our sleep detection and staging algorithms. In the recently published sleep validation study, WHOOP HR during sleep was shown to have excellent agreement with EKG, the gold standard, averaging a precision error of 1 beat per minute across 32 participants. The study similarly found excellent agreement between our respiratory rate and gold standard measurements, averaging precision error of one breath per minute. Having highly accurate HR and respiratory rate is essential to accurately staging sleep.
In order to turn our accelerometer, heart rate, heart rate variability, and respiratory rate into all the sleep analysis our members receive each day, we partnered with a local sleep center and had hundreds of subjects undergo in-laboratory polysomnography (PSG) testing while wearing WHOOP.
In a PSG sleep study, subjects undergo simultaneous electrocardiogram (EKG), electrooculogram (EOG), electroencephalogram (EEG), and electromyogram (EMG) recordings. Trained technicians then manually interpret these results, sorting each 30-second chunk of data – called an Epoch – into one of four sleep stages: Wake, Light, REM, and Slow Wave. While PSG is the gold standard, and therefore the most accurate known way to determine sleep stages, it is also expensive, cumbersome and intrusive.
In our recent study with the University of Arizona, we show that we are able to achieve an approximation of the PSG data from WHOOP, enabling our users to access this powerful data with virtually no friction. We achieved this by training machine learning algorithms to reproduce the sleep stages manually assigned by the polysomnography technicians to create the automated sleep-staging experience provided today.
Having subjects wear WHOOP while undergoing PSG testing is critical to enabling the high level of accuracy we’ve attained because we are able to teach the model to recognize exactly what WHOOP data looks like during each of the four sleep stages we detect.
Some of the differences in your data between sleep stages are pretty easy to spot, while others can be more subtle. For example, below is about 80 minutes of respiratory rate data from one of my recent sleeps. The light blue highlighted segment shows slow wave sleep while the teal segment shows REM sleep. The unhighlighted portions are light sleep. Notice that the respiratory rate during slow wave sleep is fairly constant while during REM sleep it is increased and more variable.
The image above shows just one of many physiological differences among sleep stages that can be detected by WHOOP. In reality, our sleep staging algorithm brings together lots of physiological variables – called “features” in the algorithm development world – but not all the features are as pronounced to the naked eye.
Dr. Sairam Parthasarathy, medical director for the Center for Sleep Disorders at The University of Arizona’s University Medical Center Tucson, and his team, who conducted the validation study published last week found that WHOOP accurately detects sleep duration with a precision of 17.8 minutes. They also reported highly accurate detection of REM and Slow Wave (deep) sleep. As of the publication of this post, the entire manuscript had not been made available by the Journal of Clinical Sleep Medicine, so we cannot yet share a more detailed breakdown of our algorithm’s performance, but look forward to releasing more stats in the coming months.