
A new scientific study uses behavior information derived from Apple Watch sensor data for health predicitons.
Behavioral information from an Apple Watch, such as physical activity, cardiovascular fitness, and mobility metrics, may be more useful for determining a person’s health state than just raw sensor data, according to a new scientific study.
Over the years, Apple has collaborated with medical researchers on a variety of issues, ranging from menstrual cycles and even pickleball, to hearing loss and sleep tracking. The iPhone maker has also examined the training and cardio exercises that marathon runners do, as part of a multi-year Heart and Movement Study that used the Apple Watch.
The Heart and Movement Study is part of a broader initiative to promote healthy movement and enhance cardiovascular health. Now, another Apple-sponsored research paper, which relies on data from the Heart and Movement Study, explains how behavior data can often serve as a more significant health indicator relative to conventional biometric data obtained through hardware sensors.
The study, titled “Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions,” says that physical activity, cardiovascular fitness, and mobility metrics are especially useful for detecting transient and static health states.
A static health state would, for example, include information like whether or not someone is a smoker, if they have hypertension, or are on beta blockers. Pregnancy, meanwhile, would constitute a transient state. Sensor data is typically collected at lower-level time scales — seconds as opposed to the months a transient health state may last.
The wearable health behavior foundation model — WBM
With that information in mind, the researchers created what they call a WBM, or wearable health behavior foundation model. It was trained on “behavioral data from wearables, using 162K participants with over 15 billion hourly measurements from the Apple Heart and Movement Study.”

The wearable health behavior foundation model uses patterns derived from raw sensor data.
Rather than processing the raw biometric sensor data, however, the WBM used “27 interpretable HealthKit quantities that are calculated from lower-level sensors using validated methods.” These metrics included exercise time, standing time, blood oxygen, heart rate measurements, and more.
“Compared to modeling raw sensor data, these derived metrics are chosen by experts due to their alignment with meaningful physiological health states,” the researchers explain. In short, the WBM uses patterns derived from raw sensor data to predict a person’s health state, and the study suggests this outperforms traditional detection methods that rely on data streams from sensors.
“The model excels in behavior-driven tasks like sleep prediction, and improves further when combined with representations of raw sensor data.” The research paper also says the WBM was tested on 57 health-related tasks, and that it outperformed a traditional PPG (photoplethysmograph) model in most situations.
Specifically, WBM outperforms PPG in predicting static health states such as beta blocker use, as it more reliably detects heart rate reductions during the day. It also outperformed PPG in predicting transient health states such as pregnancy, though it was unable to predict diabetes better than PPG. “Low-level sensor data outperforms behavioral data in tasks where physiological information is sufficient,” the study says.
Why a hybrid PPG + WBM approach proved useful, and when
This is why the researchers also explored a hybrid PPG+WBM model, which significantly improved predictive performance. WBM detects behavior patterns derived from raw sensor data, which can include significant information about an individual’s health. PPG, meanwhile, can recognize immediate physiological changes. The two complement each other, but only when physiological information alone isn’t enough, and where behavior is a meaningful predictor.

The researchers compared the WBM to a typical PPG approach. Image Credit: Apple & associated researchers
“Finally, we see that across most tasks, the combination of embeddings of WBM and the PPG model results in the most accurate models,” the study says. “The combination achieves the best age prediction performance across all models considered, clearly outperforming either model in isolation.”
The hybrid approach is particularly useful for pregnancy detection, as both types of data are necessary for determining this transient health state. Overall, it performed best in 42 out of 47 outcomes the researchers tested.
As for what all of this means in practice, Apple could adopt this type of hybrid approach as a way of building upon its existing health-related technology. In other words, using a WBM-like model alongside the existing Apple Watch PPG or ECG (electrocardiogram) sensors. The company’s interest in health-related features has remained constant over the years, meaning that we can expect improvements down the line.