Considerations while using Fitbit Data in the All of Us Research Program

Authors: Lauren Lederer^*, Amanda Breton^*, Hayoung Jeong^, Hiral Master+,  Ali R. Roghanizad^, Jessilyn Dunn^

Affiliations: ^Duke University, +Vanderbilt University Medical Center (Data and Research Center)

*Contributed Equally

 

What is the Goal of this Support Article?

Wearable biometric monitoring technologies (BioMeTs) such as Fitbits have become increasingly popular in recent years, especially as BioMeTs have been improving, offering more capabilities and greater functionality to assess behaviors and physiology in free-living conditions. In this post, we focus on Fitbit devices given their wide market share [1, 2], the ongoing data-collection from Fitbit users in the All Of Us Research Program (e.g., Bring-Your-Own-Device Program) [3, 4], and this data’s availability to registered Researcher Workbench users [5]. Specifically, this post focuses on physical activity (steps and intensities) and heart rate data generated by Fitbits on a per day and per minute basis.  With the increased use of wearable devices such as Fitbit in research and clinical settings, users might ask “How reliable are these devices?”, “What are the sources of biases one needs to account for when using this data for research?”, and “How does one account for said biases?”. Here, we seek to inform the All of Us community of the reliability and potential sources of bias associated with Fitbit data as well as some strategies they may want to consider to alleviate these biases.  

 

Background on Fitbit Device Reliability

The reliability of commercial wearables is an often queried topic by both everyday users as well as researchers, and many investigations have been conducted to test the reliability of Fitbits compared to reference standards (e.g., research-grade devices). In this blog, we discuss the reliability of Fitbit devices to assess daily step count, physical activity intensities (e.g. sedentary, moderate-to-vigorous), and heart rate (HR).  

 

Step Count

A 2018 systematic review by Feehan et al evaluated the accuracy of Fitbits by analyzing 67 studies where the devices were used in either controlled or free-living settings. Findings suggested that the reliability of step count in Fitbit devices varied greatly depending on the type and speed of movement, and the placement of the Fitbit. During normal walking, for example, torso placement resulted in the greatest accuracy, while ankle and wrist placement were most accurate in slow-walking and jogging, respectively. Of note, Fitbits tend to underestimate steps in controlled conditions and overestimate steps during real life free-living conditions [6]. Similar findings from other studies demonstrate that step count accuracy is affected by device placement and/or movement type [7, 8]. 

 

Physical Activity Intensity

A subset of studies in Feehan et al.’s review included Fitbit data from time spent in various physical activity categories (sedentary, light, moderate, vigorous, or moderate-to-vigorous) during waking hours that also collected measurements from ActiGraph accelerometers on either the torso or ankle as a comparison. Regardless of ActiGraph placement, Fitbit overestimated moderate-to-vigorous physical activity (MVPA) in free-living conditions and had lower error during sedentary behaviors. As with step count, the reliability of physical activity intensity measurements varies depending on the type and speed of movement [6].

 

Heart Rate

Fitbit measures heart rate using photoplethysmography (PPG) sensors, which work by detecting light absorption under the skin that is mainly driven by blood volume, which varies with the pulse cycle. PPG is known to be susceptible to motion artifacts, which was further demonstrated by a study led by Bent et al [9]. Another study by Haghayegh et al found no difference between Fitbit HR measurements and three-lead electrocardiography (ECG) during sleep, which minimized motion artifacts [10]. The results from these studies indicate that the reliability of Fitbit HR measurements increases under stationary conditions. 

 

While the reliability of BioMeT data can vary based on the data type (e.g., step count, physical activity intensity, and heart rate) and conditions of data collection (e.g., sedentary vs high activity settings),this shouldn’t deter researchers from utilizing BioMeT data in studies, as the wide public use of these technologies offers the potential for a rich free-living data source. However, we advise that researchers understand the types of error that may be associated with the specific BioMeT they intend to use (e.g., Fitbit in the case of the AoURP) and further understand the potential sources of such error, as discussed below.

 

Sources of Error

This section is not meant to provide an exhaustive review of potential sources of error/biases that can arise in Fitbits, but rather to focus on those that are most commonly recognized, for example, skin tone, motion artifacts, and device placement. We will also discuss the inherent limitations of BioMeTs in general. 

 

Skin Tones

Skin tone may present a source of error in wearable devices that rely on optical measurements, for example, photoplethysmography or pulse oximetry [11,12]. There have been mixed findings in this area– a recent study by Bent et al [9] that studied a subset of consumer smartwatches found that skin tone did not have a significant impact on HR measurement accuracy [9], while a study on older generations of consumer smartwatches found that darker skin tone was positively correlated with increased HR error [13]. Even clinical pulse oximeters have been implicated to be affected by skin tone [14]. Fitbits rely on green light wavelengths to collect HR data, which has a shorter wavelength than the clinical-grade red and infrared-based devices that are only accurate at complete rest. While the green wavelength can enable HR measurements during movement to a greater extent than the red or infrared (though still not great– the Bent et al study saw a 30% increase in HR error during movement [9]), the green wavelength is also more readily absorbed by melanin, the molecule which gives skin its pigmentation. More research needs to be conducted on this issue before a consensus is reached. Unfortunately, the All of Us program does not collect data on skin tone. With this dataset it is not possible to account for skin tone as a potential source of error in HR, and the strength of the relationship between skin tone and ancestry suggest that using race or ethnicity as a proxy for skin tone could be very problematic [15].

 

As a result, researchers working with All of Us data may need to take extra care when interpreting or translating results that may be influenced by skin tone, particularly when working with HR data. There are certain techniques that may reduce the effects of skin tone-related error, such as using an individual person as their own baseline, which we will discuss more in the Error Mitigation Strategies section of this post. However, it should be noted that even using the baseline technique would not mitigate situations where it is in actuality a combination of the effect of the measurement values as well as the skin tone (or other factors) that affect accuracy [16]– in other words, it is possible that the skin tone effect only occurs or is exacerbated in certain HR zones (e.g., high HR) or under circumstances of higher motion.

 

Movement and Motion Artifacts

Motion artifacts can also be a source of error for step count and physical activity intensity data. Feehan et al.'s research reinforced previous reports of wearable devices having higher step count error during activity than during rest. They found that step count reliability varied greatly depending on movement type, possibly due to exaggerated motions during normal household activities being logged by Fitbit as exercise movements [6].

 

In terms of accuracy, wearable device HR sensors perform best under circumstances of rest, followed by physical activity, then rhythmic activity such as walking or jogging. The Bent et al investigated activity conditions on the Fitbit by looking at rhythmic activities and concluded that decreased reliability during these conditions was likely due to Fitbit mistaking the periodic signal being produced by the repetitive movements for the cardiovascular cycle. Additionally, the study examined HR during a typing exercise. While walking resulted in HR measurements that were higher than the true HR, typing resulted in HR measurements that were lower than the true HR [9]. 

 

The study concluded that activity condition was highly correlated with HR measurement error. They alluded to the idea that sampling rate could be a source of discrepancy leading Fitbits to be less accurate than research grade wearables. However, due to the downsampling/interpolation methods used in research-grade devices, no concrete conclusion on the source of discrepancy could be reached [9]. Researchers should be aware of the important role of physical activity type in Fitbit HR measurement error. When available, researchers should utilize the “Activity Type” provided by Fitbit to segment HR data and anticipate activities where HR data may be less accurate.  

 

Picture1.png

Figure 1. Adopted from Feehan et al. Step count percentage error in controlled settings. Speed (jog, normal, self-paced, slow, very slow) by body placement (torso, wrist, ankle) of the Fitbit device. Dark lines indicate mean (horizontal). Dashed lines indicate median (horizontal). Gray shading indicates ±3% measurement error [6].

 

In the same study by Bent et al [9] described above, researchers found that large motion artifacts (indicated by high accelerometer sensor values) may signal the device’s internal quality system to remove the data points potentially affected by the artifact, causing missing values in the final dataset. This study, which compared data missingness of heart rate sensors across consumer-grade wearables, found that the Fitbit Charge 2 had the highest rate of missing data during both rest (18.7%) and physical activity (10.4%) when compared to other consumer-grade wearables. The study concluded that different wearables are all reasonably accurate at resting and prolonged elevated heart rate, but that differences exist between devices in responding to changes in activity [9]. We will discuss in the following Error Mitigation section how researchers can mitigate motion artifact errors, including data missingness.

 

Adherence and Improper Fit

Another important source of error in wearable data can be attributed to user interactions with the device, including adherence and wear. Specifically, there should be consideration for how long a device is worn per day in order to gauge the reliability and accuracy of the generated data. According to a systematic review by Chan et al [17], most studies define a valid wear time for one day to be at least 10 hours for one week to be at least three valid days [17]. In other studies, valid physical activity requires at least 10 hours of wear time, whereas sedentary behavior may require longer times. Interestingly, a study examining step counts with a wide range of participant age concluded that there was no significant difference between devices placed on the left-wrist or right-wrist on step count accuracy [18]. Furthermore, since Fitbits need to be charged about once a week, the time where they are taken off to charge can lead to data being lost, especially if one forgets to put the device back on their wrist after charging [19].

 

In addition to nonwear, improper Fitbit wear can also become a source of error. Because wrist-based devices such as Fitbits can often misclassify arm movements that occur when the entire body is not moving, they can underestimate or overestimate activity [19]. Fitbit recommends a specific placement on the wrist in order for the sensors to acquire the most reliable data, so any imprecise positioning on the wrist can result in attenuated data quality [20]. Besides improper wear leading to the sensor orientation being askew, improper wear can also lead to poor quality of skin contact [21]. Other factors that can affect Fitbit reliability include the Fitbits not being synced often enough to the smartphone app, or poor calibration of the Fitbit [22,23].

 

Inherent Measurement Inaccuracies 

In general, it is clear that there are still inherent limitations of using PPG technology in consumer wearables. Studies have shown that the best case scenario of PPG accuracy using the 95% Limits of Agreement (LoA) is anywhere from 3.89 to 3.77 bpm [24]. Table 1 summarizes some studies that have demonstrated the accuracy of Fitbit PPG when compared to reference-standard electrocardiogram (ECG) measurements. 

 

Table 1. Fitbit PPG vs Reference-Standard ECG (Limits of Accuracy) 

 

Study Name

Fitbit Charge HR Wireless Heart Rate Monitor: Validation Study Conducted Under Free-Living Conditions [25]

Accuracy of Fitbit Charge 2 PPG Technology for Assessment of Heart Rate During Sleep [10]

Assessment of the Fitbit Charge 2 for Monitoring Heart Rate [26]

Study Population 

10 healthy adults in free-living conditions.

35 healthy adults throughout a single night of sleep.

15 healthy adults while riding a stationary bike for 10 minutes.

Accuracy of Fitbit PPG Compared to Gold-Standard ECG

Fitbit underestimated heart rate by −5.96 bpm (standard error, SE=0.18).

Fitbit was not significantly different from ECG in asleep HR mean (0.09 bpm, P = 0.426) 

The mean bias was -5.9 bpm (95% CI: -6.1 to -5.6 bpm), the limits of agreement were +16.8 to -28.5 bpm

Study Conclusion

Under free-living conditions, the Fitbit Charge is affected by significant systematic errors.

The Fitbit Charge 2 suitably tracks HR during sleep in healthy young adults.

Fitbit Charge 2 underestimates heart rate. The high limits of agreement indicate that an individual heart rate measure could plausibly be underestimated by almost 30 bpm.

 

Even something as “simple” as step count can be compared against a research grade device. In the case of step counting, Fitbit’s 3-axis accelerometer can be compared to Actigraphs. A study by  Sushames et al in Queensland, Australia compared Fitbit’s step counter against direct observation and an Actigraph GT3X+ in both a laboratory setting and a free-living setting. Their conclusions can be seen in Table 2 below [27].

 

Table 2. Comparison of Fitbit’s 3-axis accelerometer to Actigraph GT3X

Activity

Absolute Proportional Difference

95% Confidence Interval

Walking

21.2% (undercounted in laboratory)

12.0-29.4%

Stair Stepping

15.5% (slightly over counted)

10.1-20.9% 

Jogging 

6.4% (slightly over counted)

3.7-9.0% 

 

Modave et al compared Fitbit to Actigraph wGT3X-BT in the Mobile Device Accuracy for Step Counting Across Age Groups study and concluded that across all age groups studied, Fitbit significantly undercounted steps [28]. However, a study on the validity of the Fitbit Charge 2 and the Garmin vivosmart HR+ involved 20 participants over the age of 65 wearing consumer and research-grade devices for 24 hours. This study concluded that the Fitbit Charge 2 tended to overcount steps when compared to the New-Lifestyles NL-2000i with a mean percentage error of 12.36% [29].





Error Mitigation Strategies

 

Accounting for Adherence Errors:

Because Fitbit can’t always be collecting data, either due to nonadherence due to forgetfulness or even charging the Fitbit, one suggestion is to only use data during certain periods of time, like during exercise, sleep, or work days depending on the study type. A study by Claudel et al sought to compare methods to identify the best wear-time intervals for physical activity. The two methods they compared were method 1, which defined a valid day as ≥ 10 hour wear time with heart rate data and method 2, which involved removing any minutes with heart rate ≤ mean - 2 SDs below mean, ≤2 steps, and nighttime. Results wise, the two methods were comparable with method 1 having an average step count per day of 7,436 (Standard Deviation = 3,543) and with method 2 having 7,298 (Standard Deviation = 3,501) steps per day. While they concluded that more studies are needed to improve the accuracy of physical activity data sets, either method could be good options for researchers to process data [30].

 

Another approach is to filter participants’ nonadherent days based on the “daily adherence level” derived from minute-level heart rate data. The daily adherence level is calculated by dividing the total count of minute level heart rate data collected within a day by total number of minutes (1440 minutes in a day). The optimal threshold for filtering should be selected carefully to avoid extensive data loss and to conserve variance in the data. Plots like the one we produced from the AoU data as shown in Figure 2 show the changes in total adherent days (or participants) against varying adherence threshold. The optimal threshold is where a minimal change in the y-axis occurs (an indication of convergence) while still providing statistically significant outcomes for the study of interest. 

 

Picture2.png

Figure 2.A & B. Total count of days (or participants) that meet each adherence threshold. Both sub-figures A and B show minimal change of the y-axis at a threshold slightly lower than 0.4. Optimal threshold for filtering out non adherent days may, therefore, be determined as >= 0.4. C. Total count of days (per participant) with adherence level greater or equal to each adherence threshold. The 95% confidence interval was calculated by μ ± σ/√n. Similar pattern from sub-figures A and B is visualized, where minimal changes start to occur with adherence threshold between 0.3 and 0.4. Controlled tier data v5 (C2021Q3R5) was used to generate Figure 2.

 

Accounting for Skin Tone and Motion Artifact Errors:

Skin tone and motion artifact errors can be resolved in similar ways by comparing an individual’s longitudinal data and properly summarizing large individual and population datasets. To accomplish these goals, we can use normalization techniques such as calculating z-scores, establishing a baseline heart rate for individuals (changes vs absolute data), or comparing data against the individual themselves rather than comparing to the population data (personalized vs. population studies).

 

When working with large datasets and study populations, it can be helpful to incorporate summary statistics instead of relying on the median. This is especially useful in motion artifact errors, where there may be short periods of incorrectly reported HR or step count. Z-scores calculate the probability of a score occurring within a normal distribution and enables comparison of two scores that are from different normal distributions [31].

 

On an individual level, incorporating baseline values can be helpful for comparing an “inactive” heart rate from the wearable devices to elevated heart rates throughout the day. This can be done by sampling an individual’s measurements during periods of sleep or reported inactivity. The limitation, however, is that it may require substantial time to establish a baseline for an individual, because baseline sleep and inactivity measurements can also be influenced by other conditions on any particular day [32].

 

The final technique is to compare an individual’s data at similar time periods across many days of data. In particular with motion artifacts and data missingness, we can look across an individual’s data set and potentially find similar levels of activity during similar time periods each day (or in a pattern of days). This technique is also useful to reduce the effect of skin tone errors, because skin tone is likely to be relatively constant, with possible changes due to sun exposure, across an individual’s longitudinal data. 

 

Table 3 below summarizes these example methods for normalizing data.

 

Table 3. Examples of Normalizing Methods for Wearable Data

 

Z-Scores

Baseline Comparisons 

Sampling During Periods of Similar Activity

Technique

Calculate how many standard deviations below/above the mean that a particular data point is.

Use measurements from earlier periods in an individual’s longitudinal Fitbit data to better understand deviations in the individual’s measurements.

Analyze an individual’s Fitbit data during similar time periods each day with comparable activity levels.

Benefits 

  • Allows comparison of two scores from different normal distributions [30]. 
  • Very effective at normalizing signals and providing useful rules from individual participants [33].
  • Provides a starting point, or what a “normal activity” level may appear as for an individual [32].
  • Reduces effects of skin tone that may artificially inflate/deflate metrics due to the comparative approach. 
  • May help identify motion artifact errors by comparing an individual’s similar activity levels
  • Useful comparison when population data lacks diverse demographics

Limitations

  • Works better for larger study populations [34].
  • May take a long time to establish a baseline depending on the study type [32]. 
  • Baselines can be influenced by daily conditions and should be sampled over a long period of time.
  • Similar to baseline, daily condition changes will affect data
  • Recommended for large sets of longitudinal data to make accurate comparisons.
  • Not particularly useful for skin tone mitigation

 

Conclusion

 

The wide adoption of Fitbits make them a great source of data for researchers. Like any technology, there are potential pitfalls and sources of error that researchers should be aware of before utilizing Fitbit data. It is encouraged that the All of Us community employs data processing techniques with consideration for the above while also considering their study goals and expected outcomes. 



Source Links

[1] Laricchia, F. (n.d). Wearables Market Share Companies 2021. Statista; www.statista.com. Retrieved August 2, 2022, from https://www.statista.com/statistics/435944/quarterly-wearables-shipments-worldwide-market-share-by-vendor/

[2] Curry, D. (2020, October 23). Fitbit Revenue and Usage Statistics (2022) - Business of Apps. Business of Apps; www.businessofapps.com. https://www.businessofapps.com/data/fitbit-statistics/

[3] Holko, M., Litwin, T. R., Munoz, F., Theisz, K. I., Salgin, L., Jenks, N. P., Holmes, B. W., Watson-McGee, P., Winford, E., & Sharma, Y. (2022). Wearable Fitness Tracker use in Federally Qualified Health Center Patients: Strategies to improve the health of all of us using Digital Health Devices. Npj Digital Medicine, 5(1). https://doi.org/10.1038/s41746-022-00593-x 

[4] All of Us Research Program Expands Data Collection Efforts with Fitbit . All of Us Research Program News and Events. (2019, January). Retrieved from https://allofus.nih.gov/news-events-and-media/announcements/all-us-research-program-expands-data-collection-efforts-fitbit 

[5] Master, H., Labrecque, S., Kouame, A., Marginean, K., & Rodriguez, K. (2022). 2022Q2R2 v6 Data Characterization Report: Overall All of Us Cohort Demographics. All of Us Research Program. Retrieved from https://aousupporthelp.zendesk.com/hc/en-us/articles/7906441956244-2022Q2R2-v6-Data-Characterization-Report-Overall-All-of-Us-Cohort-Demographics 

[6] Feehan, L. M., Geldman, J., Sayre, E. C., Park, C., Ezzat, A. M., Yoo, J. Y., Hamilton, C. B., & Li, L. C. (2018). Accuracy of Fitbit Devices: Systematic Review and Narrative Syntheses of Quantitative Data. JMIR mHealth and uHealth, 6(8), e10527. https://doi.org/10.2196/10527

[7] Chow, J. J., Thom, J. M., Wewege, M. A., Ward, R. E., & Parmenter, B. J. (2017). Accuracy of step count measured by physical activity monitors: The effect of gait speed and anatomical placement site. Gait & posture, 57, 199–203. https://doi.org/10.1016/j.gaitpost.2017.06.012

[8] Jung, H. C., Kang, M., Lee, N. H., Jeon, S., & Lee, S. (2020). Impact of Placement of Fitbit HR under Laboratory and Free-Living Conditions. Sustainability, 12(16), 6306. https://doi.org/10.3390/su12166306

[9] Bent, B., Goldstein, B. A., Kibbe, W. A., & Dunn, J. P. (2020). Investigating sources of inaccuracy in wearable optical heart rate sensors. NPJ digital medicine, 3, 18. https://doi.org/10.1038/s41746-020-0226-6

[10] Shahab Haghayegh, Sepideh Khoshnevis, Michael H. Smolensky & Kenneth R. Diller (2019). Accuracy of PurePulse photoplethysmography technology of Fitbit Charge 2 for assessment of heart rate during sleep, Chronobiology International, 36:7, 927-933, DOI: 10.1080/07420528.2019.1596947

[11] Colvonen PJ. (2021). Response To: Investigating sources of inaccuracy in wearable optical heart rate sensors. NPJ Digit Med., 4: 38. doi:10.1038/s41746-021-00408-5 

[12] Bent B, Enache OM, Goldstein B, Kibbe W, Dunn JP. (2021). Reply: Matters Arising “Investigating sources of inaccuracy in wearable optical heart rate sensors.” NPJ Digit Med., 4: 39. doi:10.1038/s41746-021-00409-4 

[13] Shcherbina, A., Mattsson, C., Waggott, D., Salisbury, H., Christle, J., Hastie, T., Wheeler, M., & Ashley, E. (2017). Accuracy in Wrist-Worn, Sensor-Based Measurements of Heart Rate and Energy Expenditure in a Diverse Cohort. Journal of Personalized Medicine, 7(2), 3. https://doi.org/10.3390/jpm7020003

[14] Sjoding, M. W., Dickson, R. P., Iwashyna, T. J., Gay, S. E., & Valley, T. S. (2020). Racial bias in pulse oximetry measurement. New England Journal of Medicine, 383(25), 2477–2478. https://doi.org/10.1056/nejmc2029240

[15] Parra, E. J., Kittles, R. A., & Shriver, M. D. (2004). Implications of correlations between skin color and genetic ancestry for biomedical research. Nature Genetics, 36(S11). https://doi.org/10.1038/ng1440

[16] Louie, A., Feiner, J. R., Bickler, P. E., Rhodes, L., Bernstein, M., & Lucero, J. (2018). Four types of pulse oximeters accurately detect hypoxia during low perfusion and motion. Anesthesiology, 128(3), 520–530. https://doi.org/10.1097/aln.0000000000002002

[17] Chan, A., Chan, D., Lee, H., Ng, C. C., & Yeo, A. (2022). Reporting adherence, validity and physical activity measures of wearable activity trackers in medical research: A systematic review. International journal of medical informatics, 160, 104696. https://doi.org/10.1016/j.ijmedinf.2022.104696

[18] Modave, F., Guo, Y., Bian, J., Gurka, M. J., Parish, A., Smith, M. D., Lee, A. M., & Buford, T. W. (2017). Mobile Device Accuracy for Step Counting Across Age Groups. JMIR mHealth and uHealth, 5(6), e88. https://doi.org/10.2196/mhealth.7870

[19] Wright, S. P., Hall Brown, T. S., Collier, S. R., & Sandberg, K. (2017). How consumer physical activity monitors could transform human physiology research. American journal of physiology. Regulatory, integrative and comparative physiology, 312(3), R358–R367. https://doi.org/10.1152/ajpregu.00349.2016

[20] Düking, P., Fuss, F. K., Holmberg, H. C., & Sperlich, B. (2018). Recommendations for Assessment of the Reliability, Sensitivity, and Validity of Data Provided by Wearable Sensors Designed for Monitoring Physical Activity. JMIR mHealth and uHealth, 6(4), e102. https://doi.org/10.2196/mhealth.9341

[21] Human Interface and the Management of Information. Information and Interaction Design. (2013). In S. Yamamoto (Ed.), Lecture Notes in Computer Science. Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-39209-2

[22] Constantinou V, Felber AE, Chan JL. Applicability of consumer activity monitor data in marathon events: an exploratory study. J Med Eng Technol 2017 Oct;41(7):534-540. [doi: 10.1080/03091902.2017.1366560] [Medline: 28954563]

[23] Cleland I, Donnelly MP, Nugent CD, Hallberg J, Espinilla M, Garcia-Constantino M. Collection of a Diverse, Realistic and Annotated Dataset for Wearable Activity Recognition. 2018 Presented at: 2018 IEEE International Conference on Pervasive Computing and Communications Workshops; March 19-23, 2018; Athens, Greece p. 555-560

[24] S. Blok, M.A. Piek, I.I. Tulevski, G.A. Somsen, M.M. Winter. (2021). The accuracy of heartbeat detection using photoplethysmography technology in cardiac patients. Journal of Electrocardiology, 67, 148-157. https://doi.org/10.1016/j.jelectrocard.2021.06.009.

[25] Gorny, A. W., Liew, S. J., Tan, C. S., & Müller-Riemenschneider, F. (2017). Fitbit Charge HR Wireless Heart Rate Monitor: Validation Study Conducted Under Free-Living Conditions. JMIR mHealth and uHealth, 5(10), e157. https://doi.org/10.2196/mhealth.8233

[26] Benedetto S, Caldato C, Bazzan E, Greenwood DC, Pensabene V, et al. (2018) Assessment of the Fitbit Charge 2 for monitoring heart rate. PLOS ONE 13(2): e0192691. https://doi.org/10.1371/journal.pone.0192691

[27] Sushames, A., Edwards, A., Thompson, F., McDermott, R., & Gebel, K. (2016). Validity and Reliability of Fitbit Flex for Step Count, Moderate to Vigorous Physical Activity and Activity Energy Expenditure. PloS one, 11(9), e0161224. https://doi.org/10.1371/journal.pone.0161224

[28] Modave, F., Guo, Y., Bian, J., Gurka, M. J., Parish, A., Smith, M. D., Lee, A. M., & Buford, T. W. (2017). Mobile Device Accuracy for Step Counting Across Age Groups. JMIR mHealth and uHealth, 5(6), e88. https://doi.org/10.2196/mhealth.7870

[29] Tedesco, S., Sica, M., Ancillao, A., Timmons, S., Barton, J., & O'Flynn, B. (2019). Validity Evaluation of the Fitbit Charge2 and the Garmin vivosmart HR+ in Free-Living Environments in an Older Adult Cohort. JMIR mHealth and uHealth, 7(6), e13084. https://doi.org/10.2196/13084

[30] Claudel, S. E., Tamura, K., Troendle, J., Andrews, M. R., Ceasar, J. N., Mitchell, V. M., Vijayakumar, N., & Powell-Wiley, T. M. (2021). Comparing Methods to Identify Wear-Time Intervals for Physical Activity With the Fitbit Charge 2. Journal of aging and physical activity, 29(3), 529–535. https://doi.org/10.1123/japa.2020-0059

[31] Standard score. Standard Score - Understanding z-scores and how to use them in calculations. (n.d.). Retrieved August 2, 2022, from https://statistics.laerd.com/statistical-guides/standard-score.php

[32] Lisa A. Cadmus-Bertram, Bess H. Marcus, Ruth E. Patterson, Barbara A. Parker, Brittany L. Morey, (2015). Randomized Trial of a Fitbit-Based Physical Activity Intervention for Women. American Journal of Preventive Medicine, 49(3), 414-418. https://doi.org/10.1016/j.amepre.2015.01.020.

[33] N. Costadopoulos, M. Z. Islam and D. Tien, "Using Z-score to Extract Human Readable Logic Rules from Physiological Data," 2019 11th International Conference on Knowledge and Systems Engineering (KSE), 2019, pp. 1-6, doi: 10.1109/KSE.2019.8919473.

[34] Stephanie Glen. "T-Score vs. Z-Score: What’s the Difference?" From StatisticsHowTo.com: Elementary Statistics for the rest of us! https://www.statisticshowto.com/probability-and-statistics/hypothesis-testing/t-score-vs-z-score/

Was this article helpful?

0 out of 0 found this helpful

Have more questions? Submit a request