Analyzing Data from FitBit
Fitbit is a company in San Francisco, California, known for its products that have the same name, which are fitness/activity trackers and now smart watches. The devices are wireless-enabled wearable technology that measure data on the number of steps you walked, your heart rate, your quality of sleep, steps climbed, and other personal metrics involved in fitness like your calorie in take, weight, and calories burned.
Like all fitness activity trackers this device isn’t perfectly accurate, but it can give you a good idea of how active you are. I am going to take some of the data that is being collected from my FitBit tracker and try to find some fun and interesting insights about myself and my “active” life. FitBit now use MET calculations that match recommendations from various associations like the Center for Disease Control and the US Department of Health to determine how active a person is. First I will need to define the problem / or question, then I am going to collect the raw data from Fitbit.com. After that I will need to process the data to transform the information (only to put multiple dates and information together). Then I will explore the data visually with graphs and then perform some analysis, and communicate the results once I am done.
- Step 1: Frame the problem. The first thing you have to do before you solve a problem is to define exactly what it is.
- Step 2: Collect the raw data needed for your problem.
- Step 3: Process the data for analysis.
- Step 4: Explore the data / perform analysis
- Step 5: Communicate results of the analysis.
Step 1: Frame the problem
I want to know what day of the week I get the most/least steps, what month I get the most/least steps and what month I got the most steps in one day.
A) Most/least steps for given day of the week in the past 2 years ?
B) Most/least steps for given month of the year in the past 2 years?
C)What month did I get the most steps in one day in the past 2 years?
Step 2: Collect The Raw Data From FitBit
A) Go to https://www.fitbit.com/export/user/data , this will allow you to download 1 moth worth of data either as a CSV (Comma Seperated Values) or an Excel file. For the data I want to collect I will click only on the “Activities” Data option and choose the File format to be “CSV”. I chose a “Custom” time period between July. 1, 2016 and Jul. 31, 2016. This should download this one month worth of data, just by clicking the download button.
B) FitBit only allows you to download data up to 31 days so I had to do this about 23 more times, because I wanted 2 years worth of data. I also named my saved CSV files “fitX.csv”, where X=the number/order I downloaded the file starting from July 2016 to Jun 2018
The reason for this naming convention is to easily use a program to concatenate these files into one later when we are processing our data.
The raw data that I’ve collected contains the following features / columns:
Date= The current date
Calories Burned = The number of calories burned for that day
Steps = The number of steps taken for that day
Distance (in miles) = The distance travelled for that day in miles
Floors = The number of floors taken for that day (approx. 10 feet in elevation= 1floor)
Minutes Sedentary = The minutes spent seated / inactive
Minutes Lightly Active = The minutes you’re lightly active
Minutes Fairly Active = The minutes you’re fairly active
Minutes Very Active = The minutes you’re very active
Activity Calories = The number of cal
Step 3: Process the data
Now that I have collected my files, I am going to concatenate them all into one CSV file called “out.csv” using the Python programming language (Python version 3.4.4). The code is below (if you want to use the code just change the <full_path> line with the folder path that contains your fitbit files, and if your fitbit filenames aren’t fit1.csv, fit2.csv, fit3.csv, etc. then change the file name as well :
Concatenate the files:
# first file:
for line in open("<full_path>/fit1.csv"):
# now the rest:
for num in range(2,25):
f = open("<full_path>/fit"+str(num)+".csv")
next(f) # skip the header
next(f) # skip the header
for line in f:
f.close() # not really needed
I noticed after concatenating the files into the one “out.csv” file, that the file contained blank rows of data. I could go through the CSV file and simply delete each row, but if I had lots of rows to delete, I would want a more automated way to do this. So I will use another Python program to get rid of these extra rows for me.
Remove empty rows:
import csvinput1 = open('out.csv', 'r')
output = open('FitBit.csv', 'w', newline='')
writer = csv.writer(output)
for row in csv.reader(input1):
Remove the “Activities” row from the CSV file. Now that we have processed this data and cleaned it up into one CSV file, we can start exploring the data ! There are many tools we can use to do some analysis. We could use Excel, MySQL, R, Tableau and Python just to name a few.
Step 4: Exploring the Data
Using Excel I can immediately get the averages and maximum values from my data using AutoSum on the individual columns.
The overall averages for the past 2 years are below:
AVG Calories Burned: 2841.731259
AVG Steps: 10538.52475
AVG Distance: 4.767468175
AVG Floors: 7.292786421
AVG Minutes Sedentary: 1171.758133
AVG Minutes Lightly Active: 165.5374823
AVG Minutes Fairly Active: 34.06647808
AVG Minutes Very Active: 54.28712871
AVG Activity Calories: 1374.951909
First I need to load the .csv file that I have created by connecting to the .csv file (a text file). Click on Text file and go to the location where you saved your processed .csv file.
Let the data visualization begin !
The month I got the most steps in the past 2 years was June with 712,155 steps.
The month with the lowest steps in the past 2 years was February with 489,622 steps.
Looks like I get most of my steps on Fridays and the least amount on Thursdays. Looking at this chart it’s clear to see I am more active on the weekends (I’m including Friday) than during the week days (Mon, Tues, Wed, & Thurs).
In 2018 looks like the same is true, on Fridays I had the most amount of steps and on Thursdays I had the least amount of steps
Out of the 24 months, in February 2018 I averaged the most steps: (12,753 steps per day for the month).
I averaged the least amount of steps in February 2017 (4,735)
I got my maximum steps in the month of June 2017 in the past 2 years
36, 082 for the day. In February 2017 I got 19,376 steps the least amount.
14 / 24 = 58.3 % months I beat or met my overall average of 10538.52475 steps.
I beat my average steps per day twice in the months of June, September, October, and December.
Looks like I got the most steps in the past two years in the month of January 2018 with 376,744 steps for that month, and I noticed the month of February 2017 seems to be a bit of an outlier with only 132,592 steps for that month. That might have been the time my FitBit broke.
Step 5: Communicate Results
From the charts above looks like something happened to my Fitbit during the month of February in 2017. I was most consistent on my steps during the month of February 2018. On any given day you could expect that I will get about 10,538.52475 steps. I will get most of my steps on Friday, Saturday, and Sunday, probably because I am busier during the weekdays. I also get a lot of steps during the months of June, October, and December.
Lessons learned and thoughts while doing this project:
- I would like a more automated way of collecting the initial 24 CSV files of data, if I wanted 10 years of data on my self that would take up even more time to collect all of that data.
- Fitbit doesn’t allow you to collect data on your heart, and your resting heart rate is very important to determining how healthy you are. There are some work arounds, but they are not standard tools given by Fitbit. I don’t understand why fitbit wouldn’t give you access to your own data.
- Sleep habits and weight seem to be important to determine how healthy you are as well, I might use the Fitbit scale to automatically track my weight and sync with my Fitbit data and track my sleeping patterns to do some analysis on that data another time.
- I will start/ collect data from January 1st next time to be more consistent.
Thanks for reading this article I hope its helpful to you all ! Keep up the learning, and if you would like more computer science, programming and algorithm analysis videos please visit and subscribe to my YouTube channels (randerson112358 & compsci112358 ). If you are interested in learning Data Analysis with Tableau and creating charts like the ones in this article, then Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software is a great book to learn how and get started !
Check Out the following for content / videos on Computer Science, Algorithm Analysis, Programming and Logic:
Video Tutorials on Recurrence Relation:
Video Tutorial on Algorithm Analysis:
how to merge 200 csv files in Python
AttributeError: '_io.TextIOWrapper' object has no attribute 'next'?
everybody. I am currently working to merge the csv files. For example, you have files from filename1 to filename100. I…
Why does range(start, end) not include end?
Although there are some useful algorithmic explanations here, I think it may help to add some simple 'real life'…
Delete blank rows from CSV?
I have a large csv file in which some rows are entirely blank. How do I use Python to delete all blank rows from the…
CSV file written with Python has blank lines between each row
Note: It seems this is not the preferred solution because of how the extra line was being added on a windows system. As…
R package to scrape fitbit data. Contribute to corynissen/fitbitScraper development by creating an account on GitHub.
Your heart, your calories, your sleep, your data: How to extract your Fitbit data and make graphs…
I opted for a Fitbit because they say "your data belongs to you". But as it turns out, it is no easy task getting…