Analysing your Apple Health Data with Splunk

punchcard

Did you know that from iPhones 5s and iOS 8 (minimal) the Apple Health App automatically collect your steps, walking, running and flights climbed data? This post will show you how to export, analyze and visualize your Apple Health data using Splunk and in the future other tools like Python, R and Tableau.

The Apple Health App comes with some nice and quick standard graph views, but I want some more analysis and answers like:

  • What was my first step with a iPhone
  • How many steps did i do in total
  • What is the average per day
  • What was my highest day
  • How does my steps, distance compare to last month or the same month last your (if available)
  • How does my activity compare in the weekdays and weekend
  • What is my most active hour on a day
  • Is there an up going trend
  • Are there outlier days
  • How many Kilometers I walked
  • ….

Export the Apple Health Data

First of all I need to export the data of the iPhone. This can be done in a couple simple steps:

  1. Start the Apple Health App
  2. Go to “Account Settings”, right upper corner.
  3. Click “Export Health Data”
  4. Click “Export”
  5. Wait for the preparing
  6. My choice was to Mail the data
  7. Save the attachment export.zip and unzip it

After the file was unzipped you should see two files; export.xml and export_cda.xml. For my analysis I will use the export.xml. Below a sample of single record rows for each of the three different standard Apple Health data points. In the export.xml file I also see some other types of data like “Clock”, “MyFitnessPal” and “Runkeeper”. These types are sharing data with the Health App, nice to know for other analysis 😆

Transform Apple Health data from XML to CSV

For transfrom the Apple Health data I found a nice python script on GitHub named applehealthdata.py created by Test-Driven Data Analysis, this script will do the trick. Copy the script or clone it from GitHub and run it as below.

This is how the first rows look like from the StepCount.csv file:

Finally I have export and transformed the data and can I do some analysis.

This post is not going in detail to install, setup and configure Splunk. If you had some questions about the details contact me with the social media icons on the right lower corner on my site.

Splunk

I created a new index (apple) and a new App (Apple Health). For now I will only add the StepCount.csv and DistanceWalkingRunning.csv data to the apple index.

New Index:
Select Settings > indexes > New index

  • Index name (apple)
  • Save

New App:
Select Apps > Manage Apps > Create App

  • Name (Apple Health)
  • Folder Name (apple_health)
  • Save

Upload data:

Splunk works time based and needs a timestamp of the data when it is indexed. The “endDate” is one of the field headers of StepCount.csv.

I want to setup a new souretype “csv:apple:health” with the “endDate” as TIMESTAMP_FIELDS, because Splunk  was not recognised the timestamp or choses one of the three timestamps from the data and I want control of which timestamp will be used.

Select Settings > Add Data (icon) > Upload > Select File

Next > Change timestamp > Advanced >

  • Timestamp field (endDate)
  • Save as (csv:apple:health)
  • Next
  • Host field value (change to the value of the host where the data is coming from, in my case johns_iphone)
  • Index (apple)
  • Review
  • Submit

The data is indexed and lets run the first search in SPL (Splunk’s Search Processing Language), and answer some of the above questions.

What was my first step with the iPhone
first_step

Let’s save it to a dashboard panel. This is the first panel we wil saving, so there is no existing dashboard, switch to new and create it directly, the next time we only have to choose existing.

How many steps did i do in total
Change to the visualisation tab and choose to Single Value and save it to your dashboard.

total_steps

Save it to your dashboard.

What is the average per day
average_step_per_day

Save it to your dashboard.

What was my highest day

Save it to your dashboard.

How does my steps, distance compare to last month or the same month last your (if available)

I do not have multiple year data so I can not compare the years, but if you have you can do the following. change span=1d to 1mon and group by date_year.

How does my activity compare in the weekdays and weekend

Go to the visualisation tab and change the visualisation to Bar chart, and do some formating.

Select Format > Stack mode > stacked

weekdays

Save it to your dashboard.

What is my most active hour on a day

For this one it’s more visible for using a visualisation add-on.

Select App: Apple Health > Find more Apps

If you have internet connection from your Splunk machine you will be on the Splunk Apps Browser. Search for Punchcard and install it with your Splunk account. After the installation we have a new visualisation installed we will use.

Change your visualisation to Punchcard.

punchcard

Save it to your dashboard.

Is there an up going trend

For this one we want to do Linear Regression. For doing that we will create a macro, to mask the complexity of the underlying search. For creating the marco you need to go to:

Select Settings > Advanced search > Search Macros > New

And follow the steps on wiki.splunk.com.

Change the visualisation back to Line Chart.

lr

Are there outlier days

Also for this one we need another visualisation, the used viz is inside a Toolkit with the name Machine Learning ToolkitWe must also install the Python for Scientific Computing Add-on before installing the Machine Learning Toolkit. Choose your appropriate version on the details tab and install both.

If everything go well you can choose between another five visualisations on the visualisation tab.

Change the visualisation to Outliers Chart.

outliers

How many Kilometers I walked
km

Finally

With some rearrangement on the dashboard and adding extra add-ons and some filters you finally have a complete dashboard of your Apple Health Step Data.

num1 num2 num3

Installed add-ons

Have fun with playing around as you can do a lot more with the Apple Health data, maybe you have a HeartRate.csv or SleepAnalysis.csv. If you have questions do not hesitate to contact me with one of the social media icons on my site.

The Splunk App I created is available on my GitHub.

I will share other tools (e.q. Python, R or Tableau) as soon as possible on my blog with this data. Stay tuned for the posts.

One comment on “Analysing your Apple Health Data with Splunk”

Comments are closed.

Related Posts