If you are a gym owner, and you're using Triib to manage membership, workouts, etc., you can get all your members' workout data off of the site with one simple command: yarn start
. You'll be prompted for additional information about the task you want to perform and the member group you want to perform it for. Then you'll wait while the script scrapes Triib's admin pages. Repeat for each member group you're interested in and then move onto the next task. By the end you'll have all your members' information in a nice CSV file that you can open in Excel or something.
Feel free to skip to the Prerequisites section below if you want to dig in right away.
This repo is the result of trying to figure out a way to extract workout results from Triib and transform them into a useful format for importing into another fitness-tracking app. Triib bills itself as "everything you need to excite your members, manage your gym and build your business." And they do offer a lot—from workout tracking to billing to customer management. But when the owner of my gym wanted to switch to another service, the Triib folks suddenly became very unhelpful.
I suspected things might get difficult because their site and mobile app are, in my opinion, kind of janky. They've never offered an API or a way for gym members to export all of their workout results. They only had one place where we could download an CSV file of the latest result from each workout.
Our gym owner is a data geek, and he loves to monitor his members' progress. He also wanted a way for us to be able to hold on to all of our info that we painstakingly entered into the app after each workout. Unfortunately, Triib wasn't much help. In fact, they seemed almost obstructionist, telling the owner that there was "no possible way to get the data." I mean, it must be in a database, right? How hard could it be?
I didn't have direct access to the database, of course, so I needed to get at the data a different way. Using the owner's credentials, I poked around in the admin area and discovered that there was, in fact, a way to get all of the data. The service our gym is switching to now said they could import our old data if we gave it to them in a CSV file with a specific set of "columns": athlete, date, workout title, "is rx," result, set rep scheme, and notes.
Because our gym has had a lot of members over the years, and those members have recorded a lot of workouts, getting all of the information was going to be a slow process, involving over 100,000 requests to Triib's server. As much as I didn't appreciate their lack of help, I didn't feel it would be right to bring the server to its knees. So, within each step, the requests are made one at a time. Here is what we do:
- Scrape the active members page for name, email, and link to workout history. When that is done, scrape the "other members" page for inactive members. Then do the same for "on-hold" and "archived" members. When that is done, we have a directory for each member status, and in those directories a JSON file for each member.
- For each member status, run a task that opens all the workout pages for each member. Add an array of workouts to the member file, along with the result details page for each one and save it into a new JSON file along with the member info.
- Again, for each member status, run a task that opens the result details page for every workout, grab all those details, and save it all into yet another JSON file for every member.
- Now that we've pulled down all the info from the site, we go through all of the "results/scores" files and clean them up, removing duplicates, adjusting the "score" column so it matches the expected format, etc.
- From the "csv-ready" files, create a CSV file for each member group.
- Combine the CSV files into a single CSV of all the data.
That's it for the narrative part. The rest are the technical details if you'd like to try this at home.
- Node.js
- An admin account for a gym (aka "box") at triib.com
- Patience
-
Clone the repo
git clone [email protected]:kswedberg/triib-scrape.git
-
Install dependencies
yarn install
-
Copy .env.example to .env
cp .env.example .env
-
In
.env
, enter your Triib login credentials and the URL to your Triib instanceUSER_EMAIL='[email protected]' USER_PWD='YOUR_TRIIB_PASSWORD' BASE_URL='https://YOUR_SUBMDOMAIN.triib.com'
There is only one command: yarn start
. All files will be stored in the project's (git-ignored) /data/
directory, in sub-directories named by task.
Here are the prompts you'll possibly see (depending on the task you choose at the first prompt):
$ yarn start
? task: (Use arrow keys)
Fetch Members
Fetch workouts for each member
Fetch scores for each workout
Prepare scores data for CSV
Create CSV
Combine CSV files into "all.csv"
? which member type are we dealing with? (Use arrow keys)
active
inactive
archived
on-hold
? where in the array of members do you want to start? (0)
? where in the array of members do you want to end?
The last two prompts are there because sometimes you need to stop a task (e.g. Fetch workouts for each member) before it completes and pick it up later. Or you might want to just test things by fetching scores for only the first 5 members, for example.
Feel free to post an issue if you have any questions or concerns about this insane project.