Skip to content

Commit

Permalink
Update README.
Browse files Browse the repository at this point in the history
  • Loading branch information
MartinHeinz committed Mar 29, 2022
1 parent b02a723 commit 6d4bbc3
Showing 1 changed file with 60 additions and 25 deletions.
85 changes: 60 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Google Analytics Extractor

## Google Cloud API Access
## Setup

You will need Google Cloud API access for run the CLI:

- Navigate to [Cloud Resource Manager](https://console.cloud.google.com/cloud-resource-manager) and click _Create Project_
- alternatively create project with `gcloud projects create $PROJECT_ID`
Expand All @@ -18,18 +20,24 @@
- email: SA email from earlier
- role: _Viewer_

Alternatively see <https://martinheinz.dev/blog/62>.
## Running
Alternatively see <https://martinheinz.dev/blog/62>.

To install and run:

```bash
python -m ga_extractor --help
pip install ga-extractor
ga-extractor --help
```

Setup:
## Running

```bash
python -m ga_extractor setup \
ga-extractor --help
# Usage: ga-extractor [OPTIONS] COMMAND [ARGS]...
# ...

# Create config file:
ga-extractor setup \
--sa-key-path="analytics-api-24102021-4edf0b7270c0.json" \
--table-id="123456789" \
--metrics="ga:sessions" \
Expand All @@ -39,34 +47,48 @@ python -m ga_extractor setup \

cat ~/.config/ga-extractor/config.yaml # Optionally, check config

python -m ga_extractor auth # Test authentication
Successfully authenticated with user: ...
ga-extractor auth # Test authentication
# Successfully authenticated with user: ...

ga-extractor setup --help # For options and flags
```

Value for `--table-id` can be found in GA web console - Click on _Admin_ section, _View Settings_ and see _View ID_ field
- Value for `--table-id` can be found in GA web console - Click on _Admin_ section, _View Settings_ and see _View ID_ field
- All configurations and generated extracts/reports are stored in `~/.config/ga-extrator/...`
- You can also use metrics and dimensions presets using `--preset` with `FULL` or `BASIC`, if you're not sure which data to extract

Extract:
### Extract

```bash
python -m ga_extractor extract
ga-extractor extract
# Report written to /home/some-user/.config/ga-extractor/report.json
```

Migrate:
`extract` perform raw extraction of dimensions and metrics using the provided configs

### Migrate

You can directly extract and transform data to various formats. Available options are:

- JSON (Default option; Default API output)
- CSV
- SQL (compatible with _Umami_ Analytics PostgreSQL backend)

```bash
python -m ga_extractor migrate --format=UMAMI
# Report written to /home/user/.config/ga-extractor/cee9e1d0-3b87-4052-a295-1b7224c5ba78_extract.sql
ga-extractor migrate --format=CSV
# Report written to /home/user/.config/ga-extractor/02c2db1a-1ff0-47af-bad3-9c8bc51c1d13_extract.csv

cat cee9e1d0-3b87-4052-a295-1b7224c5ba78_extract.sql | docker exec -i db psql -Upostgres -a blog
```
head /home/user/.config/ga-extractor/02c2db1a-1ff0-47af-bad3-9c8bc51c1d13_extract.csv
# path,browser,os,device,screen,language,country,referral_path,count,date
# /,Chrome,Android,mobile,1370x1370,zh-cn,China,(direct),1,2022-03-18
# /,Chrome,Android,mobile,340x620,en-gb,United Kingdom,t.co/,1,2022-03-18

This should be run against clean database, consider running following if possible
ga-extractor migrate --format=UMAMI
# Report written to /home/user/.config/ga-extractor/cee9e1d0-3b87-4052-a295-1b7224c5ba78_extract.sql

```sql
-- THIS WILL WIPE YOUR DATA
TRUNCATE public.pageview RESTART IDENTITY CASCADE;
TRUNCATE public.session RESTART IDENTITY CASCADE;
# IMPORTANT: Verify the data and check test database before inserting into production instance
# To insert into DB (This should be run against clean database):
cat cee9e1d0-3b87-4052-a295-1b7224c5ba78_extract.sql | psql -Upostgres -a some-db
```

You can verify the data is correct in Umami web console and GA web console:
Expand All @@ -76,13 +98,26 @@ You can verify the data is correct in Umami web console and GA web console:

_Note: Some data in GA and Umami web console might be little off, because GA displays many metrics based on sessions (e.g. Sessions by device), but data is extracted/migrated based on page views. You can however confirm that percentage breakdown of browser or OS usage does match._

## Testing
## Development

### Setup

Requirements:

- Poetry (+ virtual environment)

```bash
poetry install
python -m ga_extractor --help
```

### Testing

```bash
pytest
```

## Building Package
### Building Package

```bash
poetry install
Expand Down

0 comments on commit 6d4bbc3

Please sign in to comment.