A comprehensive toolkit for exporting and managing user data from AWS Cognito User Pools.
- 📤 Robust Cognito Data Export: Export user attributes from AWS Cognito User Pools to CSV format
- 🔄 Exponential Backoff with Jitter: Automatically handles AWS API rate limits with intelligent retry logic
- ⏸️ Checkpoint & Resume: Save progress during exports and resume from where you left off
- 🧹 CSV Deduplication: Remove duplicate user entries from exported CSV files
- 🔍 Flexible Attribute Selection: Export specific attributes or discover and export all available attributes
- 📃 Pagination Support: Efficiently handles large user pools with proper pagination
- 🎯 Attribute & Group Filtering: Export only users matching a filter or belonging to a specific group
- ☁️ Optional S3 Upload: Automatically upload exports to S3 with optional gzip compression
- Python 3.10+
- AWS credentials configured (either via environment variables, credentials file, or IAM role)
- Poetry for dependency management.
-
Clone this repository:
git clone https://github.com/tblakex01/Cognito-Attribute-Exporter.git cd Cognito-Attribute-Exporter
-
Install using pip:
pip install .
This installs the package and provides the
cognito-export
andcognito-dedup
commands.
This project uses Pytest for testing. To run the tests:
- Ensure you have installed the development dependencies:
poetry install --with dev
- Run Pytest:
poetry run pytest
Use the cognito-export
command to export user data from Cognito:
cognito-export --user-pool-id YOUR_POOL_ID --export-all
--user-pool-id
: Your Cognito User Pool ID (required)--export-all
: Export all available attributes-attr, --export-attributes
: List specific attributes to export--region
: AWS region (default: us-east-1)--profile
: AWS profile to use-f, --file-name
: Output CSV filename--filter
: Filter expression for Cognito users--group-name
: Only export users from this group--s3-bucket
: Upload the resulting CSV to this bucket--compress
: Compress the CSV before upload--max-retries
: Maximum retry attempts for rate-limited requests--resume
: Resume from last saved checkpoint
📋 Export all attributes:
cognito-export --user-pool-id us-east-1_abcdefghi --export-all
📋 Export specific attributes:
cognito-export --user-pool-id us-east-1_abcdefghi --export-attributes username email phone_number
📋 Resume an interrupted export:
cognito-export --user-pool-id us-east-1_abcdefghi --export-all --resume
📋 Custom retry settings:
cognito-export --user-pool-id us-east-1_abcdefghi --export-all --max-retries 10 --base-delay 1.0
📋 Upload to S3 with compression:
cognito-export --user-pool-id us-east-1_abcdefghi --export-all --s3-bucket my-bucket --compress
The deduplication tool removes duplicate entries from exported CSV files:
cognito-dedup CognitoUsers.csv
input_file
: Path to the CSV file to deduplicate (required)-o, --output-file
: Custom output file path-k, --keys
: Column names to use as unique keys (default: sub)--keep-last
: Keep the last occurrence of duplicates instead of the first--dry-run
: Check for duplicates without modifying files
📝 Basic deduplication:
cognito-dedup CognitoUsers.csv
📝 Custom key fields:
cognito-dedup CognitoUsers.csv -k username email
📝 Check for duplicates without making changes:
cognito-dedup CognitoUsers.csv --dry-run
The Cognito Exporter includes built-in features to handle AWS API rate limits:
- 📈 Exponential Backoff: Automatically increases wait time between retries
- 🎲 Jitter: Adds randomness to retry intervals to prevent synchronized retries
- ⚙️ Configurable Retry Parameters: Customize max retries and delay settings
- 🛡️ Built-in Rate Limiting: Adds small delays between API calls to reduce throttling
The export process automatically saves checkpoints to allow resuming interrupted exports:
- Checkpoints are saved every 10 pages or 500 records
- Use the
--resume
flag to continue from the last checkpoint - Checkpoint files are saved with the
.checkpoint
extension
When using --export-all
, the tool automatically:
- Samples users to discover all available attributes
- Includes both standard and custom attributes
- Falls back to common attributes if no users are found
- Rate Limiting Errors: Try increasing
--base-delay
and--max-retries
- Memory Issues: Export specific attributes instead of all attributes
- CSV Parsing Problems: Ensure the CSV is properly encoded (UTF-8)
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request.
Made with ❤️ for AWS Cognito users by Anthony Michaels.