All notable changes to our API will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
Until we reach API 1.0, the following special rules apply:
- If you add a feature or fix a bug, please bump the version from 0.x.y to 0.x.(y+1).
- If you make a breaking change, please bump the version from 0.x.y to 0.(x+1).0.
- Add `identity_provider' field to lookup api response to describe which identity provider was used to match the identity.
- Add residence_count to GET /cohorts
GET /scopes/{scope_id}/analysis
to return scope payload resource probability distribution
- Enable more permissible glob patterns for
prefix
connection options.
- Make webhook secrets read-only
- Added
secret
(sha256 signing key) to webhook endpoints
- Webhook endpoints endpoints
GET /usages
andGET /accounts/x/billing
addedname
to each usage
GET /traits
andGET /traits/{id}
addedlife_event
category
GET /traits
andGET /traits/{id}
addeddeprecated
boolean field for Faraday's traits.
GET /traits.csv
to return information about all traits the user has access to, in csv form.GET /traits
andGET /traits/{id}
addedtier
andpermissions
for each trait.
GET /streams/{stream_id_or_name}/analysis
to return event stream analysis (starting with atime_series
)stream.properties[property].distribution
which is present for numeric properties and contains the distribution of valuesstream.properties[property].values
which is present for categorical and multicategorical properties and contains the unique values and their countsstream.event_contribution_by_dataset
to show which datasets contributed to the stream and their event countstream.event_count
which is the total number of events in the streamstream.oldest_date
andstream.newest_date
which are the oldest and newest dates of events found in the stream
- Add yyyy_mm_dash and yyyy_mm_slash to data map date formats
- Add
identity_sets
to lookup api to allow for including multiple identifying information for the same individual, i.e. addresses, emails, phone numbers etc...
datasets.output_all_columns_as_traits.include
enables an allowlist for automatically generated traits.
- Added
merge_datasets
todatasets
. This field contains a list of parent/merge dataset IDs and join columns that use the current dataset as a source.
GET /graph
returns 1 new status fields for both upstream and downstream resources for a total of 2 fields: xxx_status_error (where xxx = downstream or upstream)- Added required to several fields on /graph that were not really optional and always included: id, status, literate, and type
-
POST /{resource_type}/{resource_id}/archive
archive the specified resource. If a resource is archived, it will no longer appear in the Faraday UI (except in the 'archived' tab), and Faraday will no longer update the resource. To archive a resource, all downstream resources must also be archived. -
POST /{resource_type}/{resource_id}/unarchive
unarchive the specified resource. To unarchive a resource, all upstream resources must also be unarchived.
null_values
added tooutput_to_traits
to support null transformations.
GET /graph
returns 4 new status fields for both upstream and downstream resources for a total of 8 fields: xxx_status_changed_at, xxx_last_updated_output_at, xxx_last_updated_config_at, xxx_last_read_input_at (where xxx = downstream or upstream)
GET /recommenders/{recommender_id}/analysis
to return model performance, feature importance, and more.
GET PATCH POST /connections
: allow slashes in s3 and gcsbucket_name
PATCH /user_places
: setting a place'saddresses
= null will now delete addresses
GET /datasets
: deprecate match_count. Useenrichment
instead
- Updated pattern for header_row to match delimiter options and added tab delimiter option '\t'
- Added
dd_mm_yyyy_slash
,dd_mm_yy_slash
,dd_mm_yyyy_dash
, anddd_mm_yy_dash
date formats to output_to_streams
- Add
recommenders.report_url
(likeoutcomes.report_url
) to provide signed download link for technical report
- Added
conditions
option to datasets' output_to_streams, giving users the ability to filter which rows in the dataset should be added to the stream
- Support UUID as a data type for detected columns in datasets
- Add
PATCH /persona_sets/{persona_set_id}/personas/{persona_id}
to update a persona in a persona set. Supports name for now.
- Added
privacy
option to datasets, giving users the ability to treat a dataset as a do not contact list (suppress
) or as adeletion
list (delete and do not contact)
- datasets: reference_columns
- targets: reference: {dataset_id, column_name}
- datasets: reference_key_column. Use datasets.reference_columns instead
- targets: reference_dataset_id. Use targets.reference.dataset_id instead
- Added new identified target transformation option: address only
- Add
predictors.blocked.providers
to outcomes resource. This will allow a user to block first party data from models during outcome builds.
- add
date
as aPrimiteDataType
- reconcile
TraitCategory
with the actual categories used in the db
- Added
output_all_columns_as_traits
as an alternative tooutput_to_traits
when configuring datasets.
- Add new dataset type
merge
- merges two or more existing datasets into a single dataset (leaves the original datasets intact)
- Added
target_filter_recommender_ranks
andtarget_filter_recommender_uncalibrated_probabilities
to thefilter
object of atarget
.
- Added
schema
as an optional parameter for Microsoft SQL Server connection types.
- Added
header_row
as an optional parameter for connection types that support CSV files.
- Allow slashes in
detected_column
names
- Add
recommender_ids
toscope.payload
- Fixed DELETE status codes for all resources (except accounts because they have a waiting period) 202->204. This is technically a breaking change based on the spec but the API was already returning 204.
- Added include_geometry boolean to TargetModesAggregated to append optional geometries to aggregated targets.
- Added dataset enrichment rates. This reports the number of identities that were appended with additional data, grouped by provider.
- Add Recommenders endpoints for building and managing recommendation models.
- Updated lift value descriptions for outcomes
- Changed targets/{target_id}/lookup success response code from 201 to 200
- Relaxed reference key pattern to allow for leading underscore.
- Outcome analysis descriptions for bias metrics updated.
- Outcome analysis lift value is optional because not all reports have it yet.
- Add customer id option to google ads connection
- Require non-empty strings for resource names
- Score explainability as an opt-in scope payload column
- Remove target score language in favor of target percentile for lift charts.
- Add last_updated_config_at to most Resources
- Add last_read_input_at to most Resources
- Add bias analysis to
GET /outcomes/{outcome_id}/analysis
- GET /traits/{trait_id}/analysis/dimensions - returns summary data about the trait. For example, what percentage of the US population falls into a certain age range.
- Move maximum and minimum on target filter probability and percentile down onto the component/schema
- Add bias mitigation to outcome configuration. Though the spec appears you can combine different mitigation strategies per dimension of concern (age=equity, gender=equality), API runtime validation will prevent it for now. Later we may support this so the spec is designed for future proofing. Note: age=none, gender=equity is allowed.
- Using Outcome Score in target filter is deprecated. Use Outcome percentile or probability instead.
- Let BigQuery dataset names start with a number
- Added dataset sample and non-null data (anonymized)
- Add
lookup_api
connection type for targets - Add JSON types for lookup requests
- Added
GET outcomes/{outcome_id}/analysis
to return outcome report data for model performance, feature importance, and more.
- connection options that aren't required now use
*
instead of+
in regex pattern matching - added
yyyymmdd
to data map formats
- instead of billing via ppds (person-predictions per day), Faraday now bills via a variety of metrics (number of connections, number of known contacts, etc). The endpoint exposes this information to the users
- Change shape of
datasets.output_to_traits
from string to object to support additional configuration details. - Add metadata to the
/traits
endpoint including literate, description, units, and emitted_by_datasets.
- Made some minor updates to managed connection types metadata.
- Added new managed connection types.
- Added a new target filter for score probabilities. Allows a user to filter a target with various operators by the following payload element
outcome_probability
.
DELETE /account
no longer returns the account object - this is in-line with how the otherDELETE
endpoints work.
- 'census_block_group', 'census_tract', and 'dma' are now options for geo aggregated targets.
- Allow
false
to be passed to target filter persona and cohort membershipeq
- Target row_count limits now allow outcome_id to be optional.
- Added secondary connections (e.g. Klaviyo). These are marked as "managed" and therefore do NOT get written into the API spec.
- Added tags:string[] and blurb:string to ConnectionTypes. This is for the website only.
- Added null (_null) and not null (_nnull) operators to target filters for outcome_percentile and outcome_score.
- Added a new target tranformation called filter. Allows a user to filter a target with various operators by any of the following payload elements: persona_id, attributes (traits), cohort_membership, outcome_percentile, outcome_score.
- Example values for
explore
field in cohorts andinvert
in cohort place conditions so that users of the API docs get better default values.
- GET uploads/{directory}/{filename} - download the file previously uploaded at POST uploads/{directory}/{filename}.
- GET dependencies - returns the list of edges in a dependency graph of the account's resources.
- Added 'managed' boolean to connections, datasets, and targets. Managed resources are read-only.
- Added rules for liveramp as a target transformation.
- Updated connection type descriptions to match the new literates in vannevar.
- Cohort stream name is now mutable
- Added new target transformation options for: pinterest, snapchat, klaviyo, segment, youtube, tiktok, taboola
- iterable, poplar, salesforce, google ads, facebook
- New error: EXPIRED_API_KEY
- data map format: static_date_iso8601, with column_name as the (temporary?) place you put the static value
- new date format yyyymm e.g. 201901
freeform_address
andemail_hash
added to identity sets on data sets.
number_of_clusters
parameter to PersonaSets
- New
GET /cohorts/{cohort_id}/analysis/membership
endpoint to show cohort membership counts over time.
- Change the format of event stream properties include type, statistical type, breaks, format, emitted_by_datasets.
- added
columns
detailed metadata for the front-end.
- added
human_readable
- added
custom_structure
- see https://github.com/faradayio/fdy/blob/master/docs/TARGET_STRUCTURE_TRANSFORMATIONS.md - deprecated
payload_map
- Implement DELETE for /traits
- Remove TRIM on dateparsing and implement regex, removing time, only for when we do not autodetect
- Places resource endpoint: GET, POST, PATCH, DELETE
- Added place_conditions parameter to /cohorts, used filter a Cohort's population using a Place's geometry
- Targets: added linkedin transform preset
- Align the initial target transformation presets with the existing target update worker.
- /datasets upsert_column can now be PATCHed after the dataset is created
- Target transformation presets and compilation script.
- Add
GET /persona_sets/{persona_set_id}/analysis/flow
to return type persona analysis info on the associated value and member count of each persona by day.
- Add
stream_conditions
field to cohorts. This allows users to specify values for stream properties to filter cohort membership.
- Added person_full_name to datasets identity sets
- Allow spaces and capital letters in identity sets
- PATCH and POST /outcomes - feature_blocklist could previously only include Faraday-provided traits. Now it can also include user-defined traits.
- GET /streams and GET /streams/{id} endpoints now return properties. Properties are set on /datasets output_to_streams.
- GET, POST, PATCH /datasets: for output_to_streams, instead of being limited to 'datetime', 'product', 'value', and 'channel', you can create any property.
- provided clearer date parsing specifications (yyyy_mm_dd_dash instead of date_iso8601, mm_dd_yy_slash instead of date_month_day_shortyear (this didn't even allow delimiter choosing)). For backwards compatibility, the old formats are still available, but depreciated.
- Add
GET /persona_sets/{persona_set_id}/analysis/dimensions
to return typed persona analysis info for traits beyond the clustering/modeling_fields of the persona set. Eventually this can contain predictions and other event analysis information.
- Marked
persona_set.personas[0].details.bins
as deprecated (use/persona_sets/{persona_set_id}/analysis/dimensions
instead which has a typed response and contains more dimensions)
- support "upsert" for snowflake referenced targets
- Added
_matches
field to trait conditions on cohorts for regex matching.
- Added last_updated_output_at to most endpoints - this specifies when the resource last finished building.
- Make
persona.id
required (non-nullable).
- Added GET /accounts - to show your account plus all subaccounts
- Added GET /accounts/{id} - to show the details of a specific account
- Added GET /accounts/current - to show the details of the account associated with your specific API key
- Added GET /accounds/{id}/billing - to show billing details for the given account. including payments, invoices, and account usage
- New field
persona.individuals_count
returns number of members that match the persona within the cohort the set was based on.
- Allow spaces in column names
- Added "case_sensitive_columns" to Snowflake datasets
- Added
name
andexplore
fields to creating persona sets - Added
PATCH
support for/persona_sets/{persona_set_id}
to updatename
andexplore
name
is now required forPOST /persona_sets
- DELETE endpoints for all resources except traits.
- add matched_count to /datasets. This will show the user how many of their identified people matched a person in Faraday's data.
- fixed the calculation for row count and identified count.
- allow space character in SQL server
database
options
- support "upsert" for bigquery referenced targets
- new supported 'format' for /datasets output_to_streams - seconds since unix epoch and milliseconds since epoch
- show whether cohort is classic (managed)
- allow bucket prefixes to start with capital letters and numbers
- allow colon in column names (e.g. "Source: Product Description")
- added dataset updates, showing the date and rows added per update. Note that we don't currently provide the row count pre-processing as this is technically difficult.
- refactor target limits to support two types of input: percentile ranges (WHERE min <= percentile_score <= max) and absolute limits (LIMIT)
- add
explore
boolean to cohorts
- add
preview
mode to datasets, facilitating "New Dataset" UI (and advanced API users who want low latency column detection) - allow spaces in column names (e.g. "Product Description")
- move required
dataset_name
from BigQuery datasets/targets to BigQuery connection level
- add
active
topersona_sets
- add
contents
andcontents_error
to connections
- add
metro
as an aggregation option to targets
- for /datasets endpoints, made
output_to_streams
optional
/datasets
read-only, optional fieldsrow_count
andidentified_count
- Create a new 403 error code MAX_RESOURCES_REACHED. When user hits the resource quota, they will get a 403 MAX_RESOURCES_REACHED error instead of 403 FORBIDDEN. The user will instead get FORBIDDEN if they do not have access to the resource.
/targets
endpoint re-spec- Added support for read-only connection_types_options
GET /datasets[/X]
now returnsname
GET /datasets[/X]
now returnsdetected_columns
GET /scopes/{scope_id}/datasets
for returning all datasets associated with a scope's population cohorts.
fig/geography
trait category to distinguish location-based fields which contain census, ZIP, etc ID's.
report_url
to scopes for returning a signed url with the report locationoutput_url
to targets for returning a signed url for the default output
- New to datasets:
primary_key_column
.
GET /scope/{scope_id}/payload/cohorts
for returning all cohorts in the scope's payload
- Scope payload attributes will not fail validation for prefixed field names.
url
for snowlake connections now uses the correct regex.
connections/{connection_id}/targets
endpoint, which returns all targets with the given connection_id
url
for redshift connections now uses the correct regex.
- Recency options to the /cohorts endpoint.
- Added
classic
options for datasets and connections to expose querying and patching of classic datasets.
- Obsolete code in this repository has been deleted.
- This repository is now completely standalone.
- Name is now reqired on cohorts.