Newer updates can be found here: GitHub Release Notes
bugfix: Ensure that streams with partition router are not executed concurrently
Add state migration workaround for legacy substreams
Low-code: Add jinja macros today_with_timezone
Adding parity to concurrent CDK from declarative and fixing composite primary key
add clear method to HttpMocker
Support multiple input datetime formats as part of the concurrent cursor
fix streams discover
Add option to File Transfer for File-Bases sources
Introduce support for low-code incremental streams to be run within the concurrent CDK framework
Add Per Partition with Global fallback Cursor
Better structured error log messages in connector_builder module, with message / internal_message / stacktrace split into separate fields
Add new Error: No DBURL given
sql [-hnr] [--table-size] [--db-size] [-p pass-through] [-s string] dburl [command] module for SQL-based destinations
HttpClient: Fixes issue where oauth authenticators would not refresh on backed off retries.
Fix yielding parent records in SubstreamPartitionRouter
Add extra fields to StreamSlice
Low Code: Removes deprecated class_types_registry
and default_implementation_registry
Low Code: Adds XmlDecoder
component
Low Code: Consolidate manifest decoder selection under SimpleRetriever
, AsyncRetriever
, and SessionTokenAuthenticator
concurrent-cdk: add per slice tracking of the most recent cursor
do not raise exception on missing stream by default
Remove PrintBuffer optimization due to dropped records
Async job component: improve memory usage
concurrent-cdk: add cursor partition generator
concurrent-cdk: change stream availability strategy to always available
concurrent-cdk: fix convert_to_concurrent_stream to use state from state manager
Async job component: support Salesforce
Have headers match during HTTP cache hit
Always return a connection status even if an exception was raised
fix connector builder output serialization
add transform_record() to class DefaultFileBasedStream
add python-snappy to file-based dependencies
concurrent-cdk: add cursor definition based on sync mode to ConcurrentSourceAdapter
Decouple low-code request_parameter_provider from cursor, add optional cursor_granularity to ConcurrentCursor
Fix pandas missing dependency
Bug fix: Return a connection status failure on an expected check failure
Declarative async job components
add migration of global stream_state to per_partition format
Connector builder: add flag to disable cache
Fix error in incremental sync docs
Add Global Parent State Cursor
Add limitation for number of partitions to PerPartitionCursor
Fix source-declarative-manifest
Replace pydantic BaseModel with dataclass
use orjson instead of json to speed up JSON parsing
Update json error message parser to identify additional error message fields in response bodies
Raise exceptions in file-based check, improve UI errors
add codeflash to dev environment
Cache the result of interpolated strings when the evaluated value is equal to its raw representation
CDK: refactor error handling in abstract source
Added support for RFR for Full-Refresh Substreams
Stop support for incoming legacy state message format
Move the @deprecated decorator to the class level.
Added test utils for integration tests
file-based cdk: add excel file type support
Have better fallback error message on HTTP error
Ensure at least one element returned by decoder
resumable full refresh: fix issue when live traffic regression tests pass state to connector
Add PrintBuffer to emit records in batches
Resumable full refresh: Add SubstreamResumableFullRefreshCursor to Python CDK to allow connectors to allow checkpointing on parent records
Align BackoffStrategy interfaces to take attempt_count as a full-fledge parameter
Add ability to stop stream when retry-after is greater than a duration
Fix case where stream wont have a state attribute and needs to resolve get_updated_state
- General performance enhancement
- Dropping Python 3.9 support
fix declarative schema refs for Decoder
Fixed: Resolved an issue in HttpClient that prevented correct error messages from being presented.
Adding text field to declarative manifest schema for general connector description.
add name property to http_client for convenience
low-code: fix record selector factory when using custom components
fix OOM on predicate for streamable responses
low code: add new Decoders: JsonlDecoder, IterableDecoder
low-code: fix overwrite for default backoff strategy
CDK: fix handling for rate limit errors when checking connection
resumable full refresh: Automatically apply RFR to streams (not including substreams) that are not incremental and implement next_page_token
Deprecate AvailabilityStrategy
CDK: add not exiting when rate limited
Add failure_type to HttpResponseFilter(retry after pypi read error)
Add failure_type to HttpResponseFilter
Remove 3.11-style union
Clean invalid fields from configured catalog
resumable full refresh: Fix bug where checkpoint reader stops syncing too early if first partition is complete
file-based cdk: add config option to limit number of files for schema discover resumable full refresh: Fix bug for substreams depending on RFR parent stream would not paginate over parent
CDK: add incomplete status to availability check during read
CDK: flush buffer for each RATE_LIMITED message print
CDK: add running stream status with rate limit reason to backoff approach
CDK: add incomplete stream status to nonexistent stream handling
Integrate HttpClient into HttpStream class. See migration guide for more details.
CDK: Add support for input format parsing at jinja macro format_datetime
Add with_json_schema method to ConfiguredAirbyteStreamBuilder
Add with_json_schema method to ConfiguredAirbyteStreamBuilder
Add with_json_schema method to ConfiguredAirbyteStreamBuilder
Update dependency to pydantic v2, and dependency to pydantic v2 models. See migration guide for more details.
low-code: Add is_compare_strictly flag to DatetimeBasedCursor
Exclude airbyte-cdk modules from schema discovery (retry after pypi read error - take 2)
Exclude airbyte-cdk modules from schema discovery (retry after pypi read error)
Exclude airbyte-cdk modules from schema discovery
add from to
add from to
Jinja interpolation - Allow access to _partition for source-jira (re-release after pypi timeout take 2)
Jinja interpolation - Allow access to _partition for source-jira (re-release after pypi timeout)
Jinja interpolation - Allow access to _partition for source-jira
Ensure error message is the same after migration to HttpClient
PerPartitionState - setting invalid initial state should trigger a config error
Fix client_side_incremental end_datetime comparison
Python/Low Code: Updates ErroHandler, BackoffStrategy, HttpClient. Integrates HttpClient into low-code CDK.
low-code: Add Incremental Parent State Handling to SubstreamPartitionRouter
Mock server tests: adding 'discover' as part of the entrypoint_wrapper
low-code: Added retriever type filter to stream slicer merge
Use for Jinja interpolations
Added new datetime format: %s_as_float
Python 3.11 compatibility bugfixes
add client side incremental sync
Removed experimental suffix for unstructured file type
CDK: upgrade dpath
Fix bug so that RFR streams don't resync successful streams on subsequent attempts
low-code: Add RFR support automatically for non-substreams
File-based CDK: avoid error on empty stream when running discover
Delete deprecated AirbyteLogger, AirbyteSpec, and Authenticators + move public classes to the top level init file. See migration guide for more details.
Python CDK: Adds HttpClient, ErrorHandler, and related interfaces.
low-code: Remove support for last_records and improve memory usage
HttpMocker, Adding the delete method.
Fix dependency for pytz
Fix timestamp formatting in low-code macros
file-based: Increase the maximum parseable field size for CSV files
Python CDK: Allow for configuring resumable full refresh for streams (excluding substreams)
File-based CDK: allow to merge schemas with nullable object values
Fix schemas merge for nullable object types
Fix schemas merge for nullable object types
Expose airbyte_cdk.version and pin airbyte-protocol-models dependency to
Connector builder: read input state if it exists
Remove package which was deprecated 2021 or earlier
Concurrent CDK: if exception is AirbyteTracedException, raise this and not StreamThreadException
Low-code: Add JwtAuthenticator
Connector builder: emit state messages
Concurrent CDK: Break Python application with status 1 on exception
Concurrent CDK: Fix to update partitioned state only when partition is successful
Upgrade to recent version of langchain
Updated langchain version and add langchain_core as a dependency
Adding stream_descriptor as part of AirbyteTracedException.init
Republish print buffer after previous pypi attempt timed out
Fix concurrent CDK printing by flushing the print buffer for every message
Concurrent CDK: add logging on exception
Unpin airbyte-protocol-models library
Concurrent CDK: support partitioned states
Concurrent CDK: Print error messages properly so that they can be categorized
Dummy patch to test new publishing flow fixes
Update release process of airbyte-cdk and source-declarative manifest
Fix CDK version mismatch introduced in 0.78.8
Update error messaging/type for missing streams. Note: version mismatch, please use 0.78.9 instead
low-code: add backward compatibility for old close slice behavior
low-code: fix stop_condition instantiation in the cursor pagination
low-code: Add last_record and last_page_size interpolation variables to pagination
Fix dependencies for file-based extras
low-code: fix retrieving partition key for legacy state migration
connector-builder: return full url-encoded URL instead of separating parameters
low-code: Allow state migration with CustomPartitionRouter
Emit state recordCount as float instead of integer
Fix empty , , extras packages
low-code: Add string interpolation filter
Migrate Python CDK to Poetry
low-code: Add StateMigration component
Request option params are allowed to be an array
set minimum python version to 3.9
Connector Builder: have schema fields be nullable by default except from PK and cursor field
low code: add refresh_token_error handler to DeclarativeOauth2Authenticator
low-code: Allow defining custom schema loaders
Declarative datetime-based cursors now only derive state values from records that were read
low-code: remove superfluous sleep
File-based CDK: Fix tab delimiter configuration in CSV file type
testing
low-code: improve error message when a custom component cannot be found
Update mock server test entrypoint wrapper to use per-stream state
Include recordCount in stream state messages and final state message for full refresh syncs
low-code: update cartesian stream slice to emit typed StreamSlice
Low-code: adding a default value if a stream slice is None during read_records
low-code: remove parent cursor compoent from incremental substreams' state message
no-op republish of 0.68.0
low-code: Allow page size to be defined with string interpolation
CDK: upgrade pyarrow
File CDK: Update parquet parser to handle values that resolve to None
Fix handling of tab-separated CSVs
Low-code: Add CustomRecordFilter
Low-code: Add interpolation for request options
low-code: Allow connectors to ignore stream slicer request options on paginated requests
Low-code: Add filter to RemoveFields
Correct handling of custom max_records limits in connector_builder
File-based CDK: fix record enqueuing
Per-stream error reporting and continue syncing on error by default
mask access key when logging refresh response
[ISSUE #34910] add headers to HttpResponse for test framework
File-based CDK: functionality to make incremental syncs concurrent
[ISSUE #34755] do not propagate parameters on JSON schemas
Align version in CDK Dockerfile to be consistent. Before this change, the docker images was mistakenly pinned to version 0.58.5.
File-based CDK: log warning on no sync mode instead of raising exception
Improve error messages for concurrent CDK
Emit state when no partitions are generated for ccdk and update StateBuilder
File-based CDK: run full refresh syncs with concurrency
Fix CCDK overlapping message due to print in entrypoint
Fix concurrent CDK deadlock
Fix state message handling when running concurrent syncs
concurrent-cdk: improve resource usage when reading from substreams
CDK: HttpRequester can accept http_method in str format, which is required by custom low code components
File CDK: Added logic to emit logged RecordParseError
errors and raise the single AirbyteTracebackException
in the end of the sync, instead of silent skipping the parsing errors. PR: airbytehq/airbyte#32589
Handle private network exception as config error
Add POST method to HttpMocker
fix declarative oauth initialization
Integration tests: adding debug mode to improve logging
Add schema normalization to declarative stream
Concurrent CDK: add state converter for ISO timestamps with millisecond granularity
add SelectiveAuthenticator
File CDK: Support raw txt file
Adding more tooling to cover source-stripe events stream
Raise error on passing unsupported value formats as query parameters
Vector DB CDK: Refactor embedders, File based CDK: Handle 422 errors properly in document file type parser
Vector DB CDK: Refactor embedders, File based CDK: Handle 422 errors properly in document file type parser
Update airbyte-protocol
Improve integration tests tooling
low-code: cache requests sent for parent streams
File-based CDK: Add support for automatic primary key for document file type format
File-based CDK: Add support for remote parsing of document file type format via API
Vector DB CDK: Fix bug with embedding tokens with special meaning like <|endoftext|>
no-op to verify pypi publish flow
Allow for connectors to continue syncing when a stream fails
File-based CDK: hide source-defined primary key; users can define primary keys in the connection's configuration
Source Integration tests: decoupling entrypoint wrapper from pytest
First iteration of integration tests tooling (http mocker and response builder)
concurrent-cdk: factory method initializes concurrent source with default number of max tasks
Vector DB CDK: Add omit_raw_text flag
concurrent cdk: read multiple streams concurrently
low-code: fix injection of page token if first request
Fix of generate the error message using _try_get_error based on list of errors
Vector DB CDK: Remove CDC records, File CDK: Update unstructured parser
low-code: fix debug logging when using --debug flag
Increase maximum_attempts_to_acquire to avoid crashing in acquire_call
File CDK: Improve stream config appearance
Concurrent CDK: fix futures pruning
Fix spec schema generation for File CDK and Vector DB CDK and allow skipping invalid files in document file parser
Concurrent CDK: Increase connection pool size to allow for 20 max workers
Concurrent CDK: Improve handling of future to avoid memory leak and improve performances
Add call rate functionality
Fix class SessionTokenAuthenticator for CLASS_TYPES_REGISTRY mapper
File CDK: Improve file type detection in document file type parser
Concurrent CDK: incremental (missing state conversion). Outside of concurrent specific work, this includes the following changes:
- Checkpointing state was acting on the number of records per slice. This has been changed to consider the number of records per syncs
Source.read_state
andSource._emit_legacy_state_format
are now classmethods to allow for developers to have access to the state before instantiating the source
File CDK: Add pptx support
make parameter as not required for default backoff handler
use in-memory cache if no file path is provided
File CDK: Add unstructured parser
Update source-declarative-manifest base image to update Linux alpine and Python
Add max time for backoff handler
File CDK: Add CustomFileBasedException for custom errors
low-code: Allow connector developers to specify the type of an added field
concurrent cdk: fail fast if a partition raises an exception
File CDK: Avoid listing all files for check command
Vector DB CDK: Expose stream identifier logic, add field remapping to processing | File CDK: Emit analytics message for used streams
Add filters for base64 encode and decode in Jinja Interpolation
Few bug fixes for concurrent cdk
Add ability to wrap HTTP errors with specific status codes occurred during access token refresh into AirbyteTracedException
Enable debug logging when running availability check
Enable debug logging when running availability check
File CDK: Allow configuring number of tested files for schema inference and parsability check
Vector DB CDK: Fix OpenAI compatible embedder when used without api key
Vector DB CDK: Improve batching process
Introduce experimental ThreadBasedConcurrentStream
Fix initialize of token_expiry_is_time_of_expiration field
Add new token_expiry_is_time_of_expiration property for AbstractOauth2Authenticator for indicate that token's expiry_in is a time of expiration
Coerce read_records to iterable in http availabilty strategy
Add functionality enabling Page Number/Offset to be set on the first request
Fix parsing of UUID fields in avro files
Vector DB CDK: Fix OpenAI embedder batch size
Add configurable OpenAI embedder to cdk and add cloud environment helper
Fix previous version of request_cache clearing
Fix request_cache clearing and move it to tmp folder
Vector DB CDK: Adjust batch size for Azure embedder to current limits
Change Error message if Stream is not found
Vector DB CDK: Add text splitting options to document processing
Ensuring invalid user-provided urls does not generate sentry issues
Vector DB CDK adjustments: Prevent failures with big records and OpenAI embedder
[ISSUE #30353] File-Based CDK: remove file_type from stream config
Connector Builder: fix datetime format inference for str parsable as int but not isdecimal
Vector DB CDK: Add Azure OpenAI embedder
File-based CDK: improve error message for CSV parsing error
File-based CDK: migrated parsing error to config error to avoid sentry alerts
Add from-field embedder to vector db CDK
FIle-based CDK: Update spec and fix autogenerated headers with skip after
Vector DB CDK adjustments: Fix id generation, improve config spec, add base test case
[Issue #29660] Support empty keys with record selection
Add vector db CDK helpers
File-based CDK: allow user to provided column names for CSV files
File-based CDK: allow for extension mismatch
File-based CDK: Remove CSV noisy log
Source-S3 V4: feature parity rollout
File-based CDK: Do not stop processing files in slice on error
Check config against spec in embedded sources and remove list endpoint from connector builder module
low-code: allow formatting datetime as milliseconds since unix epoch
File-based CDK: handle legacy options
Fix title and description of datetime_format fields
File-based CDK cursor and entrypoint updates
Low code CDK: Decouple SimpleRetriever and HttpStream
Add utils for embedding sources in other Python applications
Relax pydantic version requirement and update to protocol models version 0.4.0
Support many format for cursor datetime
File-based CDK updates
Connector Builder: Ensure we return when there are no slices
low-code: deduplicate query params if they are already encoded in the URL
Fix RemoveFields transformation issue
Breaking change: Rename existing SessionTokenAuthenticator to LegacySessionTokenAuthenticator and make SessionTokenAuthenticator more generic
Connector builder: warn if the max number of records was reached
Remove pyarrow from main dependency and add it to extras
Fix pyyaml and cython incompatibility
Connector builder: Show all request/responses as part of the testing panel
[ISSUE #27494] allow for state to rely on transformed field
Ensuring the state value format matches the cursor value from the record
Fix issue with incremental sync following data feed release
Support data feed like incremental syncs
Fix return type of RecordFilter: changed from generator to list
Connector builder module: serialize request body as string
Fix availability check to handle HttpErrors which happen during slice extraction
Refactoring declarative state management
Error message on state per partition state discrepancy
Supporting state per partition given incremental sync and partition router
Use x-www-urlencoded for access token refresh requests
Replace with when making oauth calls
Emit messages using message repository
Add utils for inferring datetime formats
Add a metadata field to the declarative component schema
make DatetimeBasedCursor.end_datetime optional
Remove SingleUseRefreshTokenOAuthAuthenticator from low code CDK and add generic injection capabilities to ApiKeyAuthenticator
Connector builder: add latest connector config control message to read calls
Add refresh token update capabilities to OAuthAuthenticator
Make step and cursor_granularity optional
Improve connector builder error messages
Align schema generation in SchemaInferrer with Airbyte platform capabilities
Allow nested objects in request_body_json
low-code: Make refresh token in oauth authenticator optional
Unfreeze requests version and test new pipeline
low-code: use jinja sandbox and restrict some methods
pin the version of the requests library
Support parsing non UTC dates and Connector Builder set slice descriptor
low-code: fix add field transformation when running from the connector builder
Emit stream status messages
low-code: remove now_local() macro because it's too unpredictable
low-code: alias stream_interval and stream_partition to stream_slice in jinja context
Connector builder scrubs secrets from raw request and response
low-code: Add title, description, and examples for all fields in the manifest schema
low-code: simplify session token authenticator interface
low-code: fix typo in ManifestDeclarativeSource
Emit slice log messages when running the connector builder
set slice and pages limit when reading from the connector builder module
Low-Code CDK: Enable use of SingleUseRefreshTokenAuthenticator
low-code: fix duplicate stream slicer update
Low-Code CDK: make RecordFilter.filter_records as generator
Enable oauth flow for low-code connectors
Remove unexpected error swallowing on abstract source's check method
connector builder: send stacktrace when error on read
Add connector builder module for handling Connector Builder server requests
CDK's read command handler supports Connector Builder list_streams requests
Fix reset pagination issue on test reads
- Low-code CDK: Override refresh_access_token logic DeclarativeOAuthAuthenticator
Releasing using the new release flow. No change to the CDK per se
OAuth: retry refresh access token requests
Low-Code CDK: duration macro added
support python3.8
Publishing Docker image for source-declarative-manifest
Breaking changes: We have promoted the low-code CDK to Beta. This release contains a number of breaking changes intended to improve the overall usability of the language by reorganizing certain concepts, renaming, reducing some field duplication, and removal of fields that are seldom used.
The changes are:
- Deprecated the concept of Stream Slicers in favor of two individual concepts: Incremental Syncs, and Partition Routers:
- Stream will define an
incremental_sync
field which is responsible for defining how the connector should support incremental syncs using a cursor field.DatetimeStreamSlicer
has been renamed toDatetimeBasedCursor
and can be used for this field. Retriever
s will now define apartition_router
field. The remaining slicers are now calledSubstreamPartitionRouter
andListPartitionRouter
, both of which can be used here as they already have been.- The
CartesianProductStreamSlicer
becausepartition_router
can accept a list of values and will generate that same cartesian product by default.
- Stream will define an
$options
have been renamed to$parameters
- Changed the notation for component references to the JSON schema notation (
$ref: "#/definitions/requester"
) DefaultPaginator
no longer has aurl_base
field. Moving forward, paginators will derive theurl_base
from theHttpRequester
. There are some unique cases for connectors that implement a customRetriever
.primary_key
andname
no longer need to be defined onRetriever
s orRequester
s. They will be derived from the stream’s definition- Streams no longer define a
stream_cursor_field
and will derive it from theincremental_sync
component.checkpoint_interval
has also been deprecated - DpathExtractor
field_pointer
has been renamed tofield_path
RequestOption
can no longer be used with withinject_into
set topath
. There is now a dedicatedRequestPath
component moving forward.
Low-Code CDK: fix signature _parse_records_and_emit_request_and_responses
Low-Code: improve day_delta macro and MinMaxDatetime component
Make HttpAvailabilityStrategy default for HttpStreams
Low-Code CDK: make DatetimeStreamSlicer.step as InterpolatedString
Low-Code: SubstreamSlicer.parent_key - dpath support added
Fix issue when trying to log stream slices that are non-JSON-serializable
Use dpath.util.values method to parse response with nested lists
Use dpath.util.values method to parse response with nested lists
Limiting the number of HTTP requests during a test read
Surface the resolved manifest in the CDK
Add AvailabilityStrategy concept and use check_availability within CheckStream
Add missing package in previous patch release
Handle edge cases for CheckStream - checking connection to empty stream, and checking connection to substream with no parent records
Low-Code: Refactor low-code to use Pydantic model based manifest parsing and component creation
Low-code: Make documentation_url in the Spec be optional
Low-Code: Handle forward references in manifest
Allow for CustomRequester to be defined within declarative manifests
Adding cursor_granularity
to the declarative API of DatetimeStreamSlicer
Add utility class to infer schemas from real records
Do not eagerly refresh access token in SingleUseRefreshTokenOauth2Authenticator
#20923
Fix the naming of OAuthAuthenticator
Include declarative_component_schema.yaml in the publish to PyPi
Start validating low-code manifests using the declarative_component_schema.yaml file
Reverts additions from versions 0.13.0 and 0.13.3.
Low-code: Add token_expiry_date_format to OAuth Authenticator. Resolve ref schema
Fixed StopIteration
exception for empty streams while check_availability
runs.
Low-code: Enable low-code CDK users to specify schema inline in the manifest
Low-code: Add SessionTokenAuthenticator
Add Stream.check_availability
and Stream.AvailabilityStrategy
. Make HttpAvailabilityStrategy
the default HttpStream.AvailabilityStrategy
.
Lookback window should applied when a state is supplied as well
Low-code: Finally, make OffsetIncrement.page_size
interpolated string or int
Revert breaking change on read_config
while keeping the improvement on the error message
Improve error readability when reading JSON config files
Low-code: Log response error message on failure
Low-code: Include the HTTP method used by the request in logging output of the airbyte-cdk
Low-code: Fix the component manifest schema to and validate check instead of checker
Declare a new authenticator SingleUseRefreshTokenOauth2Authenticator
that can perform connector configuration mutation and emit AirbyteControlMessage.ConnectorConfig
.
Low-code: Add start_from_page
option to a PageIncrement class
Low-code: Add jinja macro format_datetime
Low-code: Fix reference resolution for connector builder
Low-code: Avoid duplicate HTTP query in simple_retriever
Low-code: Make default_paginator.page_token_option
optional
Low-code: Fix filtering vars in InterpolatedRequestInputProvider.eval_request_inputs
Low-code: Allow grant_type
to be specified for OAuthAuthenticator
Low-code: Don't update cursor for non-record messages and fix default loader for connector builder manifests
Low-code: Allow for request and response to be emitted as log messages
Low-code: Decouple yaml manifest parsing from the declarative source implementation
Low-code: Allow connector specifications to be defined in the manifest
Low-code: Add support for monthly and yearly incremental updates for DatetimeStreamSlicer
Low-code: Get response.json in a safe way
Low-code: Replace EmptySchemaLoader with DefaultSchemaLoader to retain backwards compatibility Low-code: Evaluate backoff strategies at runtime
Low-code: Allow for read even when schemas are not defined for a connector yet
Low-code: Fix off by one error with the stream slicers
Low-code: Fix a few bugs with the stream slicers
Low-code: Add support for custom error messages on error response filters
Publish python typehints via py.typed
file.
- Propagate options to InterpolatedRequestInputProvider
- Report config validation errors as failed connection status during
check
. - Report config validation errors as
config_error
failure type.
- Low-code: Always convert stream slices output to an iterator
- Replace caching method: VCR.py -> requests-cache with SQLite backend
- Protocol change:
supported_sync_modes
is now a required properties on AirbyteStream. #15591
- Low-code: added hash filter to jinja template
- Low-code: Fix check for streams that do not define a stream slicer
- Low-code: $options do not overwrite parameters that are already set
- Low-code: Pass stream_slice to read_records when reading from CheckStream
- Low-code: Fix default stream schema loader
- Low-code: Expose WaitUntilTimeFromHeader strategy and WaitTimeFromHeader as component type
- Revert 0.1.96
- Improve error for returning non-iterable from connectors parse_response
- Low-code: Expose PageIncrement strategy as component type
- Low-code: Stream schema loader has a default value and can be omitted
- Low-code: Standardize slashes in url_base and path
- Low-code: Properly propagate $options to array items
- Low-code: Log request and response when running check operation in debug mode
- Low-code: Rename LimitPaginator to DefaultPaginator and move page_size field to PaginationStrategy
- Fix error when TypeTransformer tries to warn about invalid transformations in arrays
- Fix: properly emit state when a stream has empty slices, provided by an iterator
- Bugfix: Evaluate
response.text
only in debug mode
- During incremental syncs allow for streams to emit state messages in the per-stream format
- TypeTransformer now converts simple types to array of simple types
- TypeTransformer make warning message more informative
- Make TypeTransformer more robust to incorrect incoming records
- Emit legacy format when state is unspecified for read override connectors
- Fix per-stream to send legacy format for connectors that override read
- Freeze dataclasses-jsonschema to 2.15.1
- Fix regression in
_checkpoint_state
arg
- Update Airbyte Protocol model to support protocol_version
- Add NoAuth to declarative registry and auth parse bug fix
- Fix yaml schema parsing when running from docker container
- Fix yaml config parsing when running from docker container
- Add schema validation for declarative YAML connector configs
- Bugfix: Correctly set parent slice stream for sub-resource streams
- Improve
filter_secrets
skip empty secret
- Replace JelloRecordExtractor with DpathRecordExtractor
- Bugfix: Fix bug in DatetimeStreamSlicer's parsing method
- Bugfix: Fix bug in DatetimeStreamSlicer's format method
- Refactor declarative package to dataclasses
- Bugfix: Requester header always converted to string
- Bugfix: Reset paginator state between stream slices
- Bugfix: Record selector handles single records
- Bugfix: DatetimeStreamSlicer cast interpolated result to string before converting to datetime
- Bugfix: Set stream slicer's request options in SimpleRetriever
- AbstractSource emits a state message when reading incremental even if there were no stream slices to process.
- Replace parse-time string interpolation with run-time interpolation in YAML-based sources
- Add support declarative token authenticator.
- Call init_uncaught_exception_handler from AirbyteEntrypoint.init and Destination.run_cmd
- Add the ability to remove & add records in YAML-based sources
- Allow for detailed debug messages to be enabled using the --debug command.
- Add support for configurable oauth request payload and declarative oauth authenticator.
- Define
namespace
property on theStream
class insidecore.py
.
Bugfix: Correctly obfuscate nested secrets and secrets specified inside oneOf blocks inside the connector's spec.
- Remove legacy sentry code
- Add
requests.exceptions.ChunkedEncodingError
to transient errors so it could be retried
- Add
Stream.get_error_display_message()
to retrieve user-friendly messages from exceptions encountered while reading streams. - Add default error error message retrieval logic for
HTTPStream
s following common API patterns.
TypeTransformer.default_convert
catch TypeError
Update protocol models to support per-stream state: #12829.
- Update protocol models to include
AirbyteTraceMessage
- Emit an
AirbyteTraceMessage
on uncaught exceptions - Add
AirbyteTracedException
Add support for reading the spec from a YAML file (spec.yaml
)
- Add ability to import
IncrementalMixin
fromairbyte_cdk.sources.streams
. - Bumped minimum supported Python version to 3.9.
Remove a false positive error logging during the send process.
Fix BaseBackoffException constructor
Improve logging for Error handling during send process.
Add support for streams with explicit state attribute.
Fix type annotations.
Fix typing errors.
Integrate Sentry for performance and errors tracking.
Log http response status code and its content.
Fix logging of unhandled exceptions: print stacktrace.
Add base pydantic model for connector config and schemas.
Fix build error
Filter airbyte_secrets values at logger and other logging refactorings.
Add __init__.py
to mark the directory airbyte_cdk/utils
as a package.
Improve URL-creation in CDK. Changed to using urllib.parse.urljoin()
.
Fix emitted_at
from seconds * 1000
to correct milliseconds.
Fix broken logger in streams: add logger inheritance for streams from airbyte
.
Fix false warnings on record transform.
Fix logging inside source and streams
Resolve $ref fields for discover json schema.
- Added Sphinx docs
airbyte-cdk/python/reference_docs
module. - Added module documents at
airbyte-cdk/python/sphinx-docs.md
. - Added Read the Docs publishing configuration at
.readthedocs.yaml
.
Transforming Python log levels to Airbyte protocol log levels
Updated OAuth2Specification.rootObject type in airbyte_protocol to allow string or int
Fix import logger error
Added check_config_against_spec
parameter to Connector
abstract class
to allow skipping validating the input config against the spec for non-check
calls
Improving unit test for logger
Use python standard logging instead of custom class
Modified OAuth2Specification
model, added new fields: rootObject
and oauthFlowOutputParameters
Added Transform class to use for mutating record value types so they adhere to jsonschema definition.
Added the ability to use caching for efficient synchronization of nested streams.
Allow passing custom headers to request in OAuth2Authenticator.refresh_access_token()
: airbytehq/airbyte#6219
Resolve nested schema references and move external references to single schema definitions.
- Allow using
requests.auth.AuthBase
as authenticators instead of custom CDK authenticators. - Implement Oauth2Authenticator, MultipleTokenAuthenticator and TokenAuthenticator authenticators.
- Add support for both legacy and requests native authenticator to HttpStream class.
No longer prints full config files on validation error to prevent exposing secrets to log file: airbytehq/airbyte#5879
Fix incremental stream not saved state when internal limit config set.
Fix mismatching between number of records actually read and number of records in logs by 1: airbytehq/airbyte#5767
Update generated AirbyteProtocol models to contain Oauth changes.
Add _limit and _page_size as internal config parameters for SAT
If the input config file does not comply with spec schema, raise an exception instead of system.exit
.
Fix defect with user defined backoff time retry attempts, number of retries logic fixed
Add raise_on_http_errors, max_retries, retry_factor properties to be able to ignore http status errors and modify retry time in HTTP stream
Add checking specified config againt spec for read, write, check and discover commands
Add MultipleTokenAuthenticator
class to allow cycling through a list of API tokens when making HTTP requests
Allow to fetch primary key info from singer catalog
Allow to use non-JSON payloads in request body for http source
Add abstraction for creating destinations.
Fix logging of the initial state.
Allow specifying keyword arguments to be sent on a request made by an HTTP stream: airbytehq/airbyte#4493
Allow to use Python 3.7.0: airbytehq/airbyte#3566
Fix an issue that caused infinite pagination: airbytehq/airbyte#3366
Initial Release