Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase to main package #3

Merged
merged 41 commits into from
Jan 26, 2024
Merged

Rebase to main package #3

merged 41 commits into from
Jan 26, 2024

Conversation

matt-fleming
Copy link
Member

No description provided.

Jesse and others added 30 commits June 7, 2023 14:02
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
## Summary

Support OAuth flow for Databricks Azure

## Background

Some OAuth endpoints (e.g. Open ID Configuration) and scopes are different between Databricks Azure and AWS. Current code only supports OAuth flow on Databricks in AWS

## What changes are proposed in this pull request?

- Change `OAuthManager` to decouple Databricks AWS specific configuration from OAuth flow
- Add `sql/auth/endpoint.py` that implements cloud specific OAuth endpoint configuration
- Change `DatabricksOAuthProvider` to work with the OAuth configurations in different Databricks cloud (AWS, Azure)
- Add the corresponding unit tests
Signed-off-by: Jesse Whitehouse <[email protected]>
* Cloud Fetch download handler

Signed-off-by: Matthew Kim <[email protected]>

* Issue fix: final result link compressed data has multiple LZ4 end-of-frame markers

Signed-off-by: Matthew Kim <[email protected]>

* Addressing PR comments
 - Linting
 - Type annotations
 - Use response.ok
 - Log exception
 - Remove semaphore and only use threading.event
 - reset() flags method
 - Fix tests after removing semaphore
 - Link expiry logic should be in secs
 - Decompress data static function
 - link_expiry_buffer and static public methods
 - Docstrings and comments

Signed-off-by: Matthew Kim <[email protected]>

* Changing logger.debug to remove url

Signed-off-by: Matthew Kim <[email protected]>

* _reset() comment to docstring

Signed-off-by: Matthew Kim <[email protected]>

* link_expiry_buffer -> link_expiry_buffer_secs

Signed-off-by: Matthew Kim <[email protected]>

---------

Signed-off-by: Matthew Kim <[email protected]>
* Cloud Fetch download manager

Signed-off-by: Matthew Kim <[email protected]>

* Bug fix: submit handler.run

Signed-off-by: Matthew Kim <[email protected]>

* Type annotations

Signed-off-by: Matthew Kim <[email protected]>

* Namedtuple -> dataclass

Signed-off-by: Matthew Kim <[email protected]>

* Shutdown thread pool and clear handlers

Signed-off-by: Matthew Kim <[email protected]>

* Docstrings and comments

Signed-off-by: Matthew Kim <[email protected]>

* handler.run is the correct call

Signed-off-by: Matthew Kim <[email protected]>

* Link expiry buffer in secs

Signed-off-by: Matthew Kim <[email protected]>

* Adding type annotations for download_handlers and downloadable_result_settings

Signed-off-by: Matthew Kim <[email protected]>

* Move DownloadableResultSettings to downloader.py to avoid circular import

Signed-off-by: Matthew Kim <[email protected]>

* Black linting

Signed-off-by: Matthew Kim <[email protected]>

* Timeout is never None

Signed-off-by: Matthew Kim <[email protected]>

---------

Signed-off-by: Matthew Kim <[email protected]>
* Cloud fetch queue and integration

Signed-off-by: Matthew Kim <[email protected]>

* Enable cloudfetch with direct results

Signed-off-by: Matthew Kim <[email protected]>

* Typing and style changes

Signed-off-by: Matthew Kim <[email protected]>

* Client-settable max_download_threads

Signed-off-by: Matthew Kim <[email protected]>

* Docstrings and comments

Signed-off-by: Matthew Kim <[email protected]>

* Increase default buffer size bytes to 104857600

Signed-off-by: Matthew Kim <[email protected]>

* Move max_download_threads to kwargs of ThriftBackend, fix unit tests

Signed-off-by: Matthew Kim <[email protected]>

* Fix tests: staticmethod make_arrow_table mock not callable

Signed-off-by: Matthew Kim <[email protected]>

* cancel_futures in shutdown() only available in python >=3.9.0

Signed-off-by: Matthew Kim <[email protected]>

* Black linting

Signed-off-by: Matthew Kim <[email protected]>

* Fix typing errors

Signed-off-by: Matthew Kim <[email protected]>

---------

Signed-off-by: Matthew Kim <[email protected]>
* Cloud Fetch e2e tests

Signed-off-by: Matthew Kim <[email protected]>

* Test case works for e2-dogfood shared unity catalog

Signed-off-by: Matthew Kim <[email protected]>

* Moving test to LargeQueriesSuite and setting catalog to hive_metastore

Signed-off-by: Matthew Kim <[email protected]>

* Align default value of buffer_size_bytes in driver tests

Signed-off-by: Matthew Kim <[email protected]>

* Adding comment to specify what's needed to run successfully

Signed-off-by: Matthew Kim <[email protected]>

---------

Signed-off-by: Matthew Kim <[email protected]>
Signed-off-by: Sebastian Eckweiler <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Co-authored-by: Sebastian Eckweiler <[email protected]>
Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Daniel Segesdi <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Co-authored-by: Jesse Whitehouse <[email protected]>
---------
Signed-off-by: Bogdan Kyryliuk <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: William Gentry <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Behaviour is gated behind `enable_v3_retries` config. This will be removed and become the default behaviour in a subsequent release.

Signed-off-by: Jesse Whitehouse <[email protected]>
Jesse and others added 11 commits August 10, 2023 11:03
* Add note to changelog about using cloud_fetch
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jacobus Herman <[email protected]>

Co-authored-by: Jesse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
…icks#208)

snok/install-poetry@v1 installs the latest version of Poetry

The latest version of poetry released on 20 August 2023 (four days ago
as of this commit) which drops support for Python 3.7, causing our
github action to fail.

Until we complete databricks#207 we need to conditionally install the last version
of poetry that supports Python 3.7 (poetry==1.5.1)

Signed-off-by: Jesse Whitehouse <[email protected]>
databricks#206)

* Make retry policy backwards compatible with urllib3~=1.0.0

We already implement the equivalent of backoff_max so the behaviour will
be the same for urllib3==1.x and urllib3==2.x

We do not implement backoff jitter so the behaviour for urllib3==1.x will
NOT include backoff jitter whereas urllib3==2.x WILL include jitter.

---------

Signed-off-by: Jesse Whitehouse <[email protected]>
---------

Signed-off-by: Jesse Whitehouse <[email protected]>
Copy link

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@matt-fleming matt-fleming merged commit 56687e5 into main Jan 26, 2024
0 of 2 checks passed
@evb123
Copy link

evb123 commented Jan 26, 2024

Rebase socket_timeout hack with current forked repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.