forked from databricks/databricks-sql-python
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test #1
Open
evb123
wants to merge
152
commits into
octoenergy:main
Choose a base branch
from
databricks:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Test #1
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Matthew Kim <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
--------- Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
## Summary Support OAuth flow for Databricks Azure ## Background Some OAuth endpoints (e.g. Open ID Configuration) and scopes are different between Databricks Azure and AWS. Current code only supports OAuth flow on Databricks in AWS ## What changes are proposed in this pull request? - Change `OAuthManager` to decouple Databricks AWS specific configuration from OAuth flow - Add `sql/auth/endpoint.py` that implements cloud specific OAuth endpoint configuration - Change `DatabricksOAuthProvider` to work with the OAuth configurations in different Databricks cloud (AWS, Azure) - Add the corresponding unit tests
--------- Signed-off-by: Jesse Whitehouse <[email protected]>
--------- Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
* Cloud Fetch download handler Signed-off-by: Matthew Kim <[email protected]> * Issue fix: final result link compressed data has multiple LZ4 end-of-frame markers Signed-off-by: Matthew Kim <[email protected]> * Addressing PR comments - Linting - Type annotations - Use response.ok - Log exception - Remove semaphore and only use threading.event - reset() flags method - Fix tests after removing semaphore - Link expiry logic should be in secs - Decompress data static function - link_expiry_buffer and static public methods - Docstrings and comments Signed-off-by: Matthew Kim <[email protected]> * Changing logger.debug to remove url Signed-off-by: Matthew Kim <[email protected]> * _reset() comment to docstring Signed-off-by: Matthew Kim <[email protected]> * link_expiry_buffer -> link_expiry_buffer_secs Signed-off-by: Matthew Kim <[email protected]> --------- Signed-off-by: Matthew Kim <[email protected]>
* Cloud Fetch download manager Signed-off-by: Matthew Kim <[email protected]> * Bug fix: submit handler.run Signed-off-by: Matthew Kim <[email protected]> * Type annotations Signed-off-by: Matthew Kim <[email protected]> * Namedtuple -> dataclass Signed-off-by: Matthew Kim <[email protected]> * Shutdown thread pool and clear handlers Signed-off-by: Matthew Kim <[email protected]> * Docstrings and comments Signed-off-by: Matthew Kim <[email protected]> * handler.run is the correct call Signed-off-by: Matthew Kim <[email protected]> * Link expiry buffer in secs Signed-off-by: Matthew Kim <[email protected]> * Adding type annotations for download_handlers and downloadable_result_settings Signed-off-by: Matthew Kim <[email protected]> * Move DownloadableResultSettings to downloader.py to avoid circular import Signed-off-by: Matthew Kim <[email protected]> * Black linting Signed-off-by: Matthew Kim <[email protected]> * Timeout is never None Signed-off-by: Matthew Kim <[email protected]> --------- Signed-off-by: Matthew Kim <[email protected]>
* Cloud fetch queue and integration Signed-off-by: Matthew Kim <[email protected]> * Enable cloudfetch with direct results Signed-off-by: Matthew Kim <[email protected]> * Typing and style changes Signed-off-by: Matthew Kim <[email protected]> * Client-settable max_download_threads Signed-off-by: Matthew Kim <[email protected]> * Docstrings and comments Signed-off-by: Matthew Kim <[email protected]> * Increase default buffer size bytes to 104857600 Signed-off-by: Matthew Kim <[email protected]> * Move max_download_threads to kwargs of ThriftBackend, fix unit tests Signed-off-by: Matthew Kim <[email protected]> * Fix tests: staticmethod make_arrow_table mock not callable Signed-off-by: Matthew Kim <[email protected]> * cancel_futures in shutdown() only available in python >=3.9.0 Signed-off-by: Matthew Kim <[email protected]> * Black linting Signed-off-by: Matthew Kim <[email protected]> * Fix typing errors Signed-off-by: Matthew Kim <[email protected]> --------- Signed-off-by: Matthew Kim <[email protected]>
* Cloud Fetch e2e tests Signed-off-by: Matthew Kim <[email protected]> * Test case works for e2-dogfood shared unity catalog Signed-off-by: Matthew Kim <[email protected]> * Moving test to LargeQueriesSuite and setting catalog to hive_metastore Signed-off-by: Matthew Kim <[email protected]> * Align default value of buffer_size_bytes in driver tests Signed-off-by: Matthew Kim <[email protected]> * Adding comment to specify what's needed to run successfully Signed-off-by: Matthew Kim <[email protected]> --------- Signed-off-by: Matthew Kim <[email protected]>
Signed-off-by: Matthew Kim <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Sebastian Eckweiler <[email protected]> Signed-off-by: Jesse Whitehouse <[email protected]> Co-authored-by: Sebastian Eckweiler <[email protected]> Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Daniel Segesdi <[email protected]> Signed-off-by: Jesse Whitehouse <[email protected]> Co-authored-by: Jesse Whitehouse <[email protected]>
--------- Signed-off-by: Jesse Whitehouse <[email protected]>
--------- Signed-off-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]> Co-authored-by: Jesse Whitehouse <[email protected]>
--------- Signed-off-by: Bogdan Kyryliuk <[email protected]> Signed-off-by: Jesse Whitehouse <[email protected]> Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: William Gentry <[email protected]> Signed-off-by: Jesse Whitehouse <[email protected]> Co-authored-by: Jesse Whitehouse <[email protected]>
Signed-off-by: Jesse Whitehouse <[email protected]>
--------- Co-authored-by: Jesse <[email protected]>
Resolves #187 Signed-off-by: Jesse Whitehouse <[email protected]>
Behaviour is gated behind `enable_v3_retries` config. This will be removed and become the default behaviour in a subsequent release. Signed-off-by: Jesse Whitehouse <[email protected]>
* move py.typed to correct places https://peps.python.org/pep-0561/ says 'For namespace packages (see PEP 420), the py.typed file should be in the submodules of the namespace, to avoid conflicts and for clarity.'. Previously, when I added the py.typed file to this project, #382 , I was unaware this was a namespace package (although, curiously, it seems I had done it right initially and then changed to the wrong way). As PEP 561 warns us, this does create conflicts; other libraries in the databricks namespace package (such as, in my case, databricks-vectorsearch) are then treated as though they are typed, which they are not. This commit moves the py.typed file to the correct places, the submodule folders, fixing that problem. Signed-off-by: wyattscarpenter <[email protected]> * change target of mypy to src/databricks instead of src. I think this might fix the CI code-quality checks failure, but unfortunately I can't replicate that failure locally and the error message is unhelpful Signed-off-by: wyattscarpenter <[email protected]> * Possible workaround for bad error message 'error: --install-types failed (no mypy cache directory)'; see python/mypy#10768 (comment) Signed-off-by: wyattscarpenter <[email protected]> * fix invalid yaml syntax Signed-off-by: wyattscarpenter <[email protected]> * Best fix (#3) Fixes the problem by cding and supplying a flag to mypy (that mypy needs this flag is seemingly fixed/changed in later versions of mypy; but that's another pr altogether...). Also fixes a type error that was somehow in the arguments of the program (?!) (I guess this is because you guys are still using implicit optional) --------- Signed-off-by: wyattscarpenter <[email protected]> * return the old result_links default (#5) Return the old result_links default, make the type optional, & I'm pretty sure the original problem is that add_file_links can't take a None, so these statements should be in the body of the if-statement that ensures it is not None Signed-off-by: wyattscarpenter <[email protected]> * Update src/databricks/sql/utils.py "self.download_manager is unconditionally used later, so must be created. Looks this part of code is totally not covered with tests 🤔" Co-authored-by: Levko Kravets <[email protected]> Signed-off-by: wyattscarpenter <[email protected]> --------- Signed-off-by: wyattscarpenter <[email protected]> Co-authored-by: Levko Kravets <[email protected]>
* Upgrade mypy This commit removes the flag (and cd step) from f53aa37 which we added to get mypy to treat namespaces correctly. This was apparently a bug in mypy, or behavior they decided to change. To get the new behavior, we must upgrade mypy. (This also allows us to remove a couple `# type: ignore` comment that are no longer needed.) This commit runs changes the version of mypy and runs `poetry lock`. It also conforms the whitespace of files in this project to the expectations of various tools and standard (namely: removing trailing whitespace as expected by git and enforcing the existence of one and only one newline at the end of a file as expected by unix and github.) It also uses https://github.com/hauntsaninja/no_implicit_optional to automatically upgrade codebase due to a change in mypy behavior. For a similar reason, it also fixes a new type (or otherwise) errors: * "Return type 'Retry' of 'new' incompatible with return type 'DatabricksRetryPolicy' in supertype 'Retry'" * databricks/sql/auth/retry.py:225: error: object has no attribute update [attr-defined] * /test_param_escaper.py:31: DeprecationWarning: invalid escape sequence \) [as it happens, I think it was also wrong for the string not to be raw, because I'm pretty sure it wants all of its backslashed single-quotes to appear literally with the backslashes, which wasn't happening until now] * ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject [this is like a numpy version thing, which I fixed by being stricter about numpy version] --------- Signed-off-by: wyattscarpenter <[email protected]> * Incorporate suggestion. I decided the most expedient way of dealing with this type error was just adding the type ignore comment back in, but with a `[attr-defined]` specifier this time. I mean, otherwise I would have to restructure the code or figure out the proper types for a TypedDict for the dict and I don't think that's worth it at the moment. Signed-off-by: wyattscarpenter <[email protected]> --------- Signed-off-by: wyattscarpenter <[email protected]>
- Raises NonRecoverableNetworkError when request results in 401 status code Signed-off-by: Tor Hødnebø <[email protected]> Signed-off-by: Tor Hødnebø <[email protected]>
Signed-off-by: Jacky Hu <[email protected]>
…#405) * [PECO-1751] Refactor CloudFetch downloader: handle files sequentially; utilize Futures Signed-off-by: Levko Kravets <[email protected]> * Retry failed CloudFetch downloads Signed-off-by: Levko Kravets <[email protected]> * Update tests Signed-off-by: Levko Kravets <[email protected]> --------- Signed-off-by: Levko Kravets <[email protected]>
…ons we support (#412) Signed-off-by: Levko Kravets <[email protected]>
* Disable SSL verification for CloudFetch links Signed-off-by: Levko Kravets <[email protected]> * Use existing `_tls_no_verify` option in CloudFetch downloader Signed-off-by: Levko Kravets <[email protected]> * Update tests Signed-off-by: Levko Kravets <[email protected]> --------- Signed-off-by: Levko Kravets <[email protected]>
* Prepare relese 3.3.0 Signed-off-by: Levko Kravets <[email protected]> * Remove @arikfr from CODEOWNERS Signed-off-by: Levko Kravets <[email protected]> --------- Signed-off-by: Levko Kravets <[email protected]>
* Support pandas 2.2.2 See release note numpy 2.2.2: https://pandas.pydata.org/docs/dev/whatsnew/v2.2.0.html#to-numpy-for-numpy-nullable-and-arrow-types-converts-to-suitable-numpy-dtype * Allow pandas 2.2.2 in pyproject.toml * Update poetry.lock, poetry lock --no-update * Code style Signed-off-by: Levko Kravets <[email protected]> --------- Signed-off-by: Levko Kravets <[email protected]> Co-authored-by: Levko Kravets <[email protected]>
…ion setting is provided (#419) * [PECO-1801] Make OAuth as the default authenticator if no authentication setting is provided Signed-off-by: Jacky Hu <[email protected]>
* [PECO-1857] Use SSL options with HTTPS connection pool Signed-off-by: Levko Kravets <[email protected]> * Some cleanup Signed-off-by: Levko Kravets <[email protected]> * Resolve circular dependencies Signed-off-by: Levko Kravets <[email protected]> * Update existing tests Signed-off-by: Levko Kravets <[email protected]> * Fix MyPy issues Signed-off-by: Levko Kravets <[email protected]> * Fix `_tls_no_verify` handling Signed-off-by: Levko Kravets <[email protected]> * Add tests Signed-off-by: Levko Kravets <[email protected]> --------- Signed-off-by: Levko Kravets <[email protected]>
Prepare release 3.4.0 Signed-off-by: Levko Kravets <[email protected]>
… column set (#440) * Implemented the columnar flow for non arrow users * Minor fixes * Introduced the Column Table structure * Added test for the new column table * Minor fix * Removed unnecessory fikes
…rmation in error (#447) * added error info on non-retryable error
Reformatted the files using black
Prepare release 3.5.0 Signed-off-by: Jacky Hu <[email protected]>
Signed-off-by: Jacky Hu <[email protected]>
Signed-off-by: Jacky Hu <[email protected]>
…odejs drivers (#467) * Added the exponential backoff code * Added the exponential backoff algorithm and refractored the code * Added jitter and added unit tests * Reformatted * Fixed the test_retry_exponential_backoff integration test
…#463) * Built the basic flow for the async pipeline - testing is remaining * Implemented the flow for the get_execution_result, but the problem of invalid operation handle still persists * Missed adding some files in previous commit * Working prototype of execute_async, get_query_state and get_execution_result * Added integration tests for execute_async * add docs for functions * Refractored the async code * Fixed java doc * Reformatted
Fixed the chekc_types failing
* Remove upper caps on numpy and pyarrow versions
…supported from >=3.x connector (#477) Added doc update
* Raised error when incorrect Row offset it returned * Changed error type * grammar fix * Added unit tests and modified the code * Updated error message * Updated the non retying to only inline case * Updated fix * Changed the flow * Minor update * Updated the retryable condition * Minor test fix * Added extra space
* bumped up version * Updated to version 3.7.0 * Grammar fix * Minor fix
* Modified the gitignore file to not have .idea file * [PECO-1803] Splitting the PySql connector into the core and the non core part (#417) * Implemented ColumnQueue to test the fetchall without pyarrow Removed token removed token * order of fields in row corrected * Changed the folder structure and tested the basic setup to work * Refractored the code to make connector to work * Basic Setup of connector, core and sqlalchemy is working * Basic integration of core, connect and sqlalchemy is working * Setup working dynamic change from ColumnQueue to ArrowQueue * Refractored the test code and moved to respective folders * Added the unit test for column_queue Fixed __version__ Fix * venv_main added to git ignore * Added code for merging columnar table * Merging code for columnar * Fixed the retry_close sesssion test issue with logging * Fixed the databricks_sqlalchemy tests and introduced pytest.ini for the sqla_testing * Added pyarrow_test mark on pytest * Fixed databricks.sqlalchemy to databricks_sqlalchemy imports * Added poetry.lock * Added dist folder * Changed the pyproject.toml * Minor Fix * Added the pyarrow skip tag on unit tests and tested their working * Fixed the Decimal and timestamp conversion issue in non arrow pipeline * Removed not required files and reformatted * Fixed test_retry error * Changed the folder structure to src / databricks * Removed the columnar non arrow flow to another PR * Moved the README to the root * removed columnQueue instance * Revmoved databricks_sqlalchemy dependency in core * Changed the pysql_supports_arrow predicate, introduced changes in the pyproject.toml * Ran the black formatter with the original version * Extra .py removed from all the __init__.py files names * Undo formatting check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * Check * BIG UPDATE * Refeactor code * Refractor * Fixed versioning * Minor refractoring * Minor refractoring * Changed the folder structure such that sqlalchemy has not reference here * Fixed README.md and CONTRIBUTING.md * Added manual publish * On push trigger added * Manually setting the publish step * Changed versioning in pyproject.toml * Bumped up the version to 4.0.0.b3 and also changed the structure to have pyarrow as optional * Removed the sqlalchemy tests from integration.yml file * [PECO-1803] Print warning message if pyarrow is not installed (#468) Print warning message if pyarrow is not installed Signed-off-by: Jacky Hu <[email protected]> * [PECO-1803] Remove sqlalchemy and update README.md (#469) Remove sqlalchemy and update README.md Signed-off-by: Jacky Hu <[email protected]> * Removed all sqlalchemy related stuff * generated the lock file * Fixed failing tests * removed poetry.lock * Updated the lock file * Fixed poetry numpy 2.2.2 issue * Workflow fixes --------- Signed-off-by: Jacky Hu <[email protected]> Co-authored-by: Jacky Hu <[email protected]>
* Removed python3.8 support * Minor fix
Support for Py till 3.12
bumped up the version
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TEst