Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLAlchemy optional dependency #624

Open
sbrugman opened this issue May 30, 2024 · 3 comments
Open

SQLAlchemy optional dependency #624

sbrugman opened this issue May 30, 2024 · 3 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@sbrugman
Copy link

sbrugman commented May 30, 2024

Is your feature request related to a problem? Please describe.
v1.5.0 introduced a dependency on SQLAlchemy >= 2.0.0. Unfortunately, this version conflicts which what recent version of Airflow require (<2).
I'm not using the SQLAlchemy-based metadata provider.

Describe the solution you'd like
The SQLAlchemy could be "feature-gated".
In setup.py we could move the dependency to extras_require.
The user can then install this dependency using pip install sqllineage[sqlalchemy] or similar.
The core dependency is even lighter!

setup(
    name=NAME,
    version=VERSION,
    author="Reata",
    author_email="[email protected]",
    description="SQL Lineage Analysis Tool powered by Python",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/reata/sqllineage",
    packages=find_packages(exclude=("tests",)),
    package_data={"": [f"{STATIC_FOLDER}/*", f"{STATIC_FOLDER}/**/**/*", "data/**/*"]},
    include_package_data=True,
    classifiers=[
        "Development Status :: 5 - Production/Stable",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
        "Programming Language :: Python :: 3",
        "Programming Language :: Python :: 3.8",
        "Programming Language :: Python :: 3.9",
        "Programming Language :: Python :: 3.10",
        "Programming Language :: Python :: 3.11",
        "Programming Language :: Python :: 3.12",
        "Programming Language :: Python :: Implementation :: CPython",
    ],
    python_requires=">=3.8",
    install_requires=[
        "sqlparse==0.5.0",
        "networkx>=2.4",
        "sqlfluff==3.0.5",
    ],
    entry_points={"console_scripts": ["sqllineage = sqllineage.cli:main"]},
    extras_require={
        "ci": [
            "bandit",
            "black",
            "flake8",
            "flake8-blind-except",
            "flake8-builtins",
            "flake8-import-order",
            "flake8-logging-format",
            "mypy",
            "pytest",
            "pytest-cov",
            "tox",
            "twine",
            "wheel",
        ],
        "sqlalchemy": [
            "sqlalchemy>=2.0.0",
        ],
        "docs": ["Sphinx>=3.2.0", "sphinx_rtd_theme>=0.5.0"],
    },
    cmdclass={"egg_info": EggInfoWithJS},
)

sqllineage/core/metadata/sqlalchemy.py should only be imported when used, so that we get no import errors unless the module is directly imported.

Describe alternatives you've considered

Additional context
Upgrading Airflow is not an option since another team is responsible.

@sbrugman sbrugman added the enhancement New feature or request label May 30, 2024
@reata
Copy link
Owner

reata commented Jun 10, 2024

This decision is made intentionally to not support sqlalchemy v1.x (and as a result, making people suffer a bit).

Airflow and Superset still pin to sqlalchemy v1.x due to the fact that many 3rd party sqlalchemy dialect implementation still uses v1.x. Considering the maintenance effort, I don't want to build a compatible layer regarding sqlalchemy v1.x and v2.x API, though that would be more user-friendly. (this looks like what Airflow will do, see apache/airflow#28723)

At the same time, even though sqlalchemy v2.x is marked as dependency, we're coding in an OOP way that if you don't use sqlalchemy metadata provider, your code will work even with sqlalchemy v1.x installed. (pip might remind you of incompatible dependency, but that's fine).

Let me know if this works before we discuss on the 'optional' approach.

@reata reata added the question Further information is requested label Jun 10, 2024
@sbrugman
Copy link
Author

I agree with the decision not to support sqlalchemy v1.x, no problem there.

When the build pipeline includes a pip compile (or similar) step like we have, there is no resolution in the current setting. I still think the 'optional' approach is closest to the desired behaviour: if the user would like the sqlalchemy metadata provider, then the version should be >= v2.x

@cheunhong
Copy link

Agree on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants