Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: replace selenium user with fixed user #31844

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

villebro
Copy link
Member

@villebro villebro commented Jan 14, 2025

SUMMARY

This PR does the following:

  • Remove the THUMBNAILS_SELENIUM_USER config parameter. To execute as a fixed user you can now use the new FixedUser executor that can assume the role of any user. This means you can use a different fixed user for thumbnails and Alerts & Reports if necessary.
  • Rename ALERT_REPORTS_EXECUTE_AS and THUMBNAILS_EXECUTE_AS to ALERT_REPORTS_EXECUTORS and THUMBNAILS_EXECUTORS for improved clarity.
  • Add CACHE_WARMUP_EXECUTORS to make it possible to specify executors for cache warmup tasks. With this change all async tasks can be configured to use arbitrary executors, similar to what was already possible for Thumbnails and Alerts & Reports.
  • Change the cache warmup strategy interface to have a get_tasks method instead of get_payloads. This change is needed as the cache warmup tasks now need to provide a username to execute the cache warmup tasks.
  • Update docs to reflect the changes.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Functionality Insecure default admin user for async tasks ▹ view

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

superset/config.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jan 14, 2025

Codecov Report

Attention: Patch coverage is 80.00000% with 19 lines in your changes missing coverage. Please review.

Project coverage is 83.46%. Comparing base (76d897e) to head (5054fac).
Report is 1339 commits behind head on master.

Files with missing lines Patch % Lines
superset/tasks/cache.py 60.86% 18 Missing ⚠️
superset/tasks/utils.py 95.65% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #31844       +/-   ##
===========================================
+ Coverage   60.48%   83.46%   +22.97%     
===========================================
  Files        1931      546     -1385     
  Lines       76236    39323    -36913     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    32822    -13292     
+ Misses      28017     6501    -21516     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.81% <33.68%> (-0.36%) ⬇️
javascript ?
mysql 76.00% <73.68%> (?)
postgres 76.06% <73.68%> (?)
presto 53.30% <33.68%> (-0.51%) ⬇️
python 83.46% <80.00%> (+19.97%) ⬆️
sqlite 75.54% <73.68%> (?)
unit 61.07% <44.21%> (+3.44%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pull-request-size pull-request-size bot added size/L and removed size/M labels Jan 14, 2025
@justinpark justinpark marked this pull request as draft January 15, 2025 18:19
@villebro villebro force-pushed the villebro/selenium-user branch 6 times, most recently from 06e7f74 to ce4330b Compare January 18, 2025 17:43
@github-actions github-actions bot added the doc Namespace | Anything related to documentation label Jan 18, 2025
@villebro villebro force-pushed the villebro/selenium-user branch from 9b07b8e to 9fc1e15 Compare January 18, 2025 18:26
@villebro villebro added risk:breaking-change Issues or PRs that will introduce breaking changes v5.0 Label added by the release manager to track PRs to be included in the 5.0 branch labels Jan 18, 2025
@villebro villebro marked this pull request as ready for review January 18, 2025 20:58
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Functionality Silent Executor Error Handling Skips Cache Warmup ▹ view
Functionality Invalid FIXED_USER Executor Error ▹ view
Functionality Silent Failure on Missing Executor ▹ view
Error Handling Silent Error Handling ▹ view
Security Handling ExecutorNotFoundError in get_dashboard_digest and get_chart_digest. ▹ view
Files scanned
File Path Reviewed
superset/tasks/exceptions.py
superset/tasks/types.py
superset/thumbnails/digest.py
superset/tasks/thumbnails.py
superset/commands/report/alert.py
superset/tasks/utils.py
superset/models/slice.py
superset/models/dashboard.py
superset/tasks/cache.py
superset/commands/report/execute.py
superset/config.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Naming
Database Operations
Documentation
Logging
Error Handling
Systems and Environment
Objects and Data Structures
Readability and Maintainability
Asynchronous Processing
Design Patterns
Third-Party Libraries
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

Comment on lines +67 to +68
except (ExecutorNotFoundError, InvalidExecutorError):
username = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent Executor Error Handling Skips Cache Warmup category Functionality

Tell me more
What is the issue?

The code silently handles ExecutorNotFoundError and InvalidExecutorError by setting username to None, which leads to skipping cache warmup for affected charts.

Why this matters

This can result in some charts not being warmed up properly, reducing the effectiveness of the cache warmup strategy.

Suggested change ∙ Feature Preview

Log the specific error and consider alternative executor fallbacks:

try:
    executor = get_executor(executors, chart)
    username = executor[1]
except ExecutorNotFoundError as e:
    logger.error(f"No executor found for chart {chart.id}: {str(e)}. Using default executor.")
    username = current_app.config.get("DEFAULT_CACHE_WARMUP_USER")
except InvalidExecutorError as e:
    logger.error(f"Invalid executor for chart {chart.id}: {str(e)}. Using default executor.")
    username = current_app.config.get("DEFAULT_CACHE_WARMUP_USER")
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +71 to +72
if executor == ExecutorType.FIXED_USER:
raise InvalidExecutorError()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid FIXED_USER Executor Error category Functionality

Tell me more
What is the issue?

The code raises InvalidExecutorError when encountering ExecutorType.FIXED_USER, but this type should be valid according to the developer's intent of replacing selenium executor with fixed_user executor.

Why this matters

This will prevent the system from using the FIXED_USER executor type directly, contradicting the main purpose of the changes being made to replace selenium with fixed user executors.

Suggested change ∙ Feature Preview

Remove this validation check since FIXED_USER is now a valid executor type:

# Remove these lines
if executor == ExecutorType.FIXED_USER:
    raise InvalidExecutorError()
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +96 to +103
try:
executor_type, executor = get_executor(
executors=config["THUMBNAIL_EXECUTORS"],
model=dashboard,
current_user=get_current_user(),
)
except ExecutorNotFoundError:
return ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent Failure on Missing Executor category Functionality

Tell me more
What is the issue?

Silently returning an empty string when no executor is found could lead to unexpected behavior in thumbnail generation.

Why this matters

Without proper error handling or logging, debugging issues with missing executors becomes difficult, and the system's behavior becomes opaque to administrators.

Suggested change ∙ Feature Preview

Add logging before returning empty string to help with debugging:

try:
    executor_type, executor = get_executor(
        executors=config["THUMBNAIL_EXECUTORS"],
        model=dashboard,
        current_user=get_current_user(),
    )
except ExecutorNotFoundError:
    logger.warning("No valid executor found for dashboard thumbnail generation")
    return ""
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +102 to +103
except ExecutorNotFoundError:
return ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent Error Handling category Error Handling

Tell me more
What is the issue?

The error handling silently returns an empty string without logging the error or providing context about what went wrong.

Why this matters

Silent error handling makes it difficult to debug issues when they occur, as there's no trace of what caused the ExecutorNotFoundError.

Suggested change ∙ Feature Preview

Add error logging before returning empty string:

except ExecutorNotFoundError as ex:
    logger.warning(f"Failed to get executor for thumbnail generation: {ex}")
    return ""
Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

Comment on lines +102 to +103
except ExecutorNotFoundError:
return ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling ExecutorNotFoundError in get_dashboard_digest and get_chart_digest. category Security

Tell me more

In the get_dashboard_digest and get_chart_digest functions, when an ExecutorNotFoundError is caught, the code currently returns an empty string. This can potentially lead to security issues if the application does not handle the case when an executor is not found properly. Consider raising the exception or returning a default thumbnail image instead to ensure the application behaves as expected and avoids potential vulnerabilities.

Chat with Korbit by mentioning @korbit-ai, and give a 👍 or 👎 to help Korbit improve your reviews.

UPDATING.md Outdated Show resolved Hide resolved
# from superset.tasks.types import ExecutorType, FixedExecutor
#
# CACHE_WARMUP_EXECUTORS = [ExecutorType.OWNER, FixedExecutor("admin")]
CACHE_WARMUP_EXECUTORS = [ExecutorType.OWNER]
Copy link
Member

@michael-s-molina michael-s-molina Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be possible to have a more generic configuration that allows future async tasks and reduces the number of configurations? Something like:

ASYNC_TASKS_EXECUTORS: dict[AsyncTaskType, ExecutorType] = {}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michael-s-molina I kinda like this idea. However, I'm not fully clear on what the exact pattern will look like for the forthcoming AsyncTaskType. Similarly, a while back when I was implementing a custom KeyValue resource for an internal use case, I ran into issues where it was difficult to extend the existing KeyValueResource type. In the end I ended up doing a bunch of funky casting to avoid linting issues. So to get around this, we would need to make sure the type (or its equivalent) supports introducing new types beyond the built-in ones.

Thoughts? I'm ok implementing an AsyncTaskType enum (or similar) now, but we may still need to refactor it later, causing yet another breaking change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

class AsyncTaskType:
    """Base class for async task types."""
    pass

class CacheWarmUpTask(AsyncTaskType):
    pass

class AlertsReportsTask(AsyncTaskType):
    pass

# Usage
ASYNC_TASKS_EXECUTORS: dict[AsyncTaskType, ExecutorType] = {
    CacheWarmUpTask(): ExecutorType.OWNER,
    AlertsReportsTask(): FixedExecutor("admin"),
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer something that serializes more easily. String based enums are good in this regard, but introduce the pluggability issue outlined above.

Copy link
Member

@michael-s-molina michael-s-molina Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a class, with predefined values that can be extended, be serializable enough? Something like:

class AsyncTaskType:
   CACHE_WARMUP = "CACHE_WARMUP",
   ALERTS_REPORTS = "ALERTS_REPORTS",

   _additional_types = set()

   @classmethod
   def add_type(cls, type):
       if type not in cls._additional_types:
           cls._additional_types.add(type)

   @classmethod
   def get_types(cls):
       return {cls.CACHE_WARMUP, cls.ALERTS_REPORTS}.union(cls._additional_types)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would essentially mean get_types just returns set[str], which means ASYNC_TASKS_EXECUTORS would be dict[str, list[ExecutorType]]. I'm sure there's some good pattern for solving this, but I still feel that's best taken care of in a follow-up PR that implements the new async task framework.

@villebro villebro force-pushed the villebro/selenium-user branch from 9fc1e15 to ec338af Compare January 21, 2025 17:46
@villebro villebro force-pushed the villebro/selenium-user branch from ec338af to 5054fac Compare January 21, 2025 17:47
@villebro
Copy link
Member Author

@michael-s-molina is there anything else that needs to be addressed, or is this good to go, assuming the async task type change can wait until the GATF feature?

current_user=get_current_user(),
)
except ExecutorNotFoundError:
return ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe None will be a better representation? If you agree, please change all digest generation methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Namespace | Anything related to documentation review:draft risk:breaking-change Issues or PRs that will introduce breaking changes size/L v5.0 Label added by the release manager to track PRs to be included in the 5.0 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

76 - Remove falling back to fixed selenium executor for thumbnails by default
3 participants