Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: replaced Beautiful Soup by selectolax to enhance performance #213

Merged
merged 1 commit into from
Nov 9, 2024

Conversation

TeKrop
Copy link
Owner

@TeKrop TeKrop commented Nov 8, 2024

Summary by Sourcery

Replace Beautiful Soup with Selectolax for improved performance in HTML parsing, refactor parsing logic accordingly, update dependencies, and adjust tests and documentation to reflect these changes.

New Features:

  • Replaced Beautiful Soup with Selectolax for HTML parsing to enhance performance.

Enhancements:

  • Refactored parsing logic to use Selectolax's CSS selectors instead of Beautiful Soup's find methods.

Build:

  • Updated project dependencies to remove Beautiful Soup and add Selectolax.

Documentation:

  • Updated documentation to reflect the change from Beautiful Soup to Selectolax in the project description.

Tests:

  • Modified tests to accommodate changes in parsing logic due to the switch to Selectolax.

@TeKrop TeKrop added the enhancement New feature or request label Nov 8, 2024
@TeKrop TeKrop self-assigned this Nov 8, 2024
Copy link
Contributor

sourcery-ai bot commented Nov 8, 2024

Reviewer's Guide by Sourcery

This PR replaces Beautiful Soup with Selectolax for HTML parsing to improve performance. The change involves updating all HTML parsing logic to use Selectolax's CSS selector syntax instead of Beautiful Soup's find/find_all methods. The core parsing functionality remains the same, but the implementation is more efficient.

Class diagram for updated HTML parsing

classDiagram
    class HTMLParser {
        - create_bs_tag(html_content: str)
        + create_parser_tag(html_content: str)
        + store_response_data(response: httpx.Response)
    }
    class PlayerCareerParser {
        - __get_title(profile_div: Tag) : str | None
        + __get_title(profile_div: LexborNode) : str | None
        - __get_endorsement(progression_div: Tag) : dict | None
        + __get_endorsement(progression_div: LexborNode) : dict | None
        - __get_competitive_ranks(progression_div: Tag) : dict | None
        + __get_competitive_ranks(progression_div: LexborNode) : dict | None
    }
    class HeroParser {
        - __get_summary(overview_section: Tag) : dict
        + __get_summary(overview_section: LexborNode) : dict
        - __get_abilities(abilities_section: Tag) : list[dict]
        + __get_abilities(abilities_section: LexborNode) : list[dict]
    }
    class RolesParser {
        - parse_data() : list[dict]
        + parse_data() : list[dict]
    }
    HTMLParser <|-- PlayerCareerParser
    HTMLParser <|-- HeroParser
    HTMLParser <|-- RolesParser
    note for HTMLParser "Replaced BeautifulSoup with Selectolax for parsing"
Loading

File-Level Changes

Change Details Files
Replaced Beautiful Soup with Selectolax for HTML parsing
  • Removed Beautiful Soup dependency and added Selectolax dependency
  • Updated HTMLParser base class to use Selectolax's parser and CSS selectors
  • Replaced find/find_all methods with css/css_first methods
  • Updated attribute access syntax from dict-style to attributes property
  • Simplified nested tag traversal using CSS combinators
pyproject.toml
app/parsers.py
app/main.py
README.md
Refactored player career parser to use Selectolax
  • Updated HTML element selection to use CSS selectors
  • Modified attribute access to use Selectolax's attributes property
  • Simplified nested element traversal using CSS paths
  • Added helper method for getting heroes options
  • Removed type error test case specific to Beautiful Soup implementation
app/players/parsers/player_career_parser.py
tests/players/parsers/test_player_career_parser.py
Updated hero and role parsers to use Selectolax
  • Converted find/find_all calls to CSS selectors
  • Updated attribute access patterns
  • Modified text content extraction to use text() method
app/heroes/parsers/hero_parser.py
app/heroes/parsers/heroes_parser.py
app/roles/parsers/roles_parser.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

sonarcloud bot commented Nov 8, 2024

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @TeKrop - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@TeKrop TeKrop merged commit 9916b04 into main Nov 9, 2024
3 checks passed
@TeKrop TeKrop deleted the feature/replace-beautifulsoup-by-selectolax branch November 9, 2024 09:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant