Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support GitHub Enterprise Server URL pattern #1270

Closed
wants to merge 2 commits into from

Conversation

yamao-latte
Copy link

@yamao-latte yamao-latte commented Oct 5, 2024

User description

  • Update _parse_pr_url and _parse_issue_url functions to use regex
  • Improve handling of various GitHub URL formats, including Enterprise Server
  • Simplify logic and reduce code duplication
  • Enhance error handling for invalid URL structures

Closes #1268

Description

When the provider is set to 'github', the current implementation fails to correctly parse URLs from GitHub Enterprise Server instances. This causes errors when users attempt to use Enterprise Server URLs with our system.

Current Behavior

  • URLs from github.com are parsed correctly
  • URLs from GitHub Enterprise Server instances throw an error or are parsed incorrectly

Expected Behavior

  • Both github.com and GitHub Enterprise Server URLs should be parsed correctly when the provider is set to 'github'

Steps to Reproduce

  1. Set the provider to 'github'
  2. Attempt to use a URL from a GitHub Enterprise Server instance (e.g., https://github.mycompany.com/org/repo/pull/123)
  3. Observe the error or incorrect parsing

Proposed Solution

Update the URL parsing logic to accommodate both github.com and GitHub Enterprise Server URL structures. This may involve:

  • Using more flexible regex patterns
  • Adjusting the logic to handle varying URL formats
  • Ensuring that repository names and PR/issue numbers are correctly extracted regardless of the exact URL structure

Additional Notes

This issue is related to the recent changes made in the PR and issue URL parsing functions. While those changes improved the situation, they didn't fully address the Enterprise Server use case.


PR Type

Enhancement


Description

  • Implemented regex-based parsing for PR and issue URLs, improving compatibility with various GitHub URL formats, including Enterprise Server
  • Updated base_url_html logic to handle different GitHub URL structures (cloud, enterprise with api/v3, and others)
  • Refactored _parse_pr_url and _parse_issue_url functions to use regex patterns, simplifying logic and reducing code duplication
  • Enhanced error handling for invalid URL structures
  • Improved overall flexibility and robustness of URL parsing for both github.com and GitHub Enterprise Server instances

Changes walkthrough 📝

Relevant files
Enhancement
github_provider.py
Enhance GitHub URL parsing and compatibility                         

pr_agent/git_providers/github_provider.py

  • Added regex import and implemented regex-based URL parsing for PRs and
    issues
  • Updated base_url_html logic to handle different GitHub URL formats
  • Refactored _parse_pr_url and _parse_issue_url functions for improved
    compatibility
  • Simplified error handling and reduced code duplication
  • +26/-34 

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    - Update _parse_pr_url and _parse_issue_url functions to use regex
    - Improve handling of various GitHub URL formats, including Enterprise Server
    - Simplify logic and reduce code duplication
    - Enhance error handling for invalid URL structures
    @codiumai-pr-agent-pro codiumai-pr-agent-pro bot added the enhancement New feature or request label Oct 5, 2024
    Copy link
    Contributor

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🏅 Score: 92
    🧪 No relevant tests
    🔒 No security concerns identified
    🔀 No multiple PR themes
    ⚡ Recommended focus areas for review

    Error Handling
    The error messages in _parse_pr_url and _parse_issue_url functions could be more specific to help with debugging.

    Code Duplication
    The _parse_pr_url and _parse_issue_url functions are very similar and could potentially be combined into a single function to reduce duplication.

    Copy link
    Contributor

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Error handling
    Add error handling for PEM file operations to improve robustness and provide informative error messages

    Consider adding error handling for the case when the PEM file specified in the
    settings does not exist or cannot be read. This will provide more informative error
    messages and improve the robustness of the code.

    pr_agent/git_providers/github_provider.py [662-663]

    -with open(get_settings().get("GITHUB.APP_PEM_FILE"), "rb") as key_file:
    -    private_key = key_file.read()
    +pem_file_path = get_settings().get("GITHUB.APP_PEM_FILE")
    +try:
    +    with open(pem_file_path, "rb") as key_file:
    +        private_key = key_file.read()
    +except FileNotFoundError:
    +    raise ValueError(f"GitHub App PEM file not found: {pem_file_path}")
    +except IOError as e:
    +    raise ValueError(f"Error reading GitHub App PEM file: {e}")
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: This suggestion significantly improves error handling for PEM file operations, providing more informative error messages and increasing the robustness of the code. It addresses potential issues that could occur during file operations.

    8
    Enhancement
    Use a more specific regex pattern for parsing PR URLs to improve accuracy and readability

    Consider using a more specific regex pattern for parsing PR and issue URLs. The
    current pattern might match unintended paths. A more specific pattern could include
    the start of the path and use non-capturing groups for better clarity.

    pr_agent/git_providers/github_provider.py [622-623]

    -pattern = r'/repos/([^/]+/[^/]+)/pulls?/(\d+)'
    +pattern = r'^/repos/(?P<repo>[^/]+/[^/]+)/pulls?/(?P<number>\d+)$'
     match = re.match(pattern, parsed_url.path)
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggested regex pattern is more specific and uses named groups, which improves both accuracy and readability. This enhancement reduces the risk of false matches and makes the code more maintainable.

    7

    💡 Need additional feedback ? start a PR chat

    @mrT23
    Copy link
    Collaborator

    mrT23 commented Oct 7, 2024

    Hi @yamao-latte

    As this PR changes a critical component, the level of tests and code quality needed to approve this PR is high.

    As a general rule of thumb, as few changes as possible is the best way to go.

    I will also give code comments, but comprehensive tests, to ensure that compatibility to the GitHub cloud remains, are a must for this PR be considered to merge.

    Comment on lines +32 to +39
    if self.base_url.endswith("/api/v3"):
    # enterprise server with api/v3
    self.base_url_html = self.base_url[:len("/api/v3")]
    elif self.base_url == "https://api.github.com":
    # github cloud
    self.base_url_html = "https://github.com"
    else:
    self.base_url_html = self.base_url
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    explain this change.
    why is it needed ?

    Give concrete examples - what didn't work before, and is working now due to this change ?

    repo_name = '/'.join(path_parts[1:3])
    path = parsed_url.path

    pattern = r'/repos/([^/]+/[^/]+)/issues/(\d+)'
    Copy link
    Collaborator

    @mrT23 mrT23 Oct 7, 2024

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    would this format support:

    https://github.com/Codium-ai/pr-agent/issues/1268

    for example
    ?

    if len(path_parts) < 5 or path_parts[3] != 'pulls':
    raise ValueError("The provided URL does not appear to be a GitHub PR URL")
    repo_name = '/'.join(path_parts[1:3])
    pattern = r'/repos/([^/]+/[^/]+)/pulls?/(\d+)'
    Copy link
    Collaborator

    @mrT23 mrT23 Oct 7, 2024

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    would this format support

    https://github.com/Codium-ai/pr-agent/pull/1270

    for example ?

    Copy link
    Author

    @yamao-latte yamao-latte Oct 7, 2024

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    We would like to support a format such as:
    https://github.enterprise-server.com/api/v3/repos/Codium-ai/pr-agent/pull/1270

    In GitHub Enterprise Server, the API URLs include the api/v3 segment. The current implementation uses a hard-coded check:

    if len(path_parts) < 5 or path_parts[3] != 'pulls':
    

    This approach does not account for the api/v3 part of the URL, leading to errors when the URL structure differs. Specifically, this issue affects the following part of the code:

    For more information on the URL structure, refer to the GitHub Enterprise Server documentation:
    GitHub Docs - Get a Pull Request

    pr_url = event_payload.get("pull_request", {}).get("url")
    if pr_url:
    # legacy - supporting both GITHUB_ACTION and GITHUB_ACTION_CONFIG
    auto_review = get_setting_or_env("GITHUB_ACTION.AUTO_REVIEW", None)
    if auto_review is None:
    auto_review = get_setting_or_env("GITHUB_ACTION_CONFIG.AUTO_REVIEW", None)
    auto_describe = get_setting_or_env("GITHUB_ACTION.AUTO_DESCRIBE", None)
    if auto_describe is None:
    auto_describe = get_setting_or_env("GITHUB_ACTION_CONFIG.AUTO_DESCRIBE", None)
    auto_improve = get_setting_or_env("GITHUB_ACTION.AUTO_IMPROVE", None)
    if auto_improve is None:
    auto_improve = get_setting_or_env("GITHUB_ACTION_CONFIG.AUTO_IMPROVE", None)
    # Set the configuration for auto actions
    get_settings().config.is_auto_command = True # Set the flag to indicate that the command is auto
    get_settings().pr_description.final_update_message = False # No final update message when auto_describe is enabled
    get_logger().info(f"Running auto actions: auto_describe={auto_describe}, auto_review={auto_review}, auto_improve={auto_improve}")
    # invoke by default all three tools
    if auto_describe is None or is_true(auto_describe):
    await PRDescription(pr_url).run()
    if auto_review is None or is_true(auto_review):
    await PRReviewer(pr_url).run()
    if auto_improve is None or is_true(auto_improve):
    await PRCodeSuggestions(pr_url).run()

    Copy link
    Collaborator

    @mrT23 mrT23 Oct 7, 2024

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    if this PR won't provide complete and total support also for regular github, it cannot be merged.

    All the issues I stated above should be addressed and fixed.

    For example, currently
    https://github.com/Codium-ai/pr-agent/pull/1266 (CLI)
    is not supported in this new code . it was supported before

    @mrT23
    Copy link
    Collaborator

    mrT23 commented Oct 13, 2024

    @yamao-latte
    If you are not going to address and improve this PR, I suggest closing it.
    It cannot be merged in its current form

    You can still utilize the code in your local deployment

    @mrT23
    Copy link
    Collaborator

    mrT23 commented Oct 14, 2024

    anyway, github serve support was added here
    https://qodo-merge-docs.qodo.ai/installation/github/#action-for-github-enterprise-server

    you are welcome to try it and give feedback

    @mrT23 mrT23 closed this Oct 14, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    Add support for GitHub Enterprise
    2 participants