Conversation
Co-authored-by: Martín Gaitán <gaitan@gmail.com>
| result = re.findall(pattern, url) | ||
| if result: | ||
| extracted_string = result[0] | ||
| return extracted_string | ||
| else: | ||
| raise DocoptExit(f"Ensure your path is from github repository. (error {e})") |
There was a problem hiding this comment.
Here e in the message is not defined.
Also, it's considered more pythonic the EAFP approach
So, it should something like
| result = re.findall(pattern, url) | |
| if result: | |
| extracted_string = result[0] | |
| return extracted_string | |
| else: | |
| raise DocoptExit(f"Ensure your path is from github repository. (error {e})") | |
| try: | |
| return re.findall(pattern, url)[0] | |
| except TypeError: | |
| raise DocoptExit(f"Ensure the URL is from github repository") |
| pattern = r"https://github.com/.*?/(?:tree|blob)/(.*)" | ||
| result = re.findall(pattern, url) |
There was a problem hiding this comment.
do not use findall for a single potential match. also it would be a little better to compile the pattern at module level
PATTERN = re.compile(r"https://github.com/.*?/(?:tree|blob)/(.*)")
def get_path(url):
try:
return re.match(pattern, url).group(1)
except ...
| if result: | ||
| full_name_repo = result[0][0] | ||
|
|
||
| is_from_my_repo = "https://" not in url_or_path or full_name_repo == repo.full_name |
There was a problem hiding this comment.
a few things here.
-
the first condition should be
not url_or_path.startswith("https://") -
is it enough to assume that if it doesn't start with https:// then is from repo?
-
wording: "my_repo" . I'm the user but the repo could someone else and I just have permissions, right? like mgaitan downloading from Shiphero/pastebin. I would use
is_from_the_repoor something like that
| if is_from_my_repo: | ||
| path = re.sub(rf"^https://github\.com/{repo.full_name}/(blob|tree)/{repo.default_branch}/", "", url_or_path) | ||
| else: | ||
| repo = get_repo(full_name_repo) | ||
| path = get_path(url_or_path) |
There was a problem hiding this comment.
if we are trying to parse with a regex, couldn't we generalize it an accept urls or user/repo/path forms for any repo and then check if it's the our not?
suppose shbin zauberzeug/nicegui/examples
| url = f"https://raw.githubusercontent.com/{repo.full_name}/{path}" | ||
| response = requests.get(url) | ||
| if response.status_code > 200: | ||
| raise Exception("There was a problem with your download, please check url") | ||
| content = response.content |
There was a problem hiding this comment.
what if we unify own or foreign contents with requests? the only difference it that you need to pass the token in the header.
In [16]: requests.get("https://raw.githubusercontent.com/Shiphero/pastebin/main/mgaitan/15e1VFq87UA.txt",
...: headers={"Authorization": f"token {os.environ['SHBIN_GITHUB_TOKEN']}", "Accept": "application/vnd.github.v3.raw'"}
...: )
Out[16]: <Response [200]>
In [17]: Out[16].content[:144]
Out[17]: b" \n> mock_sns.push.assert_called_once_with(topic_arn, message)\nE AssertionError: expected call not found.\nE Expected: push('"the good news is that passing a token to a public content doesn't affect, so this could be the only code
In [20]: requests.get("https://raw.githubusercontent.com/facebook/react/main/README.md",
...: headers={"Authorization": f"token {os.environ['SHBIN_GITHUB_TOKEN']}", "Accept": "application/vnd.github.v3.raw'"}
...: )
Out[20]: <Response [200]>| except Exception as e: | ||
| print(f"[red]x[/red] {e}") |
There was a problem hiding this comment.
which other exception could be?
| with pytest.raises(Exception) as exc_info: | ||
| raise Exception("There was a problem with your download, please check url") | ||
| assert str(exc_info.value) == "There was a problem with your download, please check url" |
There was a problem hiding this comment.
this has no sense Alvar. It's a self-fulfilling prophecy: "I will prove that the following code raise an exception " and the following code is... literally raise Exception().
| working_dir = tmp_path / "working_dir" | ||
| working_dir.mkdir() |
There was a problem hiding this comment.
we don't need a subfolder. tmp_path is already a temporary Path instance.
| requests_mock.get("https://raw.githubusercontent.com/another_awesome/repository/main/hello.md", status_code=400) | ||
| working_dir = tmp_path / "working_dir" | ||
| working_dir.mkdir() | ||
| os.chdir(working_dir) |
There was a problem hiding this comment.
not a good practice to change the dir in this way. When needed, use monkeypatch.chdir so it's undo automatically on teardown (even if the test fails or errored). https://docs.pytest.org/en/7.1.x/reference/reference.html#pytest.MonkeyPatch.chdir
| os.chdir(working_dir) | ||
| main(["dl", "https://github.com/another_awesome/repository/blob/main/hello.md"]) |
There was a problem hiding this comment.
it would nice to have an option to define where to download the contents, like
shbin dl URL [-o path]
Add the ability to download from any public github URL