-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improved detection of files and directories in GoogleStorageMixin #129
base: master
Are you sure you want to change the base?
Conversation
The listdir function returns two values, one for directory names and one for file names. These are lists of strings and thus do not have a delete method. Instead of trying to delete each array, we loop through them to build keys that we can delete.
…or_gcs Fixes issues with old methods in GoogleStorageMixin
Fix rmtree method of S3BotoStorageMixin
…ication Fix cs specification that is breaking compilemessages command
Fix issues with old methods in GoogleStorageMixin
filebrowser_safe performed oddly against Google Cloud Storage, inappropriately considering files to be directories and vice-versa. the changeset handles edge cases in ``isfile()`` and correctly iterates the file+dir listing returned by ``listdir()``.
(Build errors appear unrelated.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned this up a bit and left explanatory comments.
filebrowser_safe/storage.py
Outdated
@@ -125,7 +126,7 @@ def rmtree(self, name): | |||
class GoogleStorageMixin(StorageMixin): | |||
|
|||
def isfile(self, name): | |||
return self.exists(name) | |||
return bool(name) and self.exists(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by relying only on exists()
, the method isfile()
inappropriately considered the bucket root to be a file.
filebrowser_safe/storage.py
Outdated
|
||
# Check whether the iterator is empty | ||
# Check for contents | ||
for item in dirlist: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the dirlist
returned by listdir()
is a tuple of (dirs, files)
, so the iteration check was previously not testing anything about the listing -- (it would always return True
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…And on a second pass-through, I rediscovered issues with file/dir detection with GCS, and worked to ameliorate the huge performance hit that the filebrowser takes with (this) cloud storage.
This discussion reveals some of the issue with detecting directories on GCS. The last couple comments in particular refer to the same issue I ran into – that even extant blobs might in fact just be fakes inserted by the web console, to look like "folders."
And the performance tweaks brought down browser directory listing times significantly – for example, one directory that took about 5 seconds for browse
to return now takes 2.3 seconds. (This is still much slower than before. And another directory still takes 7 seconds, which is probably going to be a problem. But, it's better than it was.)
@@ -47,7 +47,7 @@ def __len__(self): | |||
|
|||
@cached_property | |||
def filetype(self): | |||
if self.is_folder: | |||
if not default_storage.type_checks_slow and self.is_folder: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the browse
view performs certain checks – which with GCS are now network calls – over and over, for each object in the directory listing. this one in particular seemed unnecessary, (especially when GCS's concept of a "folder" is so uncertain as it is).
@@ -98,7 +98,9 @@ def browse(request): | |||
raise ImproperlyConfigured(_("Error finding Upload-Folder. Maybe it does not exist?")) | |||
redirect_url = reverse("fb_browse") + query_helper(query, "", "dir") | |||
return HttpResponseRedirect(redirect_url) | |||
abs_path = os.path.join(get_directory(), path) | |||
|
|||
base_dir = get_directory() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_directory()
also requires a network call, and simply holding onto its result, here, made a significant difference.
e04a9c1
to
c7eceb0
Compare
filebrowser_safe performed oddly against Google Cloud Storage,
inappropriately considering files to be directories and vice-versa.
the changeset handles edge cases in
isfile()
and correctly iteratesthe file+dir listing returned by
listdir()
.