-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: blocklist of certain domains #7
Comments
Thanks for the report. What type of page was that out of curiosity? Wondering what HTML would cause it to slow down like that. A blacklist is a good idea though. I think this has been mentioned elsewhere. Also a way to purge things that have already been imported. |
It was output from a decompiled android app via mobSF. Mostly just random
strings pulled mostly from smali code but probably many mbs of them. Happy
to show you more if curious.
…On Sat, Mar 2, 2024, 16:01 Ian Sinnott ***@***.***> wrote:
Thanks for the report. What type of page was that out of curiosity?
Wondering what HTML would cause it to slow down like that.
A blacklist is a good idea though. I think this has been mentioned
elsewhere. Also a way to purge things that have already been imported.
—
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZ72K54SZZ2LJF6GPAOKI3YWJR5RAVCNFSM6AAAAABEDIYECCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZUHE2DMNZWGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Ah yeah, it's probably a big amount of data which is an edge case the extension doesn't handle well. There's no special logic for something like "If this page is 12mb of binary data", so it's not surprising it's slow. |
To do the blacklist we would need to also remove existing entries that matched. A simpler approach might be some kind of heuristic to stop indexing if the detected page content is unusually large. |
On some websites, readability lib can cause missing paragraphs in the saved text. It would be nice to also implement a user-defined "reader mode blacklist" to disable readability on those websites. |
Blacklist is coming in next release. Not custom readability, at least not yet, but the ability to block certain URL patterns from getting indexed. |
FTTF significantly slows down when opening large files on localhost:
It would be good to have a block or ban list to prevent FTTF from running on
localhost
or others.The text was updated successfully, but these errors were encountered: