Skip to content

Releases: adbar/courlan

courlan-0.5.0

13 Oct 14:59
Compare
Choose a tag to compare
  • more complex language heuristics, use langcodes
  • extended blacklists and whitelists
  • more precise filters and more efficient code
  • support for Python 3.10

courlan-0.4.2

28 Jul 15:37
Compare
Choose a tag to compare
  • enhanced cleaning
  • fixed language filter

courlan-0.4.1

10 Jun 15:49
Compare
Choose a tag to compare
  • keep trailing slashes to avoid redirection
  • fixes: normalization and crawlable URLs

courlan-0.4.0

25 May 17:35
Compare
Choose a tag to compare
  • URL manipulation tools added: extract parts, fix relative URLs
  • filters added: language, navigation and crawls
  • more robust link handling and extraction
  • removed support for Python 3.4

courlan-0.3.1

19 Feb 17:28
Compare
Choose a tag to compare
  • improve filter precision

courlan-0.3.0

04 Jan 12:17
Compare
Choose a tag to compare
  • reduced dependencies: replace requests with bare urllib3, and tldextract with tld for Python 3.6 upwards
  • better path and fragment normalization

courlan-0.2.3

20 Oct 14:56
Compare
Choose a tag to compare
  • Python 3.9 compatibility
  • Simplified imports
  • Bug fixes

courlan-0.2.2

21 Sep 14:41
Compare
Choose a tag to compare
  • English and German language filters
  • Function to detect external links
  • Support for domain blacklisting

courlan-0.2.1

02 Sep 13:48
Compare
Choose a tag to compare
  • Less aggressive strict filters
  • CLI bug fixed

courlan-0.2.0

01 Sep 17:25
Compare
Choose a tag to compare
  • Cleaner and more efficient filtering
  • Helper functions to scrub, clean and normalize
  • Removed two dependencies with more extensive usage of urllib.parse