Avoid spidering PDF URLs - causes crash or incomplete record #29

wswtizer · 2016-12-21T20:05:31Z

Problem: I managed to crash the devCenter Uploader trying to add a link to a PDF with the 'Create New Document' tab. (Using the 'Create provisional document' tab has a different result in that it creates a record, but doesn't populate the title, so I can't access that record via the UI. Issue #28 opened for that.)

The problem is related to the fact that the tool tries to crawl (spider) for PDF, but there is no data. Every a document is edited with a blank body, it tries to fetch the content again.

Glynn suggested that the URL could be 'pre-fetched' in order to get its content-type and if it's not text/html just skip the crawler.

Workaround suggested: for PDF URL, ensure that you have a title, and put some words in the body field when creating a new record with 'Create new document tab' - this would avoid the attempt to fetch again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid spidering PDF URLs - causes crash or incomplete record #29

Avoid spidering PDF URLs - causes crash or incomplete record #29

wswtizer commented Dec 21, 2016

Avoid spidering PDF URLs - causes crash or incomplete record #29

Avoid spidering PDF URLs - causes crash or incomplete record #29

Comments

wswtizer commented Dec 21, 2016