Skip to content

Handle narrow no-break spaces in file names #1397

@cavis

Description

@cavis

Had an RSS import where some image filenames had weird non-ascii non-encoded narrow no-break spaces.

Not sure if github will keep the NNBS... but:

<image>
  <url>https://philosophytalk.org/wp-content/uploads/2022/06/Screenshot-2025-03-21-at-3.05.25 PM.png</url>
  <title>Is Optimism Rational?</title>
</image>

First up, the EpisodeImports errored. Probably because URI.parse errors on these (in file_name). We did end up with an Episode and EpisodeImage, but there were issues with the Uncut and Image not firing porter jobs.

Second, you got a 500 trying to go to the episode. Again because the file_name is on the page.

Third, you can't put that un-encoded char into the Content Disposition. (Though you can in the Object Key). So we should probably drop non-ascii from the content disposition?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Level: AdvancedMost developers should anticipate frictionPriority: NormalCruising speedSize: MConsumes a day or two

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions