Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSS <description></description> content rendered error. #2853

Open
jkryanchou opened this issue Sep 16, 2024 · 2 comments
Open

RSS <description></description> content rendered error. #2853

jkryanchou opened this issue Sep 16, 2024 · 2 comments

Comments

@jkryanchou
Copy link

jkryanchou commented Sep 16, 2024

I subscribe a twitter feed from RSSHub. while it contains the nested html in tags like below

Orignal RSS Item Content

...
</item>
<item>
<title>宝玉: ↩️ @maximlott This was the prompt it used: Below is a verbal description of a puzzle, consisting of a 3x3 grid, with the lowest-right square b...</title>
<description><img width="0" height="0" hidden="true" src="https://pbs.twimg.com/media/GXjFIS5XUAEDFos?format=jpg&amp;name=orig" referrerpolicy="no-referrer"><a href="https://twitter.com/dotey" target="_blank" rel="noopener noreferrer"><img width="48" height="48" src="https://pbs.twimg.com/profile_images/561086911561736192/6_g58vEs_normal.jpeg" hspace="8" vspace="8" align="left" referrerpolicy="no-referrer"><strong>宝玉</strong></a>: ↩️ @maximlott Alright, this IQ test image was converted into text, so it doesn't reflect the actual results. At most, it can demonstrate that o1 preview's capabilities surpass those of other models, but we can't say that o1's IQ has reached 120. https://t.co/mWFsmEeIRI https://t.co/Iyo6kj5Qqu<br clear="both"><div style="clear: both"></div><a href="https://pbs.twimg.com/media/GXjFIS5XUAEDFos?format=jpg&amp;name=orig" target="_blank" rel="noopener noreferrer"><img height="150" style="height: 150px;" hspace="4" vspace="8" src="https://pbs.twimg.com/media/GXjFIS5XUAEDFos?format=jpg&amp;name=orig" referrerpolicy="no-referrer"></a><br clear="both"><div style="clear: both"></div><hr><small>Mon Sep 16 2024 05:44:45 GMT+0800 (China Standard Time)</small><br><br><img width="0" height="0" hidden="true" src="https://pbs.twimg.com/media/GXkvtzsWIAA9VMM?format=jpg&amp;name=orig" referrerpolicy="no-referrer"><a href="https://x.com/dotey" target="_blank" rel="noopener noreferrer"><img width="48" height="48" src="https://pbs.twimg.com/profile_images/561086911561736192/6_g58vEs_normal.jpeg" hspace="8" vspace="8" align="left" referrerpolicy="no-referrer"><strong>宝玉</strong></a>: ↩️ @maximlott This was the prompt it used:<br><br>Below is a verbal description of a puzzle, consisting of a 3x3 grid, with the lowest-right square being empty. Please consider the patterns and determine the appropriate answer to fill in the empty square. First row, first column: two lines forming a<br clear="both"><div style="clear: both"></div><a href="https://pbs.twimg.com/media/GXkvtzsWIAA9VMM?format=jpg&amp;name=orig" target="_blank" rel="noopener noreferrer"><img height="150" style="height: 150px;" hspace="4" vspace="8" src="https://pbs.twimg.com/media/GXkvtzsWIAA9VMM?format=jpg&amp;name=orig" referrerpolicy="no-referrer"></a><br clear="both"><div style="clear: both"></div><hr><small>Mon Sep 16 2024 13:27:53 GMT+0800 (China Standard Time)</small></description>
<link>https://x.com/dotey/status/1835551155528638868</link>
<guid isPermaLink="false">https://twitter.com/dotey/status/1835551155528638868</guid>
<pubDate>Mon, 16 Sep 2024 05:27:53 GMT</pubDate>
<author>宝玉</author>
</item>
...

While the NetNewsWired could rendered it well, Miniflux did not.

NetNewsWired

image

Miniflux

image

And I configured the scraper rule as div.content according to the section Filter, Rewrite and Scraper Rules
I have searched for so long got nothing help for this issue. and I have no idea whether my scraper rule was wrong or anyone could help me to figure it out.

@jkryanchou
Copy link
Author

I guessed the code from here

...
entry.Content = sanitizer.Sanitize(pageBaseURL, entry.Content)
...

It santize the original content...

@jkryanchou
Copy link
Author

Is there anyone could help me to figure it out what’s wrong with it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants