Improve title text extraction #8

davorpa · 2022-09-17T09:10:02Z

According to current code

Lines 92 to 95 in dc53b8c

    
           const [link, ...otherStuff] = listItem; // head of listItem = url, the rest is "other stuff" 
        
           entry.url = link.url; 
        
           entry.title = link.children[0].value; 
        
           // remember to get OTHER STUFF!! remember there may be multiple links!

first node children[0] is used as resource titles without check if there are more meaningfull tokens. So the rest is stripped making sometimes difficult to do a search by title of resources.

Therefore a escape in resources title links part is needed when submitting and make a rebuild Markdown here is mandatory

Context

See EbookFoundation/free-programming-books#7086
Related with #2 (same workarround)

The text was updated successfully, but these errors were encountered:

Resolves EbookFoundation#8

eshellman · 2022-09-24T18:51:29Z

first node children[0] is used as resource titles without check if there are more meaningfull tokens.

Could you present an example from our current parsed data? Thanks

davorpa · 2022-09-24T23:15:16Z

first node children[0] is used as resource titles without check if there are more meaningfull tokens.

Could you present an example from our current parsed data? Thanks

Not at all since EbookFoundation/free-programming-books#7086 has been already fixed. Anyway I see you merge #11. Should I do anything more?

eshellman · 2022-09-26T21:34:43Z

I think we're good

davorpa mentioned this issue Sep 17, 2022

Parser don't take into account resources organized in sublists (fascicles/parts) #9

Open

davorpa added a commit to davorpa/free-programming-books-parser that referenced this issue Sep 18, 2022

implements getLinkTextFromLinkNodes

8cab19c

Resolves EbookFoundation#8

This was referenced Sep 18, 2022

[BUGFIX-8] Improve extraction of resource titles from Markdown links #11

Merged

improve resources found fixing EbookFoundation/free-programming-books-parser#8 EbookFoundation/free-programming-books#7090

Merged

eshellman closed this as completed in #11 Sep 24, 2022

davorpa mentioned this issue Sep 26, 2022

[FEATURE-13]: support relators role notes as part of creators text #13

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve title text extraction #8

Improve title text extraction #8

davorpa commented Sep 17, 2022 •

edited

Loading

eshellman commented Sep 24, 2022

davorpa commented Sep 24, 2022

eshellman commented Sep 26, 2022

Improve title text extraction #8

Improve title text extraction #8

Comments

davorpa commented Sep 17, 2022 • edited Loading

Context

eshellman commented Sep 24, 2022

davorpa commented Sep 24, 2022

eshellman commented Sep 26, 2022

davorpa commented Sep 17, 2022 •

edited

Loading