Skip to content

Commit

Permalink
some cosmetic changes
Browse files Browse the repository at this point in the history
it's unlikely for an img src to contain line breaks, so it shouldn't be necessary to one_string(html) beforehand and split it afterwards
  • Loading branch information
yihui committed Dec 5, 2023
1 parent f5237af commit 0951a2f
Showing 1 changed file with 7 additions and 10 deletions.
17 changes: 7 additions & 10 deletions R/base64.R
Original file line number Diff line number Diff line change
@@ -1,33 +1,30 @@
# processes an HTML resource, given a regular expression that locates
# instances of that resource
process_html_res <- function(html, reg, processor) {
html <- one_string(html)
process_img_src <- function(img_src) {
m <- gregexpr(reg, html, perl = TRUE, ignore.case = TRUE)
regmatches(html, m) <- lapply(regmatches(html, m), function(img_src) {
src <- sub(reg, '\\1', img_src, ignore.case = TRUE)
vapply(
seq_along(img_src),
function(i) processor(img_src[[i]], src[[i]]),
character(1)
)
}
m <- gregexpr(reg, html, perl = TRUE, ignore.case = TRUE)
regmatches(html, m) <- lapply(regmatches(html, m), process_img_src)

strsplit(html, "\n", fixed = TRUE)[[1]]
})
html
}

process_images <- function(html, processor) {
process_html_res(html, "<\\s*img\\s+.*?src\\s*=\\s*[\"']([^\"']+)[\"']", processor)
}

base64_encode_images <- function(html) {
base64_encode_img <- function(img_src, src) {
encode <- function(img_src, src) {
in_file <- utils::URLdecode(src)
if (length(in_file) && file.exists(in_file)) {
img_src <- sub(src, xfun::base64_uri(in_file), img_src, fixed = TRUE)
}
img_src
}
html <- process_images(html, base64_encode_img)
process_html_res(html, "<[^>]*style=\"[^\"]*url\\(([^\\)]+)\\)", base64_encode_img)
html <- process_images(html, encode)
process_html_res(html, "<[^>]*style=\"[^\"]*url\\(([^\\)]+)\\)", encode)
}

3 comments on commit 0951a2f

@cderv
Copy link
Collaborator

@cderv cderv commented on 0951a2f Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yihui I believe this breaks some edge cases where the HTML contains <img and src= on separate lines

<p><img
src="C:/Users/chris/Documents/DEV_OTHER/00-TESTS/pkgdown.gganimate/docs/articles/gganimate_files/figure-html/unnamed-chunk-2-1.gif" /><!-- --></p>

For example, it prevents some absolute links to be made relative the html_document_base processing.

Took me a while to come to here, but I believe this is why this one_string calls was there - to avoid dealing with new lines like this.

I think what is causing the issue is the pandoc 2.17 change in --wrap=auto being the default for HTML which creates this wrapping when src= is very very long.
It happens with knitr::include_graphics() usage or any other that will write the ![](very/long/path.qmd)

> pandoc::pandoc_convert(text = "![](C:/Users/chris/Documents/DEV_OTHER/00-TESTS/pkgdown.gganimate/docs/articles/plots_files/figure-html/verylongnametohaveverylongfileagainveryveryveryveryveryveryverylong-1.png)", to = 'html', args = "--wrap=none")
<p><img src="C:/Users/chris/Documents/DEV_OTHER/00-TESTS/pkgdown.gganimate/docs/articles/plots_files/figure-html/verylongnametohaveverylongfileagainveryveryveryveryveryveryverylong-1.png" /></p>

> pandoc::pandoc_convert(text = "![](C:/Users/chris/Documents/DEV_OTHER/00-TESTS/pkgdown.gganimate/docs/articles/plots_files/figure-html/verylongnametohaveverylongfileagainveryveryveryveryveryveryverylong-1.png)", to = 'html', args = "--wrap=auto")
<p><img
src="C:/Users/chris/Documents/DEV_OTHER/00-TESTS/pkgdown.gganimate/docs/articles/plots_files/figure-html/verylongnametohaveverylongfileagainveryveryveryveryveryveryverylong-1.png" /></p>

So either we do something about --wrap, or we revert this change not doing a one_string

@yihui
Copy link
Member Author

@yihui yihui commented on 0951a2f Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the oversight!

So either we do something about --wrap, or we revert this change not doing a one_string

I'm okay with either solution. The latter may be safer, though. Do you want me to do it or you can go ahead and do it? I think it should be two lines of code.

@cderv
Copy link
Collaborator

@cderv cderv commented on 0951a2f Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do it. By the way, this goes with a second issue in knitr::include_graphics() where the absolute path is not computed using output.dir= from rmarkdown but using knitr output.dir option which is different.
fig_path will be computed to be output directory as set by R Markdown, but if someone like magick or gganimate creates image in fig_path and then attached using include_graphics() - the wrong relative path is computed.

I'll try to open an issue but this is related to yihui/knitr#2171 IMO. We can discuss live too

Please sign in to comment.