Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fullPage rendering and loading #65

Closed
wants to merge 4 commits into from

Conversation

brandon93s
Copy link
Contributor

@brandon93s brandon93s commented Nov 25, 2020

Opening for discussion/proposal. This PR does the following:

  • Fixes the scroll implementation for fullPage
  • Implements an auto-scale-down to capture full-page without repeat

Benefits/drawbacks:

  • Will scale down the device ratio until the image fits
  • Works well for small to medium-sized websites
  • Can result in a low-quality images for massive pages
  • Single image implementation; no complex image stitching

Todo:

  • Proceed w/ simple scale-down approach?
  • Document scaling approach when using fullPage
  • Tests

Expand for sample screenshots

jquery
apple
espn

Expand for related links

Related:

#1 #49 #50 #58 #59

// Scroll one viewport at a time, pausing to let content load
const viewportHeight = viewportOptions.height;
let viewportIncrement = 0;
while (viewportIncrement + viewportHeight < bodyBoundingHeight) {
Copy link
Contributor Author

@brandon93s brandon93s Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop was inaccessible before due to bodyBoundingHeight being an object. autoScroll below accomplishes the same goal of scrolling to force content load.

@brandon93s brandon93s marked this pull request as ready for review November 25, 2020 23:12
const maxTextureSize = await page.evaluate(() => {
const canvas = document.createElement('canvas');
const webGL = canvas.getContext('webgl');
return webGL.getParameter(webGL.MAX_TEXTURE_SIZE);
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_TEXTURE_SIZE is the number of pixels we can go in any dimmension before duplicate rendering will occur. This is determined at runtime as it can/will be different per environment. Using this, we can adjust our deviceScaleFactor to fit the page within this dimension.

Note: The same issue can technically happen horizontally. If there's a use case out there, we should scroll and scale in both dimensions ( maybe a follow-up PR? ).

@@ -119,8 +119,6 @@ const parseCookie = (url, cookie) => {
return returnValue;
};

const imagesHaveLoaded = () => [...document.images].map(element => element.complete);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was returning an array of true/false before; always truthy and thus never waited.

@sindresorhus
Copy link
Owner

Thanks for working on this 🙌🏻

@sindresorhus
Copy link
Owner

Will scale down the device ratio until the image fits

I don't see how that will work in practice. Most websites are long enough to make the quality unusable.

return !isBottom;
};

while (await autoScroll()) { /* eslint-disable-line no-await-in-loop */
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a timeout? Otherwise, it would just go on forever on sites with infinite auto-loading.

/* eslint-disable no-undef */
window.scrollTo(0, 0);
/* eslint-enable no-undef */
// Workaround for chromium height limitations: https://bugs.chromium.org/p/chromium/issues/detail?id=770769#c12
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Workaround for chromium height limitations: https://bugs.chromium.org/p/chromium/issues/detail?id=770769#c12
// Workaround for Chromium height limitations: https://bugs.chromium.org/p/chromium/issues/detail?id=770769#c12

@brandon93s
Copy link
Contributor Author

Will scale down the device ratio until the image fits

I don't see how that will work in practice. Most websites are long enough to make the quality unusable.

Review the sample screenshots in the description; hopefully, those can inform whether this is a viable compromise. Or, at a minimum, a bridge to a proper chromium fix. Otherwise, we'd need to explore the "image stitching" approach.

@sindresorhus
Copy link
Owner

Review the sample screenshots in the description; hopefully, those can inform whether this is a viable compromise. Or, at a minimum, a bridge to a proper chromium fix. Otherwise, we'd need to explore the "image stitching" approach.

I don't think it's viable. Yes, it can work on some sites, but for most, it cannot. And I don't like that the flag makes everything nondeterministic. You have no idea of the quality of the screenshot upfront. I'd honestly rather have no --full-page flag than a broken one.

@sindresorhus
Copy link
Owner

Could we have a temporary flag that changed --full-height to mean as much as possible by scaling while still keeping the set width/height options?

@brandon93s
Copy link
Contributor Author

brandon93s commented Nov 29, 2020

Could we have a temporary flag that changed --full-height to mean as much as possible by scaling while still keeping the set width/height options?

To make sure I understand: are you proposing a new flag / option called full-height which would work as proposed in this PR as an alternative to full-page while we wait on Chromium or implement an alternative? Would we remove full-page in the interim since it's broken?

@sindresorhus
Copy link
Owner

No, just a typo, fixed:

Could we have a temporary flag that changed --full-page to mean as much as possible by scaling while still keeping the set width/height options?

@sindresorhus
Copy link
Owner

Basically, scale the site down as much as possible while still respecting the set width and then capture as much height as possible, although it won't be everything.

@sindresorhus
Copy link
Owner

Maybe you can find some inspiration in https://github.com/morteza-fsh/puppeteer-full-page-screenshot

Base automatically changed from master to main January 23, 2021 08:29
@sindresorhus
Copy link
Owner

Does https://github.com/puppeteer/puppeteer/releases/tag/v7.0.0 help at all with this?

@brandon93s
Copy link
Contributor Author

Unfortunately, the underlying chromium limitation is still outstanding and is not addressed by the puppeteer release.

@Kikobeats
Copy link
Contributor

Kikobeats commented Nov 6, 2021

I tested this PR locally with puppeteer v11 and I didn't any change compared with the default behavior.

target: https://www.apple.com/es/iphone/

before ![before](https://user-images.githubusercontent.com/2096101/140619676-a0df96d9-5306-4074-bd55-59f435513b92.png)
after ![after](https://user-images.githubusercontent.com/2096101/140619683-5f55c697-8141-48eb-acbb-2f367480ef38.png)

@sindresorhus
Copy link
Owner

Closing as this is not moving forward and we cannot keep it open forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants