Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot get r.html because session.get(url) returns a requests Session, not an HTMLSession #586

Open
e-ave opened this issue Sep 24, 2024 · 0 comments

Comments

@e-ave
Copy link

e-ave commented Sep 24, 2024

I really cannot figure out how to use the latest version of requests-html.
The readme says to do

from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://python.org/')
rendered_html = r.html.render()

but session.get returns a requests.models.Response from the normal requests library, which doesn't have an html attribute. The session.get function should return a requests_html.HTMLResponse, which is what has the html property.

I tried doing this. It does not have any errors, but it does not get the html of the webpage. session.get just returns <HTML url='https://urlhere.com'>

# First make our HTMLSession
session = HTMLSession()
# Then use it to get a regular requests Response
r = session.get(url)
# Then convert our regular Response into an HTMLResponse
response = session.response_hook(r)
print(response.html)
# Now we can access response.html
html_doc = response.html.render()
print(html_doc)

I even tried using normal requests to grab the html, which works fine, then as soon as i pass the response to my HTMLSession, it gets scrubbed and turned into <HTML url='https://urlhere.com'>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant