Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WM: Pass response charset down to HTML::Form objects #225

Open
spazm opened this issue May 8, 2017 · 0 comments
Open

WM: Pass response charset down to HTML::Form objects #225

spazm opened this issue May 8, 2017 · 0 comments

Comments

@spazm
Copy link
Contributor

spazm commented May 8, 2017

[email protected] reported on Jun 10, 2009

What steps will reproduce the problem?
1. Request a page with unusual charset.
2. Fill some input fields with non-ascii characters.
3. Submit the form.

What is the expected output? What do you see instead?

Generated HTTP request will either fail because HTTP::Message will choke on
Unicode string or always use UTF-8 encoding depending on freshness of
HTML::Form.

But it should use the charset of the page from which the form was parsed.
It is known only to the user agent.

Please provide any additional information below.

Our patch adds support for newer HTML::Form with accept_charset() method.
See
http://github.com/gisle/libwww-perl/tree/ff583c4b194eb7437c71f6fb659ae03b9bffce70

There're tests and POD. Compatibility with old HTML::Form is not broken.

Details

Imported from Google Code issue 101 via archive

  • Type: Defect
  • Date: Jun 10, 2009
  • Reporter: [email protected]
  • Owner: ----
  • Priority: Medium
  • Status: New
  • Labels: WM

Comments

[email protected] commented on Jul 6, 2009 :

(No comment was entered for this change.)
  • Summary : WM: Pass response charset down to HTML::Form objects
  • Labels : WM

[email protected] commented on Jul 15, 2009 :

I follow your guidelines in your charset-patch in version Mechanize-1.58 and I try to
make test the module. I have this error: Can't locate object method "charset" via
package "HTTP::Headers" at /usr/local/share/perl/5.10.0/HTTP/Message.pm line 627.
Do yoy have any explanation ?

thanks,
George Veranis

[email protected] commented on Aug 18, 2009 :

I'm not sure that accept_charset is the way to go. People might expect that setting
to correspond to the accept-charset parameter of the HTML form tag.
http://www.w3.org/TR/1999/REC-html401-19991224/interact/forms.html#adef-accept-charset

Instead, it might be better to pass the charset to HTML::Form->parse which then
assigns those to the default_charset entry in the Form hash. It's rather new, though:
http://github.com/gisle/libwww-perl/commit/f13b3181f9d0140d83313233b5cbf0cb7ce4ee02
Included since libwww-perl 5.831. Looking at the timing I wonder whether that feature
was added in response to this report here...?

I guess a simple "can" check for this feature in HTML::Form won't work, so you'd have
to use the version number to determine the presence of this feature. I also think
that using accept-charset as a fallback might be desirable, but as I tend to use
bleeding-edge versions, I don't care overly much.

[email protected] commented on Oct 20, 2010 :

> But it should use the charset of the page from which the form was parsed.

I face the same problem.

I agree with [email protected] and made the new patch.
This patch works well with both libwww-perl-5.827 (in github) and 5.837.

How about this patch?

[email protected] commented on Oct 21, 2010 :

This discussion needs to happen on the mailing list so the public can easily see it.

Thanks,
Andy

[email protected] commented on Oct 21, 2010 :

OK, I'll try to post it on WWW::Mechanize users ML. Thank you for your response.

[email protected] commented on Oct 22, 2010 :

What mailing list would that be? [email protected]? Looking at its archive at http://dir.gmane.org/gmane.comp.lang.perl.modules.lwp there seems to be pretty little activity there recently. http://lists.cpan.org/showlist.cgi?name=libwww as linked from p/www-mechanize/ seems to be down at least today. http://lists.perl.org/list/libwww-perl.html has almost no information, in particular no archives at all, so none that could provide public access to more recent posts either, if there are such posts.

I'd like to be able to follow this discussion even without subscribing to a mailing list. Thats the reason I like bug trackers: I can subscribe to those issues that affect me, and don't have to filter out those that don't. So please keep people subscribed to this issue informed as well.

[email protected] commented on Oct 22, 2010 :

Hmm, I sent the message to following ML.

http://groups.google.com/group/www-mechanize-users/

I'm a new member of this ML and waiting for my message to be moderated.

[email protected] commented on Aug 13, 2011 :

Wouldn't passing HTML::Form->parse the HTTP::Reponse itself solve this issue? Also, it would have the added benefit of minimizing memory usage a bit because it would in effect be passing the html by reference instead of creating another copy of it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant