Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes issue where the parser was breaking URLs that contains equal sign #208

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gabriel-felipe
Copy link

@gabriel-felipe gabriel-felipe commented May 4, 2020

Fixes broken URL when the input is:

"test":https://www.payscale.com/research/US/Degree=Bachelor%27s_Degree%2C_Business_Administration/Salary

Which resulted in

<a href="https://www.payscale.com/research/US/Degree%3DBachelor%27s_Degree%2C_Business_Administration/Salary" rel="noopener noreferrer" target="_blank">test</a>

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Security fix

Description

The result is a different and 404 URL, technically I guess it's arguable that, in this case, Payscale site could work around that on their end, but I guess it would be good to aim to change as little as possible the given URLs, nowadays an equal sign on most URLs is harmless but changing it could break several URLs that aren't so well handled on the server-side.

Checklist

  • I documented my additions and changes using PHPdoc.
  • I wrote fixtures to cover my additions.
  • $ composer update
  • $ composer test
  • $ composer cs

This is my first contribution to this project, so is there some contribute file that could show how should I organize and cover fixtures?

Also which is the ideal PHP version to use to dev this project? I tried running composer test and composer cs here but both failed.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.07%) to 99.093% when pulling 61b54e9 on entermotion:fixes-urls-with-equal-being-broken-by-parser into 1651585 on textile:master.

@gocom
Copy link
Member

gocom commented May 16, 2020

nowadays an equal sign on most URLs is harmless but changing it could break several URLs that aren't so well handled on the server-side.

According to RFC-3986, = is a reserved character. Having it encoded, or not, may cause two very different URIs and theoretically could cause existing links generated and encoded by PHP-Textile to become broken.

You can see that by the fixtures that are now failing (or a fixture, seems that we only have one URI with this sub-delimiter). Those URLs may become broken, where the server instead of receiving a request that will return a document with = in its name, will receive what it interprets as something else, well sort of...

...now, = is a sub-delimiter, so it may not really matter, but evident by PayScale mentioned, they seems not to be using it like that, but are interpreting it outside its query (and other) fragment context (like as some sort of parameter assignment or something else), or aren't just handling percent encoding at all (and treat that request as is).

Now we do handle query, and other fragments, accordingly, and do not encode = and other sub-delimiters within them, basically trying to follow RFC-3986, but that has its issues and limitations due to Textile being a human-useable formatting language that tries to do sanitization and formatting for the layman, ala =.

show how should I organize and cover fixtures?

Generally what parser returns should not change, but in cases like this the fixtures should be updated to pass, but that also means potential backwards compatibility breaks, which can not be released without major version bumps.

I'm also not sure if this a valid bug. I may be wrong, and will accept this type of change if I can get a RFC pointed to me that will say this character does not need to be encoded, and real-world examples where both encoded and not encoded URLs return the same results, and ones where are not returned. Then we can decide whether this character should be encoded or not.

Also which is the ideal PHP version to use to dev this project? I tried running composer test and composer cs here but both failed.

CI runs tests on every mainline release since 5.6 alto way to 7.4 (and recent nightlies, but those do not work due to version constraints in dependencies). CONTRIBUTING.md states:

Running tests requires PHP >=5.6.0 and PCRE with PCRE_UTF8 support

Running it on Windows may need some extra work or running it on that unix subsystem thing I hear is available on Windows 10. PHP-Textile itself works alright (library not being system dependant), but test dependencies may not, it not being tested on Windows.

Upcoming PHP-Textile 4.x (4.0 branch) the uses docker containers to have its on array of environment, cutting dependencies to just Docker and allows testing the library against all supported PHP versions locally.

@gocom gocom self-assigned this May 1, 2022
@gocom gocom added this to the 4.0.0 milestone May 1, 2022
@gocom gocom modified the milestones: 4.0.0, 5.0.0 Dec 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants