Skip to content
This repository has been archived by the owner on Feb 25, 2023. It is now read-only.

Deinflect できる to する #2266

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

tomtung
Copy link

@tomtung tomtung commented Nov 2, 2022

「できる」 should be de-inflected as the potential form of 「する」。

Otherwise, for example, Yomichan correctly matches 「全うする」 to the meaning of "to accomplish / to fulfill / to carry out", but incorrectly matches 「全うできる」 to 「真っ当・全う・真当」 with meaning "proper / respectable / decent / honest".

@toasted-nutbread
Copy link
Collaborator

I think I had considered adding this at some point, but didn't for whatever reason, maybe because I thought it might have false positives or something. One thing that comes to mind is what scanning できる in isolation would result in する being the first result, but testing your branch, there doesn't seem to be any issue.

If this is added, there are a few things we'd probably want:

  • Rebasing your branch since it's out of date.
  • 出来る probably also needs to be handled.
  • Test cases updated to validate the changes, added in test-deinflector.js. I can help with this.

@tomtung
Copy link
Author

tomtung commented Nov 12, 2022

I think I had considered adding this at some point, but didn't for whatever reason, maybe because I thought it might have false positives or something.

Yeah totally understand. I also thought that leaving out the de-inflection of できる to する doesn't matter, until I encountered the relatively uncommon cases where attaching 「する」 somewhat changes the meaning of a word. 「全う」 vs 「全うする」 is one such example as mentioned above; 「糊」 vs 「糊する」 (as in 「口を糊する」) is another. Without de-inflection of できる to する, we wouldn't be able to match 「口を糊できる」 to 「口を糊する」.

Rebasing your branch since it's out of date.

Can you clarify what you are referring to here? I don't see any merge conflicts, and the unit tests are passing. I think when you choose to accept a pull request, you can choose to rebase instead of merge if that's what you mean, but I don't have access to that.

出来る probably also needs to be handled.

Done. Although, coming back to the concern over potential false positives, if this turns out to be too noisy, only deflecting できる while leaving 出来る alone might be a reasonable compromise, since the use of latter for denoting the potential form of する seems a lot less common.

Test cases updated to validate the changes, added in test-deinflector.js.

Done. Confirmed that the npm test passes after the change.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants