-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with ToC parsing and importing pages. #249
Comments
Issue 2 resolved: We had not deleted the |
RE: Issue 1 I think this might be due to the logic in the When checking if the current item has a lower whitespace count than expected, it seems to assume the item is just one level away, so it always corrects the hierarchy by subtracting 1. In a case where it is two or more levels away, its hierarchy value will be incorrect, so the I've had some success by modifying while item:
# All items in the current directory have been processed
if item.whitespace_count < whitespace_expectation_per_level[hierarchy]:
+ if item.whitespace_count == 0:
+ hierarchy = 0
+ aggregate_dir = Path('.')
+ else:
+ hierarchy = hierarchy - 1
+ parent = parents.pop()
+ aggregate_dir = Path(parent.reference_value).parent
- hierarchy = hierarchy - 1
- parent = parents.pop()
- aggregate_dir = Path(parent.reference_value).parent This passes the unit tests in I also added a new test case to simulate this scenario: pytest.param(
(
item_1 := factories.IndexParsedListItemFactory(
whitespace_count=0, reference_value=(value_1 := "dir1")
),
item_2 := factories.IndexParsedListItemFactory(
whitespace_count=1, reference_value=(value_2 := f"{value_1}/dir2")
),
item_3 := factories.IndexParsedListItemFactory(
whitespace_count=2, reference_value=(value_3 := f"{value_2}/file3.md")
),
item_4 := factories.IndexParsedListItemFactory(
whitespace_count=0, reference_value=(value_4 := "dir4")
),
item_5 := factories.IndexParsedListItemFactory(
whitespace_count=1, reference_value=(value_5 :=f"{value_4}/file5.md")
),
),
("dir", "dir", "file", "dir", "file"),
(
factories.IndexContentsListItemFactory(
hierarchy=1,
reference_title=item_1.reference_title,
reference_value=value_1,
rank=item_1.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=2,
reference_title=item_2.reference_title,
reference_value=value_2,
rank=item_2.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=3,
reference_title=item_3.reference_title,
reference_value=value_3,
rank=item_3.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=1,
reference_title=item_4.reference_title,
reference_value=value_4,
rank=item_4.rank,
),
factories.IndexContentsListItemFactory(
hierarchy=2,
reference_title=item_5.reference_title,
reference_value=value_5,
rank=item_5.rank,
),
),
id="directory following a file nested twice",
) @jdkandersson I would love to hear your thoughts about this and get some more insights into how the hierarchy calculation is designed! I haven't raised a PR because I'm not sure if this is the best approach, and I feel like there are probably more considerations to take into account when modifying this logic. |
I'm taking a look now |
I think the reduction of level by 1 shouldn't be the problem since, if the whitespace count is still lower than expected, on the next loop it should be popped again. I haven't had the time yet, I would write a test potentially here:
|
actually, you already wrote the test, I'll use that as a starting point and see if I can fix the issue |
Ok, also got the test failure, now need to check why |
Part of the reason why I'm not sure we should use the proposed code is that it only takes care of the case where the whitespace count is 0, it should also be solved for cases where there is a 2 or more drop in hierarchy when the parent isn't at the root level |
@jdkandersson Thanks a lot for taking a look! I wrote that code to check whether this was the part of the logic that was failing for multiple level jumps, but if you agree that this is where the hierarchy mis-calculation is happening, then I definitely agree that it should be generalized for other cases. I'll try to come up with a more clean and general solution, and submit a PR. |
The solution is quite simple: if item.whitespace_count < whitespace_expectation_per_level[hierarchy]:
hierarchy = hierarchy - 1
parent = parents.pop()
aggregate_dir = Path(parent.reference_value).parent
continue Just needed to add the |
Would you like to raise a PR with this and the new test case you have written? |
Thanks for finding, reporting and even creating a test case for this! |
Ahh no way, I reached almost the same solution but thought the aggregate directory calculation had to be different. Thanks! I'll be happy to raise the PR with the solution + test case :) |
Issues
index.md
table of contents.Running the action on a repo with noSolved: see this comment./docs
folder creates a complete table of contents inindex.md
, but only imports some of those pages into the repo.Steps taken
The repository has an existing
/docs
folder with all the pages published to Charmhub. This migration was done by an old fork of the discourse-gatekeeper action.Issue 1: Table of contents nesting
We updated our workflow from the fork to
canonical/discourse-gatekeeper@stable
. Then, we re-triggered the workflow with no changes to discourse or GitHub.The action failed due to an index/ToC related error (link to logs):
It fails on this item of the contents table.
The same error happens with
canonical/discourse-gatekeeper@main
.Issue 2 (solved): Action does not import all documentation pages
See this comment
We deleted the entire
/docs
directory to allow the action to re-generate all docs from scratch with the expected structure. However, the action only downloaded a select few pages.This older revision shows the expected pages.
This newer revision shows the result of deleting
/docs
and running the action.This might be an entirely different issue altogether. It happens with both
main
andstable
branches of discourse-gatekeeper.After this, subsequent runs of the action (example run) yield the following error:
The item referenced in the error above is the first page that exists in the table of contents but was not correctly imported to GitHub, so I guess that is why it could not be parsed.
The text was updated successfully, but these errors were encountered: