Skip to content

Conversation

@Jamerson-santos
Copy link
Collaborator

@Jamerson-santos Jamerson-santos commented Jan 27, 2026

Description

This PR addresses the technical debt tracked in the issue by fixing the root cause of trailing <p> paragraphs being persisted in report editor content stored in the database.

Previously, the frontend had to apply a temporary workaround (e.g., removeTrailingParagraph) to sanitize API responses before loading them into the editor, because some reports were saved with an extra trailing <p> node. This PR prevents saving reports with trailing paragraphs at the persistence layer and includes a data cleanup to normalize existing records.

As a result:

  • Reports are stored without trailing paragraphs
  • Frontend filtering/workaround is no longer required
  • The editor works correctly with TrailingNodeExtension
  • Existing reports remain compatible after cleanup/migration

Related Ticket #2179

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Existing feature enhancement (non-breaking change which modifies existing functionality)

Testing

To validate this change:

Backend / Data validation

  1. Create or update a report and save it.
  2. Verify the persisted editor content in the database no longer contains a trailing <p> paragraph.
  3. Fetch the report via the API and confirm the returned content does not include an extra trailing paragraph.

Backward compatibility

  1. Test loading reports created before this fix (including reports that previously contained trailing <p> elements).
  2. Confirm the editor loads successfully and allows inserting new nodes (e.g., questions) normally.

Editor regression checks (core workflow)

  1. Go to the Claim page.
  2. Click on any sentence to open the drawer.
  3. In “Which questions should the verification answer?”, try adding/removing/reinserting question nodes multiple times.
  4. Confirm nodes are inserted at the correct position and no nested/duplicate trailing paragraphs appear in the resulting HTML structure.

Impacted scenarios:

  • Saving report editor content
  • Loading editor content from API
  • Compatibility with historical reports
  • Interaction with TrailingNodeExtension and insertion logic for new nodes

No special build is required beyond the standard environment. If a migration/script is included, ensure it runs in the target environment before validating.

Developer Checklist

General

  • Code is appropriately commented, particularly in hard-to-understand areas
  • Repository documentation has been updated (Readme.md) with additional steps required for a local environment setup.
  • No console.log or related logging is added.
  • No code is repeated/duplicated in violation of DRY.
  • Documented with TSDoc all library and controller new functions

Frontend Changes

  • No new styling is added through CSS files (Unless it's a bugfix/hotfix)
  • All types are added correctly

Backend Changes

  • All endpoints are appropriately secured with Middleware authentication
  • All new endpoints have a interface schema defined

Tests

  • All existing unit and end to end tests pass across all services
  • Unit and end to end tests have been added to ensure backend APIs behave as expected

Test IDs

  • Include the test ID when adding new tasks or components.
  • Check that test IDs are present in the modified components.

Merge Request Review Checklist

  • An issue is linked to this PR and these changes meet the requirements outlined in the linked issue(s)
  • High risk and core workflows have been tested and verified in a local environment.
  • Enhancements or opportunities to improve performance, stability, security or code readability have been noted and documented in Project do Github issues if not being addressed.
  • Any dependent changes have been merged and published in downstream modules
  • Changes to multiple services can be deployed in parallel and independently. If not, changes should be broken out into separate merge requests and deployed in order.

"machine.context.reviewData.visualEditor.content":
newContent,
},
$unset: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: while safe, this unconditionally unsets fields that may not exist. Consider checking first

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added field validations before unsetting the data.

const schema = editorParser.editor2schema(
event.reviewData.visualEditor.toJSON()
const visualEditorJSON = event.reviewData.visualEditor.toJSON();
const cleanedVisualEditor = editorParser.removeTrailingParagraph(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is there a way to prevent the <p> from being added instead of having to remove them afterward?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated alternatives, but couldn’t find a native way to prevent the persistence of the trailing paragraph. The TrailingNodeExtension is required for the UX (it allows insertion at the end), but it does not provide an option to exclude it during serialization.

@Jamerson-santos Jamerson-santos self-assigned this Jan 28, 2026
@sonarqubecloud
Copy link

@@ -0,0 +1,171 @@
import { Db } from "mongodb";
Copy link
Collaborator

@thesocialdev thesocialdev Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I advise against this data migration to not risk corrupting the existing data.

The trailing space is not an issue for published reports and we could just let the new ones to be created appropriately.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback — your concern about the risk of modifying existing data totally makes sense.

Just to provide a bit more context from the previous fixes:
in PR #2160, I fixed a structural issue in the editor by ensuring that the Remirror extensions — especially TrailingNodeExtension — were always properly initialized. Previously, extension instances were being reused and could become inconsistent or overwritten after document mutations, which led to unexpected behavior.

Then, in PR #2173, I added a frontend sanitization step that removed the trailing <p> from the content before passing it to the editor. Since TrailingNodeExtension already adds this node automatically, having an extra <p> caused conflicts when inserting new nodes into the document.

This solved the issue at runtime, but it also made the editor depend on this cleanup step every time data was loaded. To avoid relying on this permanent frontend sanitization, I considered adding a migration to normalize the data at the source, so both new and existing reports would be consistent and I could remove that extra logic from the editor.

That said, I agree that if the extra <p> doesn’t impact already published reports, it might not be worth the risk of modifying existing data. In that case, I can simply keep the fix to prevent the <p> from being persisted in new reports and continue handling older ones in the frontend, as is currently done.

How do you think we should proceed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants