-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PE-219] chore: new live server endpoint to convert description_html to all other formats #6310
Conversation
WalkthroughThis pull request introduces a comprehensive document conversion system across multiple files. The core functionality revolves around a new Changes
Sequence DiagramsequenceDiagram
participant Client
participant Server
participant ConvertFunction
participant YjsUtils
Client->>Server: POST /convert-document
Server->>ConvertFunction: convertHTMLDocumentToAllFormats
ConvertFunction->>YjsUtils: getBinaryDataFromHTMLString
YjsUtils-->>ConvertFunction: Binary Data
ConvertFunction->>YjsUtils: getAllDocumentFormatsFromBinaryData
YjsUtils-->>ConvertFunction: Document Formats
ConvertFunction-->>Server: Formatted Document
Server-->>Client: JSON Response
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
packages/editor/src/core/helpers/yjs-utils.ts (4)
55-63
: Validate HTML to JSON transformations.
generateJSON(descriptionHTML, RICH_TEXT_EDITOR_EXTENSIONS)
may produce unexpected output if the HTML is malformed or includes unsupported tags. Consider adding fallback logic or logs to diagnose issues stemming from invalid HTML.
70-78
: Isolate shared logic for both editor variants.
getBinaryDataFromDocumentEditorHTMLString
shares a similar pattern to the rich text variant. Consider a helper function to avoid repetition and reduce any duplication of code.
85-107
: Optimize repeated conversions for performance.Repeatedly converting the same data to JSON and then to HTML might be expensive if done in a tight loop. Caching or batching repeated conversions can help optimize performance if these methods are frequently invoked.
109-136
: Maintain consistent naming between editor types.
getAllDocumentFormatsFromDocumentEditorBinaryData
andgetAllDocumentFormatsFromRichTextEditorBinaryData
share similar logic but differ in naming. Consider renaming them to align with a unified naming convention across the codebase (e.g., a standard prefix or suffix).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
live/src/core/helpers/convert-document.ts
(1 hunks)live/src/core/types/common.d.ts
(1 hunks)live/src/server.ts
(4 hunks)packages/editor/src/core/helpers/yjs-utils.ts
(1 hunks)packages/editor/src/core/helpers/yjs.ts
(0 hunks)packages/editor/src/index.ts
(1 hunks)packages/editor/src/lib.ts
(1 hunks)
💤 Files with no reviewable changes (1)
- packages/editor/src/core/helpers/yjs.ts
🔇 Additional comments (12)
packages/editor/src/core/helpers/yjs-utils.ts (2)
24-33
: Ensure robust error handling and validation for Y.Doc updates.While applying Yjs updates, consider scenarios where the
document
orupdates
arguments might be invalid or corrupted. Implementing additional error handling or validations could help prevent potential runtime errors or unexpected states.
40-42
: Confirm environment compatibility for Buffer usage.
Buffer
is Node.js-specific. If this code executes in a browser environment, ensure that a suitable polyfill or bundler configuration is in place.packages/editor/src/lib.ts (1)
4-4
: Validate re-export scope.
export * from "@/helpers/yjs-utils";
re-exports all utilities; ensure no unintentional exports are exposed. If certain internal functions are not meant for public consumption, consider explicitly exporting only required members.live/src/core/types/common.d.ts (1)
9-13
: Confirm variant string alignment with backend logic.
variant: "rich" | "document"
matches usage inconvert-document.ts
, but ensure that all server-side checks, validations, and code paths properly handle these variants and do not expect any additional ones.packages/editor/src/index.ts (1)
27-27
: Confirm naming consistency with new module.Renaming the export from
"yjs"
to"yjs-utils"
supports the restructured code. Verify that all references to the old module name have been updated throughout the codebase.live/src/core/helpers/convert-document.ts (1)
16-44
: Validate unsupported variants.The function throws if given an invalid
variant
, which is good. For further robustness, consider logging relevant information on invalid attempts or implementing structured error handling to unify error responses across the server endpoints.live/src/server.ts (6)
4-6
: Imports are organized and well-chosen.
Usinghelmet
to address security concerns is a good practice.
7-9
: Good grouping of configuration import statements and clear inline documentation.
This helps maintain clarity in the server setup flow.
12-12
: ImportingconvertHTMLDocumentToAllFormats
No issues spotted with this import.
15-16
: Typed request body usage.
ImportingTConvertDocumentRequestBody
enforces a structured request body, reducing the risk of malformed data.
31-31
: No issues.
This line simply closes configuration for compression middleware.
109-109
: Quality logging message.
Providing clarity on which server connections have closed helps with debugging and graceful shutdown procedures.
router.post("/convert-document", (req, res) => { | ||
const { description_html, variant } = req.body as TConvertDocumentRequestBody; | ||
try { | ||
if (description_html === undefined || variant === undefined) { | ||
res.status(400).send({ | ||
message: "Missing required fields", | ||
}); | ||
return; | ||
} | ||
const { description, description_binary } = convertHTMLDocumentToAllFormats({ | ||
document_html: description_html, | ||
variant, | ||
}); | ||
res.status(200).json({ | ||
description, | ||
description_binary, | ||
}); | ||
} catch (error) { | ||
manualLogger.error("Error in /convert-document endpoint:", error); | ||
res.status(500).send({ | ||
message: `Internal server error. ${error}`, | ||
}); | ||
} | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Based on the analysis of the codebase, I can now provide the final response:
HTML input requires additional security measures
The current implementation lacks proper HTML sanitization for the description_html
input, which could lead to XSS vulnerabilities. While the codebase includes DOMPurify for sanitization in other areas (via @plane/utils
), it's not being used in this endpoint. Additionally, the variant
parameter needs type validation.
Required changes:
- Import and use the
sanitizeHTML
utility from@plane/utils
to sanitize thedescription_html
input before processing - Add type validation for the
variant
parameter using TypeScript's union type"rich" | "document"
(as seen in theconvertHTMLDocumentToAllFormats
implementation)
Example implementation:
import { sanitizeHTML } from "@plane/utils";
router.post("/convert-document", (req, res) => {
const { description_html, variant } = req.body as TConvertDocumentRequestBody;
try {
if (description_html === undefined || variant === undefined) {
res.status(400).send({
message: "Missing required fields",
});
return;
}
// Validate variant
if (variant !== "rich" && variant !== "document") {
res.status(400).send({
message: "Invalid variant value. Must be 'rich' or 'document'",
});
return;
}
// Sanitize HTML input
const sanitizedHTML = sanitizeHTML(description_html);
const { description, description_binary } = convertHTMLDocumentToAllFormats({
document_html: sanitizedHTML,
variant,
});
res.status(200).json({
description,
description_binary,
});
} catch (error) {
manualLogger.error("Error in /convert-document endpoint:", error);
res.status(500).send({
message: `Internal server error. ${error}`,
});
}
});
🔗 Analysis chain
Ensure robust sanitization and validation for HTML input.
While you verify that description_html
and variant
are defined, consider:
- Sanitizing
description_html
to prevent XSS vulnerabilities and potential malicious injection. - Validating the string to ensure only expected HTML is passed.
- Restricting valid
variant
values to “rich” or “document” to avoid unexpected behavior.
Here’s a sample snippet (no direct diff as it spans multiple lines) illustrating a possible sanitization approach using DOMPurify:
import createDOMPurify from "dompurify";
import { JSDOM } from "jsdom";
/** ... inside your conversion flow ... */
const DOMPurify = createDOMPurify(new JSDOM("").window);
const sanitizedHTML = DOMPurify.sanitize(description_html);
Would you like a more in-depth solution that includes advanced sanitization and variant validation?
Let me verify the concerns by checking the codebase for existing sanitization, validation mechanisms, and the allowed variant values.
Let me check the implementation details of the conversion function and the type definitions to better understand the validation needs.
Let me check the type definition for TConvertDocumentRequestBody and examine the editor package's HTML processing functions since they handle the actual conversion.
Let me check the editor extensions and components that might be handling HTML processing and security.
Let me check the core editor extensions to understand how HTML content is processed and secured.
🏁 Scripts executed
The following scripts were executed for the analysis:
Error: No shell scripts found
Script:
#!/bin/bash
# Check for any existing sanitization or validation logic
rg -l "sanitize|validate|purify|clean|xss" --type ts
# Look for the type definition of variant
ast-grep --pattern 'type TConvertDocumentRequestBody'
# Check for DOMPurify or similar security libraries in package.json
cat package.json
# Look for the implementation of convertHTMLDocumentToAllFormats to understand variant handling
ast-grep --pattern 'convertHTMLDocumentToAllFormats'
Length of output: 4675
Script:
#!/bin/bash
# Check the implementation of convertHTMLDocumentToAllFormats and related types
cat live/src/core/helpers/convert-document.ts
# Look for any existing input validation in the server file
rg -A 5 -B 5 "TConvertDocumentRequestBody" live/src/server.ts
# Check if there are any security-related utilities or sanitization helpers
rg -l "sanitizeHTML|cleanHTML|purifyHTML" --type ts
# Look for any HTML processing or security-related dependencies in live/package.json
cat live/package.json
Length of output: 4335
Script:
#!/bin/bash
# Check the type definition
cat live/src/core/types/common.ts
# Look into the editor package's HTML processing
rg -A 10 "getBinaryDataFromRichTextEditorHTMLString|getBinaryDataFromDocumentEditorHTMLString" packages/editor/src/
# Check if there's any sanitization in the editor package
rg -l "sanitize|xss|clean|purify" packages/editor/src/
Length of output: 2374
Script:
#!/bin/bash
# Check the editor extensions and components for HTML handling
cat packages/editor/src/core/extensions/callout/utils.ts
cat packages/editor/src/core/components/editors/document/page-renderer.tsx
# Look for the RICH_TEXT_EDITOR_EXTENSIONS and DOCUMENT_EDITOR_EXTENSIONS configuration
rg -A 5 "RICH_TEXT_EDITOR_EXTENSIONS|DOCUMENT_EDITOR_EXTENSIONS" packages/editor/src/
# Check the TiptapEditor configuration since it's being used
rg -A 10 "generateJSON|generateHTML" packages/editor/src/
Length of output: 15585
Script:
#!/bin/bash
# Check the core editor extensions
cat packages/editor/src/extensions/core-without-props.ts
# Look for any HTML sanitization utilities in @plane/utils
cat packages/utils/src/string.ts
# Check if there are any security-related configurations in the editor
rg -A 5 "allowedTags|sanitize|security" packages/editor/src/
Length of output: 12207
Description
This PR creates a new endpoint in the live server to handle conversion of html description to other formats safely.
Type of Change
Summary by CodeRabbit
New Features
Refactor
yjs-utils.ts
moduleChores