Avoid string intermediates in MCP transport read path#1319
Avoid string intermediates in MCP transport read path#1319
Conversation
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
|
@copilot both of these implementations are very similar -- do you think it would be possible to factor our the functionality of reading a line from stream in UTF-8 as either a helper method, or an extension method on PipeReader? |
…elper Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Done in b530d53. The common functionality is now in
Both transports now call |
src/ModelContextProtocol.Core/Client/StreamClientSessionTransport.cs
Outdated
Show resolved
Hide resolved
…ove NoBomUtf8Encoding from StreamClientSessionTransport Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Done in 09cf5eb. All four review items addressed:
|
|
@copilot This is looking good. Can you add any tests around this? Is is possible to measure our memory footprint for read and ensuring it's below a particular threshold? It might also be interesting to add some tests around the line splitting behavior to ensure coverage of all the special cases - empty lines, mix of carriage returns and no, non-ascii characters, invalid encoding adjacent to a newline such as an unpaired surrogate. Make tests that do their best to break this since it's effectively trying to replace a full encoding implementation for line breaks. |
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Added 15 tests in
On memory measurement: adding a precise allocation bound in xUnit tests is not straightforward without a benchmark harness (e.g. BenchmarkDotNet with |
|
@copilot can you also add a case for a multi-byte character sequence that's interrupted with a |
Co-authored-by: ericstj <8918108+ericstj@users.noreply.github.com>
Added in d050187 as The test uses
Both old and new paths raise |
|
Another instance of this problem in StreamableHttpClientSessionTransport: There are some string intermediates in Sse types, but not for the payload so it's not worth changing unless it shows up through usage. |
The read side of
StreamServerTransportandStreamClientSessionTransportwas decoding bytes →stringviaTextReader.ReadLineAsync(), then re-encoding back to UTF-8 internally duringJsonSerializer.Deserialize(string, ...). This round-trip was unnecessary sinceSystem.Text.Jsoncan deserialize directly from bytes.Changes
TextReaderwithPipeReaderon the input side of both transports.PipeReader.Create(stream)works across all targets; theSystem.IO.Pipelinesdependency was already present.JsonSerializer.Deserialize(ReadOnlySpan<byte>, typeInfo); multi-segment usenew Utf8JsonReader(ReadOnlySequence<byte>, ...)+JsonSerializer.Deserialize(ref reader, typeInfo).EncodingUtilities.GetUtf8String()(for trace logging) only allocates whenLogger.IsEnabled(LogLevel.Trace). For single-segment sequences it callsEncoding.UTF8.GetString(span)directly; multi-segment falls back toToArray().EndsWithCarriageReturn()checks the last byte of the last segment directly rather than creating an intermediate slice.encodingconstructor parameter fromStreamClientSessionTransport— it was only used to configure the now-removedTextReaderwrapper. Both call sites already passednull.PipeReaderExtensions— the buffer management loop, newline detection, and CRLF trimming are now in a singlePipeReaderExtensionsclass (inProtocol/). Both transports callawait pipeReader.ReadLinesAsync(ProcessLineAsync, cancellationToken)and share identicalProcessLineAsyncimplementations.GetUtf8StringtoEncodingUtilities— with an explicit empty-sequence fast path, following the existing utility pattern.NoBomUtf8EncodingtoStdioClientTransport— removed fromStreamClientSessionTransport(no longer needed there) and added as a privatestatic readonlyfield inStdioClientTransport, its only consumer.PipeReaderExtensionsTests— 16 tests covering line-splitting edge cases viaStreamServerTransport: empty/blank lines, LF and CRLF termination (single and multi-segment), mixed line endings, standalone\rpreservation, CRLF split across pipe segments, non-ASCII and multi-byte UTF-8 content, multi-byte character split across segments, invalid JSON line recovery, unterminated data not delivered, and a multi-byte UTF-8 sequence interrupted by\n(verifying both resulting lines raiseJsonExceptionand are silently skipped, matching the behavior of the previousStreamReader-based implementation).Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.