Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[out-of-proc] GRPC serializer does not respect Activity return type when [JsonPolymorphic] is used #3013

Open
Ilia-Kosenkov opened this issue Jan 21, 2025 · 7 comments
Labels

Comments

@Ilia-Kosenkov
Copy link

Ilia-Kosenkov commented Jan 21, 2025

Description

If activity is defined to return a base type of some hierarchy that supports polymorphic (de-)serialization, the actual serialization of activity output happens according to its runtime type rather than declared type, which breaks polymorphic feature of System.Text.Json.

We were able to trace it to Microsoft.Azure.Functions.Worker.Rpc.RpcExtensions.ToRpcDefault, which calls serializer.Serialize(value)?.ToString(); (here serializer is JsonObjectSerializer). This overload has the following signature:

BinaryData Serialize(object? value, Type? inputType = null, CancellationToken cancellationToken = default)

When called with one parameter, inputType is null. The implementation itself calls

 JsonSerializer.SerializeToUtf8Bytes(value, inputType ?? value?.GetType() ?? typeof(object), _options);

thus resolving runtime type of the return value, and serializing accordingly, instead of decalred type of activity.

Expected behavior

GRPC serializer passes return type as inputType, performing correct JSON serialization of the result.

Actual behavior

Serialization happens according to runtime type, which breaks polymorphic serialization of System.Text.Json

Relevant source code snippets

Program.cs
using System.Text.Json.Serialization;
using Azure.Core.Serialization;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Http;
using Microsoft.DurableTask;
using Microsoft.DurableTask.Client;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;

var host = new HostBuilder()
    .ConfigureFunctionsWorkerDefaults(builder => builder.Serializer = new CustomSerializer())
    .Build();

await host.RunAsync();

[JsonPolymorphic]
[JsonDerivedType(typeof(DerivedResponse), nameof(DerivedResponse))]
public abstract record BaseResponse(int Field1);

public sealed record DerivedResponse(int Field1, int Field2) : BaseResponse(Field1);

public static class DurableFunctions
{
    [Function(nameof(TestActivity))]
    public static Task<BaseResponse> TestActivity([ActivityTrigger] object? input)
    {
        return Task.FromResult<BaseResponse>(new DerivedResponse(42, 99));
    }

    [Function(nameof(TestOrchestration))]
    public static async Task TestOrchestration([OrchestrationTrigger] TaskOrchestrationContext context)
    {
        var logger = context.CreateReplaySafeLogger(nameof(TestOrchestration));
        var result = await context.CallActivityAsync<BaseResponse>(nameof(TestActivity));
        logger.LogInformation("Received {Result}", result);
    }

    [Function(nameof(HttpEntryPoint))]
    public static async Task HttpEntryPoint([HttpTrigger(
        AuthorizationLevel.Anonymous, "GET", Route = "debug")] HttpRequestData req,
        [DurableClient] DurableTaskClient client
    )
    {
        await client.ScheduleNewOrchestrationInstanceAsync(nameof(TestOrchestration));
    }
}

public sealed class CustomSerializer : JsonObjectSerializer
{
    // Called from Microsoft.Azure.Functions.Worker.Rpc.RpcExtensions.ToRpcDefault
    public override BinaryData Serialize(object? value, Type? inputType = null, CancellationToken cancellationToken = default)
    {
        // Breakpoint here will reveal the issue
        var result = base.Serialize(value, inputType, cancellationToken);

        return result;
    }
}
.csproj
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>net8.0</TargetFramework>
    <AzureFunctionsVersion>v4</AzureFunctionsVersion>
    <OutputType>Exe</OutputType>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>
  <ItemGroup>
    <PackageReference Include="Microsoft.Azure.Functions.Worker" Version="2.0.0" />
    <PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.DurableTask" Version="1.2.2" />
    <PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.Http" Version="3.2.0"/>
    <PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="2.0.0" />
  </ItemGroup>
  <ItemGroup>
    <None Update="host.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
    </None>
    <None Update="local.settings.json">
      <CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
      <CopyToPublishDirectory>Never</CopyToPublishDirectory>
    </None>
  </ItemGroup>
  <ItemGroup>
    <Using Include="System.Threading.ExecutionContext" Alias="ExecutionContext"/>
  </ItemGroup>
</Project>

Known workarounds

There is no reliable workaround because ObjectSerializer does not get type information or any other required information. If there are multiple functions that return both base and derived types, then universal solution is impossible.
Half-solutions:

  • Wrap return into a trivial container, which contains polymorphically serialized property.
  • If all activities deal with base types only, object serializer can be tweaked to scan assembly and substitute missing inputType when it detects it is working with incorrect activity serialization output (introduces serialization penalty due to constant type checks and additional startup overhead)
  • Ditch built-in [JsonPolymorphic] and build a custom converter, which defeats the purpose of relying on well-established frameworks
  • ???

App Details

  • Microsoft.Azure.Functions.Worker.Extensions.DurableTask v1.2.2
  • (not sure what runtime here means, but AzureFunctionsVersion is v4)
  • C#

Screenshots

Local state of CustomSerializer when activity return type is serialized. Notice that inputType is null, while typeof(BaseResponse) is required her to function properly.
Image

@cgillum
Copy link
Member

cgillum commented Jan 21, 2025

Does this problem exist for both activity and orchestration triggers or just activity triggers? Activity triggers go down a different serialization path owned by the .NET worker, whereas orchestration triggers use a serialization mechanism that's specific to Durable Functions. Knowing which scenarios are impacted will help us know where to focus efforts on investigating the root cause.

/cc @jviau

@Ilia-Kosenkov
Copy link
Author

Ilia-Kosenkov commented Jan 21, 2025

It is quite easy to check if I modify my example like so:

using System.Text.Json.Serialization;
using Azure.Core.Serialization;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Azure.Functions.Worker.Http;
using Microsoft.DurableTask;
using Microsoft.DurableTask.Client;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;

var host = new HostBuilder()
    .ConfigureFunctionsWorkerDefaults(builder => builder.Serializer = new CustomSerializer())
    .Build();

await host.RunAsync();

[JsonPolymorphic]
[JsonDerivedType(typeof(DerivedResponse), nameof(DerivedResponse))]
public abstract record BaseResponse(int Field1);

public sealed record DerivedResponse(int Field1, int Field2) : BaseResponse(Field1);

public static class DurableFunctions
{
    [Function(nameof(TestActivity))]
    public static Task<BaseResponse> TestActivity([ActivityTrigger] object? input)
    {
        return Task.FromResult<BaseResponse>(new DerivedResponse(42, 99));
    }
    
    [Function(nameof(TestAnotherOrchestration))]
    public static Task<BaseResponse> TestAnotherOrchestration([OrchestrationTrigger] TaskOrchestrationContext context)
    {
        return Task.FromResult<BaseResponse>(new DerivedResponse(42, 99));
    }

    [Function(nameof(TestOrchestration))]
    public static async Task TestOrchestration([OrchestrationTrigger] TaskOrchestrationContext context)
    {
        var logger = context.CreateReplaySafeLogger(nameof(TestOrchestration));
        var result = await context.CallSubOrchestratorAsync<BaseResponse>(nameof(TestAnotherOrchestration));
        logger.LogInformation("Received {Result}", result);
    }

    [Function(nameof(HttpEntryPoint))]
    public static async Task HttpEntryPoint([HttpTrigger(AuthorizationLevel.Anonymous, "GET", Route = "debug")] HttpRequestData req, [DurableClient] DurableTaskClient client)
    {
        await client.ScheduleNewOrchestrationInstanceAsync(nameof(TestOrchestration));
    }
}

public sealed class CustomSerializer : JsonObjectSerializer
{
    // Called from Microsoft.Azure.Functions.Worker.Extensions.DurableTask.ObjectConverterSlim.Serialize
    public override BinaryData Serialize(object? value, Type? inputType = null, CancellationToken cancellationToken = default)
    {
        // Breakpoint here will reveal the issue
        var result = base.Serialize(value, inputType, cancellationToken);

        return result;
    }
}

Funny thing! Same issue but slightly different root cause, this time it is Microsoft.Azure.Functions.Worker.Extensions.DurableTask.ObjectConverterSlim, and it does

public override string? Serialize(object? value)
{
    if (value is null)
    {
        return null;
    }

    BinaryData data = this.serializer.Serialize(value, value.GetType(), default);
    return data.ToString();
}

which again uses runtime type. So both do not work, conceptually the problem is the same, but implementation differs.

@jviau
Copy link
Contributor

jviau commented Jan 21, 2025

I have a vague recollection of encountering something like this myself. I think it has to do with the API used on System.Text.Json via Azure.Core.ObjectSerializer doesn't respect polymorphic serialization attributes. This is over a year ago, so my memory is vague. But if that is the case, then we can only document it as a known issue and wait for STJ or Azure.Core to fix it (if possible) - not sure which one. Customer's would need to workaround this themselves for now. Best way would be to supply their own ObjectSerializer which addresses the issue.

@Ilia-Kosenkov seeing as the issue lies in the Azure.Core's serializer, have you opened an issue or see if there is an existing one there? https://github.com/azure/azure-sdk-for-net/

@Ilia-Kosenkov
Copy link
Author

Hey @jviau , thanks for the reply. If this is by-design, it would be nice to document it explicitly (unless it is already and we just missed it). Unfortunately, this is not the issue with ObjectSerializer (which can be used to work with STJ properly), but rather that it is called without providing information about activity return type. We cannot really handle this by improving ObjectSerializer, because at runtime you cannot distinguish if DerivedResponse is returned as DerivedResponse (and is expected to be deserialized as such), which requires plain serialization, or if DerivedResponse is returned as BaseResponse, which requires polymorphic logic to kick in.
Right now it appears to be a fundamental issue in the serialization of function return types, and I guess we will just add some DI tests to catch cases when activity/orchestration return type is base type marked with JsonPolymorphic attribute.

As for Azure.Core, no, I have not talked to them yet, but I will check their issues tomorrow.

@jviau
Copy link
Contributor

jviau commented Jan 21, 2025

@Ilia-Kosenkov ah I see. The fix would be to add a new overload string? Serialize(object? value, Type? inputTarget = null) to DataConverter.cs, so that we can pass the declared type down to ObjectSerializer.

@cgillum - this work seems valuable and fairly easy to do (shouldn't require any breaking changes). Not sure how you want to prioritize it.

@Ilia-Kosenkov
Copy link
Author

Ilia-Kosenkov commented Jan 21, 2025

We did not debug deep enough into the framework, but assuming that at the moment of calling DataConverter you also have access to declared return type of a durable function, then properly passing it down from DataConverter to ObjectSerializer might just do the trick.

@jviau
Copy link
Contributor

jviau commented Jan 21, 2025

Looked into it a bit more. While adding the overload is trivial, all the call sites for .Serialize(object) today do not have access to the declared type. We would need to go through all those APIs and plumb type info all the way through from the user-facing public API.

If done via generics, this has a side benefit of avoiding boxing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants