Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodeGen - Self referencing array in json schema results in stackoverflow #236

Open
windowslucker1121 opened this issue May 16, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@windowslucker1121
Copy link

Hey Folks,
can someone provide me a helping hand in this scenario:

I have a schema defined, which should have some data and can have multiple objects of itself in an array like defined below:

    "$schema": "http://json-schema.org/draft-04/schema",
    "title": "ProcessNode Schema",
    "type": "object",
    "description": "glTF extension for ProcessNode.",
    "properties": {
        "Process": {
            "type": "array",
            "items": {
                "$ref": "#"
            },
            "description": "A list of Process elements.",
            "minItems": 0
        },
        "EventHandler": {
            "description": "A list of EventHandler elements",
            "type":"array",
            "items":{
                "method": {
                    "type":"string"
                }
            },
            "minItems": 0
          },
        "Function": {
            "type": "string"
        },        
       "id": {
            "type": "string",
            "description": "Unique identifier for the ProcessNode."
        }
     },
    "required": [
        "id"
    ]
}

But when running the CodeGen Tool it is resulting in an stackoverflow:

Stack overflow.
   at System.Linq.Enumerable.Select[[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Collections.Generic.IEnumerable`1<System.__Canon>, System.Func`2<System.__Canon,System.__Canon>)

.
.
.
.
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader._UseType(Context, NJsonSchema.JsonSchema, Boolean)
   at SharpGLTF.SchemaReflection.SchemaTypesReader.Generate(NJsonSchema.CodeGeneration.CSharp.CSharpTypeResolver)
   at SharpGLTF.SchemaProcessing.LoadSchemaContext(System.String)

how would i prevent that, or tell the codegen tool that it should only process this node on the top level once and reference it then?

I cant find anything like my schema in the already defined schemas and i cant seem to find a apropriate function in the codegen tool, thats why im out of expertise here.

Thanks in advance!

@vpenades
Copy link
Owner

This is probably a bug, since self references are something I was not expecting when I wrote the generator.

Most probably the solution is to put a barrier somewhere to prevent reentrancy when it detects that some type is already in.

basically _UseType needs to cache the result value in some dictionary, and if it _UseType is called again with the same parameters, use the cached value in the dictionary instead of doing a full reprocessing.

I am extremely busy lately, so I don't know when I'll have time to look into it. If you're in a hurry, I would suggest to try fix it yourself, and maybe create a pull request with the solution.

@windowslucker1121
Copy link
Author

I investigated this type of error and it is exactly the error you mentioned.
After 3 hours of trying i think im not capable enough of fixing this issue.

What i did is created a list of already processed schemas and hold them in a cache like dictonary.
Then i tried reusing the cached SchemaTypes if they where already processed but then i found out, that i cant reuse it, because we are in recursive loop and that schema isnt fully processed at the current time.

Then i tried replacing the current processing schema, which was already processed in the cache, with an placeholder schema.
but now im stuck with it beeing a placeholder.

This is how it looks currently:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using Newtonsoft.Json.Schema;
using JSONSCHEMA = NJsonSchema.JsonSchema;

namespace SharpGLTF.SchemaReflection
{
    public class SchemaTypePlaceholder : SchemaType
    {
        public override string PersistentName => "Placeholder";

        public string PlaceholderTarget;
        public SchemaTypePlaceholder(Context ctx) : base(ctx)
        {
        }
    }

    static class SchemaTypesReader
    {
        public static SchemaType.Context Generate(NJsonSchema.CodeGeneration.CSharp.CSharpTypeResolver types)
        {
            var context = new SchemaType.Context();
            var schemasProcessed = new Dictionary<JSONSCHEMA, SchemaType>();

            foreach (var t in types.Types.Keys)
            {
                Console.WriteLine(t.DocumentPath);
                context._UseType(t, schemasProcessed, new HashSet<JSONSCHEMA>());
            }

            return context;
        }

        private static SchemaType _UseType(this SchemaType.Context ctx, JSONSCHEMA schema, Dictionary<JSONSCHEMA, SchemaType> schemasProcessed, HashSet<JSONSCHEMA> schemaStack, bool isRequired = true)
        {
            if (ctx == null) throw new ArgumentNullException(nameof(ctx));
            if (schema == null) throw new ArgumentNullException(nameof(schema));

            if (schemasProcessed.TryGetValue(schema, out var existingType))
            {
                return existingType;
            }

            if (schemaStack.Contains(schema))
            {
                if (schemasProcessed.ContainsKey(schema))
                {
                    return schemasProcessed[schema];
                }
                else
                {
                    throw new InvalidOperationException("Recursive schema reference detected.");
                }
            }

            schemaStack.Add(schema);
            var placeholder = new SchemaTypePlaceholder(null);
            placeholder.PlaceholderTarget = schema.DocumentPath;
            schemasProcessed[schema] = placeholder;
            SchemaType result = null;

            try
            {
                if (schema is NJsonSchema.JsonSchemaProperty prop)
                {
                    isRequired &= prop.IsRequired;
                }

                if (_IsStringType(schema))
                {
                    result = ctx.UseString();
                }
                else if (_IsBlittableType(schema))
                {
                    bool isNullable = !isRequired;

                    if (schema.Type == NJsonSchema.JsonObjectType.Integer) result = ctx.UseBlittable(typeof(Int32).GetTypeInfo(), isNullable);
                    else if (schema.Type == NJsonSchema.JsonObjectType.Number) result = ctx.UseBlittable(typeof(Double).GetTypeInfo(), isNullable);
                    else if (schema.Type == NJsonSchema.JsonObjectType.Boolean) result = ctx.UseBlittable(typeof(Boolean).GetTypeInfo(), isNullable);
                    else throw new NotImplementedException();
                }
                else if (schema.HasReference)
                {
                    result = ctx._UseType(schema.ActualTypeSchema, schemasProcessed, schemaStack, isRequired);
                }
                else if (schema.IsArray)
                {
                    var elementType = ctx._UseType(schema.Item.ActualSchema, schemasProcessed, schemaStack);
                    result = ctx.UseArray(elementType);
                }
                else if (_IsEnumeration(schema))
                {
                    if (schema is NJsonSchema.JsonSchemaProperty property)
                    {
                        bool isNullable = !isRequired;

                        var dict = new Dictionary<string, Int64>();

                        foreach (var v in property.AnyOf)
                        {
                            var key = v.Description;
                            var val = v.Enumeration?.FirstOrDefault();
                            var ext = v.ExtensionData?.FirstOrDefault() ?? default;

                            if (val is String txt)
                            {
                                System.Diagnostics.Debug.Assert(v.Type == NJsonSchema.JsonObjectType.None);

                                key = txt; val = (Int64)0;
                            }

                            if (v.Type == NJsonSchema.JsonObjectType.None && ext.Key == "const")
                            {
                                key = (string)ext.Value; val = (Int64)0;
                            }

                            if (v.Type == NJsonSchema.JsonObjectType.Integer && ext.Key == "const")
                            {
                                val = (Int64)ext.Value;
                            }

                            System.Diagnostics.Debug.Assert(key != null || dict.Count > 0);

                            if (string.IsNullOrWhiteSpace(key)) continue;

                            dict[key] = (Int64)val;
                        }

                        var name = string.Join("-", dict.Keys.OrderBy(item => item));

                        var etype = ctx.UseEnum(name, isNullable);

                        etype.Description = schema.Description;

                        foreach (var kvp in dict) etype.SetValue(kvp.Key, (int)kvp.Value);

                        if (dict.Values.Distinct().Count() > 1) etype.UseIntegers = true;

                        result = etype;
                    }
                    else
                    {
                        throw new NotImplementedException();
                    }
                }
                else if (_IsDictionary(schema))
                {
                    var key = ctx.UseString();
                    var val = ctx._UseType(_GetDictionaryValue(schema), schemasProcessed, schemaStack);

                    result = ctx.UseDictionary(key, val);
                }
                else if (_IsClass(schema))
                {
                    var classDecl = ctx.UseClass(schema.Title);

                    classDecl.Description = schema.Description;

                    if (schema.InheritedSchema != null)
                    {
                        classDecl.BaseClass = ctx._UseType(schema.InheritedSchema, schemasProcessed, schemaStack) as ClassType;
                    }

                    var keys = _GetProperyNames(schema);
                    if (schema.InheritedSchema != null)
                    {
                        var baseKeys = _GetInheritedPropertyNames(schema).ToArray();
                        keys = keys.Except(baseKeys).ToArray();
                    }

                    var props = keys.Select(key => schema.Properties.Values.FirstOrDefault(item => item.Name == key));

                    var required = schema.RequiredProperties;

                    foreach (var p in props)
                    {
                        var field = classDecl.UseField(p.Name);

                        field.Description = p.Description;

                        field.FieldType = ctx._UseType(p, schemasProcessed, schemaStack, required.Contains(p.Name));

                        field.ExclusiveMinimumValue = p.ExclusiveMinimum ?? (p.IsExclusiveMinimum ? p.Minimum : null);
                        field.InclusiveMinimumValue = p.IsExclusiveMinimum ? null : p.Minimum;
                        field.DefaultValue = p.Default;
                        field.InclusiveMaximumValue = p.IsExclusiveMaximum ? null : p.Maximum;
                        field.ExclusiveMaximumValue = p.ExclusiveMaximum ?? (p.IsExclusiveMaximum ? p.Maximum : null);

                        field.MinItems = p.MinItems;
                        field.MaxItems = p.MaxItems;
                    }

                    result = classDecl;
                }
                else if (schema.Type == NJsonSchema.JsonObjectType.Object)
                {
                    result = ctx.UseAnyType();
                }
                else if (schema.Type == NJsonSchema.JsonObjectType.None)
                {
                    result = ctx.UseAnyType();
                }
                else
                {
                    throw new NotImplementedException();
                }

                schemasProcessed[schema] = result;
                schemaStack.Remove(schema);

                return result;
            }
            catch
            {
                schemasProcessed.Remove(schema);
                schemaStack.Remove(schema);
                throw;
            }
        }

        private static bool _IsBlittableType(JSONSCHEMA schema)
        {
            if (schema == null) return false;
            if (schema.Type == NJsonSchema.JsonObjectType.Boolean) return true;
            if (schema.Type == NJsonSchema.JsonObjectType.Number) return true;
            if (schema.Type == NJsonSchema.JsonObjectType.Integer) return true;

            return false;
        }

        private static bool _IsStringType(JSONSCHEMA schema)
        {
            return schema.Type == NJsonSchema.JsonObjectType.String;
        }

        private static bool _IsEnumeration(JSONSCHEMA schema)
        {
            if (schema.Type != NJsonSchema.JsonObjectType.None) return false;

            if (schema.IsArray || schema.IsDictionary) return false;

            if (schema.AnyOf.Count == 0) return false;

            return true;
        }

        private static bool _IsDictionary(JSONSCHEMA schema)
        {
            if (schema.AdditionalPropertiesSchema != null) return true;
            if (schema.AllowAdditionalProperties == false && schema.PatternProperties.Any()) return true;

            return false;
        }

        private static JSONSCHEMA _GetDictionaryValue(JSONSCHEMA schema)
        {
            if (schema.AdditionalPropertiesSchema != null)
            {
                return schema.AdditionalPropertiesSchema;
            }

            if (schema.AllowAdditionalProperties == false && schema.PatternProperties.Any())
            {
                var valueTypes = schema.PatternProperties.Values.ToArray();

                if (valueTypes.Length == 1) return valueTypes.First();
            }

            throw new NotImplementedException();
        }

        private static bool _IsClass(JSONSCHEMA schema)
        {
            if (schema.Type != NJsonSchema.JsonObjectType.Object) return false;

            return !string.IsNullOrWhiteSpace(schema.Title);
        }

        private static string[] _GetProperyNames(JSONSCHEMA schema)
        {
            return schema
                    .Properties
                    .Values
                    .Select(item => item.Name)
                    .ToArray();
        }

        private static string[] _GetInheritedPropertyNames(JSONSCHEMA schema)
        {
            if (schema?.InheritedSchema == null) return Enumerable.Empty<string>().ToArray();

            return _GetInheritedPropertyNames(schema.InheritedSchema)
                .Concat(_GetProperyNames(schema.InheritedSchema))
                .ToArray();
        }
    }
}

@vpenades vpenades added the bug Something isn't working label Jun 5, 2024
@vpenades
Copy link
Owner

vpenades commented Jun 8, 2024

I've found this: https://stackoverflow.com/questions/35250621/recursive-self-referencing-json-schema

Not sure if it's relevant to the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants