-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn about Multiple Child Elements with Same Name #1321
base: main
Are you sure you want to change the base?
Warn about Multiple Child Elements with Same Name #1321
Conversation
) | ||
} | ||
|
||
private lazy val checkMembersHaveUniqueNamesInNamespaces: Map[Boolean, Seq[Term]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might suggest a different name for this since it isn't really checking anything, it's just grouping things.
|
||
private lazy val checkIfMultipleChildrenWithSameName: Unit = { | ||
val nonUniqueNameChildren = | ||
checkMembersHaveUniqueNamesInNamespaces.filter(_._1 == true).values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this needs a different grouping than with the unorderd sequences?
The goal of this check is to warn in case the schema is used with an infoset that don't support duplicate fields, and those kinds of infosets probably also don't support namespaces. So we probably want to warn even if they have different namespaces, which means we need to group by name only and not by QName.
Also, we kindof already have a similar check with WarnID.NamespaceDifferencesOnly
. I wonder if that check should be removed in place of this one, since this new check is really just a more general version of that check? The existing one is about same-name elements next to each other, this one is about same-name elements that are siblings. Anything that this new check warns about will also warn for the existing check.
checkMembersHaveUniqueNamesInNamespaces.filter(_._1 == true).values | ||
nonUniqueNameChildren.foreach { children => | ||
children.head.SDW( | ||
WarnID.MultipleChildElementsWithSameName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this wants to be WarnID.NamespaceDifferencesOnly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the idea of MultipleChildElementsWithSameName subsuming NamespaceDifferencesOnly instead of the other way around, because in sequence group, the namespaces could be different or the same with the same name and it should still trigger MultipleChildElementsWithSameName. Actually I'm wondering if we ougt to leave NamespaceDifferenceOnly alone since it's very clear that it's for the case where the name is the same and the namespace is different, and MultipleChildElementsWithSameName is for the case where the name is the same regardless of whether the namespace is different or the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there are kindof few different cases that are are being considered for warnings here:
- Infoset representations that do not support namespaces. We warn here if immediate siblings have the same name (but different namespaces). This is the existing NamespaceDifferencesOnly check.
- Infoset representations that do not support siblings with the same name (but do support namespaces). We warn here if any siblings have the same QName.
- Infoset representations that do not support namespaces OR siblings with the same name. We warn here if any siblings have the same name (but different namespaces).
I think we definitely need to warn for 3 (infosets that support neither namespaces nor same sibling names are common). If we implement 3, that will automatically match 1 and 2. If we have separate checks for 1, 2, and 3, we would end up with duplicate warnings. For example, something that warns for 3, will also warn for 2 and maybe 1 if the siblings are adjacent. Maybe this is fine, which would allow you to suppress 3 warnings but not 1, for example. Or maybe we only warn for 3, and the warning message can be descriptive enough to mention the different implications depending on what an infoset does and does no support?
Agreed NamespaceDifferencesOnly probably isn't the right id for this warning. There are a number of other kinds of things we'll eventually want to warn about that are all about related to infosets that don't support certain things, for example anonymous choices, nested choices, nullable elements. Do we want to eventually plan to warn for all of these kinds of tings under a single warn ID (e.g. LimitedInfosetRepresentation)? Or do we want specific IDs for each type of limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples just to make sure im following
val type1 =
<xs:element>
<xs:complexType>
<xs:sequence>
<xs:element name="a" type="xs:string"/>
<xs:element ref="u:a"/>
<xs:element name="b" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
val type2 = {
<xs:element>
<xs:complexType>
<xs:sequence>
<xs:element name="a" type="xs:string"/>
<xs:element name="b" type="xs:string"/>
<xs:choice>
<xs:element ref="tns:a"/>
<xs:element name="a" type="xs:string"/>
</xs:choice>
</xs:sequence>
</xs:complexType>
</xs:element>
}
val type3 = {
<xs:element>
<xs:complexType>
<xs:sequence>
<xs:element name="a" type="xs:string"/>
<xs:element name="b" type="xs:string"/>
<xs:element ref="u:a"/>
</xs:sequence>
</xs:complexType>
</xs:element>
}
I think the existing checkIfMultipleChildrenWithSameName takes care of 2 and 3. 2 will warn if they have the same namespace and 3 will warn if they have different namespaces. I don't think we're supposed to be checking for what the current infoset supports so I think we can use the same warn id for them. I am inclined to leave 1 with the warnId NamespaceDifferencesOnly rather than deprecating it since it's a well established warning. It might be confusing for the users to suddenly change the warning?
I'm inclined to have separate warnIds for different limitations. @mbeckerle thoughts?
Assert.invariant(gm.isInstanceOf[ElementBase]) | ||
gm.asInstanceOf[ElementBase].namedQName | ||
private lazy val checkUnorderedSequenceMembersHaveUniqueNamesInNamespaces: Unit = { | ||
groupedMembersWithSameName().values.foreach { children => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just from the name groupMembersWithSameName
, it's not clear that this includes the namespace. If I just read this I think I would assume it only looks at the name. Maybe adding includesNamespae = true
just to make it explicit. Or maybe alternatively name it groupMembersWithSameQName
, and change the parameter to excludeNamespace
which defaults to false, and the other use of this would be groupmembersWithSameQName(excludeNamespace = true)
.
Another option, you could do something like
def groupMembersBy[T](func: (ElementBase) => T): Map[T, Seq[Term]] = {
...
.groupBy { func(gm.asIsntanceOf[ElementBase]) }
}
And then the usage could be groupMembersBy(_.namedQName)
and groupMembersBy(_.name)
. This makes it much more clear what your grouping by, and could be used for other things in the future.
includeNamespace: Boolean = true | ||
): Map[Boolean, Seq[Term]] = { | ||
val childrenGroupedByName = groupMembers | ||
.filter { m => m.isInstanceOf[LocalElementDecl] || m.isInstanceOf[ElementRef] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This filter is new from the old implementation. I think it's fine for the unordered sequences check since unordered sequences only allow local element decls or refs. But I don't think it works for the checkIfMultipleChildrenWithSameName
check. Normal ordered sequences are allowed to have things other than elements (e.g. choice, group, sequence), and their immediate children need to be considered for the multiple children with same name check, since they all end up as siblings in the infoset of local element decl's and references and cause potential ambiguities for some infoset types.
I think we really need to consider all children regardless of model groups. We probably already have a member for this, we have lots of members about siblings and children. Maybe elementChildren
is the right thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think elementChildren will work for the generic groupMembersBy since we do checkMembersAreAllElementOrElementRef before checkUnorderedSequenceMembersHaveUniqueNamesInNamespaces, so we won't get to the latter if the former fails. So I removed the filter and changed groupMembers to elementChildren
checkMembersHaveUniqueNamesInNamespaces.filter(_._1 == true).values | ||
nonUniqueNameChildren.foreach { children => | ||
children.head.SDW( | ||
WarnID.MultipleChildElementsWithSameName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there are kindof few different cases that are are being considered for warnings here:
- Infoset representations that do not support namespaces. We warn here if immediate siblings have the same name (but different namespaces). This is the existing NamespaceDifferencesOnly check.
- Infoset representations that do not support siblings with the same name (but do support namespaces). We warn here if any siblings have the same QName.
- Infoset representations that do not support namespaces OR siblings with the same name. We warn here if any siblings have the same name (but different namespaces).
I think we definitely need to warn for 3 (infosets that support neither namespaces nor same sibling names are common). If we implement 3, that will automatically match 1 and 2. If we have separate checks for 1, 2, and 3, we would end up with duplicate warnings. For example, something that warns for 3, will also warn for 2 and maybe 1 if the siblings are adjacent. Maybe this is fine, which would allow you to suppress 3 warnings but not 1, for example. Or maybe we only warn for 3, and the warning message can be descriptive enough to mention the different implications depending on what an infoset does and does no support?
Agreed NamespaceDifferencesOnly probably isn't the right id for this warning. There are a number of other kinds of things we'll eventually want to warn about that are all about related to infosets that don't support certain things, for example anonymous choices, nested choices, nullable elements. Do we want to eventually plan to warn for all of these kinds of tings under a single warn ID (e.g. LimitedInfosetRepresentation)? Or do we want specific IDs for each type of limitation?
2b01108
to
7c4a219
Compare
- currently when we have multiple child elements in a sequence with the same name, daffodil allows it, while choice branches return a UPA error and unordered sequences return an SDE. We want to generate a warning when this happens in an ordered sequence so we refactor code to warn if the local names are the same. This has the effect of warning whether namespaces are supported or unsupported - we group using elementChildren instead of groupMembers since we want to consider all children since regardless of the model groups, their immediate children end up as siblings in the infoset and can cause potential ambiguities for some infoset types - add test to check we get UPA error with choices - add tests to exercise new warning (same/different namespaces, but same local name) DAFFODIL-2736
7c4a219
to
97e98bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 👍 just a minor comment
private lazy val checkUnorderedSequenceMembersHaveUniqueNamesInNamespaces: Unit = { | ||
// previous checks should ensure that all group members for unordered sequences are either local | ||
// elements or element references | ||
groupChildrenBy(_.namedQName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forget why we added the groupChildrenBy
function, but it only replaces elementChildren.groupBy
. The former isn't significantly shorter and I don't think it is any more clear. Is it worth removing groupChildrenBy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It used to do alot more than elementChildren.groupBy, but I moved out the filter stuff, since that wasn't related to grouping
- remove groupChildrenBy DAFFODIL-2736
DAFFODIL-2736