Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for secondary edges #6

Open
amir-zeldes opened this issue Jan 10, 2023 · 5 comments
Open

Support for secondary edges #6

amir-zeldes opened this issue Jan 10, 2023 · 5 comments

Comments

@amir-zeldes
Copy link
Collaborator

The latest rstpp format supports secondary, treebreaking edges, represented as follows in rstWeb. Minimal example (not necessarily linguistically correct!)

<rst>
	<header>
		<relations>
			<rel name="cause" type="rst"/>
			<rel name="evaluation" type="rst"/>
		</relations>
		<sigtypes>
			<sig type="DM" subtypes="DM"/>
			<sig type="Reference" subtypes="Comparative reference;Demonstrative reference;Personal reference;Propositional reference"/>
		</sigtypes>
	</header>
	<body>
		<segment id="1" parent="4" relname="span">Lin wants to go</segment>
		<segment id="2" parent="3" relname="cause">because Beth knows Kim -</segment>
		<segment id="3" parent="6" relname="span">this won’t happen</segment>
		<group id="4" type="span" />
		<group id="6" type="span" parent="1" relname="evaluation"/>
		<secedges>
			<secedge id="2-1" source="2" target="1" relname="cause"/>
		</secedges>
		<signals>
			<signal source="6" type="Reference" subtype="Demonstrative reference" tokens="10"/>
			<signal source="2-1" type="DM" subtype="DM" tokens="5"/>
		</signals>
	</body>
</rst>

image

Note that secondary edges support signals just like regular edges, and their 'source' field uses a 'src-trg' syntax (1-2), which is also used in the <secedge> element. Since these structures are treebreaking and can form cycles I propose using pointing relations to represent them on top of the existing dominance edge tree, and using a second pointing relation type if cycles occur. This should be very rare if even attested, but the importer should ideally be able to handle it by checking for cycles before generating Salt.

@lgessler
Copy link
Collaborator

This new structure is a little awkward to handle with Salt: it'd be most straightforward to treat a <secedge /> as an SPointingRelation, but a <signal /> needs to reference a <secedge /> in its source attribute, and all SRelations are constrained to be between descendants of SNode, which an SPointingRelation is not.

It's probably best to do the following:

  1. Represent <secedge /> with SStructure, with an SPointingRelation mediating either side of it: in the example above, there would be a pointing relation outbound from this node (2-1) to its target (1), and a token inbound from its source (2) to this node (2-1).
  2. Signal handling then becomes the same as usual: make a SStructure for the signal and make it dominate the node the signal appears in (2-1).

@lgessler lgessler mentioned this issue Jan 12, 2023
@amir-zeldes
Copy link
Collaborator Author

Hm, I see the problem. But wouldn't that mean that a query for a simple secedge between two EDUs would end up requiring two relation edges and three nodes? Or what would AQL look like for finding (2) ->cause (1)?

@lgessler
Copy link
Collaborator

lgessler commented Jan 12, 2023

Yeah, I think that's right. I don't see a viable workaround, though.

The only other possible solution that I can think of to make a single SPointingRelation per <secedge /> work is to have each <signal /> referencing a <secedge /> in its source not by means of an SRelation but instead by storing an ID identifying the <secedge />. So to query this, you would instead need to assert equality between this annotation and the annotation on the secedge's SPointingRelation. This is probably even more awkward, right?

@amir-zeldes
Copy link
Collaborator Author

I agree it's awkward, but I like option 1 even worse, especially considering that some people might implement secedges without using signals, in which case the entire existence of the SStructure would be unmotivated and bewildering to them. I think secedges are so intuitively single-edge-like that we have to keep that isomorphism somehow.

How about this third option: We use a single PR between the source and target node, but attach the signal node to both the source and target nodes to indicate that it 'belongs' to that edge? We could even give it an attribute that indicates it's a secedge signal, so it's not confused with a primary edge signal. This would lead to queries like this:

  • edu_num="2" ->rst[relname="cause"] edu_num="1" (plain secedge query)

Then for a secedge signal by itself you could do:

  • signal_kind="sec" _=_ signal_type="dm" >* lemma="because"

And now the awkward part with both:

edu_num="2" ->rst[relname="cause"] edu_num="1" &
signal_kind="sec" _=_ signal_type="dm" >* lemma="because" & 
#3 > #1 & #3 > #2

The last part ensures that this signal node 'straddles' both the source and target with a dom edge. Not fun, but maybe better than options 1 and 2? And for people not using signals, or using signals but not secedges, maximum simplicity is retained.

lgessler added a commit to lgessler/pepperModules-RSTModules that referenced this issue Jan 12, 2023
@lgessler
Copy link
Collaborator

OK, implemented that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants