New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft: Implementation for LibraryElementHasher #597

Open

FRoith wants to merge 1 commit into eclipse-4diac:develop from FRoith:hashing

+696 −1

FRoith commented Oct 7, 2024

This change will add hashing functionality to the IDE, and automatically adds computed hash values to the output of the Forte NG exporter.


          Initial working LibraryElementHasher, output still in debug mode

71e3705

github-actions bot commented Oct 13, 2024

Test Results

100 files - 10 100 suites - 10 43s ⏱️ -5s
27 474 tests - 57 27 474 ✅ - 57 0 💤 ±0 0 ❌ ±0
27 475 runs - 57 27 475 ✅ - 57 0 💤 ±0 0 ❌ ±0

Results for commit 71e3705. ± Comparison against base commit 987566a.

This pull request removes 57 tests.

org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testAdd
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testAnd
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testCalc
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testDivInt
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testDivReal
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testEq
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testGt
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testLe
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testMod
org.eclipse.fordiac.ide.test.export.forte_lua.st.ForteLuaStOperatorTest ‑ testMul
…

azoitl requested changes

View reviewed changes

Contributor

azoitl left a comment •

edited

Loading

The IHasherInterface, the extension point, and the helper for getting a hasher should be moved to the model plugin, best to the typelibrary. Because we will needed it for different use cases. I would even expect that the TypeEntry will cash the hash values as they are rather expensive to create.

Please also add the copyright headers to the new files. This is the reason the build didn't run through.

....export.forte_ng.st/src/org/eclipse/fordiac/ide/export/forte_ng/st/LibraryElementHasher.java

+              		if (il.getVisibleOutputVars().size() > 0) {
+              			parts.add("DATAOUTPUTS{"); //$NON-NLS-1$
+              			for (final VarDeclaration vdec : il.getVisibleOutputVars()) { // TODO getVisibleOutputVars == DataOutputs?
+              				parts.add(String.format("%s:%s", vdec.getName(), vdec.getTypeName())); //$NON-NLS-1$

Contributor

azoitl Oct 13, 2024

Why are you not using your vardecl method here?

....export.forte_ng.st/src/org/eclipse/fordiac/ide/export/forte_ng/st/LibraryElementHasher.java

+              		if (il.getVisibleInputVars().size() > 0) {
+              			parts.add("DATAINPUTS{"); //$NON-NLS-1$
+              			for (final VarDeclaration vdec : il.getVisibleInputVars()) { // TODO VisibleInputVars == DataInputs?
+              				parts.add(String.format("%s:%s", vdec.getName(), vdec.getTypeName())); //$NON-NLS-1$

Contributor

azoitl Oct 13, 2024

Why are you not using your vardecl method here?

....export.forte_ng.st/src/org/eclipse/fordiac/ide/export/forte_ng/st/LibraryElementHasher.java

+              		if (il.getEventOutputs().size() > 0) {
+              			parts.add("OUTPUTEVENTS{"); //$NON-NLS-1$
+              			for (final Event event : il.getEventOutputs()) {
+              				parts.add(String.format("%s:%s", event.getName(), event.getTypeName())); //$NON-NLS-1$

Contributor

azoitl Oct 13, 2024

Why are you not using your event method here?

....export.forte_ng.st/src/org/eclipse/fordiac/ide/export/forte_ng/st/LibraryElementHasher.java

+              		if (il.getEventInputs().size() > 0) {
+              			parts.add("INPUTEVENTS{"); //$NON-NLS-1$
+              			for (final Event event : il.getEventInputs()) {
+              				parts.add(String.format("%s:%s", event.getName(), event.getTypeName())); //$NON-NLS-1$

Contributor

azoitl Oct 13, 2024

Why are you not using your event method here?

....export.forte_ng.st/src/org/eclipse/fordiac/ide/export/forte_ng/st/LibraryElementHasher.java

+              	private void process(final VarDeclaration vdec, final List<String> parts, final String version) {
+              		// TODO handle LocalVariable or MemberVarDeclaration
+              		parts.add(String.format("VDEC:%s:%s", vdec.getName(), //$NON-NLS-1$

Contributor

azoitl Oct 13, 2024

data type name is missing

....export.forte_ng.st/src/org/eclipse/fordiac/ide/export/forte_ng/st/LibraryElementHasher.java

+              		parts.add(String.format("ECC{")); //$NON-NLS-1$
+              		parts.add(String.format("ECStates{")); //$NON-NLS-1$
+              		for (final ECState ecs : ecc.getECState()) {
+              			parts.add(ecs.getName());

Contributor

azoitl Oct 13, 2024

actions are missing

azoitl requested a review from ernstblechaPT

October 13, 2024 14:46

mx990 self-requested a review

October 13, 2024 15:50

Member

mx990 commented Oct 13, 2024

Aside from what @azoitl has already mentioned, I have a few general remarks regarding your current approach:

You are hashing ST algorithms and methods by generating C++ code and using that for your hash. This means your hash:
- is specific to the forte_ng export,
- is sensitive to any future changes in the exported C++ code,
- is sensitive to values of global constants in array ranges, which also makes it difficult to cache,
- requires you to fully parse the ST code and generate C++ code, which makes it particularly expensive.
I believe your hash is also currently sensitive to any added or removed lines in the ST code, since the generated C++ code contains #line directives (with the line number from the ST code). You would need to suppress them, for example, by overriding generateLineDirective(EObject) to return an empty string. Of course, that is easily done in this specific case, but something similar may come up again in the future and break your hash.
You are explicitly iterating over all relevant model elements for your hash. I believe this is quite fragile and error-prone, since this makes it easy to accidentially omit relevant information. This is particularly problematic when someone else may extend the model in the future, without realising the need to extend the hash alongside it.

I would thus like to propose a different approach:

use an XMI or binary EMF resource to generate the input for hashing,
use the plain ST code, stripped of any extra whitespace, pragmas, and comments, for the hash,
use meta-model annotations to specify features to be excluded from the hash.

I believe this has the following advantages over the current approach:

all features are included in the hash by default,
based on a known format instead of running our own,
exceptions can be specified declaratively on the meta-model,
follows any future changes to the model without significant additional work,
does not require parsing of the ST code.

Contributor

azoitl commented Oct 14, 2024

@mx990 how is your xmi based approach not breaking any existing XMI export?

Member

mx990 commented Oct 15, 2024

@mx990 how is your xmi based approach not breaking any existing XMI export?

I believe the hashing implementation should be entirely independent of the XMI export, although it would certainly be quite similar in how it works. I would propose the following:

create custom annotations for the EMF meta-model [1] to:
- exclude a feature from hashing (e.g., for comments, etc.),
- specify a class for pre-processing elements (e.g., for ST algorithms to strip comments, etc.)
create a custom EcoreUtil.Copier that:
- skips excluded features,
- calls the pre-processor on copied elements if specified [2],
create a generic hasher that works similar to the XMI export [3]:
- copy the type (with the custom Copier),
- add the result into a plain org.eclipse.emf.ecore.xmi.impl.XMIResourceImpl,
- save the resource into an appropriate OutputStream,
- use the result for hashing.

[1] see org.eclipse.fordiac.ide.model.emf.FordiacMetaData
[2] see also org.eclipse.fordiac.ide.structuredtextcore.resource.LibraryElementXtextResource.ShallowCopier
[3] see org.eclipse.fordiac.ide.export.xmi.XMIExportFilter

Author

FRoith commented Oct 16, 2024

I do really like the approach, I'll look into implementing it that way, and see if I hit any snags on the way. The only potential issue I see, is that I don't yet have an idea of how I would introduce detecting "old versions" of the hash with this approach, as that was one of the requested features, but I'll see what I can find.

Contributor

oberlehner commented Oct 22, 2024

Please add unit tests.
This is really important, because we want to detect early if any implementation inside the IDE (e.g change in the parser) might potentially changes the hash values.

So please add test cases for every type that can hashed:
-SubappType
-SimpleType
-CFB
-SIFB

Are you hashing Structured Data Type ? i cannot find that in the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet