Feature/faster python build #91

staleyLANL · 2023-03-20T21:11:53Z

This change may raise some eyebrows. It arguably thumbs its nose at the normal build-system concept of fine-scale recompilation: compile only the individual translation units that have changed since the last build. However, I believe the new system is much better, given the circumstances. (More below.)

The old system is still available with: make -j <threads> GNDStk.python.old.

For the new system, just use this: make GNDStk.python.

No -j is necessary for the new system. It doesn't help.

Timings on the author's office machine, with make -j 24:

 * old: 735.93 seconds.
 * new: 221.25 seconds.

Additional benefits:

 * Less memory, overall, appears to be used.
 * Only one thread is needed. (This also frees up memory.)

Moreover, the resulting .so file is less than half of its original size! On the author's office machine:

 * old size: 221,451,008 bytes.
 * new size: 106,036,816 bytes.

Why does the size improve so much? My guess is that this relates, in part, to lambdas. Consider that two classes, X and Y, may both contain an object of class Foo. Foo's functionality may use lambdas. Internally, compilers handle lambdas by generating anonymous classes with call operators. Compile files individually,and multiple anonymous classes (and thus operators) for Foo may be created. As they're anonymous, we won't get multiply-defined symbol errors when linking. But, the same thing may be repeated in multiple files, whereas more efficiency can be obtained with the new system. (This theory about lambdas being the cause may or may not be correct. Regardless, the new .so file's size is much smaller, at least on my machine.)

So, what's the new system? Well, we made a file,

 python/src/all.python.cpp

that includes all of the *.python.cpp files - a total of 280 files - that were previously compiled individually. Stuffing these together is fine, from a technical standpoint, because of course we have macro guards on all the header files that those source files include, and we don't have file-scope statics or other things that might cause issues.

This process - stuffing C++ files together - flies in the face of the normal build concept, but (if the above stats aren't convincing enough), we believe it may be the best approach here.

 * Users are likely to download GNDStk, compile once, and then not bother
   doing so again until a substantial new release comes along. Enough
   things are likely to change, between substantial releases, that most
   individual files - not just a few - may have changes, and thus need
   recompilation, anyway.

 * Users may now see, for instance, 6 minutes instead of 20. Yay!

 * Most `*.python.cpp` files are autogenerated. If one such files changes,
   it's likely that most others have changed similarly, due to some change
   in the code generator's behavior. In this case, there's no big benefit
   to a fine-scale recompilation.

The big issue is this: GNDStk is a large header-only library, and its header is included in all of the *.python.cpp files. The author's long experience with such situations has been that the processing of the header (including what it additionally includes, such as the JSON library) - translating from code to an AST (abstract syntax tree) - eats up most of the compilation time. Basically, in our old system here, the same processing was being done 280 times, each time followed by typically just a page or two of code in the *.python.cpp file itself! So, a fine-scale-compilation system is dominated by overhead. The same overhead, 280 times!! :-( Stuff the files together, and the header processing happens only once.

I'd suggest that in all but rare (and probably contrived) circumstances, the new system will be faster than the old system, and also use less memory. And, it'll always need only one thread, and give a smaller-sized .so file.

Note: the code generator can be modified so that it automatically puts together the individual source files that currently appear in python/src/all.python.cpp, except for the non-autogenerated ones at the top. In this case, we could then have python/src/all.python.cpp include the non-autogenerated ones, as they do now, plus a single extra file. At the moment, I haven't modified the code generator to do this.

An earlier branch introduced the ability to write HDF5 files in different ways, according to two boolean flags: *reduced* and *typed*. This branch enhances our JSON capabilities in the same manner, reflecting, with JSON, an analog of the options we have for HDF5. The above was done in respect to GNDStk's general philosophy of making its support of different file types as consistent with one another as reasonably possible. Similarly to HDF5, the JSON class now has two static booleans: "reduced" and "typed". The "reduced" flag tells the JSON writer whether or not to reduce -- basically, to fold into a shorter and simpler representation -- certain special constructs that GNDStk's Node uses internally in order to handle what are called cdata, pcdata, and comment nodes in XML. Basically, if reduced == true, then we make this simplification, which shortens the JSON output. If reduced == false, then the JSON will closely reflect what Node uses internally. The "typed" flag tells the JSON writer whether or not it should use our type-guessing code to guess whether "strings" in certain contexts (in particular, metadata values and pcdata) actually contain what look like numbers. Examples: "12" looks like an int, "321 476" looks like a vector of ints, "3.14" looks like a double, and "3.14 2.72 1.41" looks like a vector of doubles. If typed == false, we use the original strings, with no type guessing. If typed == true, we apply the type guesser, and, where appropriate, get JSON numbers in place of strings in the JSON output. This is a work in progress, with a couple of things that still need doing... Still to do: modify the JSON *reading* code so that it recognizes the various different ways that the JSON *writing* code writes things, and can reliably reverse the writing process and recover a GNDStk Node. Also still to do: at present, the nlohmann JSON library, which we use under the hood, doesn't provide a way to write JSON numbers (unquoted, as opposed to quoted JSON strings) beginning from an existing string representation that we might provide for the number. Consider this discussion that I started: nlohmann/json#3460 This is relevant to us because GNDStk provides very fine control over exactly how numbers, in particular floating-point numbers, are formatted, in terms of the number of significant digits, fixed vs. scientific form, etc. At the time of this writing, if we use typed == true, so that certain strings that look like numbers are written as numbers, the numbers will, unfortunately, be formatted by the JSON library itself. We'd like to have the capability of writing an original string (one that we've already determined looks like a number!) - but writing it as a JSON number (so, without quotes) rather than a JSON string. In this commit, we also Modified an old Tree/ test that depended on the previous default (but still available, via appropriate flags, even though no longer the default) JSON writing behavior. And, we changed a variable name in json2class.cpp. We refactored detail-node2hdf5.hpp (for HDF5) considerably, and then reflected this in the modified detail-node2json.hpp (for JSON) that we're primarily working on. And, we've enhanced the JSON test code so that it tries out the new flags. This is a work in progress as well. In order to finish it, we need to finish the ability to read JSON files that were written not just in our original manner, but with any variations of the above-described flags.

Also some small tweaks and comment changes here and there.

Small improvements.

…into feature/better-json

Some tweaks to HDF5-related code. Extended and improved some comments, to clarify things.

Additional tests for those, too.

There are still some instances of the strings in question, typically appearing amongst other text inside of string literals. At some point I'll look into fixing up those cases too, but that's not a priority right now.

Updated the HDF5 test suite to ensure that we test the modifications.

We'll update the code generator later, and ensure that it does the right thing. More feature-packed classes Field and FieldPart.

I'm not sure why I made them be converting constructors to begin with. Whatever the reason, it seems (according to the test suite!) that they don't need to be converting constructors. And, I don't think that they should be. Indeed, some new code modifications uncovered this issue, as converting constructors for Node triggered certain ambiguities.

Code generator itself still needs updating.

…in name isn't known. This was important for the new generated-class Field system to work well, without carrying along a std::string (expensive!) in each class field. On the other hand, such a thing wouldn't necessarily be a memory hog in a well-designed library that mostly contained vectors and such - not a huge number of individual scalar fields.

Improved some SFINAE. Added more comments, and did a bit of rearrangement.

It now generates classes that use the new system.

into feature/component-field

… feature/component-field

Note that it doesn't yet add shortcuts to generated classes; that'll come soon. Changed the way a map was done. Faster, and leads to better printing of shortcut information.

Some refactoring related to the above. Miscellaneous cosmetic and clarity-related changes.

Updated various test codes.

Some small printing-related changes and fixes as well.

… offers. There's lots more to be done in the above respect; this is just a start. Also, provided some preliminary support for another approach to customization. The original (and still in place) "customization" system involves inserting code from custom.hpp files directly into generated classes. Inserting additional bits and pieces of code into already-existing classes is actually a somewhat goofy concept. (Imagine someone customizing, say, std::complex by adding a new piece of information. They really should derive from it, or contain it in another class. Not try to mess with the existing class.) I'm fleshing out *one* new way, at least (with possibly others to come) for doing customizations. There's more to be done, but I'll work with a particular user right now, to see how this works. JSON specs now allow metadata to have converters. It turns out that this will help with the new customization business. Briefly: we want an ability to insert our own fields into classes (say, to contain data that we'll compute as functions of other fields), but without those new fields playing a role in I/O from/to files. A new (and simple) "Noop" (no-op[eration]) class fills the necessary role of representing the "no conversion is taking place" for I/O on the new fields. Some new if-constexprs, elsewhere in the code, allow convert() functions to return a bool (but they can still return void, as before). If a convert() returns bool, and is used in the context of Node add() metadatum or child node, then a return value of false means, "don't actually add the new metadatum or children." This is all in support of our new customization approach, because we don't necessarily want auxiliary computed fields to be written to files. (But still want them in the multi-query, so they work with print() and such.) Tweaked some SFINAE. This detail:: stuff really ought to be made more internally consistent, even though it's in detail::. Note: the _v and _t business is modeled after what's in std::. (Examples: enable_if_t, is_convertible_v.) Adjusted some comments to reflect recent changes.

…terface. Some code cleanup and minor rearrangements. Lots more to come in the Python interface, as time allows. The next thing will probably be access to our extensive capabilities for controlling formatting, in particular as it relates to floating-point precision.

staleyLANL · 2023-03-23T01:23:46Z

Remark. The CI is still failing. With the new system, we could see if a plain make, rather than a make -j 2, succeeds. Only one thread is necessary for GNDStk.python. It may or may not use less memory than the -j 2 case, though.

Another remark. Other NJOY codes that are similar in certain respects (have python bindings, binding files unlikely to change on a fine scale, many such files, each including a relatively long and complex header for which processing overhead is large) may show similar build-time improvements if we do something similar to what I did here. We'd just need to try, in order to know. Aside from build-time improvements, other improvements may include needing only one thread, using less memory, and producing substantially shorter archive files.

The code generator builds C++ code, a C-language interface, and Python bindings. Prior to this change, the code that generated the C interface handled only the old "BlockData" manner of handling data nodes. That was a construct that handled the "valueType"-based "run-time defined data type" business that exists in some GNDS nodes. But our format for format specs allows for a fixed data type as well, and it's handled in a different way. For a user who has been using GNDStk's code generator to generate an entirely different format, I needed to update the generation of the C interface in order to handle this properly.

These are largely in preparation for some upcoming, more-substantive changes; ones that should somewhat improve and clarify the interface.

Some work on Component::has() and Component::operator(). Small change: class to struct in a few places.

Additional work regarding Lookup objects, Field, Component, etc.

Cleaned up some SFINAE, and removed some that was no longer needed. A bit of formatting work in some CMakeLists.txt files, just to make them arguably more readable.

As with: https://stackoverflow.com/questions/57770972

- Clarified "auto" in some places. - Ditto for constness.

See: https://stackoverflow.com/questions/42797279

Minor changes in some SFINAE to more-closely resemble certain std:: constructs. Sfome SFINAE sfimplifications. Removed unnecessary "inline"s. The functions in question were in fact function *templates*, so that "inline" was unnecessary in order to still achieve header-only-ness. Used "struct" instead of "class" in a few places. Used "size_t" instead of "int" in a few places. The difference wasn't really important in these places, but I decided that size_t was correct in principle. C Interface: fixed some previously unnoticed discrepancies that arose after an earlier change. Some function names were changed a while back: color to colors, note to notes, and warning to warnings. However, the *declarations* (for C) hadn't been updated accordingly. Function name change: detail::getter() to detail::compGetter(). (comp, for class Component.) Too many things were called "getter." This change should make code auditing/analysis a bit easier, going forward. Miscellaneous other changes. There are still some relevant files to be uploaded, but I need to review them more carefully first.

Split up some longer files.

The new overload and SFINAE were added after I switched from clang 9 to clang 17, and the latter reported overload ambiguities in some old code that hadn't had problems with earlier compilers. It's not clear to me that ambiguities really exist, but in any event, the new overload and the SFINAE disappears the new errors. Casted to (void) in certain places in some test files, after noticing that a new g++ emitted warnings about the results of some function calls being unused. (We did those calls to test that they're valid and to check out-parameters, not because we intended to use the return values of the functions.)

staleyLANL added 30 commits April 29, 2022 14:43

Uploading some unfinished JSON-related work-in-progress.

50c42c6

Also some small tweaks and comment changes here and there.

Added some vetter JSON files, to be used in the test suite.

c36ceff

Small improvements.

Merge branch 'feature/better-json' of https://github.com/njoy/GNDStk …

b6f942d

…into feature/better-json

More work on some JSON-related code.

faf2e67

Some tweaks to HDF5-related code. Extended and improved some comments, to clarify things.

Some extra material in resources for the JSON tests.

86d5527

Saving some work.

0f346b7

More updates to the work on JSON improvements.

3ee65d7

Additional tests for those, too.

More JSON code updates.

91da1bf

Finishing up some JSON work.

7669c83

Put most instances of certain reappearing strings into constants.

5caca0f

There are still some instances of the strings in question, typically appearing amongst other text inside of string literals. At some point I'll look into fixing up those cases too, but that's not a priority right now.

Small but worthwhile improvements in the HDF5 files GNDStk creates.

7f84bbd

Updated the HDF5 test suite to ensure that we test the modifications.

Detail detail.

5e932dc

Additional work on simplifying autogenerated classes.

93acc2d

Direct modifications of certain generated files.

c11ea62

We'll update the code generator later, and ensure that it does the right thing. More feature-packed classes Field and FieldPart.

More generated-code updates.

621995a

Code generator itself still needs updating.

Some new test material, for better generated-code getter coverage.

ed83af4

Corrected some verbiage. The word "data" is plural, not singular. :-D

673c516

Shorten constructors in generated classes.

9111a5e

Additional autogenerated C++ and Python code updates.

fa7234a

Shortened some things.

6cd2ba7

Improved some SFINAE. Added more comments, and did a bit of rearrangement.

Updating generated code.

9c783f0

Major update to code generator.

f96a289

It now generates classes that use the new system.

Merge branch 'feature/component-field' of https://github.com/njoy/GNDStk

a09ecfa

into feature/component-field

Temporary fix for broken HighFive library master branch. :-/

057485b

Shortened some things.

9c9bcfb

Some work mostly in Component's field.hpp. Work-in-progress.

d93a560

Merge branch 'feature/component-field' of github.com:njoy/GNDStk into…

3243a9f

… feature/component-field

staleyLANL added 8 commits February 22, 2023 00:37

Made the shortcut generator more flexible.

da70eb2

Note that it doesn't yet add shortcuts to generated classes; that'll come soon. Changed the way a map was done. Faster, and leads to better printing of shortcut information.

Updating generated code.

ddab317

Shortcuts now written to C++ and to Python-binding code.

4bee8e0

Some refactoring related to the above. Miscellaneous cosmetic and clarity-related changes.

Updates to Component printing, used for Python __repr__.

ce81699

Updated various test codes.

print() details.

f5fc4d4

Python mode for printing.

82a928a

Some small printing-related changes and fixes as well.

Improve python-bindings build speed by a factor > 3.

ef2e240

staleyLANL requested a review from whaeck March 20, 2023 21:11

staleyLANL self-assigned this Mar 20, 2023

staleyLANL and others added 18 commits March 30, 2023 14:31

Reviewing and improving class Field.

5557527

These are largely in preparation for some upcoming, more-substantive changes; ones that should somewhat improve and clarify the interface.

Work on the Field and Lookup classes.

8f925ec

Some work on Component::has() and Component::operator(). Small change: class to struct in a few places.

Regenerate some generated files.

c23d29a

Some cleanup, in particular removing old files that won't be used.

95b6546

Additional work regarding Lookup objects, Field, Component, etc.

Work regarding better handling re: index vs. label vs. class Lookup.

bab98b9

Cleaned up some SFINAE, and removed some that was no longer needed. A bit of formatting work in some CMakeLists.txt files, just to make them arguably more readable.

Small cleanup: ensure newline at EOF of each text file.

dda53ef

As with: https://stackoverflow.com/questions/57770972

A couple of small but worthwhile code improvements:

3343ae2

- Clarified "auto" in some places. - Ditto for constness.

Cleanup regarding size_t.

db4afb5

See: https://stackoverflow.com/questions/42797279

Updated autogenerated file.

f94f790

Some refactoring and rearrangement of Component-related material.

8ccb129

Updated files, reflecting some recent cleanup/simplification.

3ebd840

Workaround for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85282

d09c337

Some improved exposition.

491f75d

Split up some longer files.

Updating CI config to make it run again

c7400da

Changing OS for HDF5 library installation in CI

97bb802

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/faster python build #91

Feature/faster python build #91

staleyLANL commented Mar 20, 2023

staleyLANL commented Mar 23, 2023

Feature/faster python build #91

Are you sure you want to change the base?

Feature/faster python build #91

Conversation

staleyLANL commented Mar 20, 2023

staleyLANL commented Mar 23, 2023