-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of global variable ourDIR #4
Comments
Passing the pointers around seems the only viable option here. Trying to implement this I have run into a few roadblocks:
I have some work in progress, but am reluctant to push it to github just yet since it mainly breaks everything and does nothing productive. |
Hi Markus, great that you are interested in this. Regarding your points:
|
Hi Max, On Tue, Nov 11, 2014 at 07:11:29AM -0800, Max Horn wrote:
I think we agree on this one. I think the two parameter version for one or two
I didn't even realise that T_DATOBJ existed until now. I was looking for such
Actually there already is T_SINGULAR and T_POLYMAKE, of which I don't know As to address the memory leaks one might think We could even sub-type the pointer bags. People who do C-level kernel programming will have to know what they are
Ok, I'll make a disjoint API then. It's a bit icky that IO comes along with |
Oops, my point 3 was incomplete -- I was about to write something there, then got disrupted, and when I resumed later on forgot that I had been editing that sport sigh. I was actually planning to write something like the following: We do this already in the packages SingularInterface (formerly "libsing") and NormalizInterface. The way this is done right now is the following: For each package that needs it, a TNUM is reserved (SingularInterface uses GAP itself never creates objects (or rather, "bags") with these TNUMs, but external packages do this. The cool thing about this is that it allows those external packages to do proper memory management, as the GAP garbage collector GASMAN provides hooks for this, set via This system is not perfect, but works reasonably well. The only problem is that it currently requires a modification to the GAP kernel for each package that wants to do this. This is clearly a problematic model that does not scale well. And I really hope it will have to scale, as I'd like to see wrappers for e.g. libcurl (full-blown HTTP client support), libz/libzlib/libarchive (various libraries for dealing with compressed files or even archives), pcre (or any other regular expression library), GMP (exposing the high-level GMP functions, which we currently can't use in GAP -- this should be rather easy to achieve, and I'd like to do it in a way that integrates well with SingularInterface and NormalizInterface). The simplest ad-hoc solution for this is to reserve a bunch of TNUMs for packages and let packages register a TNUM. I have been using this internally for quite some time now, and it boils down to a rather short code change, essentially this (plus some header modifications):
Packages then would call
so that even workspaces should work, provided that packages are loaded in the exact same order as before. (To fix the latter constraint, one could store which TNUM was assigned to which package; alas, for now I deemed it not worth the effort, as other things are likely to break when the load order changes, such as types get assigned which flag values etc. -- but it certainly would be possible to improve this). Note that I deliberately did not include a mechanism for querying TNUMs. This keeps the code very simple and makes it easy to keep it bug free, while not preventing packages from exchanging TNUM. For example, I imagine we'll have a GMPInterface package at some point, and both SingularInterface and NormalizInterface will want to use its "external GMP integers". Well, no problem, it can simply assign the TNUM integer value to a global gap variable, say |
So, instead of adding new code which somehow uses "pointers to destructors" stored in T_POINTER objects, we simply use the existing infrastructure as it was meant to be. Perhaps we can eventually come up with something better, but for now, this works quite well (at least in my limited, local tests -- which notably do not cover workspaces, a feature I rarely use). |
I wasn't aware that this infrastructure already existed. I was assuming that there can only be a limited number of TNUMs (256), but browsing the code I didn't see a reason why there couldn't be more. Are you aware of a deeply ingrained limitation that prevents more than 256? Other than that you are quite right in that having explicit TNUMs assigned for types is the better way to go, but we have to keep in mind that this might lead to an inflationary use of these. |
Adding more TNUMs raises two issues. First we need to have enough bits in the object header to store the TNUM of each object. This is not a problem assuming we are OK with limiting object sizes to (say) 2^48 words. Secondly, we use memory proportional to the square of the number of TNUMs for some jump tables. This isn't a problem for RAM but might increase pressure on cache or memory bandwidth if it means touching more cache lines. |
Let me clarify: I am not proposing to allow adding an arbitrary number of TNUMs. Nor do I think packages should use TNUMS in abundance. Rather, I envision that any given package registers at most one TNUM. For now, in my patches I set aside space for 50 package TNUMs. One limiting factor here is that this actually ends up blocking 2*50=100 TNUMs, as we add COPYING variants for each. However, I think we could even avoid that (by revamping how COPYING / TESTING works), if we wanted to. Anyway, it's better to discuss actual code than thin air, so I'll put my code into a branch and we can discuss that. |
I was wondering how IO wraps POSIX functions returning pointer values. Turns out, it doesn't really. Instead,
IO_opendir
returns a boolean, and stores the actual return value ofopendir
in a global variableourDIR
. This could lead to a problem if one was not aware of this and tried to read from two directories at once. In the future, with HPC-GAP this might become a real problem.So, we should try to get rid of such global state. In this particular case, one might return the pointer as a GAP integer (perhaps wrapped inside a record, or even component object or positional object to "hide" it from the user), then rely on the user not messing with it. This would be vaguely similar to what
IO_stat
does.There are several more global variables but they are all only used to temporarily hold data; as such, they should be safe for now, but they will become a problem in HPC-GAP. These are
ourdirent
,ourstatbuf
,ourfstatbuf
,ourlstatbuf
,argv
,envp
and of course all the globals related to signal handling.The text was updated successfully, but these errors were encountered: