Get rid of global variable ourDIR #4

fingolfin · 2014-02-21T14:12:30Z

I was wondering how IO wraps POSIX functions returning pointer values. Turns out, it doesn't really. Instead, IO_opendir returns a boolean, and stores the actual return value of opendir in a global variable ourDIR. This could lead to a problem if one was not aware of this and tried to read from two directories at once. In the future, with HPC-GAP this might become a real problem.

So, we should try to get rid of such global state. In this particular case, one might return the pointer as a GAP integer (perhaps wrapped inside a record, or even component object or positional object to "hide" it from the user), then rely on the user not messing with it. This would be vaguely similar to what IO_stat does.

There are several more global variables but they are all only used to temporarily hold data; as such, they should be safe for now, but they will become a problem in HPC-GAP. These are ourdirent, ourstatbuf, ourfstatbuf, ourlstatbuf, argv, envp and of course all the globals related to signal handling.

The text was updated successfully, but these errors were encountered:

markuspf · 2014-11-10T23:33:06Z

Passing the pointers around seems the only viable option here. Trying to implement this I have run into a few roadblocks:

If I change functions like IO_readdir, I break backwards compatibility. Would you prefer new variants and leaving the old unsafe versions in place, or to just break people's code who decided to call into the C functions directly?
Once a GMP integer, I cannot see a function that turns it back into an integer. While this is certainly doable, I think maybe a T_POINTER type for bags could be introduced. All this of course opens a can of worms wrt memory leaks, because we are allocating memory which is not under control of the garbage collector (calling IO_opendir twice will leak memory now already)
Families, types and representations for directories are defined in the library. Would you want a disjoint implementation for IO, or hook into the library here (for example defining IsDirectoryRepIO and IO_DirectoryType, but having IO_DirectoryType have IsDirectory in its filter)

I have some work in progress, but am reluctant to push it to github just yet since it mainly breaks everything and does nothing productive.

fingolfin · 2014-11-11T15:11:28Z

Hi Markus, great that you are interested in this. Regarding your points:

By default I prefer not breaking backwards compatibility -- at least not at once, because that makes transition harder. Instead, it would be nice to support for a time both the old and new API, and then after some time, phase out the old API. However, in this particular case, I am not sure if anybody is using IO_opendir and IO_readdir, so perhaps a clean break would not be so bad. Hmm. In any case, IO_readdir would be easy enough to generalize, by having two versions: a parameterless one (the current, "bad" one), and a new one taking a "DIR" object, whatever it exactly would be. Similar for IO_closedir. The annoying part is IO_opendir, which currently returns just a boolean. We could have IO_opendir2 returning a dir object. Ugly, though, and in the long run, if the old code is removed, it looks weird. Better perhaps, then, to change IO_opendir to optionally take two arguments: the first a string with the directory name, as it is right now, the second a boolean for choosing between new and old behavior. Omitting it for now would lead to the old behavior. Using the old behavior variants of any of the three functions would print a warning about obsolete behavior; in a year or so, we could remove the old behavior.
Well, or we just remove it right away if we think nobody is using this anyway ;-).
I don't see the problem with turning a GAP integer back into a pointer? Well, OK, it is a bit tricky, because a 32 / 64 bit doesn't always fit into a GAP immediate integer, so we'd have to deal with converting both (small) immediate integers and (big) GMP integers; but for the former, (DIR *)INT_INTOBJ(obj) should do the trick, for the latter the value should be stored in the first (and only) limb of the GMP integer. That said, it might be easier to simply use a T_DATOBJ to store the raw pointer data. Regardless of which type we use, we should wrap it inside a high-level GAP component object, for user convenience and safety.
I am rather skeptical of the T_POINTER idea. Without any information about what a pointer is pointing at, this is very tricky to use and has a big potential for crashes and indeed memory leaks. However, more and more packages will want to use and manage data allocated externally. We do this
Calling IO_opendir twice leaks memory? Hm, looks like you are right. See issue Calling IO_opendir twice without IO_closedir leaks memory. #24.
The type here should be disjoint for the GAP library, and fairly low-level. That said, GAP is badly lacking a proper, portable and comprehensive file system abstraction layer; the existing Directory code in the GAP library is rather insufficient. But such an API is out of scope for IO, I'd say.

markuspf · 2014-11-11T16:27:16Z

Hi Max,

On Tue, Nov 11, 2014 at 07:11:29AM -0800, Max Horn wrote:

Hi Markus, great that you are interested in this. Regarding your points:

By default I prefer not breaking backwards compatibility

I think we agree on this one. I think the two parameter version for one or two
releases and then removing the old behaviour is a safe bet. For HPC-GAP
compatibility the old behaviour doesn't even make any sense anymore, so we
have a natural step at which we can (hopefully) retire the old behaviour.

I don't see the problem with turning a GAP integer back into a pointer?
Well, OK, it is a bit tricky, because a 32 / 64 bit doesn't always fit into
a GAP immediate integer, so we'd have to deal with converting both (small)
immediate integers and (big) GMP integers; but for the former, (DIR *)
INT_INTOBJ(obj) should do the trick, for the latter the value should be
stored in the first (and only) limb of the GMP integer. That said, it might
be easier to simply use a T_DATOBJ to store the raw pointer data.

I didn't even realise that T_DATOBJ existed until now. I was looking for such
a type last night and must have overlooked it.

I am rather skeptical of the T_POINTER idea. Without any information about
what a pointer is pointing at, this is very tricky to use and has a big
potential for crashes and indeed memory leaks. However, more and more
packages will want to use and manage data allocated externally. We do this

Actually there already is T_SINGULAR and T_POLYMAKE, of which I don't know
exactly what they are doing. I think packages might always want to have some
opaque external data that they manage and we cannot protect them from doing
things wrong (At least not with having a C kernel). Having appropriate GAP
level wrappers is one way of ensuring that things do not go horribly wrong.

As to address the memory leaks one might think
about introducing managed pointers which install a destructor (as a function
pointer in the bag), but I don't know about the implications of this and how
it'd play with GASMAN or the HPC-GAP garbage collector. I wanted to ask Reimer
about this.

We could even sub-type the pointer bags.

People who do C-level kernel programming will have to know what they are
getting themselves into in any case ;).

The type here should be disjoint for the GAP library, and fairly low-level.
That said, GAP is badly lacking a proper, portable and comprehensive file
system abstraction layer; the existing Directory code in the GAP library is
rather insufficient. But such an API is out of scope for IO, I'd say.

Ok, I'll make a disjoint API then. It's a bit icky that IO comes along with
IsFile, but I agree that ti's a topic for a ticket in a different project how
to fix external IO for GAP.

fingolfin · 2014-11-11T18:24:27Z

Oops, my point 3 was incomplete -- I was about to write something there, then got disrupted, and when I resumed later on forgot that I had been editing that sport sigh. I was actually planning to write something like the following:

We do this already in the packages SingularInterface (formerly "libsing") and NormalizInterface. The way this is done right now is the following: For each package that needs it, a TNUM is reserved (SingularInterface uses T_SINGULAR; NormalizInterface currently uses T_SPARE1; and there is also T_POLYMAKE, but I think it is not in use right now).

GAP itself never creates objects (or rather, "bags") with these TNUMs, but external packages do this. The cool thing about this is that it allows those external packages to do proper memory management, as the GAP garbage collector GASMAN provides hooks for this, set via InitFreeFuncBag and InitMarkFuncBags. With this, wrapping e.g. C++ objects becomes relatively easy; the code for NormalizInterface is rather short, see normaliz.cc.

This system is not perfect, but works reasonably well. The only problem is that it currently requires a modification to the GAP kernel for each package that wants to do this. This is clearly a problematic model that does not scale well. And I really hope it will have to scale, as I'd like to see wrappers for e.g. libcurl (full-blown HTTP client support), libz/libzlib/libarchive (various libraries for dealing with compressed files or even archives), pcre (or any other regular expression library), GMP (exposing the high-level GMP functions, which we currently can't use in GAP -- this should be rather easy to achieve, and I'd like to do it in a way that integrates well with SingularInterface and NormalizInterface).

The simplest ad-hoc solution for this is to reserve a bunch of TNUMs for packages and let packages register a TNUM. I have been using this internally for quite some time now, and it boils down to a rather short code change, essentially this (plus some header modifications):

static Int lastFreePackageTNUM = FIRST_PACKAGE_TNUM;

/****************************************************************************
**
*F  RegisterPackageTNUM( <name>, <typeObjFunc> )
**
**  Allocates a TNUM for use by a package. The parameters <name> and
**  <typeObjFunc> are used to initialize the relevant entries in the
**  InfoBags and TypeObjFuncs arrays.
**
**  If allocation fails (e.g. because no more TNUMs are available),
**  a negative value is returned.
*/
Int RegisterPackageTNUM( const char *name, Obj (*typeObjFunc)(Obj obj) )
{
    if (lastFreePackageTNUM > LAST_PACKAGE_TNUM)
        return -1;

    Int tnum = lastFreePackageTNUM++;

    InfoBags[tnum].name = name;
    TypeObjFuncs[tnum] = typeObjFunc;

    return tnum;
}

Packages then would call RegisterPackageTNUM from their postRestore function, like this:

extern Obj T_SINGULAR = 0;

static Int PostRestore(StructInitInfo* module)
{
...
    T_SINGULAR = RegisterPackageTNUM("singular wrapper object", _SI_TypeObj);
    if (T_SINGULAR == -1)
        return 1; /* signal an error */
...

so that even workspaces should work, provided that packages are loaded in the exact same order as before. (To fix the latter constraint, one could store which TNUM was assigned to which package; alas, for now I deemed it not worth the effort, as other things are likely to break when the load order changes, such as types get assigned which flag values etc. -- but it certainly would be possible to improve this).

Note that I deliberately did not include a mechanism for querying TNUMs. This keeps the code very simple and makes it easy to keep it bug free, while not preventing packages from exchanging TNUM. For example, I imagine we'll have a GMPInterface package at some point, and both SingularInterface and NormalizInterface will want to use its "external GMP integers". Well, no problem, it can simply assign the TNUM integer value to a global gap variable, say GMPINTERFACE_TNUM, and then SingularInterface and NormalizInterface can use that

fingolfin · 2014-11-11T18:26:16Z

So, instead of adding new code which somehow uses "pointers to destructors" stored in T_POINTER objects, we simply use the existing infrastructure as it was meant to be. Perhaps we can eventually come up with something better, but for now, this works quite well (at least in my limited, local tests -- which notably do not cover workspaces, a feature I rarely use).

markuspf · 2014-11-19T23:53:58Z

I wasn't aware that this infrastructure already existed.

I was assuming that there can only be a limited number of TNUMs (256), but browsing the code I didn't see a reason why there couldn't be more. Are you aware of a deeply ingrained limitation that prevents more than 256?

Other than that you are quite right in that having explicit TNUMs assigned for types is the better way to go, but we have to keep in mind that this might lead to an inflationary use of these.

stevelinton · 2014-11-20T07:41:58Z

Adding more TNUMs raises two issues. First we need to have enough bits in the object header to store the TNUM of each object. This is not a problem assuming we are OK with limiting object sizes to (say) 2^48 words. Secondly, we use memory proportional to the square of the number of TNUMs for some jump tables. This isn't a problem for RAM but might increase pressure on cache or memory bandwidth if it means touching more cache lines.

fingolfin · 2014-11-21T00:16:20Z

Let me clarify: I am not proposing to allow adding an arbitrary number of TNUMs. Nor do I think packages should use TNUMS in abundance. Rather, I envision that any given package registers at most one TNUM. For now, in my patches I set aside space for 50 package TNUMs. One limiting factor here is that this actually ends up blocking 2*50=100 TNUMs, as we add COPYING variants for each. However, I think we could even avoid that (by revamping how COPYING / TESTING works), if we wanted to.

Anyway, it's better to discuss actual code than thin air, so I'll put my code into a branch and we can discuss that.

fingolfin added the enhancement label Nov 8, 2014

markuspf mentioned this issue Mar 4, 2015

Add RegisterTNUM functionality gap-system/gap#15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get rid of global variable ourDIR #4

Get rid of global variable ourDIR #4

fingolfin commented Feb 21, 2014

markuspf commented Nov 10, 2014

fingolfin commented Nov 11, 2014

markuspf commented Nov 11, 2014

fingolfin commented Nov 11, 2014

fingolfin commented Nov 11, 2014

markuspf commented Nov 19, 2014

stevelinton commented Nov 20, 2014

fingolfin commented Nov 21, 2014

Get rid of global variable ourDIR #4

Get rid of global variable ourDIR #4

Comments

fingolfin commented Feb 21, 2014

markuspf commented Nov 10, 2014

fingolfin commented Nov 11, 2014

markuspf commented Nov 11, 2014

fingolfin commented Nov 11, 2014

fingolfin commented Nov 11, 2014

markuspf commented Nov 19, 2014

stevelinton commented Nov 20, 2014

fingolfin commented Nov 21, 2014