Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with large BP vector #1

Open
ShaneDouglas opened this issue Feb 18, 2014 · 2 comments
Open

Problem with large BP vector #1

ShaneDouglas opened this issue Feb 18, 2014 · 2 comments

Comments

@ShaneDouglas
Copy link

Firstly let me congratulate you on your excellent work. I especially like the ability to memory map the structures. I had a small problem though when trying to create a large BP vector of size > 2^32. I am not sure if it is something I have done incorrectly but in this section of the code bp_vector::build_min_tree() :-

    for (size_t superblock = 0; superblock < n_superblocks; ++superblock) {
        excess_t cur_super_min = static_cast<excess_t>(size());
        excess_t superblock_excess = get_block_excess(superblock * superblock_size);

        for (size_t block = superblock * superblock_size;
             block < std::min((superblock + 1) * superblock_size, n_blocks);
             ++block) {
            cur_super_min = std::min(cur_super_min, superblock_excess + block_excess_min[block]);
        }
        assert(cur_super_min >= 0 && cur_super_min < excess_t(size()));

        superblock_excess_min[m_internal_nodes + superblock] = cur_super_min;
    }

you appear to set cur_super_min to the size of the vector which in my case > 2^32 but excess_t is an int32_t which obviously causes a problem. Not sure if it is a config issue or what but changed it to int64_t and it seemed to work. Since I am not really aquainted with your code I am worried about the side effects of such a action and it would be nice to get your opinion on what is going on. Perhaps I am not allowed to create such a large BP vector ?

Shane.

@ot
Copy link
Owner

ot commented Feb 23, 2014

Hi, I haven't touched that code for a while, but there should be no limitation on the size of the vectors, just on the maximum excess inside the vector, which is numeric_limits<excess_t>::max().
I should put a large comment there, but I thought that the assumption of having at most 2^31 excess would be reasonable (that means a tree of height 2^32).

I haven't tested it, but changing the definition of excess_t to int64_t should remove the problem (the typedef is put there for this purpose). Of course the data structure size becomes slightly larger and you lose binary compatibility, but the succinct file format is not designed to be portable, for performance and memory-mappability reasons.

BTW, thanks for your kind words!

@iamjw0911
Copy link

thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants