Skip to content

No JAVA_HOME at run time makes Pydoop very slow #344

@simleo

Description

@simleo

#338 added JAVA_HOME auto detection. That's convenient, especially at compile time, since it makes the installation process easier. It also allows Pydoop to work with no JAVA_HOME set at run time, which is also convenient, but it turns out that things can be much slower in that case. Running the entire unit tests suite (minus the avro ones) with no JAVA_HOME is almost 5 times slower. HADOOP_HOME also has an effect, though not nearly as big (a quick comparison on my laptop resulted in 344s with both unset, 75s with JAVA_HOME set and 70s with both set).

Reviewing our caching of these variables (or lack thereof) might help, although not in the case where one is running several Python processes that use Pydoop (auto detection needs to be performed at least once). We do need to document this properly though, so that users make sure they have the most efficient run time setup.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions