Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add user definable functions aggregates and collations #155

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

coddingtonbear
Copy link

@coddingtonbear coddingtonbear commented Oct 17, 2017

What does this do and why?

This adds new functionality allowing users to define and use custom aggregations, functions, and collations without requiring that those functions be included in the base q distribution by configuring q to find and register those using setuptools' 'Entry Points' system (available by default with all Python installations newer than 2.7.9).

Creating Custom Aggregates, Functions, or Collations

Given that I have a concrete example available for only one of the three possibilities below, I've leaned toward verbosity; apologies for the repetitiveness:

Creating a Custom Aggregate

To define a custom aggregate (example repository: https://github.com/coddingtonbear/q-stdev), just add a group q_aggregates to your project's setup.py:

    ...
    entry_points={
        'q_aggregates': {
            'stdev = dot.path.to.module:Class'
        }
    }
    ...

The class referenced should meet the requirements defined by Sqlite3's documentation for create_aggregate. The argument count will be inferred by inspecting the referenced function's signature using inspect.

After defining your custom aggregate, you can use it in any query as a normal aggregation:

select stdev(some_field) from ./some_file.csv;

Creating a Custom Function

To define a custom function, add a group q_functions to your project's setup.py:

    ...
    entry_points={
        'q_functions': {
            'md5_maybe = dot.path.to.module:function'
        }
    }

The function referenced should meet the requirements defined by Sqlite3's documentation for
create_function
. The argument count will be inferred by inspecting the referenced function's signature using inspect.

After defining your custom function, you can use it in any query as a normal function:

select md5_maybe(some_field) as some_field_md5 from ./some_file.csv;

Creating a Custom Collation

To define a custom function, add a group q_collations to your project's setup.py:

    ...
    entry_points={
        'q_functions': {
            'reversed = dot.path.to.module:function'
        }
    }

The function referenced should meet the requirements defined by Sqlite3's documentation for
create_collation
.

After defining your custom collation, you can use it in any query as a normal collation:

select some_field from ./some_file.csv order by some_field collate reversed;

Installing Custom Aggregates, Functions, or Collations

Just install the package providing the custom aggregation using whatever method you normally use for installing third-party packages. This might be just running pip install q-stdev, or it might be running python setup.py install or pip install . from a package directory.


I think this'll be helpful for people like me who might have specific needs of running specialized calculations via q without needing to submit a patch upstream.

Cheers, and thank you for creating such a useful library that I use almost every single day.

@coddingtonbear
Copy link
Author

Incidentally -- if you were looking for reasonable things people might want to use the above functionality for: there was recently an excellent article on Hacker News that touched on the neat things you can do with sqlite's user-defined functions: http://tech.marksblogg.com/sqlite3-tutorial-and-guide.html. In their example they were extracting the hostname from stored values that are full URLs.

@harelba
Copy link
Owner

harelba commented Oct 22, 2021

I'm sorry, i missed that PR for some reason.

I've been thinking about something similar, although I would like to provide a more generic mechanism that will allow q to use such custom-functions programmatically, and not through using setup.py. The reason is so people can dynamically add their own functions or modules of functions without having to issue a new q release for supporting it.

Leaving this PR open for reference.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants