Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #7 via sketch module. #19

Merged
merged 7 commits into from
May 26, 2018
Merged

Fixes #7 via sketch module. #19

merged 7 commits into from
May 26, 2018

Conversation

jknox13
Copy link
Collaborator

@jknox13 jknox13 commented May 25, 2018

There are 2 main updates with this PR:

  1. Fixes Supported Distributions #7 by reimplenting the sketch module as a subpackage containing sketching routines (_sketch) as well as transform algorithms (transforms)
  2. Continues to build on the last PRs by improving PEP standard compliance through renaming function parameter names (ie q -> n_subspace)

Also, I have removed the single_pass algorithms, because their use (only accessing A a single time) was not yet implemented: we computed A.dot(Omega) and Psi.dot(A), instead of doing both in a single pass over A. We can look back into this later, though it would require writing a C/Cython or Fortran subroutine.

The next step I think is implementing sparse options for sparse sketching inside the factorization algorithms to solve #11, but I would like to make a separate PR for takling this.

…odule that incorporates additional sketching aglorithms
Work in progress. ``sketch`` module is implemented, but old
compatibility funcitons in ``sketch.range_finders`` still need to be
phased out.
Still need to better incorporate within random methods, perhaps giving
transform options. Also still need to standardize parameter names.

Additionally, single_pass methods have been depricated (masked with a
single subspace iteration) for now because the previous implementation
was not in fact 'single_pass'.
For the most part package now conforms with PEP parameter name
guidelines.

Still to do:
- `pca` : factor pca/spca/robspca
- `nmf` : factor and PEP parameter name eval
- `sketch` : ensure row wise sketches are indeed row wise and write
numerical accuracy checking tests

Additionally, in general there can be a lot of improvement as far as
code reproduction reduction in the testing modules.
The calling of `perform_subspace_iterations` now resides outside of the
transform functions so that it can be used independent of which
transform is used.

Additionally, default n_subspace has been set to 2.
Copy link
Owner

@erichson erichson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all very good to me. Can we fix The calling of perform_subspace_iterations so that we pass the travis-ci test?

# most definitely to sparsely sample the nnz elements of Omega, but
# is using random_state in data_rvs redundant?
values = (-sqrt(1. / density), sqrt(1. / density))
data_rvs = partial(random_state.choice, values)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random_state is not redundant here.


Parameters
----------
A : array_like, shape `(n, n)`.
Positive-definite matrix (PSD) input matrix.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mention that a PSD matrix is required, maybe just us a note.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should. We should also check by default that the matrix is indeed PSD, but have an option not to (similar to check_finite in scipy.linalg.qr)

return safe_sparse_dot(A, Omega)


def fast_johnson_lindenstrauss(A, l, axis=1, random_state=None):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! Excited to see the speedups!

@jknox13
Copy link
Collaborator Author

jknox13 commented May 26, 2018

It looks like the Travis build failed because of a failed install of Python 3.5 (some weird travis bug), but after rerunning, the build passes!

@jknox13
Copy link
Collaborator Author

jknox13 commented May 26, 2018

@erichson Does this look good to you?

@erichson erichson merged commit 11f755f into erichson:master May 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants