Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs #7

Closed
4 tasks done
dobraczka opened this issue Jul 28, 2021 · 5 comments
Closed
4 tasks done

Update docs #7

dobraczka opened this issue Jul 28, 2021 · 5 comments
Assignees
Labels
📜 documentation Improvements or additions to documentation

Comments

@dobraczka
Copy link
Owner

dobraczka commented Jul 28, 2021

  • Explain the new possibilities enabled by class-resolver
  • Incorporate architecture picture with explanation
  • link to readthedocs in readme
  • Give new examples (including for single-source)
@dobraczka dobraczka added the 📜 documentation Improvements or additions to documentation label Jul 28, 2021
@dobraczka dobraczka self-assigned this Jul 28, 2021
@cthoyt
Copy link
Contributor

cthoyt commented Jul 28, 2021

It would be great to make an example where Kiez is applied to the embeddings coming from PyKEEN. This tutorial shows how to get embeddings out after training. The following example shows it for TransE:

from pykeen.pipeline import pipeline

results = pipeline(
    model='TransE',
    dataset='Nations',
    epochs=1,  # change this to ~25 for real usage on Nations
)

entity_embeddings   = results.model.entity_representations[0]()    # is torch.Size([14, 50])
relation_embeddings = results.model.relation_representations[0]()  # is torch.Size([55, 50])

These are torch tensors, which can be converted to numpy ndarrays of the same dimensions with

entity_embeddings   = results.model.entity_representations[0]().detach().numpy()
relation_embeddings = results.model.relation_representations[0]().detach().numpy()

A kiez API function could probably do some logic to accept either a pipeline result, model, or pykeen embedding, like:

from pykeen.models import Model
from pykeen.pipeline import PipelineResult
from pykeen.nn import Embedding
from typing import Union

def from_pykeen(model: Union[Model, PipelineResult, Embedding]):
    if isinstance(model, Embedding):
        representation = model
    elif isinstance(model, Model):
        representation = model.entity_representations[0]
    elif isinstance(model, PipelineResult):
        representation = model.model.entity_representations[0]
    else:
        raise TypeError
    arr = representation().detach().numpy()
    ...

@dobraczka
Copy link
Owner Author

The main purpose of Kiez (for now) is to use it for entity resolution, i.e. I have entity embeddings of two datasources (source & target) and want to find the nearest neighbors of source entities in the target space.
However using it to find nearest neighbors within a single source is technically already possible, e.g.

from kiez import Kiez
source = ... # get embeddings from somewhere
k_inst = Kiez()
k_inst.fit(source, source)
k_inst.kneighbors()

I want to adapt the API to make this use-case more intuitively available. In the course of that I would add your mentioned example in the doc and implement the convenience function for pykeen.

@cthoyt
Copy link
Contributor

cthoyt commented Jul 30, 2021

Oh I see. There are a few pykeen datasets that are constructed such that they contain 2 knowledge graphs with support edges linking the same entity in each (like a english and german version of the same graph, with different completeness) but none of them are directly accessible to give a really good example at the moment

@dobraczka
Copy link
Owner Author

I will close this and we can continue to talk about use cases in #11

@cthoyt
Copy link
Contributor

cthoyt commented Aug 9, 2021

@dobraczka great! Looking forward to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📜 documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants