Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The graph's incremental update seems not support vertex property's update #1601

Open
songqing opened this issue Oct 24, 2023 · 11 comments
Open
Labels

Comments

@songqing
Copy link
Contributor

Describe your problem

#1563 has supported graph data's incremental update, however, it seems vertex property can not update, for example,
the full data is:
vid value
1 2.0
2 3.0
and the inc data is:
vid value
1 4.0
After inc update, the vid 1's value is 2.0 not 4.0

So, can we support vertex property's update?

@sighingnow sighingnow added enhancement New feature or request component:graph labels Oct 24, 2023
@dashanji
Copy link
Member

dashanji commented Oct 24, 2023

Thanks @songqing.

Fixed in #1600.

@songqing
Copy link
Contributor Author

Thanks @songqing.

Fixed in #1600.

Sorry, there is a mistake, #1600 is another small fix and this issue is still unresolved

@dashanji
Copy link
Member

Oh, my fault. I'm sorry for the noisy.

Reopned.

@dashanji dashanji reopened this Oct 24, 2023
@sighingnow
Copy link
Member

So, can we support vertex property's update?

Technically we can, but we define vineyard's objects as immutable objects (to make concurrency control simpler). The incremental update APIs are designed for bulk data loading as well. We currently only support adding to make multi-versioned immutable objects simpler.

For scenarios like continuous incremental graph updating, I would like to suggest GART which is a graph store that supports streaming updates and more suitable for your cases like updating properties (via updating records in tables). GART is built upon vineyard as well.

@songqing
Copy link
Contributor Author

So, can we support vertex property's update?

Technically we can, but we define vineyard's objects as immutable objects (to make concurrency control simpler). The incremental update APIs are designed for bulk data loading as well. We currently only support adding to make multi-versioned immutable objects simpler.

For scenarios like continuous incremental graph updating, I would like to suggest GART which is a graph store that supports streaming updates and more suitable for your cases like updating properties (via updating records in tables). GART is built upon vineyard as well.

OK, I see, thanks for your reply.
There is a scenario, graph data is updated daily, for now, we can only load the full data every day, but if we support incremental update with modifying the existed data, we can load the full data at first, then load incremental data the next days, by this way, the data importing will be more efficient and cost less resources.
And, there maybe only need small change based on the current incremental update's implementation, with GART, the query performance will be a little bad in this scenario.

@sighingnow
Copy link
Member

It can be implemented by

  • for vertices: maintain a copy of the vtable in involved fragment (graph in vineyard is edge-cut), and update the table.
  • for edges: append new properties to the end of current vtable/etable (just like what we already have for adding data, and maintain a copy for the CSR and update the "edge_id" field for corresponding edge.

As the first step, we could support only vertices or edges part.

@sighingnow
Copy link
Member

I may not have enough bandwidth on Vineyard in the next two months. Would you folks @songqing (or @SighingSnow) like to implement such features?

@songqing
Copy link
Contributor Author

I may not have enough bandwidth on Vineyard in the next two months. Would you folks @songqing (or @SighingSnow) like to implement such features?

OK, thanks, it's not an urgent issue, I'll have a try later.

@SighingSnow
Copy link
Contributor

SighingSnow commented Oct 25, 2023

I may not have enough bandwidth on Vineyard in the next two months. Would you folks @songqing (or @SighingSnow) like to implement such features?

OK, thanks, it's not an urgent issue, I'll have a try later.

Hi, could you please check this code block https://github.com/v6d-io/v6d/blob/main/modules/graph/loader/basic_ev_fragment_loader_impl.h#L344~L406. The code block mentioned is to use the origin data. We check the incremental added vertices, and if there is a duplicate, we use the origin table data deliberately.
Previously, expected user behaviors' are not to add duplicates, and if there is a duplicate, we will use the origin data.

So if this property is needed, you can revise the code above to update the table data.

@siyuan0322 could you please evaluate this issue

@siyuan0322
Copy link
Member

Yeah, seems it's a good fit here.

@songqing
Copy link
Contributor Author

I may not have enough bandwidth on Vineyard in the next two months. Would you folks @songqing (or @SighingSnow) like to implement such features?

OK, thanks, it's not an urgent issue, I'll have a try later.

Hi, could you please check this code block https://github.com/v6d-io/v6d/blob/main/modules/graph/loader/basic_ev_fragment_loader_impl.h#L344~L406. The code block mentioned is to use the origin data. We check the incremental added vertices, and if there is a duplicate, we use the origin table data deliberately. Previously, expected user behaviors' are not to add duplicates, and if there is a duplicate, we will use the origin data.

So if this property is needed, you can revise the code above to update the table data.

@siyuan0322 could you please evaluate this issue

Yes, based on the current implementation, there only need small change to solve this issue.
Besides the code you mentioned, https://github.com/v6d-io/v6d/blob/main/modules/graph/vertex_map/arrow_vertex_map_impl.h#L487~L500 may also need change.

@github-actions github-actions bot added the stale label Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants