Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Directly specify the halo width #680

Open
Goon83 opened this issue Jan 21, 2020 · 32 comments
Open

Directly specify the halo width #680

Goon83 opened this issue Jan 21, 2020 · 32 comments

Comments

@Goon83
Copy link

Goon83 commented Jan 21, 2020

Following some examples to learn the halo function in DASH. One question is about how to specify the width of halo layer without StencilSpec. I found most example codes use the below steps to declare a array with halo.

   using StencilSpec_t = dash::halo::StencilSpec<StencilP_t, 4>;
   using HaloWrapper_t = dash::halo::HaloMatrixWrapper<Array_t>;

    StencilSpec_t stencil_spec(StencilP_t(-1, 0), StencilP_t(1, 0), StencilP_t(0, -1), StencilP_t(0, 1));
    HaloWrapper_t halo_array(data_array, stencil_spec);

After studying the API of HaloWrapper_t, it seems that the stencil_spec is the only way to ask for a halo layer.
https://codedocs.xyz/dash-project/dash/a01194.html

Thanks.
Bin

@dhinf
Copy link
Member

dhinf commented Jan 22, 2020

How should a direct halo width specification look like?
Lets say, you want a two element halo width in each direction, does it mean including diagonal elements or not etc. . The thing is, you need later on the stencil spec to access the halo elements.

@dhinf
Copy link
Member

dhinf commented Jan 22, 2020

Possible is a stencil spec generator for typical stencil pattern.
E.g. a full stencil with a specified width.
Maybe this could be something you are looking for.

@Goon83
Copy link
Author

Goon83 commented Jan 22, 2020

Hi dhinf,

Lets say, you want a two element halo width in each direction, does it mean including diagonal elements or not etc. . The thing is, you need later on the stencil spec to access the halo elements.

--> Yes, it includes diagonal elements. The general idea is to separate the halo spec and stencil operator. For a case with their own stencil implementation (such as our ArrayUDF), the DASH can also works as a storage layer. Anyway, I guess it is just an API effort. I assume that current DASH infers the width from the stencil spec (but I might be wrong). So, it can just expose this to user directly. In our ArrayUDF on HDF5 file, it provides the below API for a 2D array:

     std::vector<int> halo(2);
     halo[0] = 2, halo[1] = 2;
     std::vector<int> chunk(2);
     chunk[0] = 4, chunk[1] = 4;
     Array(chunk, halo, ...)
     The Array has a halo layer (/w length 2) for each direction. 
     Each chunk has size as 6 by 6 (except boundary chunk).

Possible is a stencil spec generator for typical stencil pattern.
E.g. a full stencil with a specified width.
Maybe this could be something you are looking for.

--> This could be a possible solution but as said above, the stencil width can be exposed to user.

@dhinf
Copy link
Member

dhinf commented Jan 23, 2020

Hi Goon83,

The halo width is based on the stencil spec. But i think it should be possible to also provide the width. I'm interested in your use case. The halo wrapper takes the local part of a dash::matrix or narray (naming is a little bit misleading). Then it adds the halo part, which is a separate chunk of memory. How does ArrayUDF handles the halo memory?

Like i said, i'm really interested how you want to use DASH with the halo wrapper.

@Goon83
Copy link
Author

Goon83 commented Jan 23, 2020

Hi dhinf,
When data is stored in memory, ArrayUDF views the halo layer as non-halo data too. It basically merges halo and non-halo points as contiguous and normal chunk. ArrayUDF only knows the halo when it performs I/O, i.e., reading data into its memory.

I came up a simple and naive example (below) to simulate the request of ArrayUDF. It creates a 2D array on 4 MPI processing, giving the total size of 8 by 8 and the tile size of 4 by 4. Each process writes a subset of the 2D array (note: without halo). Then, each process read a subset of the array with halo layer. This example code works so far. But, I think the read with halo might be better served by the array created with halo layer. Otherwise, each access to halo layer may trigger a communication and thus bad performance.

Any thoughts on this? Thanks.

#include <unistd.h>
#include <iostream>
#include <cstddef>
#include <iomanip>

#include <libdash.h>
//#include <mpi.h>

using namespace std;

using std::cout;

int main(int argc, char *argv[])
{
  dash::init(&argc, &argv);

  size_t team_size = dash::Team::All().size();
  dash::TeamSpec<2> teamspec;
  teamspec.balance_extents();

  dash::global_unit_t myid = dash::myid();
  size_t num_units = dash::Team::All().size();

  if (num_units != 4)
  {
    cout << "Please run with mpirun -n 4" << endl;
    exit(-1);
  }

  size_t tilesize_x = 4;
  size_t tilesize_y = 4;
  size_t rows = tilesize_x * (num_units / 2);
  size_t cols = tilesize_y * (num_units / 2);
  dash::Matrix<int, 2> matrix(
      dash::SizeSpec<2>(
          rows,
          cols),
      dash::DistributionSpec<2>(),
      dash::Team::All(),
      teamspec);
  size_t matrix_size = rows * cols;
  DASH_ASSERT(matrix_size == matrix.size());
  DASH_ASSERT(rows == matrix.extent(0));
  DASH_ASSERT(cols == matrix.extent(1));

  if (0 == myid)
  {
    cout << "Matrix size: " << rows
         << " x " << cols
         << " == " << matrix_size
         << endl;
  }

  std::vector<size_t> my_start(2);
  std::vector<size_t> my_end(2);

  switch (myid)
  {
  case 0:
    my_start[0] = 0;
    my_start[1] = 0;
    break;
  case 1:
    my_start[0] = 0;
    my_start[1] = tilesize_y;
    break;
  case 2:
    my_start[0] = tilesize_x;
    my_start[1] = 0;
    break;
  case 3:
    my_start[0] = tilesize_x;
    my_start[1] = tilesize_y;
    break;
  default:
    break;
  }

  my_end[0] = my_start[0] + tilesize_x - 1;
  my_end[1] = my_start[1] + tilesize_y - 1;

  if (!myid)
    cout << "Assigning matrix values" << endl;

  cout << myid << "'s start = (" << my_start[0] << ", " << my_start[1] << "), my_end = " << my_end[0] << ", " << my_end[1] << " )" << endl;
  for (size_t i = my_start[0]; i <= my_end[0]; i++)
  {
    for (size_t k = my_start[1]; k <= my_end[1]; k++)
    {
      matrix[i][k] = myid;
    }
  }

  // Units waiting for value initialization
  dash::Team::All().barrier();

  // Read and assert values in matrix
  for (size_t i = my_start[0]; i <= my_end[0]; i++)
  {
    for (size_t k = my_start[1]; k <= my_end[1]; k++)
    {
      int value = matrix[i][k];
      int expected = myid;
      DASH_ASSERT(expected == value);
    }
  }

  if (!myid)
  {
    cout << "Print matrix values at rank 0: " << endl;
    for (size_t i = 0; i < matrix.extent(0); i++)
    {
      for (size_t k = 0; k < matrix.extent(1); k++)
      {
        int value = matrix[i][k];
        cout << value;
        cout << "   ";
      }
      cout << endl;
    }
  }

  std::vector<size_t> my_start_halo(2);
  std::vector<size_t> my_end_halo(2);

  switch (myid)
  {
  case 0:
    my_start_halo[0] = 0;
    my_start_halo[1] = 0;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  case 1:
    my_start_halo[0] = 0;
    my_start_halo[1] = tilesize_y - 1;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  case 2:
    my_start_halo[0] = tilesize_x - 1;
    my_start_halo[1] = 0;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  case 3:
    my_start_halo[0] = tilesize_x - 1;
    my_start_halo[1] = tilesize_y - 1;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  default:
    break;
  }

  sleep(myid * 3);

  cout << "myid =" << myid << "'s output (" << my_start_halo[0] << "," << my_start_halo[1] << ")->(" << my_end_halo[0] << "," << my_end_halo[1] << "):" << endl;
  // Read and assert values in matrix
  for (size_t i = my_start_halo[0]; i <= my_end_halo[0]; i++)
  {
    for (size_t k = my_start_halo[1]; k <= my_end_halo[1]; k++)
    {
      int value = matrix[i][k];
      cout << value << "  ";
    }
    cout << endl;
  }

  dash::Team::All().barrier();

  dash::finalize();
}

Bests,
Bin

@dhinf
Copy link
Member

dhinf commented Jan 24, 2020

But, I think the read with halo might be better served by the array created with halo layer. Otherwise, each access to halo layer may trigger a communication and thus bad performance.

Indeed, each access on non local elements triggers a blocked communication call. So the performance is really bad. The halo wrapper takes the local part of the distributed Array and the stencil spec. Then it calculates the memory size required for the halo elements and requests new memory. At the moment it is not possible to use the subscript operator to access the halo elements. That's why you need the stencil iterator. But maybe a subscript operator based access layer for the halos can be added in future.

The idea behind the stencil operator is to distinguish between inner and boundary calculation to asynchronously load the halo elements while calculating the inner part. Also, we wanted to use the NArray with halo and without halo support (therefor we decided to use separate memory for the halo elements).

best
Denis

@Goon83
Copy link
Author

Goon83 commented Jan 27, 2020

Hi dhinf,
Thanks for clarification. I am a little confused with how to use stencil iterator to access halo elements. Could you point me to some sample code or document? It's more helpful if you can help to modify above example code to with halo and stencil iterator.

Thanks.
Bin

@dhinf
Copy link
Member

dhinf commented Jan 28, 2020

Hi Bin,

here is an example:
https://github.com/dash-project/dash/blob/development/dash/examples/ex.02.matrix.halo.heat_equation/main.cpp

Within the iteration loop you can see how to use the iterator. If you use the feat-halo branch the comment in line 145 (slow version) isn't true anymore. The performance is more competitive in this branch.

@devreal
Copy link
Member

devreal commented Jan 28, 2020

@dhinf Can we merge this branch into development?

@Goon83
Copy link
Author

Goon83 commented Jan 28, 2020

hi dhinf,
Could you help to explain the blow code blocks? I am studying the slow version but have some difficult to locate right document to understand them. Maybe, I don't get the background of the code right.

Following the example code, it is a 2D matrix. The Stencil is defined as
StencilSpecT stencil_spec( StencilT(-1, 0), StencilT(1, 0), StencilT( 0, -1), StencilT(0, 1));
I assume this line forces HaloMatrixWrapperT to have 1 layer of halo at each direction when it is used here:

  HaloMatrixWrapperT halomat(matrix, bound_spec, stencil_spec);
  HaloMatrixWrapperT halomat2(matrix2, bound_spec, stencil_spec);

Then, the following two lines create a iterator/operator based on the stencil_spec again.

 auto stencil_op = halomat.stencil_operator(stencil_spec);
  auto stencil_op2 = halomat2.stencil_operator(stencil_spec);

So far, I think the Stencil should be 2D on a 2D data, i.e., accessing point at left/right/up/down of each point and do some calculation.

My confusing part is the below code block.

  1. The iterator (value_at(...) ) on both inner/boundary are 1D-index based. How is the 2D Stencil mapped onto the 1D value_at based one? Any layout changes, or duplication exits?

  2. Why does the separated too loop works? Not sure how the boundary points between inner and boundary are processed. Say, when an inner.value_at(1) access a cell out of inner (within boundary), does DASH automatically deal it?

  // slow version
    auto it_end = current_op->inner.end();
    for(auto it = current_op->inner.begin(); it != it_end; ++it)
    {
      auto core = *it;
      auto dtheta = (it.value_at(0) + it.value_at(1) - 2 * core) / (dx * dx) +
                    (it.value_at(2) + it.value_at(3) - 2 * core) /(dy * dy);
      new_begin[it.lpos()] = core + k * dtheta * dt;
    }

    // Wait until all Halo updates ready
    current_halo->wait();

    // Calculation of boundary Halo elements
    auto it_bend = current_op->boundary.end();
    for (auto it = current_op->boundary.begin(); it != it_bend; ++it) {
      auto core = *it;
      double dtheta =
          (it.value_at(0) + it.value_at(1) - 2 * core) / (dx * dx) +
          (it.value_at(2) + it.value_at(3) - 2 * core) / (dy * dy);
      new_begin[it.lpos()] = core + k * dtheta * dt;
    }

@dhinf
Copy link
Member

dhinf commented Jan 29, 2020

@dhinf Can we merge this branch into development?

I need to clean up some code. I think at the end of this week or next week we can merge the branch into development.

@dhinf
Copy link
Member

dhinf commented Jan 30, 2020

1. The iterator (value_at(...) ) on both inner/boundary are 1D-index based. How is the 2D Stencil mapped onto the 1D value_at based one? Any layout changes, or duplication exits?

This is the stencil point position within the stencilspec.
StencilT(-1, 0) -> value_at(0)
StencilT(1, 0) -> value_at(1)
StencilT( 0, -1) -> value_at(2)
StencilT(0, 1) -> value_at(3)

you also can use -> value_at(StencilT(-1,0)) . Here the iterator tries to find the position of stencil point within the given stencil spec and then calls value_at(position). So you have an indirection and results in slightly less performance.
I recommend the value_at with the position as argument.

2\. Why does the separated too loop works? Not sure how the boundary points between inner and boundary are processed. Say, when an inner.value_at(1) access a cell out of inner (within boundary), does DASH automatically deal it?

Depending on the stencil width the iterator for the inner part only iterates over elements where all stencil points accesses non halo elements only . The boundary iterator iterates over all center elements with at least one halo element access. The method value_at in this case tests whether the stencil point is located in the halo or the inner part.

@Goon83
Copy link
Author

Goon83 commented Feb 7, 2020

Hi dhinf,
Thanks for explanations. For the second question, I am still confused. Say the below example with the size of 5 by 5. "0" and "A" are inner elements and non 0 are halo. Will the inner iterator only works on the element "A" ? Which loop in above code will deal with "0"?

1 2 3 4 5
6 0 0 0 7
8 0 A 0 9
10 0 0 0 11
11 12 13 14 15

BTW,

  1. I still want to raise the same question again for your consideration, could we separate the stencil and halo layer support in NArray? and also provide a generic API to access both inner boundary element.

2)As previously discussed, the stencil spec generator may help the first case. Any idea (sample code) how to do that in DASH? Some of our use case can have over 100s layers of halo. A stencil spec generator may be a temporary resolution to it.

Bests,
Bin

@dhinf
Copy link
Member

dhinf commented Feb 7, 2020

Hi Bin,

for(auto it = current_op->inner.begin(); it != it_end; ++it)

works on A

for (auto it = current_op->boundary.begin(); it != it_bend; ++it) {

works on all 0 elements

I will look into your other questions next week.

Best
Denis

@Goon83
Copy link
Author

Goon83 commented Feb 11, 2020

Hi Denis,
Thanks for the answers. Could you elaborate more details about the iteration order (for both inner and boundary)? the cell layout order (e.g., row major)in memory?

I may rephrase my question with the below code again. As the code says, it accesses halo elements as normal (inner) element through [] operator or value_at. Having a halo layer can boost the performance. However, the code must be re-written to use HaloMatrixWrapper and boundary/inner iterator. I think it is fine to use the HaloMatrixWrapper but the boundary/inner iterator become too complexed. Could we just add a value_at operator to HaloMatrixWrapper? This operator access both boundary and inner elements with global coordinate. But the value_at decides to choose which boundary/inner iterator to use to access data. Definitely, users can explicitly control the update the boundary elements. Thanks.

Bests,
Bin

#include <unistd.h>
#include <iostream>
#include <cstddef>
#include <iomanip>

#include <libdash.h>
//#include <mpi.h>

using namespace std;

using std::cout;

int main(int argc, char *argv[])
{
  dash::init(&argc, &argv);

  size_t team_size = dash::Team::All().size();
  dash::TeamSpec<2> teamspec;
  teamspec.balance_extents();

  dash::global_unit_t myid = dash::myid();
  size_t num_units = dash::Team::All().size();

  if (num_units != 4)
  {
    cout << "Please run with mpirun -n 4" << endl;
    exit(-1);
  }

  size_t tilesize_x = 4;
  size_t tilesize_y = 4;
  size_t rows = tilesize_x * (num_units / 2);
  size_t cols = tilesize_y * (num_units / 2);
  dash::Matrix<int, 2> matrix(
      dash::SizeSpec<2>(
          rows,
          cols),
      dash::DistributionSpec<2>(),
      dash::Team::All(),
      teamspec);
  size_t matrix_size = rows * cols;
  DASH_ASSERT(matrix_size == matrix.size());
  DASH_ASSERT(rows == matrix.extent(0));
  DASH_ASSERT(cols == matrix.extent(1));

  if (0 == myid)
  {
    cout << "Matrix size: " << rows
         << " x " << cols
         << " == " << matrix_size
         << endl;
  }

  std::vector<size_t> my_start(2);
  std::vector<size_t> my_end(2);

  switch (myid)
  {
  case 0:
    my_start[0] = 0;
    my_start[1] = 0;
    break;
  case 1:
    my_start[0] = 0;
    my_start[1] = tilesize_y;
    break;
  case 2:
    my_start[0] = tilesize_x;
    my_start[1] = 0;
    break;
  case 3:
    my_start[0] = tilesize_x;
    my_start[1] = tilesize_y;
    break;
  default:
    break;
  }

  my_end[0] = my_start[0] + tilesize_x - 1;
  my_end[1] = my_start[1] + tilesize_y - 1;

  if (!myid)
    cout << "Assigning matrix values" << endl;

  cout << myid << "'s start = (" << my_start[0] << ", " << my_start[1] << "), my_end = " << my_end[0] << ", " << my_end[1] << " )" << endl;
  for (size_t i = my_start[0]; i <= my_end[0]; i++)
  {
    for (size_t k = my_start[1]; k <= my_end[1]; k++)
    {
      matrix[i][k] = myid;
    }
  }

  // Units waiting for value initialization
  dash::Team::All().barrier();

  // Read and assert values in matrix
  for (size_t i = my_start[0]; i <= my_end[0]; i++)
  {
    for (size_t k = my_start[1]; k <= my_end[1]; k++)
    {
      int value = matrix[i][k];
      int expected = myid;
      DASH_ASSERT(expected == value);
    }
  }

  if (!myid)
  {
    cout << "Print matrix values at rank 0: " << endl;
    for (size_t i = 0; i < matrix.extent(0); i++)
    {
      for (size_t k = 0; k < matrix.extent(1); k++)
      {
        int value = matrix[i][k];
        cout << value;
        cout << "   ";
      }
      cout << endl;
    }
  }

  std::vector<size_t> my_start_halo(2);
  std::vector<size_t> my_end_halo(2);

  switch (myid)
  {
  case 0:
    my_start_halo[0] = 0;
    my_start_halo[1] = 0;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  case 1:
    my_start_halo[0] = 0;
    my_start_halo[1] = tilesize_y - 1;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  case 2:
    my_start_halo[0] = tilesize_x - 1;
    my_start_halo[1] = 0;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  case 3:
    my_start_halo[0] = tilesize_x - 1;
    my_start_halo[1] = tilesize_y - 1;
    my_end_halo[0] = my_start_halo[0] + tilesize_x;
    my_end_halo[1] = my_start_halo[1] + tilesize_y;
    break;
  default:
    break;
  }

  sleep(myid * 3);

  cout << "myid =" << myid << "'s output (" << my_start_halo[0] << "," << my_start_halo[1] << ")->(" << my_end_halo[0] << "," << my_end_halo[1] << "):" << endl;
  // Read and assert values in matrix
  for (size_t i = my_start_halo[0]; i <= my_end_halo[0]; i++)
  {
    for (size_t k = my_start_halo[1]; k <= my_end_halo[1]; k++)
    {
      int value = matrix[i][k];
      cout << value << "  ";
    }
    cout << endl;
  }

  dash::Team::All().barrier();

  dash::finalize();
}

@dhinf
Copy link
Member

dhinf commented Feb 18, 2020

Hi Bin,
i will look into it next week. The problem by using the subscript operator is the performance. the operator has to check every access whether it is a halo or local element. it is definitely possible, but costly compared to the current solution.

best
Denis

@Goon83
Copy link
Author

Goon83 commented Feb 18, 2020

Hi Denis,
Sounds greats. I think the cost of a simple check should be acceptable regarding the simplicity of the code. Using subscript on matrix is more nature way to access a matrix, without considering the data layout.

Bests,
Bin

@Goon83
Copy link
Author

Goon83 commented Mar 4, 2020

Hi @dhinf and @devreal
Any chance you have made some progress about this line of work? Sorry for that too many questions and tickets for the DASH. So far I had a DASH plugin for our ArrayUDF and delivered to some one of our users for test. It works perfectly. If anything I can help with the subscript operator to access both inner and halo elements, please let me know. This feature is quite important to expand its usage in ArrayUDF.

Bests,
Bin

@dhinf
Copy link
Member

dhinf commented Mar 4, 2020

Hi Bin,
do you need access for inner and boundary elements separated by the subscript operator or one access for all elements?
something like:

for(int i = 0; i < extent_first_dim; ++i) {
  for(int j = 0; j < extent_second_dim; ++j) {
    dst[i][j] = src[i][j] + src[i-1][j] .....
  }
}

i try to implement this but you need to allocate a full stencil (or at least all necessary ones), otherwise you will get wrong results. I will do it besides my current tasks, so it will take while. Hopefully not longer than the end of next week.

@Goon83
Copy link
Author

Goon83 commented Mar 4, 2020

Thanks dhinf,
Thanks for the hard work. I'd like to assist if you need.
Ideally, it would be one subscript attached to HaloMatrixWrapper to access both inner and halo element, such as:

HaloMatrixWrapper m (4 by 4) with 1 layer halo, making a 5 by 5 matrix;

m[0][0] access 1st element (halo)
m[0][1] access 2nd element (inner)
m[0][2] access 3rd element (inner)
.....
m[0][4] access 5th element (halo)
m[1][0] access 6th element (halo)
m[1][1] access 7th element (inner)
....

Bests,
Bin

@dhinf
Copy link
Member

dhinf commented Mar 29, 2020

Hi Bin,
in the feat-halo branch is a coordinate based access class included now. It is a prototype and has no access checks included.
Also the HaloWrapper now has an constructor that creates the a halo environment with a full stencil.
the halo in the north west is (-1,-1) , this differs from your example.

I adapted your code to the new functionality:

#include <unistd.h>
#include <iostream>
#include <cstddef>
#include <iomanip>

#include <libdash.h>

using Matrix_t = dash::Matrix<int, 2>;
using HaloMatrixWrapper_t = dash::halo::HaloMatrixWrapper<Matrix_t>;

using namespace std;

using std::cout;

int main(int argc, char *argv[])
{
  dash::init(&argc, &argv);

  size_t team_size = dash::Team::All().size();
  dash::TeamSpec<2> teamspec;
  teamspec.balance_extents();

  dash::global_unit_t myid = dash::myid();
  size_t num_units = dash::Team::All().size();

  if (num_units != 4)
  {
    cout << "Please run with mpirun -n 4" << endl;
    exit(-1);
  }

  size_t tilesize_x = 4;
  size_t tilesize_y = 4;
  size_t rows = tilesize_x * (num_units / 2);
  size_t cols = tilesize_y * (num_units / 2);
  Matrix_t matrix(
      dash::SizeSpec<2>(
          rows,
          cols),
      dash::DistributionSpec<2>(dash::BLOCKED, dash::BLOCKED),
      dash::Team::All(),
      teamspec);
  size_t matrix_size = rows * cols;
  DASH_ASSERT(matrix_size == matrix.size());
  DASH_ASSERT(rows == matrix.extent(0));
  DASH_ASSERT(cols == matrix.extent(1));

  HaloMatrixWrapper_t wrapper(matrix, 1);

  if (0 == myid)
  {
    cout << "Matrix size: " << rows
         << " x " << cols
         << " == " << matrix_size
         << endl;
  }
  if (!myid) {
    cout << "Assigning matrix values" << endl;
  }

  auto access = wrapper.coordinate_access();

  auto lranges = access.ranges_local();
  cout << myid
       << "'s start = (" << lranges[0].begin << ", " << lranges[1].begin
       << "), my_end = " << lranges[0].end << ", " << lranges[1].end << " )"
       << endl << std::flush;

  for (size_t i = lranges[0].begin; i < lranges[0].end; ++i) {
    for (size_t k = lranges[1].begin; k < lranges[1].end; ++k) {
      access[i][k] = myid;
    }
  }

  // Units waiting for value initialization

  // Read and assert values in matrix
  for (size_t i = lranges[0].begin; i < lranges[0].end; ++i) {
    for (size_t k = lranges[1].begin; k < lranges[1].end; ++k) {
      auto value = access[i][k];
      auto expected = myid;
      DASH_ASSERT(expected == value);
    }
  }

  matrix.barrier();

  if (!myid)
  {
    cout << "Print matrix values at rank 0: " << endl;
    for (size_t i = 0; i < matrix.extent(0); ++i) {
      for (size_t k = 0; k < matrix.extent(1); ++k) {
        int value = matrix[i][k];
        cout << value;
        cout << "   ";
      }
      cout << endl;
    }
  }

  wrapper.update();
  dash::Team::All().barrier();

  sleep(myid*2);

  auto hranges = access.ranges_halo();
  cout << myid
       << "'s start = (" << hranges[0].begin << ", " << hranges[1].begin
       << "), my_end = " << hranges[0].end<< ", " << hranges[1].end<< " )"
       << endl << std::flush;
  // Read and assert values in matrix
  for (auto i = hranges[0].begin; i < hranges[0].end; ++i) {
    for (auto k = hranges[1].begin; k < hranges[1].end; ++k) {
      cout << access[i][k] << "  ";
    }
    cout << endl;
  }
  dash::Team::All().barrier();

  dash::finalize();
}

Best
Denis

@Goon83
Copy link
Author

Goon83 commented Apr 1, 2020

Hi Denis,
Thanks for hard work on this. I will test the example soon and update here once it is done.

Bests,
Bin

@Goon83
Copy link
Author

Goon83 commented May 26, 2020

Hi, @dhinf,
Tried the code on the development branch and it did not compile.

PS: also tried to install bug-dash-halo/bug-halo-wrapper branches for test. But, none of them work on Mac.

Bests,
Bin

test-ghost.cpp:62:27: error: no member named 'coordinate_access' in
      'dash::halo::HaloMatrixWrapper<dash::Matrix<int, 2, long,
      dash::TilePattern<2, dash::ROW_MAJOR, long>, dash::HostSpace> >'
    auto access = wrapper.coordinate_access();

opt/dash-0.4.0//include/dash/halo/Halo.h:733:43: error:
      member reference base type 'const int' is not a structure or union
    for(const auto& stencil : stencil_spec.specs()) {

@dhinf
Copy link
Member

dhinf commented May 26, 2020

Hi Bin,
i missed to mention to use the feat-halo branch.

Please try it with this branch.

best
Denis

@Goon83
Copy link
Author

Goon83 commented May 26, 2020

@dhinf thanks for information.

Check-outing the feat-halo branch but it still reports below errors:

/Users/dbin/work/soft/dash-ghost/build/opt/dash-0.4.0//include/dash/halo/Stencil.h:303:50: error:
      no type named 'point_value_t' in 'dash::halo::StencilPoint<2, double>'
  using stencil_dist_t = typename StencilPointT::point_value_t;
                         ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
/Users/dbin/work/soft/dash-ghost/build/opt/dash-0.4.0//include/dash/halo/HaloMatrixWrapper.h:122:50: note:
      in instantiation of template class
      'dash::halo::StencilSpecFactory<dash::halo::StencilPoint<2, double> >'
      requested here
  : HaloMatrixWrapper(matrix, GlobBoundSpec_t(), StencilSpecFactory<St...
                                                 ^
test-halo.cpp:48:25: note: in instantiation of function template
      specialization 'dash::halo::HaloMatrixWrapper<dash::Matrix<int, 2, long,
      dash::TilePattern<2, dash::ROW_MAJOR, long>, dash::HostSpace>
      >::HaloMatrixWrapper<dash::halo::StencilPoint<2, double> >' requested
      here
    HaloMatrixWrapper_t wrapper(matrix, 1);

In file included from test-halo.cpp:6:
In file included from /Users/dbin/work/soft/dash-ghost/build/opt/dash-0.4.0//include/libdash.h:56:
/Users/dbin/work/soft/dash-ghost/build/opt/dash-0.4.0//include/dash/halo/HaloMatrixWrapper.h:122:83: error:
      incomplete definition of type
      'dash::halo::StencilSpecFactory<dash::halo::StencilPoint<2, double> >'
  ...GlobBoundSpec_t(), StencilSpecFactory<StencilPointT>::full_stencil_spe...
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
test-halo.cpp:48:25: note: in instantiation of function template
      specialization 'dash::halo::HaloMatrixWrapper<dash::Matrix<int, 2, long,
      dash::TilePattern<2, dash::ROW_MAJOR, long>, dash::HostSpace>
      >::HaloMatrixWrapper<dash::halo::StencilPoint<2, double> >' requested
      here
    HaloMatrixWrapper_t wrapper(matrix, 1);
                        ^
2 errors generated.

@dhinf
Copy link
Member

dhinf commented May 27, 2020

Hi Bin,

i fixed it. After i send you the example, i changed the internal structure and missed to change that alias in stencil.h.

best
Denis

@Goon83
Copy link
Author

Goon83 commented May 28, 2020

hi @dhinf
Thanks for resolving the issue. It works now.

Will you merge it into the dev branch ?

Bests,
Bin

@dhinf
Copy link
Member

dhinf commented Jun 16, 2020

Hi Bin,
i missed to answer that question. Hopefully i will find some time to clean up the code and then merge it.

best
Denis

@Goon83
Copy link
Author

Goon83 commented Jan 6, 2021

@dhinf
Just revisited some old issue post and found this new feature is still out of the development branch.

Could you kindly hep to merge this into development.

Thanks and happy new year !

Bests,
Bin

@dhinf
Copy link
Member

dhinf commented Jan 7, 2021

Hi Goon83,
i will merge in the next days.

Happy new year and bests
Denis

@dhinf
Copy link
Member

dhinf commented Jan 15, 2021

Hi Goon83,
i need to fix some bugs for small narrays. I will merge the fix and the feat-halo branch into development next week.

best
Denis

@dhinf dhinf mentioned this issue Feb 1, 2021
@dhinf
Copy link
Member

dhinf commented Feb 1, 2021

it is merged now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants