Skip to content

Commit 3c02462

Browse files
committed
Init changed to reduce magnitude of the L2 norm of the kernel results. Updated sample_output
1 parent fa0e173 commit 3c02462

17 files changed

+167
-35
lines changed

sample_output/OUT_WHAMO_CUDA

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@ srun -n 1 -p nvidia bin/sw4ck ../src/sw4ck.in
22
Reading from file ../src/sw4ck.in
33
Launching sw4 kernels
44

5-
Kernel 1 time 4.69094
6-
Kernel 2 time 1.25952
7-
Kernel 3 time 1.26464
8-
Kernel 4 time 1.11923
9-
Kernel 5 time 5.38419
5+
Kernel 1 time 4.72678
6+
Kernel 2 time 1.26259
7+
Kernel 3 time 1.26054
8+
Kernel 4 time 1.11821
9+
Kernel 5 time 5.41491
1010

11-
Total kernel runtime = 14
11+
Total kernel runtime = 13
1212

13-
Norm of output 0x1.b5fa52fe0079dp+59
14-
Norm of output 986238393426103936
15-
Error = -1.3e-14 %
13+
Norm of output 0x1.941a40aec142ep+7
14+
Norm of output 202.0512747393526638
15+
Error = 0 %

sample_output/OUT_WHAMO_CUDA_RAJA

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@ srun -n 1 -p nvidia bin/sw4ck ../src/sw4ck.in
22
Reading from file ../src/sw4ck.in
33
Launching sw4 kernels
44

5-
Kernel 1 time 6.92634
6-
Kernel 2 time 1.70496
7-
Kernel 3 time 1.74182
8-
Kernel 4 time 1.36806
9-
Kernel 5 time 8.98355
5+
Kernel 1 time 6.94067
6+
Kernel 2 time 1.7152
7+
Kernel 3 time 1.69062
8+
Kernel 4 time 1.33939
9+
Kernel 5 time 8.64973
1010

11-
Total kernel runtime = 21
11+
Total kernel runtime = 20
1212

13-
Norm of output 0x1.b5fa52fe0079fp+59
14-
Norm of output 986238393426104192
15-
Error = 1.3e-14 %
13+
Norm of output 0x1.941a40aec142ep+7
14+
Norm of output 202.0512747393526638
15+
Error = 0 %

sample_output/OUT_WHAMO_HIP_CC

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
srun -n 1 -p amd ./sw4ck sw4ck.in
2+
Reading from file sw4ck.in
3+
Launching sw4 kernels
4+
5+
Kernel 1 time 58.0426
6+
Kernel 2 time 24.1525
7+
Kernel 3 time 23.6964
8+
Kernel 4 time 17.5566
9+
Kernel 5 time 50.5733
10+
11+
Total kernel runtime = 174
12+
13+
Norm of output 0x1.941a40ae9d6fap+7
14+
Norm of output 202.05127473518206216
15+
Error = -2.1e-09 %

sample_output/OUT_WHAMO_HIP_HIPCC

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
srun -n 1 -p amd ./sw4ck sw4ck.in
2+
Reading from file sw4ck.in
3+
Launching sw4 kernels
4+
5+
Kernel 1 time 15.3942
6+
Kernel 2 time 4.6632
7+
Kernel 3 time 3.43617
8+
Kernel 4 time 3.74249
9+
Kernel 5 time 15.4687
10+
11+
Total kernel runtime = 42
12+
13+
Norm of output 0x1.941a40aec142ep+7
14+
Norm of output 202.0512747393526638
15+
Error = 0 %

sample_output/OUT_WHAMO_HIP_RAJA

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@ srun -n 1 -p amd bin/sw4ck ../src/sw4ck.in
22
Reading from file ../src/sw4ck.in
33
Launching sw4 kernels
44

5-
Kernel 1 time 21.2905
6-
Kernel 2 time 30.6422
7-
Kernel 3 time 37.4966
8-
Kernel 4 time 28.6283
9-
Kernel 5 time 22.5596
5+
Kernel 1 time 21.0319
6+
Kernel 2 time 30.7601
7+
Kernel 3 time 37.5654
8+
Kernel 4 time 28.6686
9+
Kernel 5 time 22.607
1010

1111
Total kernel runtime = 140
1212

13-
Norm of output 0x1.b5fa52fe0079fp+59
14-
Norm of output 986238393426104192
15-
Error = 1.3e-14 %
13+
Norm of output 0x1.941a40aec142ep+7
14+
Norm of output 202.0512747393526638
15+
Error = 0 %

sample_output/init/OUT_WHAMO_CUDA

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
srun -n 1 -p nvidia bin/sw4ck ../src/sw4ck.in
2+
Reading from file ../src/sw4ck.in
3+
Launching sw4 kernels
4+
5+
Kernel 1 time 4.69094
6+
Kernel 2 time 1.25952
7+
Kernel 3 time 1.26464
8+
Kernel 4 time 1.11923
9+
Kernel 5 time 5.38419
10+
11+
Total kernel runtime = 14
12+
13+
Norm of output 0x1.b5fa52fe0079dp+59
14+
Norm of output 986238393426103936
15+
Error = -1.3e-14 %

0 commit comments

Comments
 (0)