Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][acc] Add a missing acc.delete generation for the copyin clause #122539

Merged
merged 3 commits into from
Jan 11, 2025

Conversation

khaki3
Copy link
Contributor

@khaki3 khaki3 commented Jan 10, 2025

We are missing the deletion part of the copyin clause after a region or in a destructor. This PR completes its implementation for data regions, compute regions, and global declarations.

Example:

subroutine sub()
  real :: x(1:10)
  !$acc data copyin(x)
  !$acc end data
end subroutine sub

We are getting the following:

    %5 = acc.copyin varPtr(%2#0 : !fir.ref<!fir.array<10xf32>>) bounds(%4) -> !fir.ref<!fir.array<10xf32>> {name = "x"}
    acc.data dataOperands(%5 : !fir.ref<!fir.array<10xf32>>) {
      acc.terminator
    }
    return

With this PR, we'll get:

    %5 = acc.copyin varPtr(%2#0 : !fir.ref<!fir.array<10xf32>>) bounds(%4) -> !fir.ref<!fir.array<10xf32>> {name = "x"}
    acc.data dataOperands(%5 : !fir.ref<!fir.array<10xf32>>) {
      acc.terminator
    }
    acc.delete accPtr(%5 : !fir.ref<!fir.array<10xf32>>) bounds(%4) {dataClause = #acc<data_clause acc_copyin>, name = "x"}
    return

@khaki3 khaki3 requested a review from clementval January 10, 2025 22:11
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir openacc labels Jan 10, 2025
@khaki3 khaki3 requested a review from razvanlupusoru January 10, 2025 22:11
@llvmbot
Copy link
Member

llvmbot commented Jan 10, 2025

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-openacc

Author: None (khaki3)

Changes

We are missing the deletion part of the copyin clause after a region or in a destructor. This PR completes its implementation for data regions, compute regions, and global declarations.

Example:

subroutine sub()
  real :: x(1:10)
  !$acc data copyin(x)
  !$acc end data
end subroutine sub

We are getting the following:

    %5 = acc.copyin varPtr(%2#<!-- -->0 : !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt;) bounds(%4) -&gt; !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt; {name = "x"}
    acc.data dataOperands(%5 : !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt;) {
      acc.terminator
    }
    return

With this PR, we'll get:

    %5 = acc.copyin varPtr(%2#<!-- -->0 : !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt;) bounds(%4) -&gt; !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt; {name = "x"}
    acc.data dataOperands(%5 : !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt;) {
      acc.terminator
    }
    acc.delete accPtr(%5 : !fir.ref&lt;!fir.array&lt;10xf32&gt;&gt;) bounds(%4) {dataClause = #acc&lt;data_clause acc_copyin&gt;, name = "x"}
    return

Patch is 23.59 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122539.diff

11 Files Affected:

  • (modified) flang/lib/Lower/OpenACC.cpp (+27-8)
  • (modified) flang/test/Lower/OpenACC/acc-data-operands.f90 (+8-5)
  • (modified) flang/test/Lower/OpenACC/acc-data.f90 (+3)
  • (modified) flang/test/Lower/OpenACC/acc-declare-globals.f90 (+8)
  • (modified) flang/test/Lower/OpenACC/acc-declare.f90 (+6-4)
  • (modified) flang/test/Lower/OpenACC/acc-kernels-loop.f90 (+2)
  • (modified) flang/test/Lower/OpenACC/acc-kernels.f90 (+3)
  • (modified) flang/test/Lower/OpenACC/acc-parallel-loop.f90 (+2)
  • (modified) flang/test/Lower/OpenACC/acc-parallel.f90 (+3)
  • (modified) flang/test/Lower/OpenACC/acc-serial-loop.f90 (+2)
  • (modified) flang/test/Lower/OpenACC/acc-serial.f90 (+3)
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 8155c36396b112..86d8a549331fb0 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -2183,8 +2183,9 @@ static Op createComputeOp(
   mlir::Value ifCond;
   mlir::Value selfCond;
   llvm::SmallVector<mlir::Value> waitOperands, attachEntryOperands,
-      copyEntryOperands, copyoutEntryOperands, createEntryOperands,
-      dataClauseOperands, numGangs, numWorkers, vectorLength, async;
+      copyEntryOperands, copyinEntryOperands, copyoutEntryOperands,
+      createEntryOperands, dataClauseOperands, numGangs, numWorkers,
+      vectorLength, async;
   llvm::SmallVector<mlir::Attribute> numGangsDeviceTypes, numWorkersDeviceTypes,
       vectorLengthDeviceTypes, asyncDeviceTypes, asyncOnlyDeviceTypes,
       waitOperandsDeviceTypes, waitOnlyDeviceTypes;
@@ -2321,6 +2322,7 @@ static Op createComputeOp(
                                dataClauseOperands.end());
     } else if (const auto *copyinClause =
                    std::get_if<Fortran::parser::AccClause::Copyin>(&clause.u)) {
+      auto crtDataStart = dataClauseOperands.size();
       genDataOperandOperationsWithModifier<mlir::acc::CopyinOp,
                                            Fortran::parser::AccClause::Copyin>(
           copyinClause, converter, semanticsContext, stmtCtx,
@@ -2328,6 +2330,8 @@ static Op createComputeOp(
           dataClauseOperands, mlir::acc::DataClause::acc_copyin,
           mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes,
           asyncOnlyDeviceTypes);
+      copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart,
+                                 dataClauseOperands.end());
     } else if (const auto *copyoutClause =
                    std::get_if<Fortran::parser::AccClause::Copyout>(
                        &clause.u)) {
@@ -2525,6 +2529,8 @@ static Op createComputeOp(
   // Create the exit operations after the region.
   genDataExitOperations<mlir::acc::CopyinOp, mlir::acc::CopyoutOp>(
       builder, copyEntryOperands, /*structured=*/true);
+  genDataExitOperations<mlir::acc::CopyinOp, mlir::acc::DeleteOp>(
+      builder, copyinEntryOperands, /*structured=*/true);
   genDataExitOperations<mlir::acc::CreateOp, mlir::acc::CopyoutOp>(
       builder, copyoutEntryOperands, /*structured=*/true);
   genDataExitOperations<mlir::acc::AttachOp, mlir::acc::DetachOp>(
@@ -2544,8 +2550,8 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter,
                          const Fortran::parser::AccClauseList &accClauseList) {
   mlir::Value ifCond;
   llvm::SmallVector<mlir::Value> attachEntryOperands, createEntryOperands,
-      copyEntryOperands, copyoutEntryOperands, dataClauseOperands, waitOperands,
-      async;
+      copyEntryOperands, copyinEntryOperands, copyoutEntryOperands,
+      dataClauseOperands, waitOperands, async;
   llvm::SmallVector<mlir::Attribute> asyncDeviceTypes, asyncOnlyDeviceTypes,
       waitOperandsDeviceTypes, waitOnlyDeviceTypes;
   llvm::SmallVector<int32_t> waitOperandsSegments;
@@ -2604,6 +2610,7 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter,
                                dataClauseOperands.end());
     } else if (const auto *copyinClause =
                    std::get_if<Fortran::parser::AccClause::Copyin>(&clause.u)) {
+      auto crtDataStart = dataClauseOperands.size();
       genDataOperandOperationsWithModifier<mlir::acc::CopyinOp,
                                            Fortran::parser::AccClause::Copyin>(
           copyinClause, converter, semanticsContext, stmtCtx,
@@ -2611,6 +2618,8 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter,
           dataClauseOperands, mlir::acc::DataClause::acc_copyin,
           mlir::acc::DataClause::acc_copyin_readonly, async, asyncDeviceTypes,
           asyncOnlyDeviceTypes);
+      copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart,
+                                 dataClauseOperands.end());
     } else if (const auto *copyoutClause =
                    std::get_if<Fortran::parser::AccClause::Copyout>(
                        &clause.u)) {
@@ -2723,6 +2732,8 @@ static void genACCDataOp(Fortran::lower::AbstractConverter &converter,
   // Create the exit operations after the region.
   genDataExitOperations<mlir::acc::CopyinOp, mlir::acc::CopyoutOp>(
       builder, copyEntryOperands, /*structured=*/true);
+  genDataExitOperations<mlir::acc::CopyinOp, mlir::acc::DeleteOp>(
+      builder, copyinEntryOperands, /*structured=*/true);
   genDataExitOperations<mlir::acc::CreateOp, mlir::acc::CopyoutOp>(
       builder, copyoutEntryOperands, /*structured=*/true);
   genDataExitOperations<mlir::acc::AttachOp, mlir::acc::DetachOp>(
@@ -3691,7 +3702,8 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter,
                      mlir::Location loc,
                      const Fortran::parser::AccClauseList &accClauseList) {
   llvm::SmallVector<mlir::Value> dataClauseOperands, copyEntryOperands,
-      createEntryOperands, copyoutEntryOperands, deviceResidentEntryOperands;
+      copyinEntryOperands, createEntryOperands, copyoutEntryOperands,
+      deviceResidentEntryOperands;
   Fortran::lower::StatementContext stmtCtx;
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
 
@@ -3729,12 +3741,15 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter,
           /*structured=*/true, /*implicit=*/false);
     } else if (const auto *copyinClause =
                    std::get_if<Fortran::parser::AccClause::Copyin>(&clause.u)) {
+      auto crtDataStart = dataClauseOperands.size();
       genDeclareDataOperandOperationsWithModifier<mlir::acc::CopyinOp,
                                                   mlir::acc::DeleteOp>(
           copyinClause, converter, semanticsContext, stmtCtx,
           Fortran::parser::AccDataModifier::Modifier::ReadOnly,
           dataClauseOperands, mlir::acc::DataClause::acc_copyin,
           mlir::acc::DataClause::acc_copyin_readonly);
+      copyinEntryOperands.append(dataClauseOperands.begin() + crtDataStart,
+                                 dataClauseOperands.end());
     } else if (const auto *copyoutClause =
                    std::get_if<Fortran::parser::AccClause::Copyout>(
                        &clause.u)) {
@@ -3801,12 +3816,14 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter,
   }
 
   openAccCtx.attachCleanup([&builder, loc, createEntryOperands,
-                            copyEntryOperands, copyoutEntryOperands,
-                            deviceResidentEntryOperands, declareToken]() {
+                            copyEntryOperands, copyinEntryOperands,
+                            copyoutEntryOperands, deviceResidentEntryOperands,
+                            declareToken]() {
     llvm::SmallVector<mlir::Value> operands;
     operands.append(createEntryOperands);
     operands.append(deviceResidentEntryOperands);
     operands.append(copyEntryOperands);
+    operands.append(copyinEntryOperands);
     operands.append(copyoutEntryOperands);
 
     mlir::func::FuncOp funcOp = builder.getFunction();
@@ -3825,6 +3842,8 @@ genDeclareInFunction(Fortran::lower::AbstractConverter &converter,
         builder, deviceResidentEntryOperands, /*structured=*/true);
     genDataExitOperations<mlir::acc::CopyinOp, mlir::acc::CopyoutOp>(
         builder, copyEntryOperands, /*structured=*/true);
+    genDataExitOperations<mlir::acc::CopyinOp, mlir::acc::DeleteOp>(
+        builder, copyinEntryOperands, /*structured=*/true);
     genDataExitOperations<mlir::acc::CreateOp, mlir::acc::CopyoutOp>(
         builder, copyoutEntryOperands, /*structured=*/true);
   });
@@ -3848,7 +3867,7 @@ genDeclareInModule(Fortran::lower::AbstractConverter &converter,
     } else if (const auto *copyinClause =
                    std::get_if<Fortran::parser::AccClause::Copyin>(&clause.u)) {
       genGlobalCtorsWithModifier<Fortran::parser::AccClause::Copyin,
-                                 mlir::acc::CopyinOp, mlir::acc::CopyinOp>(
+                                 mlir::acc::CopyinOp, mlir::acc::DeleteOp>(
           converter, modBuilder, copyinClause,
           Fortran::parser::AccDataModifier::Modifier::ReadOnly,
           mlir::acc::DataClause::acc_copyin,
diff --git a/flang/test/Lower/OpenACC/acc-data-operands.f90 b/flang/test/Lower/OpenACC/acc-data-operands.f90
index 5151e8f6a135cd..4341ea687070b6 100644
--- a/flang/test/Lower/OpenACC/acc-data-operands.f90
+++ b/flang/test/Lower/OpenACC/acc-data-operands.f90
@@ -35,6 +35,7 @@ subroutine acc_operand_array_section()
 ! CHECK: acc.data dataOperands(%[[COPYIN]], %[[COPYOUT_CREATE]] : !fir.ref<!fir.array<100xf32>>, !fir.ref<!fir.array<100xf32>>) {
 ! CHECK:   acc.terminator
 ! CHECK: }
+! CHECK: acc.delete accPtr(%[[COPYIN]] : !fir.ref<!fir.array<100xf32>>) bounds(%[[BOUND_1_50]]) {dataClause = #acc<data_clause acc_copyin>, name = "a(1:50)"}
 ! CHECK: acc.copyout accPtr(%[[COPYOUT_CREATE]] : !fir.ref<!fir.array<100xf32>>) bounds(%[[BOUND_51_100]]) to varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xf32>>) {name = "a(51:100)"}
 
 ! Testing array sections of a derived-type component
@@ -138,9 +139,9 @@ subroutine acc_operand_array_section_allocatable()
 ! CHECK: %[[LOAD_BOX_A_2:.*]] = fir.load %[[DECLA]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
 ! CHECK: %[[C0:.*]] = arith.constant 0 : index
 ! CHECK: %[[DIMS0_2:.*]]:3 = fir.box_dims %[[LOAD_BOX_A_2]], %[[C0]] : (!fir.box<!fir.heap<!fir.array<?xf32>>>, index) -> (index, index, index)
-! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS0_2]]#1 : index) stride(%[[DIMS0_1]]#2 : index) startIdx(%[[DIMS0_0]]#0 : index) {strideInBytes = true}
+! CHECK: %[[BOUND_1_50:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS0_2]]#1 : index) stride(%[[DIMS0_1]]#2 : index) startIdx(%[[DIMS0_0]]#0 : index) {strideInBytes = true}
 ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD_BOX_A_0]] : (!fir.box<!fir.heap<!fir.array<?xf32>>>) -> !fir.heap<!fir.array<?xf32>>
-! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND]]) -> !fir.heap<!fir.array<?xf32>> {name = "a(1:50)"}
+! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND_1_50]]) -> !fir.heap<!fir.array<?xf32>> {name = "a(1:50)"}
 ! CHECK: %[[LOAD_BOX_A_0:.*]] = fir.load %[[DECLA]]#0 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
 ! CHECK: %[[LOAD_BOX_A_1:.*]] = fir.load %[[DECLA]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
 ! CHECK: %[[C0:.*]] = arith.constant 0 : index
@@ -154,13 +155,14 @@ subroutine acc_operand_array_section_allocatable()
 ! CHECK: %[[LOAD_BOX_A_2:.*]] = fir.load %[[DECLA]]#1 : !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
 ! CHECK: %[[C0:.*]] = arith.constant 0 : index
 ! CHECK: %[[DIMS0_2:.*]]:3 = fir.box_dims %[[LOAD_BOX_A_2]], %[[C0]] : (!fir.box<!fir.heap<!fir.array<?xf32>>>, index) -> (index, index, index)
-! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS0_2]]#1 : index) stride(%[[DIMS0_1]]#2 : index) startIdx(%[[DIMS0_0]]#0 : index) {strideInBytes = true}
+! CHECK: %[[BOUND_51_100:.*]] = acc.bounds lowerbound(%[[LB]] : index) upperbound(%[[UB]] : index) extent(%[[DIMS0_2]]#1 : index) stride(%[[DIMS0_1]]#2 : index) startIdx(%[[DIMS0_0]]#0 : index) {strideInBytes = true}
 ! CHECK: %[[BOX_ADDR:.*]] = fir.box_addr %[[LOAD_BOX_A_0]] : (!fir.box<!fir.heap<!fir.array<?xf32>>>) -> !fir.heap<!fir.array<?xf32>>
-! CHECK: %[[COPYOUT_CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND]]) -> !fir.heap<!fir.array<?xf32>> {dataClause = #acc<data_clause acc_copyout>, name = "a(51:100)"}
+! CHECK: %[[COPYOUT_CREATE:.*]] = acc.create varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND_51_100]]) -> !fir.heap<!fir.array<?xf32>> {dataClause = #acc<data_clause acc_copyout>, name = "a(51:100)"}
 ! CHECK: acc.data dataOperands(%[[COPYIN]], %[[COPYOUT_CREATE]] : !fir.heap<!fir.array<?xf32>>, !fir.heap<!fir.array<?xf32>>) {
 ! CHECK:   acc.terminator
 ! CHECK: }
-! CHECK: acc.copyout accPtr(%[[COPYOUT_CREATE]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND]]) to varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xf32>>) {name = "a(51:100)"}
+! CHECK: acc.delete accPtr(%[[COPYIN]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND_1_50]]) {dataClause = #acc<data_clause acc_copyin>, name = "a(1:50)"}
+! CHECK: acc.copyout accPtr(%[[COPYOUT_CREATE]] : !fir.heap<!fir.array<?xf32>>) bounds(%[[BOUND_51_100]]) to varPtr(%[[BOX_ADDR]] : !fir.heap<!fir.array<?xf32>>) {name = "a(51:100)"}
 
 
 ! Testing array sections on pointer array
@@ -196,6 +198,7 @@ subroutine acc_operand_array_section_pointer()
 ! CHECK: acc.data dataOperands(%[[COPYIN]] : !fir.ptr<!fir.array<?xf32>>) {
 ! CHECK:   acc.terminator
 ! CHECK: }
+! CHECK: acc.delete accPtr(%[[COPYIN]] : !fir.ptr<!fir.array<?xf32>>) bounds(%[[BOUND]]) {dataClause = #acc<data_clause acc_copyin>, name = "p(1:50)"}
 
 
 end module
diff --git a/flang/test/Lower/OpenACC/acc-data.f90 b/flang/test/Lower/OpenACC/acc-data.f90
index 6e0ecb9129061c..074f4d1135f109 100644
--- a/flang/test/Lower/OpenACC/acc-data.f90
+++ b/flang/test/Lower/OpenACC/acc-data.f90
@@ -74,6 +74,9 @@ subroutine acc_data
 ! CHECK:      acc.data dataOperands(%[[COPYIN_A]], %[[COPYIN_B]], %[[COPYIN_C]] : !fir.ref<!fir.array<10x10xf32>>, !fir.ref<!fir.array<10x10xf32>>, !fir.ref<!fir.array<10x10xf32>>) {
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
+! CHECK: acc.delete accPtr(%[[COPYIN_A]] : !fir.ref<!fir.array<10x10xf32>>) bounds(%{{.*}}, %{{.*}}) {dataClause = #acc<data_clause acc_copyin>, name = "a"}
+! CHECK: acc.delete accPtr(%[[COPYIN_B]] : !fir.ref<!fir.array<10x10xf32>>) bounds(%{{.*}}, %{{.*}}) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
+! CHECK: acc.delete accPtr(%[[COPYIN_C]] : !fir.ref<!fir.array<10x10xf32>>) bounds(%{{.*}}, %{{.*}}) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "c"}
 
   !$acc data copyout(a) copyout(zero: b) copyout(c)
   !$acc end data
diff --git a/flang/test/Lower/OpenACC/acc-declare-globals.f90 b/flang/test/Lower/OpenACC/acc-declare-globals.f90
index 1c54010dc108a2..b64bbbcc0d12f0 100644
--- a/flang/test/Lower/OpenACC/acc-declare-globals.f90
+++ b/flang/test/Lower/OpenACC/acc-declare-globals.f90
@@ -39,6 +39,14 @@ module acc_declare_copyin_test
 ! CHECK:         acc.terminator
 ! CHECK:       }
 
+! CHECK-LABEL: acc.global_dtor @_QMacc_declare_copyin_testEdata1_acc_dtor {
+! CHECK:         %[[GLOBAL_ADDR:.*]] = fir.address_of(@_QMacc_declare_copyin_testEdata1) {acc.declare = #acc.declare<dataClause = acc_copyin>} : !fir.ref<!fir.array<100000xf32>>
+! CHECK:         %[[DEVICEPTR:.*]] = acc.getdeviceptr varPtr(%[[GLOBAL_ADDR]] : !fir.ref<!fir.array<100000xf32>>) -> !fir.ref<!fir.array<100000xf32>> {dataClause = #acc<data_clause acc_copyin>, name = "data1", structured = false}
+! CHECK:         acc.declare_exit dataOperands(%[[DEVICEPTR]] : !fir.ref<!fir.array<100000xf32>>)
+! CHECK:         acc.delete accPtr(%[[DEVICEPTR]] : !fir.ref<!fir.array<100000xf32>>) {dataClause = #acc<data_clause acc_copyin>, name = "data1", structured = false}
+! CHECK:         acc.terminator
+! CHECK:       }
+
 module acc_declare_device_resident_test
  integer, parameter :: n = 5000
  integer, dimension(n) :: data1
diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90
index 0066e712fbdcce..f40216e2c9fe8e 100644
--- a/flang/test/Lower/OpenACC/acc-declare.f90
+++ b/flang/test/Lower/OpenACC/acc-declare.f90
@@ -82,12 +82,14 @@ subroutine acc_declare_copyin()
 ! CHECK: %[[ADECL:.*]]:2 = hlfir.declare %[[A]](%{{.*}}) {acc.declare = #acc.declare<dataClause =  acc_copyin>, uniq_name = "_QMacc_declareFacc_declare_copyinEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[B:.*]] = fir.alloca !fir.array<10xi32> {bindc_name = "b", uniq_name = "_QMacc_declareFacc_declare_copyinEb"}
 ! CHECK: %[[BDECL:.*]]:2 = hlfir.declare %[[B]](%{{.*}}) {acc.declare = #acc.declare<dataClause =  acc_copyin_readonly>, uniq_name = "_QMacc_declareFacc_declare_copyinEb"} : (!fir.ref<!fir.array<10xi32>>, !fir.shape<1>) -> (!fir.ref<!fir.array<10xi32>>, !fir.ref<!fir.array<10xi32>>)
-! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%{{.*}} : index) stride(%{{.*}} : index) startIdx(%{{.*}} : index)
-! CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[ADECL]]#0 : !fir.ref<!fir.array<100xi32>>)   bounds(%[[BOUND]]) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
-! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%{{.*}} : index) stride(%{{.*}} : index) startIdx(%{{.*}} : index)
-! CHECK: %[[COPYIN_B:.*]] = acc.copyin varPtr(%[[BDECL]]#0 : !fir.ref<!fir.array<10xi32>>)   bounds(%[[BOUND]]) -> !fir.ref<!fir.array<10xi32>> {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
+! CHECK: %[[BOUND_A:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%{{.*}} : index) stride(%{{.*}} : index) startIdx(%{{.*}} : index)
+! CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[ADECL]]#0 : !fir.ref<!fir.array<100xi32>>)   bounds(%[[BOUND_A]]) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
+! CHECK: %[[BOUND_B:.*]] = acc.bounds lowerbound(%{{.*}} : index) upperbound(%{{.*}} : index) extent(%{{.*}} : index) stride(%{{.*}} : index) startIdx(%{{.*}} : index)
+! CHECK: %[[COPYIN_B:.*]] = acc.copyin varPtr(%[[BDECL]]#0 : !fir.ref<!fir.array<10xi32>>)   bounds(%[[BOUND_B]]) -> !fir.ref<!fir.array<10xi32>> {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
 ! CHECK: acc.declare_enter dataOperands(%[[COPYIN_A]], %[[COPYIN_B]] : !fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<10xi32>>)
 ! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: acc.delete accPtr(%[[COPYIN_A]] : !fir.ref<!fir.array<100xi32>>)   bounds(%[[BOUND_A]]) {dataClause = #acc<data_clause acc_copyin>, name = "a"}
+! CHECK: acc.delete accPtr(%[[COPYIN_B]] : !fir.ref<!fir.array<10xi32>>)   bounds(%[[BOUND_B]]) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
 
   subroutine acc_declare_copyout()
     integer :: a(100), i
diff --git a/flang/test/Lower/OpenACC/acc-kernels-loop.f90 b/flang/test/Lower/OpenACC/acc-kernels-loop.f90
index e5791f0e5b3921..388a14d278d6c6 100644
--- a/flang/test/Lower/OpenACC/acc-kernels-loop.f90
+++ b/flang/test/Lower/OpenACC/acc-kernels-loop.f90
@@ -345,6 +345,8 @@ subroutine acc_kernels_loop
 ! CHECK-NEXT:   }{{$}}
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
+! CHECK:      acc.delete accPtr(%[[COPYIN_A]] : !fir.ref<!fir.array<10xf32>>) bounds(%{{.*}}) {dataClause = #acc<data_clause acc_copyin>, name = "a"}
+! CHECK:      acc.delete accPtr(%[[COPYIN_B]] : !fir.ref<!fir.array<10xf32>>) bounds(%{{.*}}) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
 
   !$acc kernels loop copyout(a) copyout(zero: b)
   DO i = 1, n
diff --git a/flang/test/Lower/OpenACC/acc-kernels.f90 b/flang/test/Lower/OpenACC/acc-kernels.f90
index ff4f1d3b545911..7282fee689cadb 100644
--- a/flang/test/Lower/OpenACC/acc-kernels.f90
+++ b/flang/test/Lower/OpenACC/acc-kernels.f90
@@ -214,6 +214,9 @@ subroutine acc_kernels
 ! CHECK:      acc.kernels dataOperands(%[[COPYIN_A]], %[[COPYIN_B]], %[[COPYIN_C]] : !fir.ref<!fir.array<10x10xf32>>, !fir.ref<!fir.array<10x10xf32>>, !fir.ref<!fir.array<10x10xf32>>) {
 ! CHECK:        acc.terminator
 ! CHECK-NEXT: }{{$}}
+! CHECK:      acc.delete accPtr(%[[COPYIN_A]] : !fir.ref<!fir.array<10x10xf32>>) bounds(%{{.*}}, %{{.*}}) {dataClause = #acc<data_clause acc_copyin>, name = "a"}
+! CHECK:      acc.delete accPtr(%[[COPYIN_B]] : !fir.ref<!fir.array<10x10xf32>>) bounds(%{{.*}}, %{{.*}}) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
+! CHECK:      acc.delete accPtr(%[[COPYIN_C]] : !fir.ref<!fir.array<10x10xf32>>) bounds(%{{.*}}, %{{.*}}) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "c"}
 
   !$acc kernels copyout(a) copyout(zero: b) copyout(c)
   !$acc end kernels
diff --git a/flang/t...
[truncated]

Copy link
Contributor

@razvanlupusoru razvanlupusoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thank you!

@khaki3 khaki3 merged commit f3d3ec8 into llvm:main Jan 11, 2025
12 checks passed
BaiXilin pushed a commit to BaiXilin/llvm-fix-vnni-instr-types that referenced this pull request Jan 12, 2025
llvm#122539)

We are missing the deletion part of the copyin clause after a region or
in a destructor. This PR completes its implementation for data regions,
compute regions, and global declarations.

Example:
```f90
subroutine sub()
  real :: x(1:10)
  !$acc data copyin(x)
  !$acc end data
end subroutine sub
```
We are getting the following:
```mlir
    %5 = acc.copyin varPtr(%2#0 : !fir.ref<!fir.array<10xf32>>) bounds(%4) -> !fir.ref<!fir.array<10xf32>> {name = "x"}
    acc.data dataOperands(%5 : !fir.ref<!fir.array<10xf32>>) {
      acc.terminator
    }
    return
```
With this PR, we'll get:
```mlir
    %5 = acc.copyin varPtr(%2#0 : !fir.ref<!fir.array<10xf32>>) bounds(%4) -> !fir.ref<!fir.array<10xf32>> {name = "x"}
    acc.data dataOperands(%5 : !fir.ref<!fir.array<10xf32>>) {
      acc.terminator
    }
    acc.delete accPtr(%5 : !fir.ref<!fir.array<10xf32>>) bounds(%4) {dataClause = #acc<data_clause acc_copyin>, name = "x"}
    return
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category openacc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants