Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gemm_operation_profiler.cu #1848

Closed
scm-later opened this issue Sep 30, 2024 · 1 comment
Closed

gemm_operation_profiler.cu #1848

scm-later opened this issue Sep 30, 2024 · 1 comment

Comments

@scm-later
Copy link

What is your question?
I added split_k serial of cutlass 2.x to cutlass 3.x, slice_k as a parameter of problem_size. Now I want to use cutlass_profiler to test whether I should add a parameter to problem_size in gemm_operation_profiler, or use the existing split_k_slices parameter , but split_k_slices is not a parameter of problem_size

/***************************************************************************************************

  • Copyright (c) 2017 - 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
  • SPDX-License-Identifier: BSD-3-Clause
  • Redistribution and use in source and binary forms, with or without
  • modification, are permitted provided that the following conditions are met:
    1. Redistributions of source code must retain the above copyright notice, this
  • list of conditions and the following disclaimer.
    1. Redistributions in binary form must reproduce the above copyright notice,
  • this list of conditions and the following disclaimer in the documentation
  • and/or other materials provided with the distribution.
    1. Neither the name of the copyright holder nor the names of its
  • contributors may be used to endorse or promote products derived from
  • this software without specific prior written permission.
  • THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
  • AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
  • IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
  • DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
  • FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
  • DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
  • SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
  • CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
  • OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  • OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

**************************************************************************************************/
/*! \file
\brief Tests for device-wide GEMM interface
*/

#include

#include "cutlass/cutlass.h"
#include "cute/tensor.hpp"
#include "cute/atom/mma_atom.hpp"

#include "cutlass/numeric_types.h"

#include "cutlass/gemm/device/gemm_universal_adapter.h"
#include "default_gemm_configuration.hpp"

#include "common/gemm_test_3x.hpp"

using namespace cute;

/////////////////////////////////////////////////////////////////////////////////////////////////
int main(int argc, char* argv[]){
int m = 256;
int n = 256;
int k = 32;
int slice_k = 1;
if(argc >= 2){
m = atoi(argv[1]);
}
if(argc >= 3){
n = atoi(argv[2]);
}
if(argc >= 4){
k = atoi(argv[3]);
}
if(argc >= 5){
slice_k = atoi(argv[4]);
}

using Config = cutlass::gemm::device::DefaultGemmConfigurationToCutlass3Types<
cutlass::arch::OpClassTensorOp, cutlass::arch::Sm75,
half_t, cutlass::layout::ColumnMajor,
half_t, cutlass::layout::RowMajor,
half_t, cutlass::layout::ColumnMajor,
float>;

using GemmKernel = cutlass::gemm::kernel::GemmUniversal<
Shape<int,int,int,int>,
Config::CollectiveMainloop,
Config::CollectiveEpilogue

;

using Gemm = cutlass::gemm::device::GemmUniversalAdapter;
using ElementScalar = typename Gemm::EpilogueOutputOp::ElementScalar;
using ProblemShapeType = typename Gemm::GemmKernel::ProblemShape;
using Testbed = test::gemm::device::Testbed3x<Gemm, cutlass::epilogue::thread::Identity>;

Testbed testbed = {};
ProblemShapeType problem_size;
double alpha,beta;

problem_size = ProblemShapeType{m, n, k, slice_k};
alpha = 1.0;
beta = 0.0;

bool passed = testbed.run(
problem_size,
cutlass::from_real(alpha),
cutlass::from_real(beta)
);

int iterations = 100;
char perf = 'N';
if(perf == 'Y')
test::gemm::device::TestGemm3x(iterations);
}

/////////////////////////////////////////////////////////////////////////////////////////////////

@thakkarV
Copy link
Collaborator

but split_k_slices is not a parameter of problem_size

The "right" way to do split K in 3.x is to write a custom tile scheduler similar to streamK. Then the arguments of the scheduler would contain the num splits argument.

Aside: our StreamK scheduler supports a pure split K mode as well. Curious why you are implementing your own split K only scheduler when we support it OOTB?

@thakkarV thakkarV closed this as completed Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants