Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Makefile for cross-compile for arm mali gpu #22

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
SRCDIR=src

HDR=$(wildcard src/*.hpp)
OBJS = $(patsubst %.cpp, %.o, $(wildcard src/*.cpp))

PLATFORM=$(shell uname -s)

ifeq ($(PLATFORM),Darwin)
LDLIBS=-framework OpenCL
else
LDLIBS=-lOpenCL
endif

#CROSS_COMPILE=/usr/local/bin/aarch64-linux-gnu-

ifneq ($(CROSS_COMPILE),)
#LDFLAGS+=-L/usr/local/gcc_9/lib
#CXXFLAGS+=-I/usr/local/include/OpenCL-Headers
else ifeq ($(PLATFORM),Linux)
CXXFLAGS+=-L./src/OpenCL/lib
else ifeq ($(PLATFORM),Android)
CXXFLAGS+=-L/system/vendor/lib64
endif

LDFLAGS+=-L$(SRCDIR)/OpenCL/lib
CXXFLAGS+=-I$(SRCDIR)/OpenCL/include
CXXFLAGS+=-std=c++17 -pthread -O -Wno-comment
CXX=$(CROSS_COMPILE)g++
LD=${CROSS_COMPILE}ld
AR=$(CROSS_COMPILE)ar
CC=$(CROSS_COMPILE)gcc

.PHONY: all clean bin/OpenCL-Benchmark run

all: bin/OpenCL-Benchmark

%.o: %.cpp $(HDR) Makefile
@printf " CXX $<\n"
@$(CXX) $(CXXFLAGS) -L. -c $< -o $@

bin/OpenCL-Benchmark: $(OBJS)
@mkdir -p bin
@printf " LINK $@\n"
@$(CXX) -o $@ $^ $(LDFLAGS) $(LDLIBS)

clean:
@printf " CLEAN bin/OpenCL-Benchmark\n"
@rm -f $(OBJS)
@rm -rf bin

run:
@bin/OpenCL-Benchmark
42 changes: 41 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,52 @@ Works with any GPU in Windows, Linux, macOS and Android.
bin/OpenCL-Benchmark
```

### Cross Compile
- Edit Makefile for CROSS_COMPILE, CXX_FLAGS, LD_FLAGS for compile tools and target opencl so file and header file.
or you can export them before run make.
- Compile and run
```
make
```
- Copy target binary file to target board and run it like above

### Run only for a specified list of devices
- call `bin\OpenCL-Benchmark.exe 0 2 5` (Windows) or `bin/OpenCL-Benchmark 0 2 5` (Linux/macOS) with the number(s) being the device IDs to be benchmarked



## Examples
```
.-----------------------------------------------------------------------------.
|----------------.------------------------------------------------------------|
| Device ID 0 | Mali-G78AE r0p1 |
|----------------'------------------------------------------------------------|
|----------------.------------------------------------------------------------|
| Device ID | 0 |
| Device Name | Mali-G78AE r0p1 |
| Device Vendor | ARM |
| Device Driver | 3.0 (Linux) |
| OpenCL Version | OpenCL C 3.0 |
| Compute Units | 2 at 800 MHz (16 cores, 0.026 TFLOPs/s) |
| Memory, Cache | 6975 MB RAM, 256 KB global / 32 KB local |
| Buffer Limits | 4095 MB global, 4193792 KB constant |
|----------------'------------------------------------------------------------|
| Info: OpenCL C code successfully compiled. |
| FP64 compute not supported |
| FP32 compute 0.051 TFLOPs/s ( 2x ) |
| FP16 compute 0.102 TFLOPs/s ( 4x ) |
| INT64 compute 0.006 TIOPs/s (1/4 ) |
| INT32 compute 0.025 TIOPs/s ( 1x ) |
| INT16 compute 0.050 TIOPs/s ( 2x ) |
| INT8 compute 0.101 TIOPs/s ( 4x ) |
| Memory Bandwidth ( coalesced read ) 11.67 GB/s |
| Memory Bandwidth ( coalesced write) 12.36 GB/s |
| Memory Bandwidth (misaligned read ) 5.79 GB/s |
| Memory Bandwidth (misaligned write) 5.68 GB/s |
|-----------------------------------------------------------------------------|
'-----------------------------------------------------------------------------'
```

```
.-----------------------------------------------------------------------------.
|----------------.------------------------------------------------------------|
Expand Down Expand Up @@ -387,4 +427,4 @@ Works with any GPU in Windows, Linux, macOS and Android.
|-----------------------------------------------------------------------------|
| Done. Press Enter to exit. |
'-----------------------------------------------------------------------------'
```
```