spark-bench

Build

    $ git clone https://github.com/mrsrinivas/spark-bench.git
    $ cd spark-bench
    $ mvn install

Run

Run DataGen Spark application on YARN cluster

    $ nohup spark2-submit \
        --master yarn \
        --executor-cores 2 \
        --num-executors 30 \
        --driver-memory 2g \
        --executor-memory 4g \
        --class com.mrsrinivas.app.DataGen \
        ./target/spark-bench-1.0-fat.jar  \ 
        100G \
        30 \
        file:///scratch/username/datagen_in > spark-submit.log &
    
    [1] 11069
    $ nohup: ignoring input and redirecting stderr to stdout
    
    tail -f spark-submit.log

Once the job is successful, the output directory should have following sub directories

    $ cd /scratch/username/datagen_in
    $ ls
    employees	stage-metrics

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
AUTHORS		AUTHORS
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pom.xml		pom.xml
scalastyle-config.xml		scalastyle-config.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spark-bench

Build

Run

About

Releases

Packages

Languages

License

mrsrinivas/spark-bench

Folders and files

Latest commit

History

Repository files navigation

spark-bench

Build

Run

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages