Skip to content

apache/tez

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
Dec 31, 2024
Jun 12, 2023
Dec 23, 2024
Sep 24, 2024
Sep 24, 2024
Mar 4, 2025
Sep 24, 2024
Dec 23, 2024
Mar 4, 2025
Sep 24, 2024
Sep 24, 2024
Mar 4, 2025
Mar 4, 2025
Mar 4, 2025
Feb 21, 2025
Mar 4, 2025
Dec 23, 2024
Dec 30, 2024
Sep 24, 2024
Feb 14, 2021
Apr 14, 2022
May 3, 2024
Sep 3, 2014
Jan 21, 2022
Jan 25, 2024
Jan 25, 2024
Mar 4, 2015
Sep 24, 2024
Feb 21, 2025

Apache Tez

Apache Tez is a generic data-processing pipeline engine envisioned as a low-level engine for higher abstractions such as Apache Hadoop Map-Reduce, Apache Pig, Apache Hive etc.

At its heart, tez is very simple and has just two components:

  • The data-processing pipeline engine where-in one can plug-in input, processing and output implementations to perform arbitrary data-processing. Every 'task' in tez has the following:
  • Input to consume key/value pairs from.
  • Processor to process them.
  • Output to collect the processed key/value pairs.
  • A master for the data-processing application, where-by one can put together arbitrary data-processing 'tasks' described above into a task-DAG to process data as desired. The generic master is implemented as a Apache Hadoop YARN ApplicationMaster.