-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEZ-3331: Add operation specific HDFS counters for Tez UI #379
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@ayushtkn, can you please review this one? we're using this downstream for a long time, haven't ported back to upstream yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx Laszlo for the PR, minor stuff rest looks good
FILE_BYTES_READ("fileBytesRead"), | ||
FILE_BYTES_WRITTEN("fileBytesWritten"), | ||
|
||
// Additional counters from HADOOP-13305 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be some more counters as well, like OP_CREATE_NON_RECURSIVE
, OP_EXISTS
, OP_IS_FILE
etc
Should we include them as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I'm refreshing it according to the current state of CommonStatisticNames
if (!statisticUpdaters.containsKey(stats.getScheme())) { | ||
Map<String, FileSystemStatisticUpdater> updaterSet = new TreeMap<>(); | ||
statisticUpdaters.put(stats.getScheme(), updaterSet); | ||
} | ||
FileSystemStatisticUpdater updater = statisticUpdaters.get(stats.getScheme()) | ||
.computeIfAbsent(stats.getName(), k -> new FileSystemStatisticUpdater(tezCounters, stats)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like computeIfAbsent
maybe can do
// Fetch or initialize the updater set for the scheme
Map<String, FileSystemStatisticUpdater> updaterSet = statisticUpdaters
.computeIfAbsent(stats.getScheme(), k -> new TreeMap<>());
// Fetch or create the updater for the specific statistic
FileSystemStatisticUpdater updater = updaterSet
.computeIfAbsent(stats.getName(), k -> new FileSystemStatisticUpdater(tezCounters, stats));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absolutely, thanks!
|
||
private static MiniDFSCluster dfsCluster; | ||
|
||
private static Configuration conf = new Configuration(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be final
public static void setup() throws IOException { | ||
try { | ||
conf.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, TEST_ROOT_DIR); | ||
dfsCluster = new MiniDFSCluster.Builder(conf).numDataNodes(2).format(true).racks(null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.format(true).racks(null)
ain't required IMO, they are by default true
& null
FSDataOutputStream out = remoteFs.create(new Path("/tmp/foo/abc.txt")); | ||
out.writeBytes("xyz"); | ||
out.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use this
DFSTestUtil.writeFile(remoteFs, new Path("/tmp/foo/abc.txt"), "xyz");
and below as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see there is a difference between how these approaches work, which is reflected in the counters
no matter which one we choose, the counters can be asserted accordingly
original approach (mkdir, create)
HDFS_BYTES_WRITTEN=3
HDFS_WRITE_OPS=2
HDFS_OP_CREATE=1
HDFS_OP_MKDIRS=1
DFSTestUtil.writeFile
HDFS_BYTES_WRITTEN=3
HDFS_READ_OPS=1
HDFS_WRITE_OPS=1
HDFS_OP_CREATE=1
HDFS_OP_GET_FILE_STATUS=1
|
||
private static final Logger LOG = LoggerFactory.getLogger( | ||
TestTaskCounterUpdater.class); | ||
private static Configuration conf = new Configuration(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be final
private static final Logger LOG = LoggerFactory.getLogger( | ||
TestTaskCounterUpdater.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
line break wasn't required I think
TaskCounterUpdater updater = new TaskCounterUpdater(counters, conf, "pid"); | ||
|
||
updater.updateCounters(); | ||
LOG.info("Counters: " + counters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use {}
Assert.assertTrue("Counter not updated, old=" + oldVal | ||
+ ", new=" + cpuCounter.getValue(), cpuCounter.getValue() > oldVal); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we put some sleep before updateCounters
, just thinking in extreme conditions, this check won't go flaky, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure, it seems quite unlikely, I would let it be decided by the future's precommit runs (it will be obvious once it fails)
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid
OP_SET_ACL(CommonStatisticNames.OP_SET_ACL), | ||
OP_SET_OWNER(CommonStatisticNames.OP_SET_OWNER), | ||
OP_SET_PERMISSION(CommonStatisticNames.OP_SET_PERMISSION), | ||
OP_GET_FILE_BLOCK_LOCATIONS("op_get_file_block_locations"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this, no enum in Hadoop? where is this coming from then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's from DFSOpsCountStatistics, but I'm removing it as I intend to include and expose only stats that are defined in CommonStatisticNames
thanks @ayushtkn for the initial comments! I need to rework this area, I think some of the comments are related to the fact that I simply adapted an old patch, I need to think this over again! I'll let you know |
60db8f1
to
532adec
Compare
This comment was marked as outdated.
This comment was marked as outdated.
532adec
to
d642287
Compare
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx @abstractdog for the changes, Minor Comments, rest looks good
// Additional counters from HADOOP-13305 | ||
// Additional counters from HADOOP-13305 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate line
// /** | ||
// * A Map where Key-> URIScheme and value->FileSystemStatisticUpdater | ||
// */ | ||
// private Map<String, FileSystemStatisticUpdater> statisticUpdaters = | ||
// new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you commenting this out? Can't we just delete it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right, leftover, removing it
private Map<String, FileSystemStatisticUpdater> statisticUpdaters = | ||
new HashMap<String, FileSystemStatisticUpdater>(); | ||
private Map<String, Map<String, FileSystemStatisticUpdater>> statisticUpdaters = | ||
new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this we can make final
StorageStatistics stats = iter.next(); | ||
// Fetch or initialize the updater set for the scheme | ||
Map<String, FileSystemStatisticUpdater> updaterSet = statisticUpdaters | ||
.computeIfAbsent(stats.getScheme(), k -> new TreeMap<>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using TreeMap
now? If I decode right, earlier it was HashMap
, it would be some cost using it right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, I cannot see the point of ordering while adding different statistic updaters for the same scheme, let me revert back to HashMap
|
||
DFSTestUtil.writeFile(remoteFs, new Path("/tmp/foo/abc.txt"), "xyz"); | ||
|
||
updater.updateCounters(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we first do
FileSystem.clearStatistics();
In case there is any test added in future which does FS operations, I believe this test will screw up. So better to reset everything to 0, before we start testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
taking care of that in a @before method
tested it with copying basicTest to basicTest2 and tried until the problems went away :)
thanks @ayushtkn, addressed your comments |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This a rebased, reworked version of the last patch on TEZ-3331:
https://issues.apache.org/jira/secure/attachment/12926702/TEZ-3331.8.patch