Refactor merge.py and add tests for it#165
Conversation
so it's easier to read what's going on
Add tests for some of the important functions in merge.py. Since I wrote the tests for pytest, I switched to using pytest as the runner for the rest of the tests.
5fbc5c6 to
6d219b4
Compare
QuanMPhm
left a comment
There was a problem hiding this comment.
I have a few small questions before I approve
| if cluster_name is None: | ||
| cluster_name = metrics_from_file.get("cluster_name") |
There was a problem hiding this comment.
Is there any concern that cluster_name is not being checked? What if the provided files are from different clusters. It seems this behavior has been in the code prior to this refactoring, but wanted to ask just in case
There was a problem hiding this comment.
I don't think we checked that cluster_name could be different in different files, so this behavior is unchanged. But it doesn't hurt to add that additional check. I'll add that in a different PR.
openshift_metrics/merge.py
Outdated
| gpu_a100sxm4=rates_data.get_value_at( | ||
| cpu=rates_data.get_value_at("CPU SU Rate", report_month, Decimal), # type: ignore | ||
| gpu_a100=rates_data.get_value_at("GPUA100 SU Rate", report_month, Decimal), # type: ignore | ||
| gpu_a100sxm4=rates_data.get_value_at( # type: ignore |
There was a problem hiding this comment.
Are you using a linter additional to the ruff that we use in the CI? I didn't got any pre-commit errors when removing these comments
There was a problem hiding this comment.
I think it was vscode yelling at me so I put these, but I am going to remove these.
| with open(file, "r") as jsonfile: | ||
| metrics_from_file = json.load(jsonfile) | ||
| cpu_request_metrics = metrics_from_file["cpu_metrics"] | ||
| memory_request_metrics = metrics_from_file["memory_metrics"] | ||
| gpu_request_metrics = metrics_from_file.get("gpu_metrics", None) | ||
| processor.merge_metrics("cpu_request", cpu_request_metrics) | ||
| processor.merge_metrics("memory_request", memory_request_metrics) | ||
| if gpu_request_metrics is not None: | ||
| processor.merge_metrics("gpu_request", gpu_request_metrics) |
There was a problem hiding this comment.
Minor suggestion, but I think this is more concise:
| with open(file, "r") as jsonfile: | |
| metrics_from_file = json.load(jsonfile) | |
| cpu_request_metrics = metrics_from_file["cpu_metrics"] | |
| memory_request_metrics = metrics_from_file["memory_metrics"] | |
| gpu_request_metrics = metrics_from_file.get("gpu_metrics", None) | |
| processor.merge_metrics("cpu_request", cpu_request_metrics) | |
| processor.merge_metrics("memory_request", memory_request_metrics) | |
| if gpu_request_metrics is not None: | |
| processor.merge_metrics("gpu_request", gpu_request_metrics) | |
| for resource in ["cpu_metrics", "memory_metrics", "gpu_metrics"]: | |
| if resource == "gpu_metrics": | |
| if gpu_request_metrics := metrics_from_file.get(resource): | |
| processor.merge_metrics(resource, gpu_request_metrics) | |
| else: | |
| request_metrics = metrics_from_file[resource] | |
| processor.merge_metrics(resource, request_metrics) |
If cpu_metrics and memory_metrics is always present, is it fine to make the loop even simpler?
for resource in ["cpu_metrics", "memory_metrics", "gpu_metrics"]:
if request_metrics := metrics_from_file.get(resource):
processor.merge_metrics(resource, request_metrics)There was a problem hiding this comment.
uh, while this is a good suggestion, due to the old clumsy naming of things it'll break stuff. See, the files put things in cpu_metrics but the processors call it cpu_request so our neat little loop won't work. I am going to leave this as is.
There was a problem hiding this comment.
Ah sorry, the strings looked similar and I thought they were the same
| return processor | ||
|
|
||
|
|
||
| def load_metadata(files: List[str]) -> MetricsMetadata: |
There was a problem hiding this comment.
I think I could load data and metadata in a single loop instead of loading files twice. And for that I reason I don't like what I've done. I am going to refactor it again later.
This refactors merge.py a bit, by moving some things into functions. Additionally it adds some basic tests and this time I switched to using pytest.
I ended up working on this because I realized I was adding more stuff in https://github.com/CCI-MOC/openshift-usage-scripts/pull/164/files and there were no tests.