Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test for cgroups_relative_memory #2686

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

omprakaash
Copy link
Contributor

@omprakaash omprakaash commented Feb 15, 2024

Adds an integration test for cgroups_relative_memory. The only difference from the non-relative test seems to be the cgroup_path of the specified spec while testing.

@codecov-commenter
Copy link

codecov-commenter commented Feb 15, 2024

Codecov Report

Merging #2686 (44e7b59) into main (04f8f2d) will decrease coverage by 0.37%.
Report is 64 commits behind head on main.
The diff coverage is 0.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2686      +/-   ##
==========================================
- Coverage   65.50%   65.13%   -0.37%     
==========================================
  Files         133      133              
  Lines       16916    17012      +96     
==========================================
  Hits        11081    11081              
- Misses       5835     5931      +96     

@lengrongfu
Copy link
Collaborator

LGTM

@YJDoc2
Copy link
Collaborator

YJDoc2 commented Feb 20, 2024

Hey, as @omprakaash had mentioned in discord we seem to be skipping a check for memory cgroups validation, as done here in runtime-tools. @tsturzl do you have any idea, why we might be intentionally skipping this, or we might have just missed it originally?

If so, @omprakaash may I ask you to add that in both this and the other test? (You can do it in separate PR, we can merge this one before it)

@tsturzl
Copy link
Collaborator

tsturzl commented Feb 20, 2024

@YJDoc2 Hmm, if you compare any of the integration tests to runtime-tools it seems like we're missing a similar step a few cgroups integration test I think I might have assumed there was something in check_container_created that was checking the container against the spec somehow. It's been 3 years, so I can't recall my thinking from back then.

A simple solution would be to just read the respective cgroups files and comparing them. It looks like cpu, memory, and network are all missing these checks.

There is a lot of opportunity here to create a read/write abstraction for cgroups files. Right now we just have a constant for each file name and use a write_cgroup_file utility function to write to a given path. If reading (or writing) cgroups file is going to become more useful outside the context of libcgroups crate then maybe we want a generalized abstraction to handle read/write? This would clean up the implementation and unit tests, as well as provide obvious utility to these integration tests in checking cgroups values without having to implement a lot of boilerplate. It's something I thought about introducing to make it easier to more easily swap out the kernel IO API being used which is something I've experimented with in the past. I don't know if that effort is too large to block fixing the issues with the current cgroups integration tests, but it's something I'd be interested in working on if no one else is immediately interested.

@omprakaash
Copy link
Contributor Author

I could just implement the additional checks for this test with current existing methods in this PR. Can the new abstractions be part of a separate PR later on ?

@tsturzl
Copy link
Collaborator

tsturzl commented Feb 21, 2024

@omprakaash that's probably a better immediate solution. If we can just implement something similar to what runtime tools is doing.

@omprakaash omprakaash force-pushed the main branch 4 times, most recently from 8332eb0 to 44e7b59 Compare February 25, 2024 23:37
@YJDoc2
Copy link
Collaborator

YJDoc2 commented Mar 11, 2024

I think I have some changes to request in this, I'll try to get to a full review soon.

Copy link
Collaborator

@YJDoc2 YJDoc2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, Apologies it took me long for the review. There are some changed needed, so I have added comments, please take a look.

Is this test supposed to be cgroups v1 specific? If not we should not use the v1 functions explicitly, and use a generic version that can work on both : v1 and v2.

Also, for the functions added in libcgroup : if we are using them in only tests, and they might not be as useful otherwise, I'd prefer to move them in the test crate itself rather then libcgroup.

@@ -41,6 +46,71 @@ pub fn list_supported_mount_points() -> Result<HashMap<ControllerType, PathBuf>,
Ok(mount_paths)
}

pub fn get_memory_data(pid: i32) -> Result<LinuxMemory, Box<dyn std::error::Error>> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not use dyn error, instead create a proper error type similar to how we have made specialized errors in enum for other functions and use that (or alternatively reuse some existing error type if that fits)

Another thing is that , I would like to confirm if we are similarly exposing the oci_spec objects from any other existing functions : as this fn is pub, it will be part of our public api, and if we are using another crate (oci_spec) type here, we need to be careful for this. If any other function already does this, then fine, else need to think on how to properly expose this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see any other functions that do this. Would it be a better idea to just move this into the testing crate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. While it is not a bad idea to have this sort of function, I'd prefer not to add this to a core crate in a PR related to tests. If you are interested, we can open another issue and discuss the API for this ; but for this case, I'll request you to move the functions into the tests itself.

Comment on lines 59 to 67
let cgroup_memory_files = vec![
"memory.limit_in_bytes",
"memory.soft_limit_in_bytes",
"memory.memsw.limit_in_bytes",
"memory.kmem.limit_in_bytes",
"memory.kmem.tcp.limit_in_bytes",
"memory.swappiness",
"memory.oom_control",
];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to not hardcode these (I think we probably can't, but just confirming )?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think so.

.parse::<u64>()?;
memory_data = memory_data.disable_oom_killer(oom_control == 1);
}
_ => {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we are controlling the files by hardcoding, we should add unreachable!() in the remaining case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Will change it

Comment on lines +153 to +106
pub fn get_subsystem_path(pid: i32, subsystem: &str) -> Result<PathBuf, io::Error> {
let contents = fs::read_to_string(Path::new(&format!("/proc/{}/cgroup", pid)))
.unwrap_or_else(|_| panic!("failed to read /proc/{}/cgroup", pid));
for line in contents.lines() {
let parts: Vec<&str> = line.splitn(3, ':').collect();
if parts.len() < 3 {
continue;
}
let subparts: Vec<&str> = parts[1].split(',').collect();
for subpart in subparts {
if subpart == subsystem {
return Ok(PathBuf::from(parts[2].to_string()));
}
}
}
Err(io::Error::new(
io::ErrorKind::Other,
format!("subsystem {} not found", subsystem),
))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already have something similar to this existing somewhere in the libcgroup crate. Can you check and confirm if we really need a new function for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes there is, but it is tied to V1Manager and is for the current process only.


use crate::utils::{test_outside_container, test_utils::check_container_created};
use crate::utils::{self, test_outside_container};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we combine this import with the utils import above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep !

let spec = SpecBuilder::default()
.linux(
LinuxBuilder::default()
.cgroups_path(Path::new("/testdir/runtime-test/container").join(cgroup_name))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extract this path at top as a const

Comment on lines +52 to +55
fn can_run() -> bool {
Path::new(CGROUP_MEMORY_LIMIT).exists() && Path::new(CGROUP_MEMORY_SWAPPINESS).exists()
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is cgroups v1 specific test, we should also check that cgroups is v1 here

Comment on lines +7 to +18
let expected_memory = spec
.linux()
.as_ref()
.unwrap()
.resources()
.as_ref()
.unwrap()
.memory()
.as_ref();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do without the as_ref calls here, as spec is already a immutable ref? Please check once.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ,the linux() function call returns a &Option, so as_ref() was needed to make it a Option<&Linux>

Comment on lines 23 to 28
if memory.is_err() {
bail!("failed to get memory data: {:?}", memory.err().unwrap());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if memory.is_err() {
bail!("failed to get memory data: {:?}", memory.err().unwrap());
}
if let Err(e) = memory {
bail!("failed to get memory data: {:?}", e);
}

Comment on lines 27 to 40
if expected_memory.limit().unwrap() != memory.as_ref().unwrap().limit().unwrap() {
bail!("expected memory {:?}, got {:?}", expected_memory, memory);
}

if expected_memory.swappiness().unwrap() != memory.as_ref().unwrap().swappiness().unwrap() {
bail!("expected memory {:?}, got {:?}", expected_memory, memory);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both of these, can we extract the corresponding params in vars, and compare those? Additionally we should also report specifically what param mismatch was there, as potentially (although it shouldn't) some other param in memory data can be diff, making it confusing as to exactly why this failed. So in error message it should be something like expected memory limit {expected_limit} but got {actual_limit} instead .

@omprakaash
Copy link
Contributor Author

Thank you for the review ! Will make the required changes.

@YJDoc2
Copy link
Collaborator

YJDoc2 commented Apr 29, 2024

@omprakaash ping! There are still some review changes pending, so pinging you

Signed-off-by: omprakaash <[email protected]>
Signed-off-by: om prakaash <[email protected]>
@omprakaash
Copy link
Contributor Author

omprakaash commented May 27, 2024

Sorry for the delay, finally got a chance to work on this again.

@utam0k
Copy link
Member

utam0k commented May 28, 2024

@omprakaash There seems to be a conflict. Please need to rebase.

@YJDoc2
Copy link
Collaborator

YJDoc2 commented Sep 18, 2024

@omprakaash Ping!
You'll also need to rebase on latest main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants