-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HuggingFace][Neuronx] Training - Optimum Neuron 0.0.25 - Neuron sdk 2.20.0 - Transformers to 4.43.2 #4365
Conversation
de3747a
to
39ff0fe
Compare
"71670": "[Package: torch] Core torch package version 2.1 affected, cannot be changed in PyTorch 2.1 DLC advisory='A vulnerability in the PyTorch's torch.distributed.rpc framework, specifically in versions prior to 2.2.2, allows for remote code execution (RCE). The framework, which is used in distributed training scenarios, does not properly verify the functions being called during RPC (Remote Procedure Call) operations. This oversight permits attackers to execute arbitrary commands by leveraging built-in Python functions such as eval during multi-cpu RPC communication. The vulnerability arises from the lack of restriction on function calls when a worker node serializes and sends a PythonUDF (User Defined Function) to the master node, which then deserializes and executes the function without validation. This flaw can be exploited to compromise master nodes initiating distributed training, potentially leading to the theft of sensitive AI-related data.'", | ||
"71671": "[Package: torch] Core torch package version 2.1 affected, cannot be changed in PyTorch 2.1 DLC advisory='PyTorch before v2.2.0 was discovered to contain a heap buffer overflow vulnerability in the component /runtime/vararg_functions.cpp. This vulnerability allows attackers to cause a Denial of Service (DoS) via a crafted input.'", | ||
"71672": "[Package: torch] Core torch package version 2.1 affected, cannot be changed in PyTorch 2.1 DLC advisory='Pytorch before version v2.2.0 was discovered to contain a use-after-free vulnerability in torch/csrc/jit/mobile/interpreter.cpp.'", | ||
"71064": "Affected versions of Requests, when making requests through a Requests `Session`, if the first request is made with `verify=False` to disable cert verification, all subsequent requests to the same host will continue to ignore cert verification regardless of changes to the value of `verify`. This behavior will continue for the lifecycle of the connection in the connection pool. Requests 2.32.0 fixes the issue, but versions 2.32.0 and 2.32.1 were yanked due to conflicts with CVE-2024-35195 mitigation." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a ,
on this line, code build logs:
Traceback (most recent call last):
--
259 | File "src/main.py", line 132, in <module>
260 | main()
261 | File "src/main.py", line 128, in main
262 | image_builder(buildspec_file, image_types, device_types)
263 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/image_builder.py", line 370, in image_builder
264 | pushed_images += process_images(parent_images, "Parent/Independent", buildspec_path=buildspec)
265 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/image_builder.py", line 434, in process_images
266 | build_images(common_stage_image_list, make_dummy_boto_client=True)
267 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/image_builder.py", line 581, in build_images
268 | FORMATTER.progress(THREADS)
269 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/output.py", line 103, in progress
270 | output[i] += "." * 10 + constants.STATUS_MESSAGE[futures[image].result()]
271 | File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 432, in result
272 | return self.__get_result()
273 | File "/usr/local/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
274 | raise self._exception
275 | File "/usr/local/lib/python3.8/concurrent/futures/thread.py", line 57, in run
276 | result = self.fn(*self.args, **self.kwargs)
277 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/image.py", line 164, in build
278 | self.update_pre_build_configuration()
279 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/common_stage_image.py", line 54, in update_pre_build_configuration
280 | generate_safety_report_for_image(
281 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/utils.py", line 383, in generate_safety_report_for_image
282 | ignore_dict = get_safety_ignore_dict(
283 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/utils.py", line 324, in get_safety_ignore_dict
284 | get_safety_ignore_dict_from_image_specific_safety_allowlists(image_uri)
285 | File "/codebuild/output/src2844226945/src/github.com/aws/deep-learning-containers/src/utils.py", line 265, in get_safety_ignore_dict_from_image_specific_safety_allowlists
286 | ignore_dict_from_image_specific_allowlist = json.load(f)
287 | File "/usr/local/lib/python3.8/json/__init__.py", line 293, in load
288 | return loads(fp.read(),
289 | File "/usr/local/lib/python3.8/json/__init__.py", line 357, in loads
290 | return _default_decoder.decode(s)
291 | File "/usr/local/lib/python3.8/json/decoder.py", line 337, in decode
292 | obj, end = self.raw_decode(s, idx=_w(s, 0).end())
293 | File "/usr/local/lib/python3.8/json/decoder.py", line 353, in raw_decode
294 | obj, end = self.scan_once(s, idx)
295 | json.decoder.JSONDecodeError: Expecting ',' delimiter: line 8 column 5 (char 2593)
0c43330
to
0663096
Compare
@Captainia a python vulnerability is detected for gevent:
It does not seem to have been detected for other images in this repository, so I don't know if it can be ignored or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one new critical vulnerability in the image
{"apparmor": [{"description": "It was discovered that the AppArmor policy compiler incorrectly generated looser restrictions than expected for rules allowing mount operations. A local attacker could possibly use this to bypass AppArmor restrictions in applications where some mount operations were permitted.", "vulnerability_id": "CVE-2016-1585", "name": "CVE-2016-1585", "package_name": "apparmor", "package_details": {"file_path": null, "name": "apparmor", "package_manager": "OS", "version": "2.13.3", "release": "7ubuntu5.3build2"}, "remediation": {"recommendation": {"text": "None Provided"}}, "cvss_v3_score": 9.8, "cvss_v30_score": 9.8, "cvss_v31_score": 0.0, "cvss_v2_score": 0.0, "cvss_v3_severity": "CRITICAL", "source_url": "https://people.canonical.com/~ubuntu-security/cve/2016/CVE-2016-1585.html", "source": "UBUNTU_CVE", "severity": "CRITICAL", "status": "ACTIVE", "title": "CVE-2016-1585 - apparmor", "reason_to_ignore": "N/A"}]}
Shall we add it to apt-get update && apt-get upgrade
in the docker file?
It is already installed in the Dockerfile. |
It seems this vulnerability is not exploitable in a docker environment, but worth confirming and then we can add to ignore list. |
Could you update gevent similar to the PR here? #4367 (comment) |
4fa1145
to
a0bd545
Compare
5015228
to
2ee717f
Compare
1c66ef1
to
3aa0fb7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good, feel free to revert the changes in dlc_developer_config.toml after the hook passes
There is one vulnerability
Please patch or add to ignore list |
It seems there is another vulnerability id associated with werkzeug
Could you add it as well? |
1d36b4f
to
4d51973
Compare
Hi David, thanks for making the updates, looks like the image has an incompatible version of package installed, could you take a look?
|
One last vulnerabilities and we should be good to go..
The other code build jobs are newly added so once security test passes we can merge |
These vulnerabilities were already added for the pytorch training DLCs.
9309905
to
041689f
Compare
there's one remaining mlflow vulnerability which you might need to allowlist. Can you please address that?
|
I just added one ... it is impossible to complete this if every time I push a new vuln pops out ! You must improve this process somehow ! |
This reverts commit 7cff21f.
Issue #4307
Description
This PR creates Hugginface's PyTorch DLC for training on neuron-v2 devices (Trainium).
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.