Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory tracked and collected by Memray far exceeds the resource limits set by Docker #691

Closed
1 task done
jalr4ever opened this issue Oct 8, 2024 · 1 comment
Closed
1 task done
Labels
bug Something isn't working

Comments

@jalr4ever
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Hi, I am currently testing a memory usage issue. I started a Python 3.8 container using the following command and tested my program's memory usage with memray run --live:

docker run -itd \
  --name python38_limit \
  --cpus 8 \
  --memory 16g \
  --gpus all \
  python:3.8 /bin/bash

After a while, I found that Memray reported a maximum memory usage of 34G, which far exceeds the 16G memory limit of my container. Meanwhile, my task is still running.

image

When I use docker stats to check my container's resource usage, it shows far less than 34G:

CONTAINER ID   NAME             CPU %     MEM USAGE / LIMIT   MEM %     NET I/O          BLOCK I/O        PIDS
5b9183ee1a7c   python38_limit   216.84%   2.305GiB / 16GiB    14.40%    11.6MB / 365kB   1.47GB / 140MB   31

Why is this happening?

Expected Behavior

Collect the correct physical memory usage of the program.

Steps To Reproduce

Using the following command to create a container on a GPU server (Nvidia V100):

docker run -itd \
  --name python38_limit \
  --cpus 8 \
  --memory 16g \
  --gpus all \
  python:3.8 /bin/bash

And then, using the code below and csv (20240821152459_100000_sample40000.csv)

from datetime import datetime

import pandas as pd
from sdv.evaluation.single_table import evaluate_quality, run_diagnostic
from sdv.metadata import SingleTableMetadata
from sdv.single_table import CTGANSynthesizer

diagnose = False
export = False
file_name = '../datasets/guests.csv'
current_time = datetime.now().strftime("%Y%m%d%H%M%S")
file_output_name = f'{file_name}_{current_time}_CTGAN_gen_synthetic.csv'

real_data = pd.read_csv(file_name)
print(real_data.head(10))

# Create metadata
metadata = SingleTableMetadata()
metadata.detect_from_dataframe(real_data)

synthesizer = CTGANSynthesizer(
    metadata,
    epochs=300,
    verbose=True
)
synthesizer.fit(real_data)
# sample
synthetic_data = synthesizer.sample(num_rows=1000)
print(synthetic_data.head())

Memray Version

1.14.0

Python Version

3.8

Operating System

Linux

Anything else?

No response

@jalr4ever jalr4ever added the bug Something isn't working label Oct 8, 2024
@pablogsal
Copy link
Member

When a program allocates memory (e.g., using malloc/new in C++ or creating large objects in Python), it initially just reserves address space in the virtual memory. This virtual memory allocation doesn't immediately translate to physical memory usage - the pages are only mapped to physical memory when they're actually accessed (written to). This is known as "lazy allocation" or "demand paging."

In your case:

  • Memray reports 34GB because it's tracking all virtual memory allocations
  • Docker shows 2.3GB because that's the actual physical memory being used
  • Even though your code might request 34GB of address space, if it's not actively using all that memory, it won't count against Docker's 16GB limit

This is why your program continues running despite Memray showing memory usage exceeding the container limit. The virtual memory allocations only become an issue when the program actually tries to use (write to) more physical memory than is available.

@pablogsal pablogsal closed this as not planned Won't fix, can't repro, duplicate, stale Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants