Memory tracked and collected by Memray far exceeds the resource limits set by Docker #691

jalr4ever · 2024-10-08T03:39:46Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Hi, I am currently testing a memory usage issue. I started a Python 3.8 container using the following command and tested my program's memory usage with memray run --live:

docker run -itd \
  --name python38_limit \
  --cpus 8 \
  --memory 16g \
  --gpus all \
  python:3.8 /bin/bash

After a while, I found that Memray reported a maximum memory usage of 34G, which far exceeds the 16G memory limit of my container. Meanwhile, my task is still running.

When I use docker stats to check my container's resource usage, it shows far less than 34G:

CONTAINER ID   NAME             CPU %     MEM USAGE / LIMIT   MEM %     NET I/O          BLOCK I/O        PIDS
5b9183ee1a7c   python38_limit   216.84%   2.305GiB / 16GiB    14.40%    11.6MB / 365kB   1.47GB / 140MB   31

Why is this happening?

Expected Behavior

Collect the correct physical memory usage of the program.

Steps To Reproduce

Using the following command to create a container on a GPU server (Nvidia V100):

docker run -itd \
  --name python38_limit \
  --cpus 8 \
  --memory 16g \
  --gpus all \
  python:3.8 /bin/bash

And then, using the code below and csv (20240821152459_100000_sample40000.csv)

from datetime import datetime

import pandas as pd
from sdv.evaluation.single_table import evaluate_quality, run_diagnostic
from sdv.metadata import SingleTableMetadata
from sdv.single_table import CTGANSynthesizer

diagnose = False
export = False
file_name = '../datasets/guests.csv'
current_time = datetime.now().strftime("%Y%m%d%H%M%S")
file_output_name = f'{file_name}_{current_time}_CTGAN_gen_synthetic.csv'

real_data = pd.read_csv(file_name)
print(real_data.head(10))

# Create metadata
metadata = SingleTableMetadata()
metadata.detect_from_dataframe(real_data)

synthesizer = CTGANSynthesizer(
    metadata,
    epochs=300,
    verbose=True
)
synthesizer.fit(real_data)
# sample
synthetic_data = synthesizer.sample(num_rows=1000)
print(synthetic_data.head())

Memray Version

1.14.0

Python Version

3.8

Operating System

Linux

Anything else?

No response

The text was updated successfully, but these errors were encountered:

pablogsal · 2025-02-13T01:50:15Z

When a program allocates memory (e.g., using malloc/new in C++ or creating large objects in Python), it initially just reserves address space in the virtual memory. This virtual memory allocation doesn't immediately translate to physical memory usage - the pages are only mapped to physical memory when they're actually accessed (written to). This is known as "lazy allocation" or "demand paging."

In your case:

Memray reports 34GB because it's tracking all virtual memory allocations
Docker shows 2.3GB because that's the actual physical memory being used
Even though your code might request 34GB of address space, if it's not actively using all that memory, it won't count against Docker's 16GB limit

This is why your program continues running despite Memray showing memory usage exceeding the container limit. The virtual memory allocations only become an issue when the program actually tries to use (write to) more physical memory than is available.

jalr4ever added the bug Something isn't working label Oct 8, 2024

pablogsal closed this as not planned Won't fix, can't repro, duplicate, stale Feb 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory tracked and collected by Memray far exceeds the resource limits set by Docker #691

Memory tracked and collected by Memray far exceeds the resource limits set by Docker #691

jalr4ever commented Oct 8, 2024

pablogsal commented Feb 13, 2025

Memory tracked and collected by Memray far exceeds the resource limits set by Docker #691

Memory tracked and collected by Memray far exceeds the resource limits set by Docker #691

Comments

jalr4ever commented Oct 8, 2024

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Memray Version

Python Version

Operating System

Anything else?

pablogsal commented Feb 13, 2025