Skip to content

[BUG]: Allocation profiler can not record allocations larger than 512B #16300

@yanglong1010

Description

@yanglong1010

Tracer Version(s)

Latest

Python Version(s)

All versions

Pip Version(s)

All versions

Bug Report

Hi,

The test code use large and small allocations to allocate the same size of memory.
If use dd-trace-py to profile allocation, only small allocations can be seen.

a = []

def alloc(n):
    return [None] * n

x = 1024 * 1024

def small():
    for i in range(int(x / 32)):
        a.append(alloc(32))

def large():
    a.append(alloc(x))

def test():
    for i in range(5):
        small()
        large()

test()
Image

If use memray to profile, both small and large allocations can be seen.

Image

After investigating the code of dd-trace-py, cpython and memray, I think the reason is dd-trace-py only hooks PYMEM_DOMAIN_OBJ, this domain only responsible for objects with a size not great than 512B.

I think PYMEM_DOMAIN_MEM and PYMEM_DOMAIN_RAW need to be hooked too, they are responsible for large objects.

From the code of cpython, it's clear that small and large allocations go different path.
cpython/Objects/obmalloc.c

1958 static void *
1959 _PyObject_Malloc(void *ctx, size_t nbytes)
1960 {
1961     void* ptr = pymalloc_alloc(ctx, nbytes); // my comment: small allocations
1962     if (LIKELY(ptr != NULL)) {
1963         return ptr;
1964     }
1965
1966     ptr = PyMem_RawMalloc(nbytes);  // my comment: large allocations
1967     if (ptr != NULL) {
1968         raw_allocated_blocks++;
1969     }
1970     return ptr;
1971 }

The stack for small allocations:
···
(gdb) bt
#0 _PyObject_Malloc (ctx=0x0, nbytes=8) at Objects/obmalloc.c:1960
#1 0x000000000044dd55 in list_new_prealloc (size=1) at Objects/listobject.c:201
#2 0x000000000044ea84 in list_repeat (a=0x7ffff06b8b80, n=1) at Objects/listobject.c:572
#3 0x0000000000429341 in _PyEval_EvalFrameDefault (tstate=, f=, throwflag=) at Python/ceval.c:2003
#4 0x00000000004f7eff in _PyEval_EvalFrame (throwflag=0, f=0x7ffff7fa4fc0, tstate=0x9c5440) at ./Include/internal/pycore_ceval.h:46
#5 _PyEval_Vector (tstate=0x9c5440, con=0x7ffff7f4be30, locals=, args=, argcount=, kwnames=) at Python/ceval.c:5067
#6 0x0000000000423a3f in _PyObject_VectorcallTstate (kwnames=, nargsf=, args=, callable=, tstate=) at ./Include/cpython/abstract.h:114
#7 PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=) at ./Include/cpython/abstract.h:123
#8 call_function (tstate=, trace_info=, pp_stack=0x7fffffffdd70, oparg=, kwnames=) at Python/ceval.c:5893
#9 0x0000000000427aae in _PyEval_EvalFrameDefault (tstate=, f=, throwflag=) at Python/ceval.c:4213
#10 0x00000000004f752f in _PyEval_EvalFrame (throwflag=0, f=, tstate=0x9c5440) at ./Include/internal/pycore_ceval.h:46
#11 _PyEval_Vector (args=0x0, argcount=0, kwnames=0x0, locals=0x9c5440, con=0x7fffffffdea0, tstate=0x9c5440) at Python/ceval.c:5067
#12 PyEval_EvalCode (co=co@entry=0x7ffff06f26b0, globals=globals@entry=0x7ffff06ad300, locals=locals@entry=0x7ffff06ad300) at Python/ceval.c:1134
#13 0x000000000053f20d in run_eval_code_obj (locals=0x7ffff06ad300, globals=0x7ffff06ad300, co=0x7ffff06f26b0, tstate=0x9c5440) at Python/pythonrun.c:1291
#14 run_mod (mod=, filename=, globals=0x7ffff06ad300, locals=0x7ffff06ad300, flags=, arena=) at Python/pythonrun.c:1312
#15 0x0000000000540e98 in pyrun_file (flags=0x7fffffffe040, closeit=, locals=0x7ffff06ad300, globals=0x7ffff06ad300, start=257, filename=0x7ffff069b550, fp=) at Python/pythonrun.c:1208
#16 _PyRun_SimpleFileObject (fp=, filename=0x7ffff069b550, closeit=, flags=0x7fffffffe040) at Python/pythonrun.c:456
#17 0x0000000000541380 in _PyRun_AnyFileObject (fp=0xa300c0, filename=filename@entry=0x7ffff069b550, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffe040) at Python/pythonrun.c:90
#18 0x000000000042ed7b in pymain_run_file_obj (skip_source_first_line=0, filename=0x7ffff069b550, program_name=0x7ffff06ef5a0) at Modules/main.c:353
#19 pymain_run_file (config=0x9a9a40) at Modules/main.c:372
#20 pymain_run_python (exitcode=0x7fffffffe030) at Modules/main.c:591
#21 Py_RunMain () at Modules/main.c:670
#22 0x000000000042f19f in pymain_main (args=0x7fffffffe140) at Modules/main.c:700
#23 Py_BytesMain (argc=, argv=) at Modules/main.c:724
#24 0x00007ffff6ed3555 in __libc_start_main () from /lib64/libc.so.6
#25 0x000000000042df49 in _start ()
···

The stack for large allocations:
···
Breakpoint 1, 0x00007ffff6f36740 in malloc () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff6f36740 in malloc () from /lib64/libc.so.6
#1 0x00000000004811cc in PyMem_RawMalloc (size=) at Objects/obmalloc.c:572
#2 _PyObject_Malloc (ctx=, nbytes=) at Objects/obmalloc.c:1966
#3 0x000000000044dd55 in list_new_prealloc (size=1048576) at Objects/listobject.c:201
#4 0x000000000044ea84 in list_repeat (a=0x7ffff7f9f300, n=1048576) at Objects/listobject.c:572
#5 0x0000000000429341 in _PyEval_EvalFrameDefault (tstate=, f=, throwflag=) at Python/ceval.c:2003
#6 0x00000000004f7eff in _PyEval_EvalFrame (throwflag=0, f=0x7ffff7fa4fc0, tstate=0x9c5440) at ./Include/internal/pycore_ceval.h:46
#7 _PyEval_Vector (tstate=0x9c5440, con=0x7ffff7f4be30, locals=, args=, argcount=, kwnames=) at Python/ceval.c:5067
#8 0x0000000000423a3f in _PyObject_VectorcallTstate (kwnames=, nargsf=, args=, callable=, tstate=) at ./Include/cpython/abstract.h:114
#9 PyObject_Vectorcall (kwnames=, nargsf=, args=, callable=) at ./Include/cpython/abstract.h:123
#10 call_function (tstate=, trace_info=, pp_stack=0x7fffffffdd70, oparg=, kwnames=) at Python/ceval.c:5893
#11 0x0000000000427aae in _PyEval_EvalFrameDefault (tstate=, f=, throwflag=) at Python/ceval.c:4213
#12 0x00000000004f752f in _PyEval_EvalFrame (throwflag=0, f=, tstate=0x9c5440) at ./Include/internal/pycore_ceval.h:46
#13 _PyEval_Vector (args=0x0, argcount=0, kwnames=0x0, locals=0x9c5440, con=0x7fffffffdea0, tstate=0x9c5440) at Python/ceval.c:5067
#14 PyEval_EvalCode (co=co@entry=0x7ffff06f26b0, globals=globals@entry=0x7ffff06ad300, locals=locals@entry=0x7ffff06ad300) at Python/ceval.c:1134
#15 0x000000000053f20d in run_eval_code_obj (locals=0x7ffff06ad300, globals=0x7ffff06ad300, co=0x7ffff06f26b0, tstate=0x9c5440) at Python/pythonrun.c:1291
#16 run_mod (mod=, filename=, globals=0x7ffff06ad300, locals=0x7ffff06ad300, flags=, arena=) at Python/pythonrun.c:1312
#17 0x0000000000540e98 in pyrun_file (flags=0x7fffffffe040, closeit=, locals=0x7ffff06ad300, globals=0x7ffff06ad300, start=257, filename=0x7ffff069b550, fp=) at Python/pythonrun.c:1208
#18 _PyRun_SimpleFileObject (fp=, filename=0x7ffff069b550, closeit=, flags=0x7fffffffe040) at Python/pythonrun.c:456
#19 0x0000000000541380 in _PyRun_AnyFileObject (fp=0xa300c0, filename=filename@entry=0x7ffff069b550, closeit=closeit@entry=1, flags=flags@entry=0x7fffffffe040) at Python/pythonrun.c:90
#20 0x000000000042ed7b in pymain_run_file_obj (skip_source_first_line=0, filename=0x7ffff069b550, program_name=0x7ffff06ef5a0) at Modules/main.c:353
#21 pymain_run_file (config=0x9a9a40) at Modules/main.c:372
#22 pymain_run_python (exitcode=0x7fffffffe030) at Modules/main.c:591
#23 Py_RunMain () at Modules/main.c:670
#24 0x000000000042f19f in pymain_main (args=0x7fffffffe140) at Modules/main.c:700
#25 Py_BytesMain (argc=, argv=) at Modules/main.c:724
#26 0x00007ffff6ed3555 in __libc_start_main () from /lib64/libc.so.6
#27 0x000000000042df49 in _start ()
···

Reproduction Code

No response

Error Logs

No response

Libraries in Use

No response

Operating System

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions