-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inverted flame graph #439
Inverted flame graph #439
Conversation
src/memray/reporters/flamegraph.py
Outdated
@@ -75,6 +75,21 @@ def __init__( | |||
self.data = data | |||
self.memory_records = memory_records | |||
|
|||
@classmethod | |||
def update_import_system_frames_for_inverted_flamegraph( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may have two problems:
- For deep trees, recursion can be quite slow.
- In Python there is a recursion limit to protect from stack overflows. This means that if the tree is deep enough this may blow up with a
RecursionError
.
We could transform this function into something iterative that keeps the stack manually but on the other hand we are already limiting the stack depth here:
memray/src/memray/reporters/flamegraph.py
Lines 164 to 167 in 0154228
if index - num_skipped_frames > MAX_STACKS: | |
current_frame["name"] = "<STACK TOO DEEP>" | |
current_frame["location"] = ["...", "...", 0] | |
break |
It may be good to add a test to see that we don't crash for a stack that's too deep. The test would just see what's the generated data for an input that's very nested.
src/memray/reporters/flamegraph.py
Outdated
@@ -178,6 +195,11 @@ def _from_any_snapshot( | |||
for interval in intervals | |||
) | |||
|
|||
if inverted: | |||
frames[0][ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the assignment necessary? update_import_system_frames_for_inverted_flamegraph
does this:
nodes[node_index]["import_system"] |= result
and on the first call node_index
is 0 so it should update frames[0] already, no?
@@ -28,7 +28,9 @@ def include_file(name: str) -> Markup: | |||
return env | |||
|
|||
|
|||
def get_report_title(*, kind: str, show_memory_leaks: bool) -> str: | |||
def get_report_title(*, kind: str, show_memory_leaks: bool, inverted: bool) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you missed some updates of calls to this function for example here: tests/unit/test_templates.py
node.value += same_node.value; | ||
} | ||
|
||
root.children = root.children.filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hummmm. isn't this invalidating the array root.children
and therefore making the indexing incorrect?
@@ -192,6 +192,32 @@ function filterFramesByFunc(root, func) { | |||
return _.defaults({ children: children }, root); | |||
} | |||
|
|||
function mergeNodes(root) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function will likely need some tests in src/memray/reporters/assets/common.test.js
553d821
to
2abf102
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #439 +/- ##
==========================================
+ Coverage 91.72% 91.89% +0.16%
==========================================
Files 90 90
Lines 10532 10766 +234
Branches 1450 1472 +22
==========================================
+ Hits 9661 9893 +232
- Misses 868 871 +3
+ Partials 3 2 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
2abf102
to
a65cd31
Compare
src/memray/reporters/flamegraph.py
Outdated
from typing import TextIO | ||
from typing import Tuple | ||
from typing import TypedDict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI is failing because of this. We still support 3.7, and typing.TypedDict
wasn't added until Python 3.8. We either need to use from typing_extensions import TypedDict
on Python 3.7, or drop support for Python 3.7 - or maybe make this conditional on Python version... or maybe only use TypedDict
when typing.TYPE_CHECKING
is set.
Or maybe just use a dict for now, not worry about typing it since it's an internal API, and add an issue to remind us to add typing for it after we drop 3.7 support 😄
news/439.feature.rst
Outdated
@@ -0,0 +1,4 @@ | |||
Add a new ``inverted`` flag to ``memray flamegraph``. In an inverted flame graph, the roots are the functions that allocated memory, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new ``inverted`` flag to ``memray flamegraph``. In an inverted flame graph, the roots are the functions that allocated memory, | |
Add a new ``--inverted`` flag to ``memray flamegraph``. In an inverted flame graph, the roots are the functions that allocated memory, |
news/439.feature.rst
Outdated
@@ -0,0 +1,4 @@ | |||
Add a new ``inverted`` flag to ``memray flamegraph``. In an inverted flame graph, the roots are the functions that allocated memory, | |||
and the children of any given node represent the percentage of any of that node's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the children of any given node represent the percentage of any of that node's | |
and the children of any given node represent the percentage of that node's |
news/439.feature.rst
Outdated
@@ -0,0 +1,4 @@ | |||
Add a new ``inverted`` flag to ``memray flamegraph``. In an inverted flame graph, the roots are the functions that allocated memory, | |||
and the children of any given node represent the percentage of any of that node's | |||
allocations that can be attributed to a particular caller.The inverted flame graph is very helpful in analyzing where memory is being |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allocations that can be attributed to a particular caller.The inverted flame graph is very helpful in analyzing where memory is being | |
allocations that can be attributed to a particular caller. The inverted flame graph is very helpful in analyzing where memory is being |
src/memray/_ipython/flamegraph.py
Outdated
@@ -90,6 +90,13 @@ def argument_parser() -> argparse.ArgumentParser: | |||
default=False, | |||
) | |||
|
|||
parser.add_argument( | |||
"--inverted", | |||
help="Invert flame graph", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should explain this in more detail, I think. Something like:
help="Invert flame graph", | |
help=( | |
"Invert the flame graph: " | |
"use allocators as roots instead of thread entry points", | |
) |
src/memray/reporters/flamegraph.py
Outdated
@@ -132,14 +178,15 @@ def _from_any_snapshot( | |||
) | |||
num_skipped_frames = 0 | |||
is_import_system = False | |||
for index, stack_frame in enumerate(reversed(stack)): | |||
stack_it = enumerate((stack)) if inverted else enumerate(reversed(stack)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stack_it = enumerate((stack)) if inverted else enumerate(reversed(stack)) | |
stack_it = enumerate(stack) if inverted else enumerate(reversed(stack)) |
src/memray/reporters/flamegraph.py
Outdated
result = cls.update_import_system_frames_for_inverted_flamegraph( | ||
nodes, child_index | ||
) | ||
nodes[node_index]["import_system"] |= result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The |=
here is essentially saying: if any of this node's callers are part of the import system, then this is an import system node. That doesn't seem correct to me. Consider a test case like this:
def test_inverted_allocator_sometimes_called_under_importlib(self):
# GIVEN
peak_allocations = [
MockAllocationRecord(
tid=1,
address=0x1000000,
size=1024,
allocator=AllocatorType.MALLOC,
stack_id=1,
n_allocations=1,
_stack=[
("malloc", "malloc.c", 12),
("parent_one", "fun.py", 8),
("main", "main.py", 4),
],
),
MockAllocationRecord(
tid=1,
address=0x1000000,
size=1024,
allocator=AllocatorType.MALLOC,
stack_id=1,
n_allocations=1,
_stack=[
("malloc", "malloc.c", 12),
("importer", "<frozen importlib>", 4),
("parent_two", "fun.py", 10),
("main", "main.py", 4),
],
),
]
In that example, malloc
can be called from two different paths - either by main
calling parent_one
calling malloc
, or by main
calling parent_two
calling importer
calling malloc
. importer
is part of the import system, so there's one path to malloc
that goes through the import system and one that doesn't. Each of those paths are responsible for half the bytes allocated by malloc
.
Our non-inverted flame graph displays these stacks for that:
| root size=2048 |
| main size=2048 |
| parent_one size=1024 | parent_two size=1024 |
| malloc size=1024 | importer size=1024 |
| | malloc size=1024 |
When you click the "hide import system frames" checkbox, importer
and the frames below it are hidden, and parent_two
becomes a leaf:
| root size=2048 |
| main size=2048 |
| parent_one size=1024 | parent_two size=1024 |
| malloc size=1024 | |
For our inverted flame graphs, I expect the exact same stacks to be shown, just with leaves and roots swapped. So, when import system frames are not hidden, I expect us to show this:
| root size=2048 |
| malloc size=2048 |
| parent_one size=1024 | importer size=1024 |
| main size=1024 | parent_two size=1024 |
| | main size=1024 |
But when import system frames are hidden, I expect it to instead show:
| root size=2048 |
| malloc size=1024 | parent_two size=1024 |
| parent_one size=1024 | main size=1024 |
| main size=1024 | |
The main -> parent_one -> malloc
path remains unchanged by "hide import system frames", since that path doesn't go through the import system, but the main -> parent_two -> importer -> malloc
stack gets truncated to main -> parent_two
when importer
is reached (just like it is for non-inverted flame graphs), and the stack we show now ends with parent_two
as a child of the root (because parent_two
is a leaf in the non-inverted hide-import-system case, and inverted mode swaps leaves and roots). Which means that the malloc
node shrinks from 2048 bytes to 1024 bytes, since half of its allocations are attributable to a path through the import system, and half aren't.
There's no way we can achieve that desired behavior if we update the malloc
node to say that it's an import system frame because one of its callers is an import system frame, even though the other isn't. As this function is currently implemented, I think it would result in us hiding malloc
entirely, effectively truncating the main -> parent_one -> malloc
call stack even though it didn't pass through the import system.
I can think of 3 reasonable ways to handle this:
- Build two separate trees in
--inverted
mode. For one, use full stacks as the input. For the other, truncate the stacks at import system calls before processing them into a tree. Include both trees in the inverted-mode HTML, and have the checkbox switch which one is rendered. - In the
FrameNodeDict
, instead of recording just abool
for whether or not the node was reached by the import system, record anint
: the number of bytes allocated at that node through a path that went through the import system. That would change this|=
into a+=
. When the checkbox is unset the inverted-mode JS code would ignore that field. When the checkbox is checked the JS code would shrink every node by treating that node's size asvalue - import_system_bytes
, instead of asvalue
bytes. Any node wherevalue - import_system_bytes
is 0 would be removed, and its children would be hoisted up to the root. - Pre-process each stack to identify which frames are below a call into the import system. Include the is-import-system flag as part of the node key, so that we have two different
malloc
nodes in the dumped tree, each associated with 1024 bytes. When the checkbox is checked, our JS code drops nodes with is-import-system set, and hoists their children up a level. When the checkbox is unchecked, our JS code would need to traverse the tree and merge nodes together if they have the same parent and same location, but one is flagged is-import-system and the other isn't.
Of these options, I think 1 is probably the simplest, and 3 is probably the most complex. 2 might require the fewest changes to what's been implemented so far...
@pablogsal Please check what I've said here, and see if a) you agree with me about what the checkbox should do, and b) if my proposed solutions seem reasonable.
if inverted and kind == "flamegraph": | ||
kind = f"inverted {kind}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want to return "inverted flamegraph report (memory leaks)" if show_memory_leaks
is true? I think what we want is:
parts = []
if inverted:
parts.append("inverted")
parts.append(kind)
parts.append("report")
if show_memory_leaks:
parts.append("(memory leaks)")
return " ".join(parts)
Also: Don't worry about doing the and kind == "flamegraph"
here. I think it's fine to return "inverted table report" if someone calls this with kind="table"
. We know they won't, but if they for some reason do, that would be sane behavior, so there's no reason to avoid it.
As a heads up, I haven't looked at the JS or the tests yet, so the above comments are what I have to say about the Python side of the implementation. |
a65cd31
to
1c399a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great! I've taken a look at the Javascript this time around (though I still haven't looked closely at the tests), and most of the concerns I have now are pretty minor.
src/memray/reporters/flamegraph.py
Outdated
@@ -20,16 +29,43 @@ | |||
from memray.reporters.frame_tools import is_cpython_internal | |||
from memray.reporters.frame_tools import is_frame_from_import_system | |||
from memray.reporters.frame_tools import is_frame_interesting | |||
from memray.reporters.stats import PythonStackElement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not have one reporter importing things from another if we can avoid it. I'd rather copy-paste this type declaration, instead:
from memray.reporters.stats import PythonStackElement | |
PythonStackElement = Tuple[str, str, int] |
That helps maintain the separation of responsibilities between different reporters. We might eventually want to move that definition into a common place that different reporters can share - maybe create src/memray/_types.py
or something like that - but I wouldn't bother until we need it in yet another place.
src/memray/reporters/flamegraph.py
Outdated
@classmethod | ||
def update_import_system_frames_for_inverted_flamegraph( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is no longer used and should be removed
src/memray/reporters/flamegraph.py
Outdated
if temporal and inverted: | ||
raise NotImplementedError( | ||
"FlamegraphReporter does not support temporal inverted flame graphs." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should handle this in src/memray/commands/common.py
in run
instead, and do a parser.error()
if the user provides both --temporal
and --inverted
. That'll give a nicer error message for users, I believe.
src/memray/reporters/flamegraph.py
Outdated
frames: List[FrameNodeDict], | ||
node_index_by_key: Dict[Tuple[int, StackFrame, str], int], | ||
record: Union[AllocationRecord, TemporalAllocationRecord], | ||
current_frame_id: int, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callers always provide 0 for this argument. Since it doesn't vary from one call to the next, current_frame_id
should be a local variable instead of a parameter.
src/memray/reporters/flamegraph.py
Outdated
current_frame["location"] = ("...", "...", 0) | ||
break | ||
|
||
return frames |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to return frames
here. The caller passed frames
to us, so they already have a reference to it. Returning it is misleading, since it isn't clear from the signature that we always return the same list, never a different one.
src/memray/reporters/flamegraph.py
Outdated
import_system_stack = list( | ||
reversed( | ||
list( | ||
itertools.takewhile( | ||
lambda e: not is_frame_from_import_system(e), | ||
reversed(stack), | ||
) | ||
) | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this! Very elegant.
src/memray/reporters/flamegraph.py
Outdated
"value": 0, | ||
"children": [], | ||
"n_allocations": 0, | ||
"thread_id": "0x0", | ||
"interesting": True, | ||
"import_system": False, | ||
} | ||
inverted_import_system_root = deepcopy(root) if inverted else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think every variable name in this function with _import_system_
in the name should be renamed to _no_import_system_
(or just _no_imports_
) instead.
inverted_import_system_root
makes it sound like this is the root of an inverted tree that does contain import system nodes, when actually its the root of an inverted tree that contains only non-import-system nodes.
src/memray/reporters/flamegraph.py
Outdated
if temporal and inverted: | ||
raise NotImplementedError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you given any thought to what it would take to extend this to work for temporal mode? It should definitely be a separate PR, this one is long enough - but I don't think it would be too tough to implement.
@@ -139,8 +139,11 @@ <h5 class="modal-title" id="helpModalLabel">How to interpret {{ kind }} reports< | |||
<script type="text/javascript"> | |||
const packed_data = {{ data|tojson }}; | |||
var data = null; | |||
var flamegraphData = null; | |||
var invertedImportSystemData = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this makes it sound like this one does include import system nodes, rather than does not. I'd call it something like importSystemHiddenFlamegraphData
or noImportSystemData
or flamegraphDataNoImportSystem
or something.
nodes: inverted_import_system_nodes, | ||
unique_threads: unique_threads, | ||
}) | ||
: {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do null
here instead of an empty object. That's a better indicator for the particular sort of invalid that this is.
: {}; | |
: null; |
1c399a6
to
fd0b4f9
Compare
91bc90e
to
935184b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made a few small changes, but this LGTM!
src/memray/reporters/flamegraph.py
Outdated
size: Union[int, None] | ||
n_allocations: Union[int, None] | ||
intervals: Union[List[Interval], None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union[X, None]
is the same thing as Optional[X]
, and we're already using Optional
elsewhere in this file, so I'm going to update these 3 to match.
src/memray/reporters/flamegraph.py
Outdated
cls, | ||
frames: List[FrameNodeDict], | ||
all_strings: StringRegistry, | ||
interval_list: List[Tuple[int, Union[int, None], int, int, int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function doesn't actually touch interval_list
beyond seeing whether it exists, and since it exists if and only if we're in temporal
mode, I'm going to switch this to just accepting temporal: bool
instead.
src/memray/reporters/flamegraph.py
Outdated
inverted_no_imports_frames: List[FrameNodeDict] = [] | ||
|
||
if inverted: | ||
inverted_no_imports_root = deepcopy(root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like having root
and inverted_no_imports_root
as local variables in this function, since they're only used here at the very top of the fairly big function. Limiting the amount of local state will make this function easier to understand. I'm going to factor creation of the root node out into a function, and call it once for each tree we're making.
src/memray/reporters/flamegraph.py
Outdated
node_index_by_key: Dict[Tuple[int, StackFrame, str], int] = {} | ||
inverted_no_imports_node_index_by_key: Dict[ | ||
Tuple[int, StackFrame, str], int | ||
] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna switch this to use a type alias, to make things slightly easier to read:
NodeKey = Tuple[int, StackFrame, str]
node_index_by_key: Dict[NodeKey, int] = {}
inverted_no_imports_node_index_by_key: Dict[NodeKey, int] = {}
In particular, it's tough to see that the two variables have the same type given the way that Black wrapped the lines, and this change makes that more recognizable.
src/memray/reporters/flamegraph.py
Outdated
record_data: RecordData = { | ||
"thread_name": thread_id, | ||
"size": size, | ||
"n_allocations": n_allocations, | ||
"intervals": intervals, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna move this to be defined in each branch of the if
above, so that we don't need intervals
and size
and n_allocations
and thread_id
as local variables in addition to values in this dict, for the same reason as above - limiting the number of local variables here will help future readers.
src/memray/reporters/flamegraph.py
Outdated
stack_it = enumerate(reversed(stack)) | ||
cls.generate_frames( | ||
stack_it=stack_it, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna drop this local variable as well, and just construct it in the call to the function below, since it's only used in one spot, and to keep limiting local state.
src/memray/reporters/flamegraph.py
Outdated
no_imports_stack = list( | ||
reversed( | ||
list( | ||
itertools.takewhile( | ||
lambda e: not is_frame_from_import_system(e), | ||
reversed(stack), | ||
) | ||
) | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna split this into a function, so that we can assign a nice name to it for the benefit of future maintainers.
935184b
to
f99c6a8
Compare
Added inverted flame graph which is generated with --inverted command line argument. Signed-off-by: Ivona Stojanovic <[email protected]>
Added tests for inverted flame graph. Signed-off-by: Ivona Stojanovic <[email protected]>
f99c6a8
to
e16d9b2
Compare
Added inverted flame graph which is generated with --inverted command line argument. The tree's nodes with the same parent and location are merged when we check 'Hide Import System Frames' checkbox.
Fixes: #293
Describe your changes
A clear and concise description of the changes you have made.
Testing performed
Describe the testing you have performed to ensure that the bug has been addressed, or that the new feature works as planned.
Additional context
Add any other context about your contribution here.