High level profiling / tracing #8588

marcins · 2022-10-31T00:20:47Z

💬 RFC

🔦 Context

Parcel currently provides a way to generate sampling CPU profiles using --profile, or using Ctrl-E during the build. These CPU sample based profiles can be used to identify bottlenecks and understand where Parcel is spending it's time at the code level, and are useful for these low-level purposes.

This proposal is to introduce the concept of a high-level trace, where Parcel can report on time spent at a higher level - for example, to be able to identify which transform is the most expensive in your build, or identify if there is a particular source file that causes additional overhead.

This is particularly useful for us internally when determining where to focus our efforts on build time improvements, in order to identify where time is being spent and where we should focus our optimisation efforts.

Prior art

When I initially spiked this, @mischnic mentioned this had been previously raised by him in the Parcel project management tool (issue T-1131).

This RFC builds upon work originally created by @thebriando in our internal Parcel fork - Brian's implementation only ran on the main thread, and measured time spent in transform/bundling/packaging at a high level, however when I tried to adapt this approach to a more broad high level tracing (e.g. per transform) I ran into the following issues: a) passing a Tracer around through the various layers would involve a lot of changes in a lot of places, b) it would not be able to cross thread/serialisation boundaries.

Spike approach

There is a very rough spike branch here: https://github.com/marcins/parcel/tree/mszczepanski/spike-system-tracing which at this stage is not comprehensive, nor particularly clean, but was created in order to help socialise the potential approach.

In this branch I have added additional "trace" functionality to the logger, mostly because I wanted to follow a similar approach - have a singleton SystemTracer that could be used to access the measurement functionality from anywhere. This SystemTracer lets you measure any code within Parcel with a pattern like:

 let measurement = SystemTracer.createMeasurement(transformer.name, {
  categories: ["transform"],
  args: {
    name: fromProjectPathRelative(initialAsset.value.filePath),
  },
});

// .. transformation happens here

measurement.end();

These trace events are caught by a TraceReporter which uses chrome-trace-event to produce an appropriate output.

This will produce a Chrome Trace event similar to:

{"ts":2726640,"pid":34523,"tid":3,"ph":"X","name":"@parcel/transformer-babel","cat":"transform","args":{"name":"node_modules/react/jsx-runtime.js"},"dur":288},

A trivial but complete trace is available here: https://gist.github.com/marcins/a1a0ceaf3f3fd0d4771e0e611ebc360f

Analysing results

While these traces can be loaded into Chrome DevTools Performance tab, like Parcel's other profiles, they are of limited value here due to the lack of analysis tools. However, where this data can really be useful is by loading it into https://ui.perfetto.dev/. Example screenshot from the trace above:

The visual timeline itself isn't necessarily useful except at a very high level, however the SQL queries built into Perfetto let us easily ask questions like "What is the most expensive transform?"

select 
  s.category, name, SUM(CAST(dur AS double)/1000/1000) as dur_ms
from 
  slice s 
left join 
  args using(arg_set_id)
where 
  args.flat_key = "args.name" AND
  s.category = "transform"
group by name
order by dur_ms desc
limit 100;

category	name	dur_ms
transform	@parcel/transformer-babel	93.15300000000002
transform	@parcel/transformer-js	68.70400000000001
transform	@parcel/transformer-html	27.017
transform	@parcel/transformer-css	4.612
transform	@parcel/transformer-react-refresh-wrap	1.6889999999999998
transform	@parcel/transformer-postcss	0.382
transform	@parcel/transformer-posthtml	0.28

Using Perfetto's SQL you can come up with all sorts of other queries, whether it's by source file, category, etc.

Proposed implementation details

Create a new package @parcel/tracing to encapsulate tracing/profiling. The existing Trace and Profiler from packages/core/workers/src/Profiler.js could be moved here as well as they share a dependency on chrome-trace-event / deal with tracing.
Tracer / tracer should be analogous to Logger / logger (i.e. the class implementing the thing, a singleton instance of the thing exported for use in other code)
Follow the same pattern as for logging, a child will handle Tracer.onTrace events and re-emit them using bus.emit('traceEvent'), which get re-fired on the main thread by a listener.
Currently in the work Brian did there is a TraceReporter to handle creating the trace, but we should probably just handle this in core similar to how / where existing profiling traces are created?
Thought that needs more thinking.. --profile=application vs --profile=sampling (where sampling is the default current behaviour?)
need to measure the impact of sending measurement messages across workers when the result is unused - it would be easier to disable listening / writing to a file when the feature isn't enabled, rather than disabling at the sending end, but if there's a non-trivial impact then definitely work on the latter (maybe can just add an .enable()/.disable() to Tracer that can be called when config/CLI options are processed?)
measurement of most transforms is handled at the Transformer pipeline level, but for the BabelTransformer implement a wrapPluginVisitorMethod that will measure time spent in individual Babel plugins (marcins@51695a0#diff-8555dae9dfcfcf7ddcaab1c7cb7ebde8b3bcdecef9b54dc032c654bb0e2c5c21R69-R86)

Status / Questions / etc

Pending implementation.

The text was updated successfully, but these errors were encountered:

devongovett · 2022-10-31T06:32:42Z

This all makes sense to me, and it sounds useful!

In my spike I've just added a new trace method to Logger in order to let me easily send these trace events across the worker boundary.

You could do this using the bus from @parcel/workers. It's an event emitter, but it sends events in workers back to the main thread. The logger and reporters use it already.

Where should control of whether this feature is enabled be?

We could add a new CLI option like we have for --profile perhaps?

If we implement this, we should probably consider unifying it with profiling

We could make --profile take a value, like --profile=high-level (idk what to call it), but I don't think a separate --trace option would be too confusing either.

It would be useful to also capture time spent in Babel plugins - this might be possible to do in the Babel transform, as Babel config provides the ability to provide a wrapPluginVisitorMethod

Sounds like it would work.

marcins · 2022-10-31T23:34:43Z

Thanks for the feedback Devon, I've removed the questions from the original post and added some proposed implementation details based on the answers and some more thinking.

mischnic · 2023-10-27T12:05:40Z

@marcins is this completed with #8695?

marcins · 2023-10-27T23:13:08Z

Implemented in #8695

mischnic added the 💬 RFC Request For Comments label Oct 31, 2022

marcins mentioned this issue Dec 16, 2022

Application profiling #8695

Merged

5 tasks

github-actions bot added the Stale Inactive issues label Apr 30, 2023

devongovett removed the Stale Inactive issues label Apr 30, 2023

parcel-bundler deleted a comment from github-actions bot Apr 30, 2023

github-actions bot added the Stale Inactive issues label Oct 27, 2023

mischnic removed the Stale Inactive issues label Oct 27, 2023

parcel-bundler deleted a comment from github-actions bot Oct 27, 2023

marcins closed this as completed Oct 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High level profiling / tracing #8588

High level profiling / tracing #8588

marcins commented Oct 31, 2022 •

edited

Loading

devongovett commented Oct 31, 2022

marcins commented Oct 31, 2022

mischnic commented Oct 27, 2023

marcins commented Oct 27, 2023

High level profiling / tracing #8588

High level profiling / tracing #8588

Comments

marcins commented Oct 31, 2022 • edited Loading

💬 RFC

🔦 Context

Prior art

Spike approach

Analysing results

Proposed implementation details

Status / Questions / etc

devongovett commented Oct 31, 2022

marcins commented Oct 31, 2022

mischnic commented Oct 27, 2023

marcins commented Oct 27, 2023

marcins commented Oct 31, 2022 •

edited

Loading