Commit Graph

14 Commits

Author SHA1 Message Date
cyy
8967d55b01 [18/N] Fix clang-tidy warnings in jit (#132963)
Follows #132753

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132963
Approved by: https://github.com/Skylion007
2024-08-09 01:27:32 +00:00
cyy
bfe5e1258b avoid unnecessary static_cast (#93898)
avoid unnecessary static_cast
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93898
Approved by: https://github.com/Skylion007
2023-02-03 03:44:43 +00:00
Salil Desai
e2dc60c6cb [Vulkan + Profiler] Add Timestamp Adjustment Algorithm (#90672)
@bypass-github-export-checks

This change ensures that vulkan event start/end times are correctly synced with their parent CPU times.

This sometimes requires increasing CPU event durations (to fully contain their child events) and delaying CPU event start times (to prevent overlaps), so this should not be used unless Vulkan events are being profiled and it is ok to use this modified timestamp/duration information instead of the the original information.

Differential Revision: [D39893109](https://our.internmc.facebook.com/intern/diff/D39893109/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39893109/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90672
Approved by: https://github.com/kimishpatel
2022-12-19 20:01:07 +00:00
Digant Desai
03346296db [edge profiler] Add support for performance events counting (#87876)
* Add support in lite_predictor benchmark binary to select event lists
* Uses Linux perf through Kineto profiler

Differential Revision: [D39837216](https://our.internmc.facebook.com/intern/diff/D39837216/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39837216/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87876
Approved by: https://github.com/SS-JIA
2022-11-02 14:47:44 +00:00
Max Ren
8ea033a009 [profiling] API For profiling backend memory events (#80350)
Summary:
Adding new API call

```
mobile::getCurrentEdgeProfiler()->recordBackendMemoryEvent(
        ptr, alloc_size, total_allocated, total_reserved, device);
```

As well as another macro to use for recording backend memory events. These memory events will be captured in the trace file when we create the profiler:

```
{
    KinetoEdgeCPUProfiler profiler(
        module,
        trace_file_name,
        false, // record input_shapes
        true, // profile memory
        true, // record callstack
        false, // record flops
        true); // record module hierarchy
    module.forward(inputs);
  }
```

Test Plan: Testing to do in the next diff

Differential Revision: D37116111

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80350
Approved by: https://github.com/kimishpatel
2022-06-29 23:32:50 +00:00
Taylor Robie
bd34636b13 [pytorch][PR] [Profiler] Add EventFieldsVisitor
One source of complexity in profiler_kineto is that we do most things twice: once to set a field in `kineto_events_.back()`, and once for the metadata json. These have historically been chained, with the KinetoEvent used to populate the metadata fields. However this is hard to read and error prone, as we have one giant block of assignments followed by another giant block. It also means that logic about whether a field is present or not is duplicated.

This PR replaces this logic with a visitor that writes both together. E.g.
```
    auto& dtypes = result_.get().inputs_.dtypes_;
    if (!dtypes.empty()) {
      kineto_event_.get().dtypes(dtypes);
      out.emplace_back("Input type", dtypesToStr(dtypes));
    }
```

Differential Revision: [D36070202](https://our.internmc.facebook.com/intern/diff/D36070202/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77691

Approved by: https://github.com/aaronenyeshi
2022-05-18 03:49:47 +00:00
Taylor Robie
24bc3be146 [Profiler] Clean up profiler includes. (#69421)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69421

I've hit a lot of build issues in D32671972, and I've come to realize that a lot of it boils down to header hygene. `function.h` includes `profiler.h` *solely* to transitively include `record_function.h` which winds up leaking the profiler symbols. Moreover several files are relying on transitive includes to get access to `getTime`. As long as I have to touch all the places that use `getTime`, I may as well also move them to the new namespace.

Test Plan: Unit tests and CI.

Reviewed By: aaronenyeshi, albanD

Differential Revision: D32865907

fbshipit-source-id: f87d6fd5afb784dca2146436e72c69e34623020e
2021-12-15 12:50:24 -08:00
Kimish Patel
c6216b2a43 Back out "Revert D30710710: [Pytorch Edge] Support profiling kineto events from external source" (#66421)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66421

Original commit changeset: ab6bb8fe4e83

Plus this incldes BUILD.bazel changes, the reason for the revert.

Test Plan: See original diff

Reviewed By: gdankel

Differential Revision: D31542513

fbshipit-source-id: ee30aca2d6705638f97e04b77a9ae31fe5cc4ebb
2021-10-12 10:55:29 -07:00
Jane Xu
c62ed96496 Revert D30710710: [Pytorch Edge] Support profiling kineto events from external source
Test Plan: revert-hammer

Differential Revision:
D30710710 (c1343ff706)

Original commit changeset: 51399f9b0b64

fbshipit-source-id: ab6bb8fe4e83ed1052e621e427259192a4f0f540
2021-10-08 17:46:18 -07:00
Kimish Patel
c1343ff706 [Pytorch Edge] Support profiling kineto events from external source (#64397)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64397

This diff exposes a way to add events to kineto profiler from external
source.
This can be a backend that executes a subgraph and wants to record this
execution in kineto profiler.
This diff also adds "backend" metadata to identify the backend an event
would have executed on.

Test Plan:
test_lite_interpreter

Imported from OSS

Reviewed By: raziel

Differential Revision: D30710710

fbshipit-source-id: 51399f9b0b647bc2d0076074ad4ea9286d0ef3e2
2021-10-08 15:59:42 -07:00
Kimish Patel
468001600c Back out "Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling." (#64307)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64307

Original commit changeset: 0b2aa7c57d08

Restores original changes.
This diff changes the way operator profiling is done in lite predictor
benchmarking binary.
Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile
events and then generate operator level metric from it.
Since KinetoEvents do not contain cpu clock time, now we report only wallclock
time.
This unifies various profiling effort that we have for benchmarking purpose. In
production we will still use observer based mechanism, but the advantage of
using kineto profiler is that we get few other things for free, such as:
chrome trace generation.
operator level memory profiling (to be added)
flop counts (to be added)
Furthermore possible we can use python post processing script to parse chrome
trace and generate output similar to torch.profiler. (To be done)

Furthermore removes some tests from test_lite_interpreter.cpp which were testing module hierarchy in debug info. They should be covered by test_mobile_profiler.cpp.

Test Plan:
aibench run
Model without debug info:
https://www.internalfb.com/intern/aibench/details/219598441154763
Model with debug info and --print_module_info true (see Operator summary has now module hierarchy information).
https://www.internalfb.com/intern/aibench/details/617154236292985

Reviewed By: raziel

Differential Revision: D30680354

fbshipit-source-id: b6ba0d59c510c13d13d9935b1d8051cc82ffa4e9
2021-09-01 13:29:35 -07:00
Kimish Patel
67cb131458 Revert D30327514: [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling.
Test Plan: revert-hammer

Differential Revision:
D30327514 (bc9277dca3)

Original commit changeset: 3bb2f2daaaed

fbshipit-source-id: 0b2aa7c57d08de77c9aaa75e546a7d0938610f64
2021-08-31 08:30:36 -07:00
Kimish Patel
bc9277dca3 [Pytorch lite predictor] Use KinetoEdgeCPUProfiler for operator profiling. (#63367)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63367

This diff changes the way operator profiling is done in lite predictor
benchmarking binary.
Instead of using custom callbacks it uses KinetoEdgeCPUProfiler to profile
events and then generate operator level metric from it.
Since KinetoEvents do not contain cpu clock time, now we report only wallclock
time.
This unifies various profiling effort that we have for benchmarking purpose. In
production we will still use observer based mechanism, but the advantage of
using kineto profiler is that we get few other things for free, such as:
- chrome trace generation.
- operator level memory profiling (to be added)
- flop counts (to be added)

Furthermore possible we can use python post processing script to parse chrome
trace and generate output similar to torch.profiler. (To be done)

Test Plan:
aibench run
Model without debug info:
https://www.internalfb.com/intern/aibench/details/219598441154763
Model with debug info and `--print_module_info true` (see Operator summary has now module hierarchy information).
https://www.internalfb.com/intern/aibench/details/617154236292985

Reviewed By: raziel

Differential Revision: D30327514

fbshipit-source-id: 3bb2f2daaaedfb04bd6f5d9c91292783f9c4344f
2021-08-30 20:54:51 -07:00
Kimish Patel
38c185189c [Pytorch Edge] Enable kineto profiler on mobile via EdgeKinetoProfiler (#62419)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62419

This diff adds support for cpu only kineto profiler on mobile. Thus
enabling chrome trace generation on mobile. This bring cpp API for
mobile profiling on part with Torchscript.
This is done via:
1. Utilizating debug handle annotations in KinetoEvent.
2. Adding post processing capability, via callbacks, to
KinetoThreadLocalState
3. Creating new RAII stype profiler, KinetoEdgeCPUProfiler, which can be
used in surrounding scope of model execution. This will write chrome
trace to the location specified in profiler constructor.

Test Plan:
MobileProfiler.ModuleHierarchy

Imported from OSS

Reviewed By: raziel

Differential Revision: D29993660

fbshipit-source-id: 0b44f52f9e9c5f5aff81ebbd9273c254c3c03299
2021-08-13 21:40:19 -07:00