Commit Graph

145 Commits

Author SHA1 Message Date
DanilBaibak
8cf1a02e80 Rever [Profiler] Improve the docstring for export_memory_timeline (#110978)
Rever [Profiler] Improve the docstring for export_memory_timeline
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110978
Approved by: https://github.com/huydhn, https://github.com/aaronenyeshi
2023-10-10 19:57:25 +00:00
Aaron Shi
52b1470935 [Profiler] Improve the docstring for export_memory_timeline (#110949)
Summary: Add more details about the export_memory_timeline API, as we've landed new representations of the memory timeline data.

Test Plan: CI, should be no functional change, as we only changed comments.

Differential Revision: D50123450

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110949
Approved by: https://github.com/davidberard98
2023-10-10 17:53:56 +00:00
Kazuaki Ishizaki
b5f9696d81 Fix typo under torch directory (#110824)
This PR fixes typo `the the` of comments and exception messages in files under `torch` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110824
Approved by: https://github.com/H-Huang
2023-10-09 19:16:43 +00:00
David Berard
8c66f97c9b [profiler] move _enable_dynamo_cache_lookup_profiler (#107720)
_enable_dynamo_cache_lookup_profiler used to get turned on when running `__enter__` or `__exit__` with the profiler. But it's possible to turn the profiler on and off without the context manager (e.g. with a schedule and calling `.step()`). Instead, we should put these calls (which are supposed to be executed when the profiler turns on/off) where `_enable_profiler()` and `_disable_profiler()` are called.

This puts `_enable_dynamo_cache_lookup_profiler` and `_set_is_profiler_enabled` into `_run_on_profiler_(start|stop)` and calls that on the 3 places where `_(enable|disable)_profiler` get called.

Differential Revision: [D48619818](https://our.internmc.facebook.com/intern/diff/D48619818)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107720
Approved by: https://github.com/wconstab
2023-08-23 23:41:35 +00:00
David Berard
cb107c74bb [profiler] DISABLE_CUPTI_LAZY_REINIT for CUDA 12 as well (#107744)
Summary:
Apparently CUDA 12 + CUPTI can fail with an illegal memory access, similar to what we saw with CUDA 11 (https://github.com/pytorch/pytorch/issues/75504).

For now we'll just turn on DISABLE_CUPTI_LAZY_REINIT, which will fix this internally. In OSS, this will probably still break - which will hopefully give us a repro.

Differential Revision: D48568888

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107744
Approved by: https://github.com/aaronenyeshi
2023-08-23 23:28:15 +00:00
Anupam Bhatnagar
3336aa191c Adding allocated and reserved memory values to memory timline view. (#107056)
Summary: This diff adds the max allocated and max reserved memory values to the memory timeline plot.

Test Plan:
Executed

`buck run mode/dev-nosan kineto/libkineto/fb/integration_tests:pytorch_resnet_integration_test -- --enable_profiling --profile_memory --trace_handler=auto_trace --with_stack --record_shapes` on my devgpu.

The generated output is at
https://www.internalfb.com/manifold/explorer/ai_efficiency/tree/traces/dynocli/devgpu020.odn1.facebook.com/rank-0/rank-0.Aug_10_16_50_50.236946.pt.memorytl.html

 {F1067885545}
Screenshot of the html above
 {F1067886350}

Reviewed By: aaronenyeshi

Differential Revision: D48251791

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107056
Approved by: https://github.com/aaronenyeshi, https://github.com/davidberard98
2023-08-21 17:20:13 +00:00
Aaron Gokaslan
b1e8e01e50 [BE]: Apply PYI autofixes to various types (#107521)
Applies some autofixes from the ruff PYI rules to improve the typing of PyTorch. I haven't enabled most of these ruff rules yet as they do not have autofixes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107521
Approved by: https://github.com/ezyang
2023-08-20 02:42:21 +00:00
MooYeh
fb6652b56e [profiler] add profiler parsing support for custom device. (#106142)
We hope PyTorch profiling parsing ability can also be applicable to custom devices. Based on previous  work  https://github.com/pytorch/pytorch/pull/101554, we have made supplementary updates to PyTorch profiling to extend its parsing capabilities for custom devices. These modifications do not affect the original logic of the code and mainly include the following aspects:
1. Added the relevant logic for use_device in torch.profiler.profiler._KinetoProfile.
2. In torch.autograd.profiler and torch.autograd.profiler_util, custom devices profiling data  parsing ability has been added using privateuse1 and use_device attributes.
3. In torch._C._autograd.pyi and torch._C._autograd.pyi, custom devices related attributes have been added. The underlying C++
logic will be added in subsequent pull requests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106142
Approved by: https://github.com/aaronenyeshi
2023-08-02 20:23:22 +00:00
Edward Z. Yang
3bf922a6ce Apply UFMT to low traffic torch modules (#106249)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249
Approved by: https://github.com/Skylion007
2023-07-29 23:37:30 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Howard Cheng
3dacc8e847 [PyTorch] [Memory profiler] Early return if qualified name is invalid (#105495)
Summary: Return early if we can easily determine the operator qualified name is invalid before attempting to retrieve the schema. In particular "::" should always be present. Quick estimate shows that this is >50x faster (100 us -> 2 us).

Test Plan: CI

Differential Revision: D47562587

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105495
Approved by: https://github.com/aaronenyeshi
2023-07-20 00:58:32 +00:00
Justin Chu
3721fa5612 [BE] Enable ruff's UP rules and autoformat optim/ (#105426)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105426
Approved by: https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi, https://github.com/janeyx99
2023-07-18 21:07:43 +00:00
Aaron Enye Shi
e0d2ad1a21 [Profiler][Memory] Export raw timestamped events in export_memory_timeline_raw (#105094)
Summary:
Rather than processing the events into a time and sizes plot, dump the actual events as (timestamp, action, num of bytes, category) when output file ends in `raw.json.gz`.

This can allow downstream analysis tools to process these events. It also avoids having to control the granularity of the previous json.gz in memory profiler.

Test Plan: CI Tests

Differential Revision: D47416544

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105094
Approved by: https://github.com/davidberard98
2023-07-17 17:39:37 +00:00
Nikita Shulga
5837e95d30 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`

Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-15 20:30:20 +00:00
PyTorch MergeBot
15fd1ea118 Revert "[Reland] Update mypy to 1.4.1 (#105227)"
This reverts commit c9c4f8efc3.

Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))
2023-07-14 22:28:35 +00:00
Nikita Shulga
c9c4f8efc3 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-14 20:45:12 +00:00
PyTorch MergeBot
3c5a494d7a Revert "Update mypy to 1.4.1 (#91983)"
This reverts commit 634659e262.

Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))
2023-07-14 15:59:16 +00:00
PyTorch MergeBot
b4d91b1c5b Revert "[Typing] Fix PEP 484 Violation (#105022)"
This reverts commit 4148b7bada.

Reverted https://github.com/pytorch/pytorch/pull/105022 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105022#issuecomment-1635967734))
2023-07-14 14:45:09 +00:00
Nikita Shulga
634659e262 Update mypy to 1.4.1 (#91983)
Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  -
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983
Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi
2023-07-13 16:30:36 +00:00
Nikita Shulga
4148b7bada [Typing] Fix PEP 484 Violation (#105022)
Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None

Towards enabling mypy-1.4.1 in lintrunner

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 5e1b9f4</samp>

> _We annotate the arguments of doom_
> _To show the `None` values of gloom_
> _We improve the type checking and readability_
> _With `Optional` annotations of metal-ity_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022
Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007
2023-07-12 10:20:48 +00:00
Huy Do
b3e60ee052 Fix broken torch._inductor.config import (#104477)
This fixes the bug in profiler code exposed by  https://github.com/pytorch/pytorch/pull/104368 that introduced on the fact that `import torch._dynamo` also imports `torch._inductor.config`:
```
$ python -c "import torch._inductor;print(torch._inductor.config)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
AttributeError: module 'torch._inductor' has no attribute 'config'
(base) $ python -c "import torch._dynamo;print(torch._inductor.config)"
<module 'torch._inductor.config' from '/home/nshulga/git/pytorch/pytorch/torch/_inductor/config.py'>
```

### Testing
D47159397

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104477
Approved by: https://github.com/aaronenyeshi, https://github.com/malfet
2023-07-01 02:23:44 +00:00
Louis Feng
5847cb55e4 [PyPer][ET] Refactor EG to ET (#99694)
Summary:
Change execution graph to execution trace.
See post: https://fb.workplace.com/groups/873291503156329/permalink/1529496217535851/

Test Plan: Run a job.

Reviewed By: chaekit

Differential Revision: D44121392

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99694
Approved by: https://github.com/chaekit
2023-06-22 19:41:54 +00:00
Aaron Enye Shi
2a4fa25109 [Profiler] Include more uncategorized events in memory profile (#101200)
Summary: This PR adds handling for allocations / frees which we cannot prove are for Tensors. (And thus aren't assigned an ID.) These events are still important for judging overall utilization.

Test Plan: CI and Unit tests.

Differential Revision: D45458885

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101200
Approved by: https://github.com/anupambhatnagar, https://github.com/davidberard98
2023-06-08 16:22:49 +00:00
Richard Li
f1f57e1e54 trigger tracing for MTIA events (#102288)
Summary: trigger tracing for MTIA events on python side when ProfilerActivity.MTIA is specified

Test Plan:
Test diff: D45437426

```
hg graft D45437426
```
- in one terminal

```
cd ~/fbsource/fbcode
buck2 run -j 8 \
    //infra_asic_fpga/firmware/tools/mad/service:mad_service
```
- in another terminal

Pytorch profiler
```
buck run mode/dev-nosan -j 8 //caffe2/torch/fb/acc_runtime/afg/tests:test_afg  -- -m kernel_add
```

Differential Revision: D46122853

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102288
Approved by: https://github.com/aaronenyeshi
2023-06-05 15:10:31 +00:00
Aaron Enye Shi
fa7ad77ac9 [Profiler] Workaround CUPTI Lazy Reinit and CUDA Graphs crash in CUDA 11 (#101879)
Summary: Since CUPTI lazy re-init crashes with CUDA Graphs in CUDA 11, we should disable this. Remove this item once majority of workloads move to CUDA 12.

Test Plan: CI Tests

Reviewed By: xw285cornell

Differential Revision: D45921028

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101879
Approved by: https://github.com/xw285cornell
2023-05-19 21:47:07 +00:00
Aaron Enye Shi
e35323d6a7 [Profiler] Fix HTML plot output for profiler export_memory_timeline (#101316)
Summary: Wrap the PNG image of the memory plot inside of an HTML body, so that the file can be easily opened or embedding in other frontends.

Test Plan:
CI Tests

# Ran locally on Resnet50:
{F988498243}
{F988498789}
https://www.internalfb.com/manifold/explorer/trace_stats/tree/749163530321413/tmpj3ifzs7r.pt.memorytl.html

Differential Revision: D45827509

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101316
Approved by: https://github.com/xuzhao9
2023-05-15 16:31:06 +00:00
David Berard
447a20fdb1 [profiler] provide torch.profiler._utils._init_for_cuda_graphs() as a workaround (#100441)
There are known issues with profiling cuda graphs - particularly, if you create a cuda graph before the first use of the profiler, and then run that cuda graph during profiling.

One workaround is to add `with profile(): pass` before creating the cuda graph that you want to profile later.

For convenience, we provide this function to use the workaround. This also adads a test for this workaround, to ensure that it continues working.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100441
Approved by: https://github.com/Chillee, https://github.com/aaronenyeshi
2023-05-05 19:25:37 +00:00
Aaron Enye Shi
87b71e570e [Profiler] Support HTML plot output for profiler export_memory_timeline API (#99751)
Summary:
Support the file extension .html, which will include a PNG image of the plot embedded into an HTML file.

This allows users to avoid processing the timeline manually in their own frontend UI.

Test Plan:
CI Tests

Ran on resnet50 model and generated this html file w/ plot:
See attached html file: {F954232276}
Screenshot: {F954232469}

Differential Revision: D45152735

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99751
Approved by: https://github.com/davidberard98
2023-04-22 04:21:58 +00:00
Edward Z. Yang
9a8f71f23e Convert logging f-strings to use % format (#98697)
Codemod done with
https://gist.github.com/ezyang/2e8b0463cdc6be278478495b23ff0530 with
assistance from ChatGPT.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98697
Approved by: https://github.com/voznesenskym
2023-04-10 12:19:31 +00:00
Aaron Gokaslan
9c3fbe7475 [BE] Enable flake8-simplify checks (#97984)
Enable some sensible flake8-simplify rules. Mainly wanted to enable the SIM101, and `yield from` SIM103 checks. @kit1980 since you wanted to be tagged on this CI check.

Enabling this check also helped flag one logical bug so it's definitely beneficial (also fixed in this PR).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97984
Approved by: https://github.com/ezyang
2023-03-31 03:40:21 +00:00
Sergii Dymchenko
477f3f555f Simplify by using yield from (#97831)
The issues were found by SIM104 flake8-simplify in a local run.

I'll take a look on adding the check to the CI separately.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97831
Approved by: https://github.com/Skylion007
2023-03-29 19:15:24 +00:00
Huy Do
4a88f71f65 Fix potential naming clash when writing traces with tensorboard_trace_handler (#97392)
Fixes https://github.com/pytorch/pytorch/issues/82915

This rare flaky issue caught my attention today when it failed flakily on MacOS in https://github.com/pytorch/pytorch/actions/runs/4494182574/jobs/7906827531.  The test expected 3 traces to be written but got only 2 of them.

Looking a bit closer into the `tensorboard_trace_handler` function, it looks like there is a potential filename clash here.  The millisecond since epoch `"{}.{}.pt.trace.json".format(worker_name, int(time.time() * 1000))` is used as part of the name.  As `tensorboard_trace_handler` is used as a callback handle in the test, the names look too close to each other (1-millisecond apart), i.e.

```
huydo-mbp_13494.1679526197252.pt.trace.json
huydo-mbp_13494.1679526197253.pt.trace.json
huydo-mbp_13494.1679526197250.pt.trace.json
```

Switching to nanosecond reduces the chance of two or more of them having the same timestamp while keeping the naming convention intact, i.e. `huydo-mbp_13804.1679526325182878000.pt.trace.json`

I suspect that this is also the cause of Windows flakiness.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97392
Approved by: https://github.com/malfet, https://github.com/aaronenyeshi
2023-03-23 16:53:11 +00:00
Will Constable
e8a722b9cb Fix missing dynamo cache lookup registration in profiler.profiler (#97305)
This follows https://github.com/pytorch/pytorch/pull/96199 and supports the 'other' profiler.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97305
Approved by: https://github.com/voznesenskym
2023-03-22 21:09:16 +00:00
Aaron Gokaslan
5471621497 [BE] Remove unnecessary dict comprehensions (#97116)
Removes unnecessary dict comprehensions that optimize creation of dicts from iterables

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97116
Approved by: https://github.com/kit1980
2023-03-20 00:56:57 +00:00
Aaron Enye Shi
1e6961586b [Profiler] Memory timeline to show actual timestamps (#96535)
Summary: Rather than starting the timeline at t=0, keep the actual timestamps of the memory events.

Test Plan: CI Tests

Reviewed By: leitian, chaekit

Differential Revision: D43807624

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96535
Approved by: https://github.com/davidberard98
2023-03-11 00:25:30 +00:00
Aaron Enye Shi
e948ba07d4 [Profiler] Add export_memory_timeline to save memory timeline plot to file (#96137)
Summary: Added the functionality to export the memory timeline plot as a list of times and sizes, which the post processing visualization can parse and plot.

Test Plan: CI Tests

Reviewed By: leitian, fengxizhou

Differential Revision: D43680760

Pulled By: aaronenyeshi

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96137
Approved by: https://github.com/chaekit
2023-03-10 18:20:25 +00:00
Horace He
5bbec680d7 Fix usages of contextmanager without finally (#96170)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96170
Approved by: https://github.com/ngimel, https://github.com/malfet
2023-03-08 20:59:27 +00:00
David Berard
ed4b6d2113 [profiler] update docs with repeat=1 (#95085)
Specifying number of times to repeat is now required when defining the schedule.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95085
Approved by: https://github.com/aaronenyeshi
2023-02-21 21:09:10 +00:00
Aaron Gokaslan
0444a6c90a [BE] Remove deprecated logging warn method (#94708)
Swaps all logging.warn calls to logging.warning since the former is deprecated and even raises a deprecation warning now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94708
Approved by: https://github.com/ezyang
2023-02-13 18:24:52 +00:00
Xiaodong Wang
88e16849db [pt2] Fix multiple races in log folder (#93407)
Summary:
There are a few races/permission errors in file creation, fixing
OSS:
1. caffe2/torch/_dynamo/utils.py, get_debug_dir: multiple process may conflict on it even it's using us. Adding pid to it
2. caffe2/torch/_dynamo/config.py: may not be a right assumption that we have permission to cwd

Test Plan: sandcastle

Differential Revision: D42905908

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93407
Approved by: https://github.com/soumith, https://github.com/mlazos
2023-02-09 21:10:14 +00:00
Aaron Gokaslan
8fce9a09cd [BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308)
Apply parts of pyupgrade to torch (starting with the safest changes).
This PR only does two things: removes the need to inherit from object and removes unused future imports.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-07 21:10:56 +00:00
albanD
e52786f3d1 Silence profiler error (#94013)
This is not 3.11 specific but a lot more likely in 3.11 I guess
You can find other reports at https://github.com/pytorch/pytorch/issues/64345 as well for it failing in 3.8
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94013
Approved by: https://github.com/malfet
2023-02-03 17:33:47 +00:00
Anupam Bhatnagar
f4b804eeaa Call profiler step via optimizer post hook (#90101)
This PR adds the `_profile_using_dynolog` function to `torch/__init__.py`. The `_profile_using_dynolog` method allows registering the optimizer step post hook. This is required to collect iteration based traces using dynolog.

Other related changes for tests to pass:
1. Updated `optimizer.pyi`
1. Updated `overrides.py`
1. The test `test_kineto_profiler_multiple_steppers` in `test_profiler.py` has been broken down into two cases:
     - `test_kineto_profiler_multiple_steppers_with_override_True` : this test uses the override argument
     - `test_kineto_profiler_multiple_steppers_with_override_False` : this test uses the environment variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90101
Approved by: https://github.com/albanD
2023-01-13 18:07:40 +00:00
Brian Coutinho
1d3e7fcc3b [pytorch profiler] Add step tracker logic to handle multiple sources of step increments (#90880)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90880

# Summary
Enables multiple step trackers. Currently we only had one place to mark that a step() has occurred in the program. This was via pytorch profiler step().
We are now working on adding an Optimizer step hook - https://github.com/pytorch/pytorch/issues/88446
- This could mean programs that already call profiler.step() every iteration can end up double incrementing steps
- If a model uses multiple optimizers we can also have double or more counting of the step.

## Solution
We fix this by adding a layer of abstraction before calling step() to the kineto library. The idea is to maintain steps per requester in a dictionary
```
{
   "ProfilerStep": 100,  # triggered by profiler step() call
   "Optimizer1Step": 100,   # Optimizer 1 or 2 are just examples, could be SGD, Adam etc
   "Optimizer2Step": 100,
}
```
To figure out the global step count just take max on the dict values (100).
```
{
   "ProfilerStep": 100,
   "Optimizer1Step": 101,   # Optimizer1 got incremented first say
   "Optimizer2Step": 100,
}
```
Then global step count is 101

## Calling kineto
We only call the kineto step() function when global count increments.

# Test Plan:
Added a unit test
   buck2 run mode/dev-nosan caffe2/test:profiler

Differential Revision: D41751157

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90880
Approved by: https://github.com/chaekit
2022-12-20 00:48:01 +00:00
Edward Z. Yang
eef019c14a Lint rule to forbid direct use of logging.info/etc APIs (#90907)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90907
Approved by: https://github.com/jansel
2022-12-16 05:13:51 +00:00
Jagadish Krishnamoorthy
0a4e4de525 [ROCm] add case for FP32MatMulPattern skip property (#84077)
TF32 is not supported on ROCm and hence the torch/profiler/_pattern_matcher.py FP32MatMulPattern should return False for ROCm instead of checking the results of torch.cuda.get_arch_list().  Depending on the gfx arch running the test, test_profiler.py's test_profiler_fp32_matmul_pattern (__main__.TestExperimentalUtils) will fail otherwise.

Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84077
Approved by: https://github.com/jeffdaily, https://github.com/kit1980
2022-12-13 20:27:35 +00:00
Ram Rachum
351d73b97f Fix exception causes all over the codebase (#90271)
This is the continuation to #90134 and hopefully the final PR in this series.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271
Approved by: https://github.com/kit1980
2022-12-07 04:29:00 +00:00
Taylor Robie
63e57280fc [Profiler] Memory profiler part 13: Add sizes to timeline. (#89356)
If we see an allocation the size is unambiguous. Otherwise we have to use sizes and strides to bound the underlying storage.

Differential Revision: [D40868660](https://our.internmc.facebook.com/intern/diff/D40868660/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89356
Approved by: https://github.com/chaekit
2022-12-02 03:55:22 +00:00
Taylor Robie
6727e537a7 [Profiler] Memory profiler part 12: Emit timeline of memory events. (#89355)
Add a simple interface to get a flat representation of the memory profile.

Differential Revision: [D40868663](https://our.internmc.facebook.com/intern/diff/D40868663/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89355
Approved by: https://github.com/chaekit
2022-12-02 03:55:22 +00:00
Taylor Robie
b709078dc6 [Profiler] Memory profiler part 11: Mark tensors created in the backward pass which don't correspond to parameters. (#88926)
There are various Tensors created in the backward pass which do not correspond to parameters. We don't want to mark these as gradients, but we do still want to convey as much information as possible. Thus, this PR introduces an AUTOGRAD_DETAIL category. (Which can be grouped with GRADIENT in visualization if one wishes to take a coarse grained view of the world.)

Differential Revision: [D40868661](https://our.internmc.facebook.com/intern/diff/D40868661/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88926
Approved by: https://github.com/chaekit
2022-11-27 12:20:30 +00:00