pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
DanilBaibak	8cf1a02e80	Rever [Profiler] Improve the docstring for export_memory_timeline (#110978 ) Rever [Profiler] Improve the docstring for export_memory_timeline Pull Request resolved: https://github.com/pytorch/pytorch/pull/110978 Approved by: https://github.com/huydhn, https://github.com/aaronenyeshi	2023-10-10 19:57:25 +00:00
Aaron Shi	52b1470935	[Profiler] Improve the docstring for export_memory_timeline (#110949 ) Summary: Add more details about the export_memory_timeline API, as we've landed new representations of the memory timeline data. Test Plan: CI, should be no functional change, as we only changed comments. Differential Revision: D50123450 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110949 Approved by: https://github.com/davidberard98	2023-10-10 17:53:56 +00:00
Kazuaki Ishizaki	b5f9696d81	Fix typo under torch directory (#110824 ) This PR fixes typo `the the` of comments and exception messages in files under `torch` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110824 Approved by: https://github.com/H-Huang	2023-10-09 19:16:43 +00:00
David Berard	8c66f97c9b	[profiler] move _enable_dynamo_cache_lookup_profiler (#107720 ) _enable_dynamo_cache_lookup_profiler used to get turned on when running `__enter__` or `__exit__` with the profiler. But it's possible to turn the profiler on and off without the context manager (e.g. with a schedule and calling `.step()`). Instead, we should put these calls (which are supposed to be executed when the profiler turns on/off) where `_enable_profiler()` and `_disable_profiler()` are called. This puts `_enable_dynamo_cache_lookup_profiler` and `_set_is_profiler_enabled` into `_run_on_profiler_(start\|stop)` and calls that on the 3 places where `_(enable\|disable)_profiler` get called. Differential Revision: [D48619818](https://our.internmc.facebook.com/intern/diff/D48619818) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107720 Approved by: https://github.com/wconstab	2023-08-23 23:41:35 +00:00
David Berard	cb107c74bb	[profiler] DISABLE_CUPTI_LAZY_REINIT for CUDA 12 as well (#107744 ) Summary: Apparently CUDA 12 + CUPTI can fail with an illegal memory access, similar to what we saw with CUDA 11 (https://github.com/pytorch/pytorch/issues/75504). For now we'll just turn on DISABLE_CUPTI_LAZY_REINIT, which will fix this internally. In OSS, this will probably still break - which will hopefully give us a repro. Differential Revision: D48568888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107744 Approved by: https://github.com/aaronenyeshi	2023-08-23 23:28:15 +00:00
Anupam Bhatnagar	3336aa191c	Adding allocated and reserved memory values to memory timline view. (#107056 ) Summary: This diff adds the max allocated and max reserved memory values to the memory timeline plot. Test Plan: Executed `buck run mode/dev-nosan kineto/libkineto/fb/integration_tests:pytorch_resnet_integration_test -- --enable_profiling --profile_memory --trace_handler=auto_trace --with_stack --record_shapes` on my devgpu. The generated output is at https://www.internalfb.com/manifold/explorer/ai_efficiency/tree/traces/dynocli/devgpu020.odn1.facebook.com/rank-0/rank-0.Aug_10_16_50_50.236946.pt.memorytl.html {F1067885545} Screenshot of the html above {F1067886350} Reviewed By: aaronenyeshi Differential Revision: D48251791 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107056 Approved by: https://github.com/aaronenyeshi, https://github.com/davidberard98	2023-08-21 17:20:13 +00:00
Aaron Gokaslan	b1e8e01e50	[BE]: Apply PYI autofixes to various types (#107521 ) Applies some autofixes from the ruff PYI rules to improve the typing of PyTorch. I haven't enabled most of these ruff rules yet as they do not have autofixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107521 Approved by: https://github.com/ezyang	2023-08-20 02:42:21 +00:00
MooYeh	fb6652b56e	[profiler] add profiler parsing support for custom device. (#106142 ) We hope PyTorch profiling parsing ability can also be applicable to custom devices. Based on previous work https://github.com/pytorch/pytorch/pull/101554, we have made supplementary updates to PyTorch profiling to extend its parsing capabilities for custom devices. These modifications do not affect the original logic of the code and mainly include the following aspects: 1. Added the relevant logic for use_device in torch.profiler.profiler._KinetoProfile. 2. In torch.autograd.profiler and torch.autograd.profiler_util, custom devices profiling data parsing ability has been added using privateuse1 and use_device attributes. 3. In torch._C._autograd.pyi and torch._C._autograd.pyi, custom devices related attributes have been added. The underlying C++ logic will be added in subsequent pull requests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106142 Approved by: https://github.com/aaronenyeshi	2023-08-02 20:23:22 +00:00
Edward Z. Yang	3bf922a6ce	Apply UFMT to low traffic torch modules (#106249 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106249 Approved by: https://github.com/Skylion007	2023-07-29 23:37:30 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
Howard Cheng	3dacc8e847	[PyTorch] [Memory profiler] Early return if qualified name is invalid (#105495 ) Summary: Return early if we can easily determine the operator qualified name is invalid before attempting to retrieve the schema. In particular "::" should always be present. Quick estimate shows that this is >50x faster (100 us -> 2 us). Test Plan: CI Differential Revision: D47562587 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105495 Approved by: https://github.com/aaronenyeshi	2023-07-20 00:58:32 +00:00
Justin Chu	3721fa5612	[BE] Enable ruff's UP rules and autoformat optim/ (#105426 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105426 Approved by: https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi, https://github.com/janeyx99	2023-07-18 21:07:43 +00:00
Aaron Enye Shi	e0d2ad1a21	[Profiler][Memory] Export raw timestamped events in export_memory_timeline_raw (#105094 ) Summary: Rather than processing the events into a time and sizes plot, dump the actual events as (timestamp, action, num of bytes, category) when output file ends in `raw.json.gz`. This can allow downstream analysis tools to process these events. It also avoids having to control the granularity of the previous json.gz in memory profiler. Test Plan: CI Tests Differential Revision: D47416544 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/105094 Approved by: https://github.com/davidberard98	2023-07-17 17:39:37 +00:00
Nikita Shulga	5837e95d30	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04: - Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh` - Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-15 20:30:20 +00:00
PyTorch MergeBot	15fd1ea118	Revert "[Reland] Update mypy to 1.4.1 (#105227 )" This reverts commit `c9c4f8efc3`. Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))	2023-07-14 22:28:35 +00:00
Nikita Shulga	c9c4f8efc3	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-14 20:45:12 +00:00
PyTorch MergeBot	3c5a494d7a	Revert "Update mypy to 1.4.1 (#91983 )" This reverts commit `634659e262`. Reverted https://github.com/pytorch/pytorch/pull/91983 on behalf of https://github.com/malfet due to It's dependent change was reverted, so reverting this one as well, to keep CI clean ([comment](https://github.com/pytorch/pytorch/pull/91983#issuecomment-1636059709))	2023-07-14 15:59:16 +00:00
PyTorch MergeBot	b4d91b1c5b	Revert "[Typing] Fix PEP 484 Violation (#105022 )" This reverts commit `4148b7bada`. Reverted https://github.com/pytorch/pytorch/pull/105022 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105022#issuecomment-1635967734))	2023-07-14 14:45:09 +00:00
Nikita Shulga	634659e262	Update mypy to 1.4.1 (#91983 ) Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91983 Approved by: https://github.com/kit1980, https://github.com/ZainRizvi, https://github.com/huydhn, https://github.com/thiagocrepaldi, https://github.com/aaronenyeshi	2023-07-13 16:30:36 +00:00
Nikita Shulga	4148b7bada	[Typing] Fix PEP 484 Violation (#105022 ) Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None Towards enabling mypy-1.4.1 in lintrunner <!-- copilot:poem --> ### <samp>🤖 Generated by Copilot at 5e1b9f4</samp> > _We annotate the arguments of doom_ > _To show the `None` values of gloom_ > _We improve the type checking and readability_ > _With `Optional` annotations of metal-ity_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022 Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007	2023-07-12 10:20:48 +00:00
Huy Do	b3e60ee052	Fix broken torch._inductor.config import (#104477 ) This fixes the bug in profiler code exposed by https://github.com/pytorch/pytorch/pull/104368 that introduced on the fact that `import torch._dynamo` also imports `torch._inductor.config`: ``` $ python -c "import torch._inductor;print(torch._inductor.config)" Traceback (most recent call last): File "<string>", line 1, in <module> AttributeError: module 'torch._inductor' has no attribute 'config' (base) $ python -c "import torch._dynamo;print(torch._inductor.config)" <module 'torch._inductor.config' from '/home/nshulga/git/pytorch/pytorch/torch/_inductor/config.py'> ``` ### Testing D47159397 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104477 Approved by: https://github.com/aaronenyeshi, https://github.com/malfet	2023-07-01 02:23:44 +00:00
Louis Feng	5847cb55e4	[PyPer][ET] Refactor EG to ET (#99694 ) Summary: Change execution graph to execution trace. See post: https://fb.workplace.com/groups/873291503156329/permalink/1529496217535851/ Test Plan: Run a job. Reviewed By: chaekit Differential Revision: D44121392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99694 Approved by: https://github.com/chaekit	2023-06-22 19:41:54 +00:00
Aaron Enye Shi	2a4fa25109	[Profiler] Include more uncategorized events in memory profile (#101200 ) Summary: This PR adds handling for allocations / frees which we cannot prove are for Tensors. (And thus aren't assigned an ID.) These events are still important for judging overall utilization. Test Plan: CI and Unit tests. Differential Revision: D45458885 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/101200 Approved by: https://github.com/anupambhatnagar, https://github.com/davidberard98	2023-06-08 16:22:49 +00:00
Richard Li	f1f57e1e54	trigger tracing for MTIA events (#102288 ) Summary: trigger tracing for MTIA events on python side when ProfilerActivity.MTIA is specified Test Plan: Test diff: D45437426 ``` hg graft D45437426 ``` - in one terminal ``` cd ~/fbsource/fbcode buck2 run -j 8 \ //infra_asic_fpga/firmware/tools/mad/service:mad_service ``` - in another terminal Pytorch profiler ``` buck run mode/dev-nosan -j 8 //caffe2/torch/fb/acc_runtime/afg/tests:test_afg -- -m kernel_add ``` Differential Revision: D46122853 Pull Request resolved: https://github.com/pytorch/pytorch/pull/102288 Approved by: https://github.com/aaronenyeshi	2023-06-05 15:10:31 +00:00
Aaron Enye Shi	fa7ad77ac9	[Profiler] Workaround CUPTI Lazy Reinit and CUDA Graphs crash in CUDA 11 (#101879 ) Summary: Since CUPTI lazy re-init crashes with CUDA Graphs in CUDA 11, we should disable this. Remove this item once majority of workloads move to CUDA 12. Test Plan: CI Tests Reviewed By: xw285cornell Differential Revision: D45921028 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/101879 Approved by: https://github.com/xw285cornell	2023-05-19 21:47:07 +00:00
Aaron Enye Shi	e35323d6a7	[Profiler] Fix HTML plot output for profiler export_memory_timeline (#101316 ) Summary: Wrap the PNG image of the memory plot inside of an HTML body, so that the file can be easily opened or embedding in other frontends. Test Plan: CI Tests # Ran locally on Resnet50: {F988498243} {F988498789} https://www.internalfb.com/manifold/explorer/trace_stats/tree/749163530321413/tmpj3ifzs7r.pt.memorytl.html Differential Revision: D45827509 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/101316 Approved by: https://github.com/xuzhao9	2023-05-15 16:31:06 +00:00
David Berard	447a20fdb1	[profiler] provide torch.profiler._utils._init_for_cuda_graphs() as a workaround (#100441 ) There are known issues with profiling cuda graphs - particularly, if you create a cuda graph before the first use of the profiler, and then run that cuda graph during profiling. One workaround is to add `with profile(): pass` before creating the cuda graph that you want to profile later. For convenience, we provide this function to use the workaround. This also adads a test for this workaround, to ensure that it continues working. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100441 Approved by: https://github.com/Chillee, https://github.com/aaronenyeshi	2023-05-05 19:25:37 +00:00
Aaron Enye Shi	87b71e570e	[Profiler] Support HTML plot output for profiler export_memory_timeline API (#99751 ) Summary: Support the file extension .html, which will include a PNG image of the plot embedded into an HTML file. This allows users to avoid processing the timeline manually in their own frontend UI. Test Plan: CI Tests Ran on resnet50 model and generated this html file w/ plot: See attached html file: {F954232276} Screenshot: {F954232469} Differential Revision: D45152735 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/99751 Approved by: https://github.com/davidberard98	2023-04-22 04:21:58 +00:00
Edward Z. Yang	9a8f71f23e	Convert logging f-strings to use % format (#98697 ) Codemod done with https://gist.github.com/ezyang/2e8b0463cdc6be278478495b23ff0530 with assistance from ChatGPT. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98697 Approved by: https://github.com/voznesenskym	2023-04-10 12:19:31 +00:00
Aaron Gokaslan	9c3fbe7475	[BE] Enable flake8-simplify checks (#97984 ) Enable some sensible flake8-simplify rules. Mainly wanted to enable the SIM101, and `yield from` SIM103 checks. @kit1980 since you wanted to be tagged on this CI check. Enabling this check also helped flag one logical bug so it's definitely beneficial (also fixed in this PR). Pull Request resolved: https://github.com/pytorch/pytorch/pull/97984 Approved by: https://github.com/ezyang	2023-03-31 03:40:21 +00:00
Sergii Dymchenko	477f3f555f	Simplify by using yield from (#97831 ) The issues were found by SIM104 flake8-simplify in a local run. I'll take a look on adding the check to the CI separately. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97831 Approved by: https://github.com/Skylion007	2023-03-29 19:15:24 +00:00
Huy Do	4a88f71f65	Fix potential naming clash when writing traces with tensorboard_trace_handler (#97392 ) Fixes https://github.com/pytorch/pytorch/issues/82915 This rare flaky issue caught my attention today when it failed flakily on MacOS in https://github.com/pytorch/pytorch/actions/runs/4494182574/jobs/7906827531. The test expected 3 traces to be written but got only 2 of them. Looking a bit closer into the `tensorboard_trace_handler` function, it looks like there is a potential filename clash here. The millisecond since epoch `"{}.{}.pt.trace.json".format(worker_name, int(time.time() * 1000))` is used as part of the name. As `tensorboard_trace_handler` is used as a callback handle in the test, the names look too close to each other (1-millisecond apart), i.e. ``` huydo-mbp_13494.1679526197252.pt.trace.json huydo-mbp_13494.1679526197253.pt.trace.json huydo-mbp_13494.1679526197250.pt.trace.json ``` Switching to nanosecond reduces the chance of two or more of them having the same timestamp while keeping the naming convention intact, i.e. `huydo-mbp_13804.1679526325182878000.pt.trace.json` I suspect that this is also the cause of Windows flakiness. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97392 Approved by: https://github.com/malfet, https://github.com/aaronenyeshi	2023-03-23 16:53:11 +00:00
Will Constable	e8a722b9cb	Fix missing dynamo cache lookup registration in profiler.profiler (#97305 ) This follows https://github.com/pytorch/pytorch/pull/96199 and supports the 'other' profiler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97305 Approved by: https://github.com/voznesenskym	2023-03-22 21:09:16 +00:00
Aaron Gokaslan	5471621497	[BE] Remove unnecessary dict comprehensions (#97116 ) Removes unnecessary dict comprehensions that optimize creation of dicts from iterables Pull Request resolved: https://github.com/pytorch/pytorch/pull/97116 Approved by: https://github.com/kit1980	2023-03-20 00:56:57 +00:00
Aaron Enye Shi	1e6961586b	[Profiler] Memory timeline to show actual timestamps (#96535 ) Summary: Rather than starting the timeline at t=0, keep the actual timestamps of the memory events. Test Plan: CI Tests Reviewed By: leitian, chaekit Differential Revision: D43807624 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/96535 Approved by: https://github.com/davidberard98	2023-03-11 00:25:30 +00:00
Aaron Enye Shi	e948ba07d4	[Profiler] Add export_memory_timeline to save memory timeline plot to file (#96137 ) Summary: Added the functionality to export the memory timeline plot as a list of times and sizes, which the post processing visualization can parse and plot. Test Plan: CI Tests Reviewed By: leitian, fengxizhou Differential Revision: D43680760 Pulled By: aaronenyeshi Pull Request resolved: https://github.com/pytorch/pytorch/pull/96137 Approved by: https://github.com/chaekit	2023-03-10 18:20:25 +00:00
Horace He	5bbec680d7	Fix usages of contextmanager without finally (#96170 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96170 Approved by: https://github.com/ngimel, https://github.com/malfet	2023-03-08 20:59:27 +00:00
David Berard	ed4b6d2113	[profiler] update docs with repeat=1 (#95085 ) Specifying number of times to repeat is now required when defining the schedule. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95085 Approved by: https://github.com/aaronenyeshi	2023-02-21 21:09:10 +00:00
Aaron Gokaslan	0444a6c90a	[BE] Remove deprecated logging warn method (#94708 ) Swaps all logging.warn calls to logging.warning since the former is deprecated and even raises a deprecation warning now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94708 Approved by: https://github.com/ezyang	2023-02-13 18:24:52 +00:00
Xiaodong Wang	88e16849db	[pt2] Fix multiple races in log folder (#93407 ) Summary: There are a few races/permission errors in file creation, fixing OSS: 1. caffe2/torch/_dynamo/utils.py, get_debug_dir: multiple process may conflict on it even it's using us. Adding pid to it 2. caffe2/torch/_dynamo/config.py: may not be a right assumption that we have permission to cwd Test Plan: sandcastle Differential Revision: D42905908 Pull Request resolved: https://github.com/pytorch/pytorch/pull/93407 Approved by: https://github.com/soumith, https://github.com/mlazos	2023-02-09 21:10:14 +00:00
Aaron Gokaslan	8fce9a09cd	[BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308 ) Apply parts of pyupgrade to torch (starting with the safest changes). This PR only does two things: removes the need to inherit from object and removes unused future imports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-07 21:10:56 +00:00
albanD	e52786f3d1	Silence profiler error (#94013 ) This is not 3.11 specific but a lot more likely in 3.11 I guess You can find other reports at https://github.com/pytorch/pytorch/issues/64345 as well for it failing in 3.8 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94013 Approved by: https://github.com/malfet	2023-02-03 17:33:47 +00:00
Anupam Bhatnagar	f4b804eeaa	Call profiler step via optimizer post hook (#90101 ) This PR adds the `_profile_using_dynolog` function to `torch/__init__.py`. The `_profile_using_dynolog` method allows registering the optimizer step post hook. This is required to collect iteration based traces using dynolog. Other related changes for tests to pass: 1. Updated `optimizer.pyi` 1. Updated `overrides.py` 1. The test `test_kineto_profiler_multiple_steppers` in `test_profiler.py` has been broken down into two cases: - `test_kineto_profiler_multiple_steppers_with_override_True` : this test uses the override argument - `test_kineto_profiler_multiple_steppers_with_override_False` : this test uses the environment variable Pull Request resolved: https://github.com/pytorch/pytorch/pull/90101 Approved by: https://github.com/albanD	2023-01-13 18:07:40 +00:00
Brian Coutinho	1d3e7fcc3b	[pytorch profiler] Add step tracker logic to handle multiple sources of step increments (#90880 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90880 # Summary Enables multiple step trackers. Currently we only had one place to mark that a step() has occurred in the program. This was via pytorch profiler step(). We are now working on adding an Optimizer step hook - https://github.com/pytorch/pytorch/issues/88446 - This could mean programs that already call profiler.step() every iteration can end up double incrementing steps - If a model uses multiple optimizers we can also have double or more counting of the step. ## Solution We fix this by adding a layer of abstraction before calling step() to the kineto library. The idea is to maintain steps per requester in a dictionary ``` { "ProfilerStep": 100, # triggered by profiler step() call "Optimizer1Step": 100, # Optimizer 1 or 2 are just examples, could be SGD, Adam etc "Optimizer2Step": 100, } ``` To figure out the global step count just take max on the dict values (100). ``` { "ProfilerStep": 100, "Optimizer1Step": 101, # Optimizer1 got incremented first say "Optimizer2Step": 100, } ``` Then global step count is 101 ## Calling kineto We only call the kineto step() function when global count increments. # Test Plan: Added a unit test buck2 run mode/dev-nosan caffe2/test:profiler Differential Revision: D41751157 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90880 Approved by: https://github.com/chaekit	2022-12-20 00:48:01 +00:00
Edward Z. Yang	eef019c14a	Lint rule to forbid direct use of logging.info/etc APIs (#90907 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/90907 Approved by: https://github.com/jansel	2022-12-16 05:13:51 +00:00
Jagadish Krishnamoorthy	0a4e4de525	[ROCm] add case for FP32MatMulPattern skip property (#84077 ) TF32 is not supported on ROCm and hence the torch/profiler/_pattern_matcher.py FP32MatMulPattern should return False for ROCm instead of checking the results of torch.cuda.get_arch_list(). Depending on the gfx arch running the test, test_profiler.py's test_profiler_fp32_matmul_pattern (__main__.TestExperimentalUtils) will fail otherwise. Signed-off-by: Jagadish Krishnamoorthy <jagdish.krishna@gmail.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84077 Approved by: https://github.com/jeffdaily, https://github.com/kit1980	2022-12-13 20:27:35 +00:00
Ram Rachum	351d73b97f	Fix exception causes all over the codebase (#90271 ) This is the continuation to #90134 and hopefully the final PR in this series. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271 Approved by: https://github.com/kit1980	2022-12-07 04:29:00 +00:00
Taylor Robie	63e57280fc	[Profiler] Memory profiler part 13: Add sizes to timeline. (#89356 ) If we see an allocation the size is unambiguous. Otherwise we have to use sizes and strides to bound the underlying storage. Differential Revision: [D40868660](https://our.internmc.facebook.com/intern/diff/D40868660/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89356 Approved by: https://github.com/chaekit	2022-12-02 03:55:22 +00:00
Taylor Robie	6727e537a7	[Profiler] Memory profiler part 12: Emit timeline of memory events. (#89355 ) Add a simple interface to get a flat representation of the memory profile. Differential Revision: [D40868663](https://our.internmc.facebook.com/intern/diff/D40868663/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89355 Approved by: https://github.com/chaekit	2022-12-02 03:55:22 +00:00
Taylor Robie	b709078dc6	[Profiler] Memory profiler part 11: Mark tensors created in the backward pass which don't correspond to parameters. (#88926 ) There are various Tensors created in the backward pass which do not correspond to parameters. We don't want to mark these as gradients, but we do still want to convey as much information as possible. Thus, this PR introduces an AUTOGRAD_DETAIL category. (Which can be grouped with GRADIENT in visualization if one wishes to take a coarse grained view of the world.) Differential Revision: [D40868661](https://our.internmc.facebook.com/intern/diff/D40868661/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88926 Approved by: https://github.com/chaekit	2022-11-27 12:20:30 +00:00

1 2 3

145 Commits