pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Richard Barnes	3979cb0656	irange for size_t (#55320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27572577 fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03	2021-06-03 01:04:13 -07:00
Zhengxu Chen	2b0ec9c3cf	Reapply "[jit] Implement ScriptProfile to collect instruction profiles." (#58783 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58783 This reverts commit `fc804b5def`. Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D28617037 Pulled By: zhxchen17 fbshipit-source-id: 645de2ede20500a5c218d6ec3c7faae94de37a14	2021-05-24 18:23:21 -07:00
Edward Yang	fc804b5def	Revert D28133579: [jit] Implement ScriptProfile to collect instruction profiles. Test Plan: revert-hammer Differential Revision: D28133579 (`034a238bab`) Original commit changeset: e7e30e961513 fbshipit-source-id: 5a7756468b4f2eeed24d2abb7b52ab46d081a95e	2021-05-21 08:18:40 -07:00
Zhengxu Chen	034a238bab	[jit] Implement ScriptProfile to collect instruction profiles. (#57397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57397 Introduces two main classes in C++ runtime: ScriptProfile is the implementation for enalbing and disabling interpreter profiling in C++. This should be only used from Python, and we will add corresponding Python API in the next diff. InstructionSpan is a utility class to instrument execution of each single instruction. A start timestamp is recorded in the consturctor, and an end timestamp is recorded in the destructor. During destruction, this will send runtime data to all enabled ScriptProfile instances. Test Plan: build/bin/test_jit --gtest_filter='ScriptProfileTest.Basic' Imported from OSS Reviewed By: gmagogsfm Differential Revision: D28133579 fbshipit-source-id: e7e30e96151367022793ab3ad323f01c51ad4a3b	2021-05-20 14:11:03 -07:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	fc9c486044	Add enabling default instructions flag for mobile (#57778 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57778 Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D28268997 Pulled By: tugsbayasgalan fbshipit-source-id: 5571b233d03d3aa80c820ee4245b4d0d3b70f924	2021-05-10 17:26:05 -07:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	b0c27b44cf	Enable backward/forward compatibility for TS runtime (#57498 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57498 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D28162448 Pulled By: tugsbayasgalan fbshipit-source-id: 5c21ced42a22aca7cee089e876e9d98d32f68955	2021-05-07 15:41:45 -07:00
Luca Wehrstedt	36e47af58b	Pass reference to parent future in callbacks (#57635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57635 Note: this PR looks massive, but it's just one simple change, codemodded many times. In many cases, a callback needs to access the value/error produced by the parent future. In Python this was easy because the callback was invoked with the parent future as argument, and could thus inspect it. In C++ the callbacks didn't take any arguments, thus in many cases we worked around this by capturing the future in its own callback. This is risky (leads to reference cycle and thus memory leak) and must be done carefully (spoiler: sometimes we weren't). ghstack-source-id: 128296580 Test Plan: CI Reviewed By: wanchaol Differential Revision: D28178783 fbshipit-source-id: 6de02c4568be42123372edc008f630d5ddae0081	2021-05-07 03:59:18 -07:00
Zhengxu Chen	8b38458011	[jit] Break interpreter.cpp into smaller files. (#56546 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56546 A code move for CodeImpl and Frame to a subdirectory runtime/interpreter, so that it's easier to reuse them and navigate the interpreter code. Test Plan: Imported from OSS Reviewed By: nikithamalgifb Differential Revision: D28133580 fbshipit-source-id: 8de89a4e8e637836625e1ac1db95f0a3353da670	2021-05-06 16:43:57 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Tugsbayasgalan Manlaibaatar	2041cd6707	Enable forward/backward compatibility in TS mobile (#56079 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56079 Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D27828149 Pulled By: tugsbayasgalan fbshipit-source-id: 9291ddbf01853354fca0fa0a58b8115d5d2294da	2021-04-23 16:55:18 -07:00
Tugsbayasgalan Manlaibaatar	6de1d9b2d0	Fix bug in emitUse to drop all values that are marked as drop (#56652 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56652 Previous code doesn't drop prim::Constant values even when they are marked as drop. Test Plan: Imported from OSS Reviewed By: iseeyuan Differential Revision: D27927413 fbshipit-source-id: 67cd52cf292e111be2830ccf93b0e7b089e49001	2021-04-23 12:42:51 -07:00
Mike Ruberry	c0ac0fef4e	Revert D27448156: irange for size_t Test Plan: revert-hammer Differential Revision: D27448156 (`041b4431b2`) Original commit changeset: 585da57d4de9 fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365	2021-04-03 19:14:00 -07:00
Richard Barnes	041b4431b2	irange for size_t (#55163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27448156 fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1	2021-04-02 23:22:29 -07:00
Edward Yang	e70f3d1189	Nasty little hack to preserve NotImplementedError raised in interpreter (#54627 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54627 This is the simplest little fix to get interpreter to preserve NotImplementedError, so that the test suite doesn't start choking on meta tensors not working in interpreter. It is sound and correct but doesn't work for other c10::Error subclasses with special handling. A more proper fix is requested at https://github.com/pytorch/pytorch/issues/54612 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: wenleix, ngimel Differential Revision: D27328666 Pulled By: ezyang fbshipit-source-id: 483bef062de5a907d20e2d9e25eafe2d5197cf8d	2021-03-27 11:53:06 -07:00
Scott Wolchok	3959d393b8	[PyTorch][JIT] Less shared_ptr use in dictConstruct (#54110 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54110 dictConstruct doesn't need to make its caller have a `shared_ptr<DictType>`. It also doesn't need to do extra `shared_ptr` copies into the `key_type` and `value_type` locals. ghstack-source-id: 124150642 Test Plan: fitsships Reviewed By: ezyang Differential Revision: D27101782 fbshipit-source-id: 3c632ad9d8f1bd7bdf37f517a86aca27bd41548a	2021-03-22 18:31:27 -07:00
Scott Wolchok	4a24c552cc	[PyTorch] Fix string copy in WARN path for both interpreters (#54076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54076 If we don't constrain ourselves to use `torch::jit::pop`, we can avoid copying a string or moving IValues around. ghstack-source-id: 124040891 Test Plan: existing tests spot-checked regular interpreter assembly; seems better Reviewed By: dhruvbird, walterddr Differential Revision: D27087204 fbshipit-source-id: 7cf355dbcec31409bdb37afa09d7df85cf2a7e4b	2021-03-17 08:44:08 -07:00
Scott Wolchok	665d5e2a4f	[PyTorch][JIT] Audit interpreter for extra copies (#54029 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54029 I found what appear to be some missed moves and/or extra copies in the JIT interpreter. ghstack-source-id: 123958682 Test Plan: Existing CI for correctness Ran AdIndexer inline_cvr local_ro model benchmark with static_runtime off via `env bin=/tmp/ptvsc2_predictor_bench.StaticDispatchModeFile static_runtime=0 caffe2=0 scripts/swolchok/static_runtime/inline_cvr/run_local_ro.sh` before: ``` I0315 14:25:23.916893 3075680 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.01635. Iters per second: 983.914 I0315 14:26:05.536207 3080560 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.01689. Iters per second: 983.395 I0315 14:26:47.510561 3083335 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.02697. Iters per second: 973.737 I0315 14:27:29.024830 3086767 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.01326. Iters per second: 986.918 I0315 14:28:10.849496 3091323 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.023. Iters per second: 977.517 ``` after: ``` I0315 14:17:43.280469 3046242 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 0.997838. Iters per second: 1002.17 I0315 14:18:24.244606 3046861 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.00173. Iters per second: 998.269 I0315 14:19:05.208899 3051998 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.00187. Iters per second: 998.136 I0315 14:19:46.103854 3055392 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 1.00073. Iters per second: 999.27 I0315 14:20:27.011411 3056062 PyTorchPredictorBenchLib.cpp:215] PyTorch run finished. Milliseconds per iter: 0.999121. Iters per second: 1000.88 ``` (This was just a convenient workload I had handy; the plan of record is to use static runtime for inline_cvr inference AIUI.) Reviewed By: dhruvbird, walterddr Differential Revision: D27060762 fbshipit-source-id: 5567206d7c2d9ae99776ce5524caf09ec2035e87	2021-03-16 15:09:09 -07:00
jiej	4d94ee566e	Ge v1 (#52136 ) Summary: This is a second attempt to use graph executor to run forward on a gradient. This allows a secondary chance to profile intermediate tensor introduced by autodiff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52136 Reviewed By: pbelevich Differential Revision: D26693978 Pulled By: Krovatkin fbshipit-source-id: 91dde8009a210950af8e5173668ada241e16dd52	2021-02-28 00:53:13 -08:00
jiej	dd1c2a06b7	refactor profiling optional (#47667 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47667 Test Plan: Imported from OSS Reviewed By: anjali411, ngimel Differential Revision: D25255572 Pulled By: Krovatkin fbshipit-source-id: d0152c9ef5b1994e27be9888bcb123dca3ecd88f	2021-01-22 14:45:28 -08:00
Scott Wolchok	4a0d17ba2d	[PyTorch][codemod] Replace immediately-dereferenced expect calls w/expectRef (#50228 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50228 `fastmod -m 'expect(<((at\|c10)::)?\w+Type>\s*)->' 'expectRef${1}.'` Presuming it builds, this is a safe change: the result of `expect()` wasn't being saved anywhere, so we didn't need it, so we can take a reference instead of a new `shared_ptr`. ghstack-source-id: 119782961 Test Plan: CI Reviewed By: SplitInfinity Differential Revision: D25837374 fbshipit-source-id: 86757b70b1520e3dbaa141001e7976400cdd3b08	2021-01-13 16:13:55 -08:00
Thomas Viehmann	ea087e2d92	JIT: guard DifferentiableGraph node (#49433 ) Summary: This adds guarding for DifferentiableGraph nodes in order to not depend on Also bailing out on required gradients for the CUDA fuser. Fixes https://github.com/pytorch/pytorch/issues/49299 I still need to look into a handful of failing tests, but maybe it can be a discussion basis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49433 Reviewed By: ngimel Differential Revision: D25681374 Pulled By: Krovatkin fbshipit-source-id: 8e7be53a335c845560436c0cceeb5e154c9cf296	2021-01-08 20:01:27 -08:00
Scott Wolchok	ef1fa547ba	[PyTorch] Use expectRef() when calling listConstruct (#50062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50062 Avoids creating an extra shared_ptr. ghstack-source-id: 119325645 Test Plan: CI Reviewed By: ezyang Differential Revision: D25766631 fbshipit-source-id: f2ab8349dfea325054820fa2c1055180c740574e	2021-01-06 18:13:38 -08:00
Scott Wolchok	480a756194	[PyTorch] IValue::toTensor can now return const Tensor& (#48868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48868 Building on the previous diff, we can make `toTensor()` return a `const Tensor&`, which should make it easier to avoid reference counting. ghstack-source-id: 119327372 Test Plan: internal benchmarks. Reviewed By: bwasti Differential Revision: D25325379 fbshipit-source-id: ca699632901691bcee432f595f75b0a4416d55dd	2021-01-06 08:40:50 -08:00
Yanan Cao	7518f54611	Add flag torch_jit_disable_warning_prints to allow disabling all warnings.warn (#49313 ) Summary: Adding a flag torch_jit_disable_warning_prints to optimize interpreter performance by suppressing (potentially large amount) of warnings.warn. This is to work around TorchScript's warning behavior mismatch with Python. Python by default triggers a warning once per location but TorchScript doesn't support it. This causes same warning to trigger and print once per inference run, hurting performance. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49313 Reviewed By: SplitInfinity Differential Revision: D25534274 Pulled By: gmagogsfm fbshipit-source-id: eaeb57a335c3e6c7eb259671645db05d781e80a2	2020-12-15 15:22:41 -08:00
Ilia Cherniavskii	db5e5b439c	Extra sampling of record function events [resend] (#49114 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49114 resend of https://github.com/pytorch/pytorch/pull/48289 Test Plan: see 48289 Reviewed By: robieta Differential Revision: D25443365 Pulled By: ilia-cher fbshipit-source-id: c15ac312222bb4d744e10199ed79801cccae8227	2020-12-11 12:53:37 -08:00
Bram Wasti	f4226b5c90	[static runtime] add static subgraph fusion pass (#49185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49185 This diff adds a fusion feature that will let us use static runtime for parts of the graph. This will prove useful in cases where fully eliminating control flow is hard etc. TODO: [x] factor out into separate fusion file [x] add python test case [x] add graph that isn't fully lowered test case [x] add graph that has weird list/tuple outputs test case the loop example looks quite good: ``` graph(%a.1 : Tensor, %b.1 : Tensor, %iters.1 : int): %12 : bool = prim::Constant[value=1]() # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:110:4 %c.2 : Tensor = prim::StaticSubgraph_0(%a.1, %b.1) %c : Tensor = prim::Loop(%iters.1, %12, %c.2) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:110:4 block0(%i : int, %c.12 : Tensor): %c.10 : Tensor = prim::StaticSubgraph_1(%a.1, %c.12, %b.1) -> (%12, %c.10) return (%c) with prim::StaticSubgraph_0 = graph(%0 : Tensor, %4 : Tensor): %5 : int = prim::Constant[value=2]() %6 : Tensor = aten::mul(%4, %5) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:109:12 %2 : int = prim::Constant[value=1]() %c.2 : Tensor = aten::add(%0, %6, %2) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:109:8 return (%c.2) with prim::StaticSubgraph_1 = graph(%1 : Tensor, %7 : Tensor, %8 : Tensor): %9 : int = prim::Constant[value=1]() %c.4 : Tensor = aten::add(%7, %8, %9) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:111:12 %5 : int = prim::Constant[value=2]() %c.7 : Tensor = aten::mul_(%c.4, %5) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:112:8 %2 : int = prim::Constant[value=1]() %c.10 : Tensor = aten::sub_(%c.7, %1, %2) # /data/users/bwasti/fbsource/fbcode/buck-out/dev/gen/caffe2/test/static_runtime#binary,link-tree/test_static_runtime.py:113:8 return (%c.10) ``` (Note: this ignores all push blocking failures!) Test Plan: buck test mode/no-gpu //caffe2/benchmarks/static_runtime:static_runtime_cpptest buck test mode/no-gpu caffe2/test:static_runtime Reviewed By: bertmaher Differential Revision: D25385702 fbshipit-source-id: 2f24af4f11d92a959167facd03fbd24f464a6098	2020-12-10 14:03:11 -08:00
Elias Ellison	70853c5021	Dont use symbolic shapes check (#47810 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47810 `bindSymbolicShapes` wasn't checking device or dtype at all, so it wasn't correct. It also isn't being used anywhere (num_profiles is always 1 and we don't use symbolic shapes). We shouldn't have it on until we are actually using symoblic shapes. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D25286214 Pulled By: eellison fbshipit-source-id: 10fb175d0c75bd0159fb63aafc3b59cc5fd6c5af	2020-12-10 12:14:58 -08:00
jiej	a6fa3b2682	adding profile_ivalue (#47666 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47666 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D25255573 Pulled By: Krovatkin fbshipit-source-id: 5d8753e4040a3d96105d28d26728125947c7a638	2020-12-09 15:29:15 -08:00
Mike Ruberry	9f7fb54693	Revert D25111515: Extra sampling of record function events Test Plan: revert-hammer Differential Revision: D25111515 (`09b974c2d5`) Original commit changeset: 0d572a3636fe fbshipit-source-id: d558d8052924d937d86db7dd40dc6388e6d28823	2020-12-09 08:37:17 -08:00
Ilia Cherniavskii	09b974c2d5	Extra sampling of record function events (#48289 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48289 Adding extra sampling step when dispatching RecordFunction. (Note: this ignores all push blocking failures!) Reviewed By: swolchok Differential Revision: D25111515 Pulled By: ilia-cher fbshipit-source-id: 0d572a3636fe649a47ec47901826bbfc08368937	2020-12-09 02:29:13 -08:00
Chen Lai	416dc68341	[Pytorch][Annotation] Update inlined callstack with module instance info (#47416 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47416 Test Plan: Imported from OSS Reviewed By: kimishpatel Differential Revision: D24752846 Pulled By: cccclai fbshipit-source-id: 94d3c18c56161d1de3a16bb7c93502fedf71644c	2020-12-03 10:44:46 -08:00
Meghan Lele	fc1153a8be	[JIT] Fix clang-tidy warnings in jit/runtime (#47992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47992 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D25258645 Pulled By: SplitInfinity fbshipit-source-id: b3e4576400c101b247e80cb4044fc04471f39a47	2020-12-02 12:35:42 -08:00
Scott Wolchok	3ceec73db9	[PyTorch] Lazily construct guts of RecordFunction (#47550 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47550 I saw over 5% time spent in RecordFunction's ctor during one of our framework overhead benchmarks in `perf`. Inspecting assembly, it looks like we just create a lot of RecordFunctions and the constructor has to initialize a relatively large number of member variables. This diff takes advantage of the observation that RecordFunction does nothing most of the time by moving its state onto the heap and only allocating it if needed. It does add the requirement that profiling is actually active to use RecordFunction accessors, which I hope won't be a problem. ghstack-source-id: 117498489 Test Plan: Run framework overhead benchmarks. Savings ranging from 3% (InPlace_ndim_1) to 7.5% (empty_ndim_3) wall time. Reviewed By: ilia-cher Differential Revision: D24812213 fbshipit-source-id: 823a1e2ca573d9a8d7c5b7bb3972987faaacd11a	2020-12-01 13:07:17 -08:00
Scott Wolchok	383abf1f0c	[PyTorch] Make RecordFunction::active private (#47549 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47549 In preparation for moving state onto the heap. ghstack-source-id: 117027862 Test Plan: CI Reviewed By: ilia-cher Differential Revision: D24812214 fbshipit-source-id: 1455c2782b66f6a59c4d45ba58e1c4c92402a323	2020-11-18 17:58:54 -08:00
Gaoxiang Liu	735f8cc6c2	[DI] Allow explicit taskLauncher for torchscript interpreter (#46865 ) Summary: By default, TorchScript execution is single threaded and uses the caller's thread pool. For the use case of distributed inference, we hope there is a way to customize the behavior where the interpreter in torch script can be executed in other places. This diff allows an explicit taskLauncher for torchscript interpreter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46865 Test Plan: unit test is passed. fbshipit-source-id: 1d7b003926c0d1f8facc53206efb960cff8897ac Fixes #{issue number} Reviewed By: houseroad Differential Revision: D24616102 Pulled By: garroud fbshipit-source-id: 79202b62f92d0b0baf72e4bf7aa3f05e0da91d59	2020-11-04 17:07:55 -08:00
Yanan Cao	86abc8cd48	[JIT] Make InsertInstruction overflow check a warning instead of fatal (#46369 ) Summary: This diff restores previous behavior of silently allow overflowing when inserting instructions. The behavior was changed recently in https://github.com/pytorch/pytorch/issues/45382. But it started to break some existing use cases that haver overflow problems. Restoring original behavior but throw a warning to to unblock existing use cases where overflowing happens. Pull Request resolved: https://github.com/pytorch/pytorch/pull/46369 Reviewed By: kwanmacher, wanchaol, fbhuba Differential Revision: D24324345 Pulled By: gmagogsfm fbshipit-source-id: 1c0fac421d4de38f070e21059bbdc1b788575bdf	2020-10-14 23:09:53 -07:00
Yanan Cao	d150d3e276	Make sure each warnings.warn only executes once inside TorchScript. (#45382 ) Summary: * Add a pass at end of runCleanupPasses to annotate `aten::warn` so that each has its unique id * Enhanced interpreter so that it tracks which `aten::warn` has been executed before and skip them * Improved insertInstruction so that it correctly checks for overflow Fixes https://github.com/pytorch/pytorch/issues/45108 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45382 Reviewed By: mrshenli Differential Revision: D24060677 Pulled By: gmagogsfm fbshipit-source-id: 9221bc55b9ce36b374bdf614da3fe47496b481c1	2020-10-02 14:55:10 -07:00
Ilia Cherniavskii	f5c95d5cf1	Source code level attribution in profiler (#43898 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43898 Adding with_source parameter to enable tracking source code (filename and line) in profiler for eager, torchscript and autograd modes Test Plan: python test/test_profiler.py ``` Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls Source Location ----------------------------------- --------------- --------------- --------------- --------------- --------------- --------------- -------------------------------------------- ts_method_1 10.43% 235.364us 36.46% 822.920us 822.920us 1 test/test_profiler.py(70): test_source aten::add 7.52% 169.833us 8.88% 200.439us 200.439us 1 test/test_profiler.py(69): test_source aten::normal_ 6.26% 141.380us 6.26% 141.380us 141.380us 1 test/test_profiler.py(67): test_source aten::add 5.80% 130.830us 8.41% 189.800us 63.267us 3 test/test_profiler.py(72): test_source aten::sum 5.02% 113.340us 8.39% 189.475us 189.475us 1 test/test_profiler.py(64): ts_method_1 aten::add 4.58% 103.346us 6.33% 142.847us 142.847us 1 test/test_profiler.py(62): ts_method_1 aten::mul 4.05% 91.498us 9.62% 217.113us 217.113us 1 test/test_profiler.py(71): test_source aten::add 4.03% 90.880us 5.60% 126.405us 126.405us 1 test/test_profiler.py(58): ts_method_2 aten::empty 3.49% 78.735us 3.49% 78.735us 19.684us 4 test/test_profiler.py(72): test_source ``` Reviewed By: ngimel Differential Revision: D23432664 Pulled By: ilia-cher fbshipit-source-id: 83ad7ebe0c2502494d3b48c4e687802db9c77615	2020-09-30 00:57:35 -07:00
gunandrose4u	f07ac6a004	Fix Windows build failure after DDP PR merged (#45335 ) Summary: Fixes #{issue number} This is resubmit for PR https://github.com/pytorch/pytorch/issues/42897 . Together with fix for Windows build issue introduced by PR https://github.com/pytorch/pytorch/issues/44344 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/45335 Reviewed By: zou3519 Differential Revision: D23931471 Pulled By: mrshenli fbshipit-source-id: f49b5a114944c1450b32934b3292170be064f494	2020-09-25 12:37:50 -07:00
Mike Ruberry	103fa3894a	Revert D23841786: [pytorch][PR] Enable distributed package on windows, Gloo backend supported only Test Plan: revert-hammer Differential Revision: D23841786 (`0122299f9b`) Original commit changeset: 334ba1ed73ef fbshipit-source-id: ec95432f9957df56a5a04e52661f5db920b7f57f	2020-09-24 22:44:33 -07:00
gunandrose4u	0122299f9b	Enable distributed package on windows, Gloo backend supported only (#42897 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42095 For test case part will be committed to this PR later mrshenli, please help to review Pull Request resolved: https://github.com/pytorch/pytorch/pull/42897 Reviewed By: osalpekar Differential Revision: D23841786 Pulled By: mrshenli fbshipit-source-id: 334ba1ed73eff2f668857390fc32d1bc7f08e5f3	2020-09-24 21:13:55 -07:00
Pritam Damania	f1624b82b5	Preserve python backtrace in autograd engine errors. (#43684 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43684 This PR attempts to address #42560 by capturing the appropriate exception_ptr in the autograd engine and passing it over to the Future. As part of this change, there is a significant change the Future API where we now only accept an exception_ptr as part of setError. For the example in #42560, the exception trace would now look like: ``` > Traceback (most recent call last): > File "test_autograd.py", line 6914, in test_preserve_backtrace > Foo.apply(t).sum().backward() > File "torch/tensor.py", line 214, in backward > torch.autograd.backward(self, gradient, retain_graph, create_graph) > File "torch/autograd/__init__.py", line 127, in backward > allow_unreachable=True) # allow_unreachable flag > File "torch/autograd/function.py", line 87, in apply > return self._forward_cls.backward(self, *args) > File "test_autograd.py", line 6910, in backward > raise ValueError("something") > ValueError: something ``` ghstack-source-id: 111109637 Test Plan: waitforbuildbot Reviewed By: albanD Differential Revision: D23365408 fbshipit-source-id: 1470c4776ec8053ea92a6ee1663460a3bae6edc5	2020-09-01 01:28:47 -07:00
Elias Ellison	3c8b1d73c9	Update aliasing in tensorexpr fuser (#43743 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43743 Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D23385205 Pulled By: eellison fbshipit-source-id: 097a15d5bcf216453e1dd144d6117108b3deae4d	2020-08-31 11:52:26 -07:00
Elias Ellison	a7e7981c0b	Use prim::TensorExprGroup interned symbol (#43635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43635 Intern the symbol, no functional changes. Aliasing need to be looked at but this should be done in a separate PR; this PR is just changing the symbol. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D23358806 Pulled By: eellison fbshipit-source-id: f18bcd142a0daf514136f019ae607e4c3f45d9f8	2020-08-31 11:52:16 -07:00
Elias Ellison	01f974eb1e	Specialize optionals for grad_sum_to_size (#43633 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43633 In the backward graph, _grad_sum_to_size is inserted whenever a possibly broadcasting op is called:" `"aten::_grad_sum_to_size(Tensor(a) self, int[]? size) -> Tensor(a)"` If a broadcast occurred, a sum is called, otherwise the second input is None and it is a no-op. Most of the time, it's a no-op (in the fast RNNs benchmark > 90% of the time). We can get rid of this op by profiling the optionality of the second input. I added `prim::profile_optional` to do this, which counts the number of times it saw a None value and the number of times it saw a value present. When specializing the backward graph, we insert checks for values we profiled as None, and in the optimized block can remove the grad_sum_to_size calls that use those values. In the future we may revisit this when NNC supports reductions and we want to replace grad_sum_to_size with sums as well, but I think this is worth landing now. Test Plan: Imported from OSS Reviewed By: bwasti, ZolotukhinM Differential Revision: D23358809 Pulled By: eellison fbshipit-source-id: a30a148ca581370789d57ba082d23cbf7ef2cd4d	2020-08-27 14:35:37 -07:00
Zino Benaissa	40c77f926c	Add prim::TypeCheck operation (#43026 ) Summary: TypeCheck is a new operation to check the shape of tensors against expectd shapes. TypeCheck is a variadic operation. An example, %t0 : Tensor = ... %t1 : Tensor = ... %2 : FLOAT(20, 20), %3 : FLOAT(30, 30), %1 : bool = prim::TypeCheck(%t1, %t2) prim::If(%1) Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/43026 Reviewed By: ZolotukhinM Differential Revision: D23115830 Pulled By: bzinodev fbshipit-source-id: fbf142126002173d2d865cf4b932dea3864466b4	2020-08-21 20:03:24 -07:00
Sebastian Messmer	53af9df557	Unify boxed function signature between jit and c10 (#37034 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37034 c10 takes a Stack* in boxed functions while JIT took Stack&. c10 doesn't return anything while JIT returns an int which is always zero. This changes JIT to follow the c10 behavior. ghstack-source-id: 106834069 Test Plan: unit tests Differential Revision: D20567950 fbshipit-source-id: 1a7aea291023afc52ae706957e9a5ca576fbb53b	2020-06-29 19:24:26 -07:00
Meghan Lele	d58b8222b7	[JIT] Add support for with statements (#34705 ) Summary: Summary This commit adds support for with statements to PyTorch JIT. Each of the with items in a with statement is represented in the JIT IR as a pair of `prim::Enter` and `prim::Exit` nodes that call the `__enter__` and `__exit__` methods defined on the context manager objects returned by the expressions in the with item. Testing This commit adds unit tests for with statements with named with items, nameless with items, and with statements that encounter exceptions. ``` $ python test/test_jit.py TestWith.test_with_as Fail to import hypothesis in common_utils, tests are not derandomized . ---------------------------------------------------------------------- Ran 1 test in 0.430s OK ``` ``` $ python test/test_jit.py TestWith.test_with_no_as Fail to import hypothesis in common_utils, tests are not derandomized . ---------------------------------------------------------------------- Ran 1 test in 0.264s OK ``` ``` $ python test/test_jit.py TestWith.test_with_exceptions Fail to import hypothesis in common_utils, tests are not derandomized Couldn't download test skip set, leaving all tests enabled... . ---------------------------------------------------------------------- Ran 1 test in 1.053s OK ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/34705 Differential Revision: D22095945 Pulled By: SplitInfinity fbshipit-source-id: f661565a834786725259b8ea014b4d7532f9419d	2020-06-18 16:57:18 -07:00
Ailing Zhang	b861daf098	Reduce time spent per guard by comparing TensorType with Tensor (#39098 ) Summary: It mainly reduces the time spent on allocating new TensorType object for Tensor, but comparing them directly. benchmark result before and after this PR: https://gist.github.com/ailzhang/db44d0a1911cae62e0bb794bff33f40a Pull Request resolved: https://github.com/pytorch/pytorch/pull/39098 Differential Revision: D21786678 Pulled By: ailzhang fbshipit-source-id: 2f61f0ac1dc8c529c45bef4e149be431ff1608b0	2020-06-04 13:50:18 -07:00
Nikolay Korovaiko	42870ddf24	Generate Dynamic Shapes (#37693 ) Summary: Yay! Pull Request resolved: https://github.com/pytorch/pytorch/pull/37693 Differential Revision: D21641663 Pulled By: Krovatkin fbshipit-source-id: 64e70138b31800371887d24ceb1c5d18945b4412	2020-05-19 23:17:54 -07:00

1 2

70 Commits