pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Xuehai Pan	046e88a291	[BE] [3/3] Rewrite `super()` calls in test (#94592 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592 Approved by: https://github.com/ezyang, https://github.com/seemethere	2023-02-12 22:20:53 +00:00
David Berard	268ced5104	Retry - [JIT] Propagate profiled information to DifferentiableGraph outputs Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79498 Approved by: https://github.com/eellison	2022-06-21 22:57:17 +00:00
PyTorch MergeBot	5413580f9e	Revert "[JIT] Propagate profiled information to DifferentiableGraph outputs" This reverts commit `1d2a6c2e94`. Reverted https://github.com/pytorch/pytorch/pull/78875 on behalf of https://github.com/davidberard98 due to Internal failures were bisected to this change	2022-06-12 00:14:08 +00:00
David Berard	1d2a6c2e94	[JIT] Propagate profiled information to DifferentiableGraph outputs Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78875 Approved by: https://github.com/eellison	2022-06-10 00:54:11 +00:00
Ryan Spring	4f8b986e28	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: VitalyFedyunin Differential Revision: D33894937 Pulled By: jbschlosser fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851 (cherry picked from commit `6e986f91a9`)	2022-02-14 03:40:32 +00:00
Nikita Shulga	74c44ba9d6	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692 (cherry picked from commit `c9efb58223`)	2022-01-31 17:44:19 +00:00
Ryan Spring	23d03025dc	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33 (cherry picked from commit `3a53b3e94f`)	2022-01-31 17:07:45 +00:00
Joel Schlosser	cb823d9f07	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93 (cherry picked from commit `e9fb2d1db1`)	2022-01-28 18:35:01 +00:00
Ryan Spring	f499ab9cef	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187 (cherry picked from commit `4713dd9cca`)	2022-01-28 16:59:09 +00:00
Nikolay Korovaiko	ab1d879b33	[WIP] forbid aliasing between the outputs of a differentiable graph (#67732 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/67732 Reviewed By: cpuhrsch Differential Revision: D32522826 Pulled By: Krovatkin fbshipit-source-id: 9fdf3509dcd1b885f7c7f06d22b340c0f93bbe12	2021-11-18 15:03:35 -08:00
Jane Xu	09c7771e9c	Set test owners for jit tests (#66808 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66808 Reviewed By: mrshenli Differential Revision: D31761414 Pulled By: janeyx99 fbshipit-source-id: baf8c49ff9c4bcda7b0ea0f6aafd26380586e72d	2021-10-25 07:51:10 -07:00
gmagogsfm	90a96e0642	Remove left-over print in test_diff_graph_inline_threshold (#63231 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63231 Reviewed By: VitalyFedyunin Differential Revision: D30305851 Pulled By: gmagogsfm fbshipit-source-id: 43da3b5f49ad4a6a2d6d174acf792f3ccf41a463	2021-08-13 13:11:27 -07:00
Nikolay Korovaiko	236d3afd82	manual revert of 57575 (#60572 ) Summary: manually reverting 57575 while keeping 57574 since it's fixing a bug: https://github.com/pytorch/pytorch/issues/55609 Sandcastle couldn't do it automatically Pull Request resolved: https://github.com/pytorch/pytorch/pull/60572 Reviewed By: driazati Differential Revision: D29342473 Pulled By: Krovatkin fbshipit-source-id: 66ad7d316984a13d203158ceba9706a5f451f9b2	2021-06-23 19:21:48 -07:00
jiej	9ad0de3c6f	Rework requires_grad on DifferentiableGraphOp (#57575 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57575 This PR does two things: 1. reverts "Manual revert of D27369251 (`f88a3fff65`) (#56080)" in commit `92a09fb87a`. 2. fixing DifferentiableGraph output with wrong requires_grad flag Fixing requires_grad on outputs from DifferentiableGraph, the proper flag is retrieved from profiling information. We previously only retrieves the profiling information on the first profile node in all its uses. However, in case where control flows are present, we need to iteratively search for profile node with profiling information available, in case the first use is in an inactive code path. e.g. ``` graph(%0 : Tensor, %1 : Bool): ..., %2 : Tensor = prim::DifferentiableGraph_0(%0) %3 : Tensor = prim::If(%1) block0(): %4 : Tensor = prim::DifferentiableGraph_1(%2) -> (%4) block1(): %5 : Tensor = prim::DifferentiableGraph_2(%2) -> (%5) -> (%3) with prim::DifferentiableGraph_0 = graph(%0 : Tensor): ... %out : Tensor = aten::operation(...) ... return (..., %out) with prim::DifferentiableGraph_1 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Tensor](%0) ... with prim::DifferentiableGraph_2 = graph(%0 : Tensor): %temp : Tensor = prim::profile[profiled_type=Float(...)](%0) ... ``` Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D29038773 Pulled By: Krovatkin fbshipit-source-id: 6c0a851119f6b8f2f1afae5c74532407aae238fe	2021-06-14 10:37:31 -07:00
jiej	1f7251df90	fixing DifferentiableGraphOp updating requires_grad on input tensor list; python test added to verify the test (#57574 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57574 Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D29038774 Pulled By: Krovatkin fbshipit-source-id: cb342c1b04fa3713a8166b39213437bc9f2d8606	2021-06-14 10:36:26 -07:00
Nikolay Korovaiko	d50a969f2a	reduce inline autodiff threshold so we can caputre smaller fusions (#57062 ) Summary: This should let us fuse simpler expressions like ```cpp torch.jit.script def foo(x): return torch.sigmoid(torch.sigmoid(x)) ``` RUN_TORCHBENCH: alexnet attention_is_all_you_need_pytorch Background_Matting BERT_pytorch demucs densenet121 dlrm fastNLP gen_torchvision_benchmarks.py LearningToPaint maml mnasnet1_0 mobilenet_v2 mobilenet_v2_quantized_qat moco pyhpc_equation_of_state pyhpc_isoneutral_mixing pytorch_CycleGAN_and_pix2pix pytorch_mobilenet_v3 pytorch_stargan pytorch_struct resnet18 resnet50 resnext50_32x4d shufflenet_v2_x1_0 squeezenet1_1 Super_SloMo tacotron2 vgg16 yolov3 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57062 Reviewed By: zou3519 Differential Revision: D28053608 Pulled By: Krovatkin fbshipit-source-id: 6871c3d2a81dd326a481e7ecfaf2ffefffce4a89	2021-04-30 09:55:09 -07:00
jiej	ce1380f9b5	fixing Optional[Tensor] type in autodiff (#55565 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54783 We need to be extra careful with the pattern to legitimately use `unchecked_unwrap_optional` in autodiff. This would at least allow us to start support `Optional[Tensor]` in autodiff, which is quite common in composite layers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55565 Reviewed By: ejguan Differential Revision: D27825336 Pulled By: Krovatkin fbshipit-source-id: a8562eb10ea741effff430d7417d313b1eb53dfe	2021-04-16 14:06:49 -07:00
Nikolay Korovaiko	92a09fb87a	Manual revert of D27369251 (#56080 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/56080 Reviewed By: hansonw Differential Revision: D27777498 Pulled By: Krovatkin fbshipit-source-id: f72ca725ceba3c1fbd54c30014ac001d4b35b9eb	2021-04-14 17:25:59 -07:00
Jie	66289673f7	patching requires_grad on DifferentiableGraph (#55701 ) Summary: The retrieval of profile node is much easier prior to inserting guard node. test cases updated to reflect the patch on a previously failing cases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55701 Reviewed By: pbelevich Differential Revision: D27701216 Pulled By: Krovatkin fbshipit-source-id: e2e6b64b682377e622b75c762e85ff7967e45118	2021-04-11 17:04:13 -07:00
Nikolay Korovaiko	f88a3fff65	Set requires_gradient to help autodiff to prune unneeded gradients (#54374 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/54040 `prim::RequiresGradCheck` guarantees that requires_grad properties of input tensors will match the profiled, otherwise a fallback path will be triggered. This allow us to prune off gradients in backward graph for inputs that don't need gradients. We transfer requires_grad properties from inputs to the `prim::DifferentiableGraph` onto inputs to the differentiable graph. Autodiff will inspect these properties and prune off gradients that aren't required Pull Request resolved: https://github.com/pytorch/pytorch/pull/54374 Reviewed By: H-Huang Differential Revision: D27369251 Pulled By: Krovatkin fbshipit-source-id: 2bce7a2d7f2ec091db9bf4c4b91d8b29edd5be11	2021-04-08 03:15:40 -07:00
Nikolay Korovaiko	8e60bf9034	add RequiresGradCheck (#50392 ) Summary: This change improves perf by 3-4% on fastrnns. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50392 Reviewed By: izdeby Differential Revision: D25891392 Pulled By: Krovatkin fbshipit-source-id: 44d9b6907d3975742c9d77102fe6a85aab2c08c0	2021-01-15 16:50:42 -08:00
Thomas Viehmann	ea087e2d92	JIT: guard DifferentiableGraph node (#49433 ) Summary: This adds guarding for DifferentiableGraph nodes in order to not depend on Also bailing out on required gradients for the CUDA fuser. Fixes https://github.com/pytorch/pytorch/issues/49299 I still need to look into a handful of failing tests, but maybe it can be a discussion basis. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49433 Reviewed By: ngimel Differential Revision: D25681374 Pulled By: Krovatkin fbshipit-source-id: 8e7be53a335c845560436c0cceeb5e154c9cf296	2021-01-08 20:01:27 -08:00
Nikolay Korovaiko	fe26102a0e	Enable TE in test_jit.py (#44200 ) Summary: Enable TE in test_jit.py and adjust/fix tests accordingly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44200 Reviewed By: SplitInfinity Differential Revision: D23673624 Pulled By: Krovatkin fbshipit-source-id: 5999725c7aacc6ee77885eb855a41ddfb4d9a8d8	2020-09-13 15:58:20 -07:00
Elias Ellison	f502290e91	[JIT] Make create autodiff subgraphs do in place updates to aliasDb (#42141 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42141 Update alias db in-place instead of having to construct alias db from scratch on each change, causing O(n^2) behavior. Description from https://github.com/pytorch/pytorch/pull/37106 holds pretty well: """ Recomputing the aliasdb on every fusion iteration + in every subblock is hugely expensive. Instead, update it in-place when doing fusion. The graph fuser pass operates by pushing nodes into a fusion group. So we start with `x, y = f(a, b, c)` and end with: ``` x_out, y_out = prim::fusionGroup(a, b, c) x_in, y_in = f(a_in, b_in, c_in) -> x_in, y_in ``` We destroy the x and y Value*s in the process. This operation is easy to express as an update to the aliasDb--x_out just takes on all the aliasing information x used to have. In particular, since we know f and prim::fusionGroup are purely functional, we don't have to mess with any write information. """ The one difficulty here is mapping x, y to x_out, y_out is not trivial in merging nodes into the autodiff subgraph node. There are a few options: - attempt to make all subgraph utils & ir cloning logic update a map - mirror the subgraph utils implementation in create_autodiff_subgraph - uniquely map x, y and x_in, y_in so you can back out the correspondence. I went with the third option. This shouldn't affect the results of the pass at all. LMK if you think there's anything else I should be doing to test, I was thinking about maybe exposing an option to run create autodiff subgraphs without the post processor and check that the alias db was correctly updated. Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D22798377 Pulled By: eellison fbshipit-source-id: 9a133bcaa3b051c0fb565afb23a3eed56dbe71f9	2020-07-31 15:13:32 -07:00
Elias Ellison	2193fa119e	[JIT] consider side effects when trying moves in alias analysis (#39497 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39497 Previously, we didn't consider side effects at all when moving nodes in alias analysis. It is never valid to reorder a node with a side effect. This has led to bugs when used with Bailouts. Unfortunately this will might cause regressions but it wasn't correct prior :/ Test Plan: Imported from OSS Differential Revision: D21963774 Pulled By: eellison fbshipit-source-id: 656995d1b82534eca65437ed4e397b2bf08a4dec	2020-06-09 19:32:55 -07:00
Elias Ellison	0e3a05ec00	[JIT] rename enable_profiling_mode to enable_profiling_mode_for_profiling_tests (#37825 ) Summary: The existing contextmanager only conditionally enabled_profiling_mode, which was counter intuitive. When we changed the default executor it broke internal benchmarking as a result. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37825 Differential Revision: D21404611 Pulled By: eellison fbshipit-source-id: 306b3c333ef4eb44ab6a6e5ab4e0682e5ce312ce	2020-05-06 11:30:02 -07:00
Pritam Damania	f050b16dd9	Move pytorch distributed tests to separate folder for contbuild. (#30445 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445 Create distributed and rpc directories under caffe/test for better management of unit tests. Differential Revision: D18702786 fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606	2020-01-22 21:16:59 -08:00
Nikolay Korovaiko	5b702ab52b	switching to a simple/full executor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29230 Differential Revision: D18402229 Pulled By: Krovatkin fbshipit-source-id: 62f4bc9bc89c0c7369359bba1359c22a2fa80f46	2019-11-11 13:41:35 -08:00
Michael Suo	cc457ca30f	split remaining "easy" tests (#29249 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29249 This splits out all the tests that are "easy", leaving `TestJit`, `TestScript`, the autogenerated tests, and a small docs test. Splitting those into reasonable chunks is more effort which is less mechanical. Differential Revision: D18339007 Test Plan: Imported from OSS Pulled By: suo fbshipit-source-id: 69164b9f9a2c379fe8923a846c98dd3c37ccb70e	2019-11-06 13:23:01 -08:00

29 Commits