pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Elias Ellison	ad5be26b2f	Small changes/cleanup (#46950 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46950 Make sure that we're fusing in a fuse tests, and refactor to more concise API to check if fusions have happened. Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D24805250 Pulled By: eellison fbshipit-source-id: f898008a64b74e761bb5fe85f91b3cdf2dbdf878	2020-11-12 11:13:38 -08:00
Elias Ellison	f221a19a7f	Force LLVM Compilation for CPU Tests (#46949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46949 Test Plan: Imported from OSS Reviewed By: ansley Differential Revision: D24805247 Pulled By: eellison fbshipit-source-id: 4fcaf02d8a78cc5cbcbde36940d0a2c85fba3fc5	2020-11-12 11:12:08 -08:00
Bert Maher	c4892c8efe	[pytorch][tensorexpr] Promote integer arguments to sin/cos/tan to float (#46776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46776 Following numpy and (now) eager mode Fixes #46458 Test Plan: test_jit_fuser_te Reviewed By: navahgar Differential Revision: D24509884 fbshipit-source-id: c063030fc609ba4aefcd9abd25b50f082fef1548	2020-10-23 17:32:54 -07:00
kshitij12345	8e13fe6c44	[numpy] `torch.sin` : support and promote integer inputs to float (#45733 ) Summary: References https://github.com/pytorch/pytorch/issues/42515 > Enable integer -> float unary type promotion for ops like sin Will follow-up for other such Ops once this PR is merged. cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/45733 Reviewed By: zou3519 Differential Revision: D24431194 Pulled By: mruberry fbshipit-source-id: db600bc5de0e535b538d2aa301c3526b7c75ed17	2020-10-22 01:58:57 -07:00
Elias Ellison	1b97ffa07a	[1/3] [JIT] Make sure fusion occurs in test_tensorexpr file (#45788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45788 We were only running the traced graph once, which would not yet have been fused at that point. We should run for num_profiled_runs + 1, and also assert that all nodes in the graph were fused. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D24169537 Pulled By: eellison fbshipit-source-id: 8499bb1a5bd9d2221b1f1c54d6352558cf07ba9a	2020-10-08 12:02:57 -07:00
Nikolay Korovaiko	993628c74a	Build shape expressions and remove outputs that are only used by `aten::size`s (#45080 ) Summary: Currently, TE materializes all intermediate results even if they are only used for computing their shapes. This diff ports the approach the OF (Old Fuser) took to deal with this issue. Namely, given the structure of a fusion group we infer all the sizes outside a fusion group based on fusion group's inputs. A simple example would be: ``` def test_fuse(a, b): c = a + b d = c + b return d ``` Here we don't need to cache `c` as computing a gradient for `b` in `d = c + b` doesn't need it. We do need to compute sizes for all arguments here in case broadcasts happen. Without this optimization, TE would need to materialize `c` so we can get its size ``` [DUMP profiling_graph_executor_impl.cpp:499] Optimized Graph: [DUMP profiling_graph_executor_impl.cpp:499] graph(%a.1 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %b.1 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %11 : Tensor = prim::DifferentiableGraph_0(%b.1, %a.1) [DUMP profiling_graph_executor_impl.cpp:499] return (%11) [DUMP profiling_graph_executor_impl.cpp:499] with prim::DifferentiableGraph_0 = graph(%11 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %13 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %59 : int[] = aten::size(%13) # <string>:3:44 [DUMP profiling_graph_executor_impl.cpp:499] %62 : int[] = aten::size(%11) # <string>:3:93 [DUMP profiling_graph_executor_impl.cpp:499] %83 : Double(1:1, requires_grad=0, device=cuda:0), %84 : Double(1:1, requires_grad=0, device=cuda:0), %85 : bool = prim::TypeCheck(%11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %86 : Tensor, %87 : Tensor = prim::If(%85) [DUMP profiling_graph_executor_impl.cpp:499] block0(): [DUMP profiling_graph_executor_impl.cpp:499] %d.4 : Double(1:1, requires_grad=0, device=cuda:0), %c.4 : Double(1:1, requires_grad=0, device=cuda:0) = prim::TensorExprGroup_0(%83, %84) [DUMP profiling_graph_executor_impl.cpp:499] -> (%d.4, %c.4) [DUMP profiling_graph_executor_impl.cpp:499] block1(): [DUMP profiling_graph_executor_impl.cpp:499] %94 : Function = prim::Constant[name="fallback_function", fallback=1]() [DUMP profiling_graph_executor_impl.cpp:499] %95 : (Tensor, Tensor) = prim::CallFunction(%94, %11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %96 : Tensor, %97 : Tensor = prim::TupleUnpack(%95) [DUMP profiling_graph_executor_impl.cpp:499] -> (%96, %97) [DUMP profiling_graph_executor_impl.cpp:499] %60 : int[] = aten::size(%87) # <string>:3:55 [DUMP profiling_graph_executor_impl.cpp:499] %61 : int[]? = aten::_size_if_not_equal(%59, %60) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %64 : int[]? = aten::_size_if_not_equal(%62, %60) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] %67 : int[] = aten::size(%86) # <string>:3:55 [DUMP profiling_graph_executor_impl.cpp:499] %68 : int[]? = aten::_size_if_not_equal(%60, %67) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %71 : int[]? = aten::_size_if_not_equal(%62, %67) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] return (%86, %61, %64, %68, %71) [DUMP profiling_graph_executor_impl.cpp:499] with prim::TensorExprGroup_0 = graph(%1 : Double(1:1, requires_grad=0, device=cuda:0), [DUMP profiling_graph_executor_impl.cpp:499] %4 : Double(1:1, requires_grad=0, device=cuda:0)): [DUMP profiling_graph_executor_impl.cpp:499] %5 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %c.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%4, %1, %5) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2872:16 [DUMP profiling_graph_executor_impl.cpp:499] %2 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %d.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%c.3, %1, %2) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2873:16 [DUMP profiling_graph_executor_impl.cpp:499] return (%d.3, %c.3) ``` With this optimization we use `prim::BroadcastSizes` to compute the size of `c`. No need to materialize it. ``` [DUMP profiling_graph_executor_impl.cpp:499] Optimized Graph: [DUMP profiling_graph_executor_impl.cpp:499] graph(%a.1 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %b.1 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %11 : Tensor = prim::DifferentiableGraph_0(%b.1, %a.1) [DUMP profiling_graph_executor_impl.cpp:499] return (%11) [DUMP profiling_graph_executor_impl.cpp:499] with prim::DifferentiableGraph_0 = graph(%11 : Tensor, [DUMP profiling_graph_executor_impl.cpp:499] %13 : Tensor): [DUMP profiling_graph_executor_impl.cpp:499] %59 : int[] = aten::size(%13) # <string>:3:44 [DUMP profiling_graph_executor_impl.cpp:499] %62 : int[] = aten::size(%11) # <string>:3:93 [DUMP profiling_graph_executor_impl.cpp:499] %88 : Double(1:1, requires_grad=0, device=cuda:0), %89 : Double(1:1, requires_grad=0, device=cuda:0), %90 : bool = prim::TypeCheck(%11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %91 : Tensor = prim::If(%90) [DUMP profiling_graph_executor_impl.cpp:499] block0(): [DUMP profiling_graph_executor_impl.cpp:499] %d.4 : Double(1:1, requires_grad=0, device=cuda:0) = prim::TensorExprGroup_0(%88, %89) [DUMP profiling_graph_executor_impl.cpp:499] -> (%d.4) [DUMP profiling_graph_executor_impl.cpp:499] block1(): [DUMP profiling_graph_executor_impl.cpp:499] %97 : Function = prim::Constant[name="fallback_function", fallback=1]() [DUMP profiling_graph_executor_impl.cpp:499] %98 : (Tensor) = prim::CallFunction(%97, %11, %13) [DUMP profiling_graph_executor_impl.cpp:499] %99 : Tensor = prim::TupleUnpack(%98) [DUMP profiling_graph_executor_impl.cpp:499] -> (%99) [DUMP profiling_graph_executor_impl.cpp:499] %85 : int[] = aten::size(%91) [DUMP profiling_graph_executor_impl.cpp:499] %86 : int[] = prim::BroadcastSizes(%59, %62) [DUMP profiling_graph_executor_impl.cpp:499] %61 : int[]? = aten::_size_if_not_equal(%59, %86) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %64 : int[]? = aten::_size_if_not_equal(%62, %86) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] %68 : int[]? = aten::_size_if_not_equal(%86, %85) # <string>:3:19 [DUMP profiling_graph_executor_impl.cpp:499] %71 : int[]? = aten::_size_if_not_equal(%62, %85) # <string>:3:68 [DUMP profiling_graph_executor_impl.cpp:499] return (%91, %61, %64, %68, %71) [DUMP profiling_graph_executor_impl.cpp:499] with prim::TensorExprGroup_0 = graph(%1 : Double(1:1, requires_grad=0, device=cuda:0), [DUMP profiling_graph_executor_impl.cpp:499] %4 : Double(1:1, requires_grad=0, device=cuda:0)): [DUMP profiling_graph_executor_impl.cpp:499] %5 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %c.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%4, %1, %5) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2872:16 [DUMP profiling_graph_executor_impl.cpp:499] %2 : int = prim::Constant[value=1]() [DUMP profiling_graph_executor_impl.cpp:499] %d.3 : Double(1:1, requires_grad=0, device=cuda:0) = aten::add(%c.3, %1, %2) # /scratch/villedepommes/pytorches/bench/test/test_jit.py:2873:16 [DUMP profiling_graph_executor_impl.cpp:499] return (%d.3) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/45080 Reviewed By: bertmaher Differential Revision: D23856410 Pulled By: Krovatkin fbshipit-source-id: 2956286eb03a4894a5baa151c35e6092466322b1	2020-09-28 10:45:56 -07:00
Nick Gibson	d1d9017a66	[NNC] fix Half conversion of immediates in Cuda backend (#45213 ) Summary: The Cuda HalfChecker casts up all loads and stores of Half to Float, so we do math in Float on the device. It didn't cast up HalfImmediate (ie. constants) so they could insert mixed-size ops. Fix is to do that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/45213 Reviewed By: ezyang Differential Revision: D23885287 Pulled By: nickgg fbshipit-source-id: 912991d85cc06ebb282625cfa5080d7525c8eba9	2020-09-25 10:53:36 -07:00
Alex Suhan	3dd0e362db	[TensorExpr] Fix min and max for integral inputs in CUDA backend (#44984 ) Summary: For integral types, isnan is meaningless. Provide specializations for maximum and minimum which don't call it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44984 Test Plan: python test/test_jit_fuser_te.py -k TestTEFuser.test_minmax_int_ops Reviewed By: ezyang Differential Revision: D23885259 Pulled By: asuhan fbshipit-source-id: 2e6da2c43c0ed18f0b648a2383d510894c574437	2020-09-23 23:19:12 -07:00
Bert Maher	2d00ebd29f	Failing test demonstrating problems with mixed output shapes (#44455 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44455 Test Plan: Imported from OSS Reviewed By: gmagogsfm Differential Revision: D23886119 Pulled By: bertmaher fbshipit-source-id: 41787930f154cf4e8a1766613c4cf33b18246555	2020-09-23 21:15:37 -07:00
Alex Suhan	0495998862	[TensorExpr] Disallow arithmetic binary operations on Bool (#44677 ) Summary: Arithmetic operations on Bool aren't fully supported in the evaluator. Moreover, such semantics can be implemented by the client code through insertion of explicit casts to widen and narrow to the desired types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44677 Test Plan: test_tensorexpr --gtest_filter=TensorExprTest.ExprDisallowBoolArithmetic python test/test_jit_fuser_te.py Reviewed By: agolynski Differential Revision: D23801412 Pulled By: asuhan fbshipit-source-id: fff5284e3a216655dbf5a9a64d1cb1efda271a36	2020-09-23 14:59:11 -07:00
Xiang Gao	20ac736200	Remove py2 compatible future imports (#44735 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735 Reviewed By: mruberry Differential Revision: D23731306 Pulled By: ezyang fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f	2020-09-16 12:55:57 -07:00
Mikhail Zolotukhin	d66520ba08	[TensorExpr] Fuser: try merging adjacent fusion groups. (#43671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43671 Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D23360796 Pulled By: ZolotukhinM fbshipit-source-id: 60ec318fe77ae9f2c821d9c4d106281845266e0f	2020-09-15 21:31:02 -07:00
Akihiro Nitta	84949672bf	Fix exception chaining in `test/` (#44193 ) Summary: ## Motivation This PR fixes https://github.com/pytorch/pytorch/issues/43770 and is the continuation of https://github.com/pytorch/pytorch/issues/43836. ## Description of the change This PR fixes exception chaining only in files under `test/` where appropriate. To fix exception chaining, I used either: 1. `raise new_exception from old_exception` where `new_exception` itself seems not descriptive enough to debug or `old_exception` delivers valuable information. 2. `raise new_exception from None` where raising both of `new_exception` and `old_exception` seems a bit noisy and redundant. ## List of lines containing `raise` in `except` clause: I wrote [this simple script](https://gist.github.com/akihironitta/4223c1b32404b36c1b349d70c4c93b4d) using [ast](https://docs.python.org/3.8/library/ast.html#module-ast) to list lines where `raise`ing in `except` clause. - [x] `f8f35fddd4/test/test_cpp_extensions_aot.py (L16)` - [x] `f8f35fddd4/test/test_jit.py (L2503)` - [x] `f8f35fddd4/test/onnx/model_defs/word_language_model.py (L22)` - [x] `f8f35fddd4/test/onnx/verify.py (L73)` - [x] `f8f35fddd4/test/onnx/verify.py (L110)` - [x] `f8f35fddd4/test/onnx/test_verify.py (L31)` - [x] `f8f35fddd4/test/distributed/test_c10d.py (L255)` - [x] `f8f35fddd4/test/distributed/test_c10d.py (L2992)` - [x] `f8f35fddd4/test/distributed/test_c10d.py (L3025)` - [x] `f8f35fddd4/test/distributed/test_c10d.py (L3712)` - [x] `f8f35fddd4/test/distributed/test_distributed.py (L3180)` - [x] `f8f35fddd4/test/distributed/test_distributed.py (L3198)` - [x] `f8f35fddd4/test/distributed/test_data_parallel.py (L752)` - [x] `f8f35fddd4/test/distributed/test_data_parallel.py (L776)` - [x] `f8f35fddd4/test/test_type_hints.py (L151)` - [x] `f8f35fddd4/test/test_jit_fuser.py (L771)` - [x] `f8f35fddd4/test/test_jit_fuser.py (L773)` - [x] `f8f35fddd4/test/test_dispatch.py (L105)` - [x] `f8f35fddd4/test/test_distributions.py (L4738)` - [x] `f8f35fddd4/test/test_nn.py (L9824)` - [x] `f8f35fddd4/test/test_namedtensor.py (L843)` - [x] `f8f35fddd4/test/test_jit_fuser_te.py (L875)` - [x] `f8f35fddd4/test/test_jit_fuser_te.py (L877)` - [x] `f8f35fddd4/test/test_dataloader.py (L31)` - [x] `f8f35fddd4/test/test_dataloader.py (L43)` - [x] `f8f35fddd4/test/test_dataloader.py (L365)` - [x] `f8f35fddd4/test/test_dataloader.py (L391)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/44193 Reviewed By: albanD Differential Revision: D23681529 Pulled By: malfet fbshipit-source-id: 7c2256ff17334625081137b35baeb816c1e53e0b	2020-09-14 14:20:16 -07:00
Bert Maher	350130a69d	Prevent the TE fuser from getting datatypes it can't handle (#44160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44160 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D23528508 Pulled By: bertmaher fbshipit-source-id: 03b22725fb2666f441cb504b35397ea6d155bb85	2020-09-09 11:10:04 -07:00
Bert Maher	960c088a58	[te] Fix casting of unsigned char, and abs(int) (#44157 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44157 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D23528507 Pulled By: bertmaher fbshipit-source-id: c5ef0422a91a4665b616601bed8b7cd137be39f9	2020-09-09 11:08:36 -07:00
Nikolay Korovaiko	f044b17ae2	Disable a test (#44348 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/44348 Reviewed By: mrshenli Differential Revision: D23592524 Pulled By: Krovatkin fbshipit-source-id: 349057606ce39dd5de24314c9ba8f40516d2ae1c	2020-09-09 08:36:19 -07:00
Nick Gibson	be94dba429	[NNC] fix support for FP16 in CudaCodgen (#44209 ) Summary: Fixes a bug where FP16 values could be incorrectly cast to a half type that doesn't have a cast operator by inserting the cuda specific cast to float during handling of the Cast node, not as a wrapper around printing Loads and Stores. Two main changes: the HalfChecker now inserts the casts to float explicitly in the IR, and the PrioritizeLoad mutator now consumes both Loads and a Cast which immediately preceded a load. Tested with test_jit_fuser_te.py and test_tensorexpr.py, plus C++ tests obv. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44209 Reviewed By: izdeby Differential Revision: D23575577 Pulled By: nickgg fbshipit-source-id: 808605aeb2af812758f96f9fdc11b07e08053b46	2020-09-08 18:00:39 -07:00
Nikolay Korovaiko	47ac9bb105	Enable temp disabled tests in test_jit_fuser_te.py (#44222 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/44222 Reviewed By: izdeby Differential Revision: D23582214 Pulled By: Krovatkin fbshipit-source-id: 27caa3ea02ce10b163212f6a45a81b446898953d	2020-09-08 14:40:32 -07:00
Bert Maher	98ad5ff41f	[te] Disable reductions by default (#44122 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44122 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D23504769 Pulled By: bertmaher fbshipit-source-id: 1889217cd22da529e46ab30c9319a5646267e4ec	2020-09-03 23:37:45 -07:00
Bert Maher	55ff9aa185	Test TE fuser unary ops and fix sigmoid(half) (#44094 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44094 Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D23494950 Pulled By: bertmaher fbshipit-source-id: 676c4e57267c4ad92065ea90b06323918dd5b0de	2020-09-03 12:48:46 -07:00
Mikhail Zolotukhin	40fec4e739	[TensorExpr] Fuser: do not fuse ops with 0-dim tensors. (#44073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44073 We don't have a proper support on NNC and JIT IR->NNC lowering side for it yet. Test Plan: Imported from OSS Reviewed By: SplitInfinity Differential Revision: D23487905 Pulled By: ZolotukhinM fbshipit-source-id: da0da7478fc8ce7b455176c95d8fd610c94352c1	2020-09-02 22:59:04 -07:00
Bert Maher	33d51a9b32	Respect canFuseOn{CPU,GPU} in TE fuser (#43967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43967 Test Plan: Imported from OSS Reviewed By: asuhan Differential Revision: D23469048 Pulled By: bertmaher fbshipit-source-id: 1005a7ae08974059ff9d467492caa3a388070eeb	2020-09-02 18:00:25 -07:00
Bert Maher	c14a3613a8	Fix NaN propagation in TE fuser's min/max implementation (#43609 ) Summary: Per eager mode source-of-truth, NaNs shall be propagated by min/max. Pull Request resolved: https://github.com/pytorch/pytorch/pull/43609 Reviewed By: ZolotukhinM Differential Revision: D23349184 Pulled By: bertmaher fbshipit-source-id: 094eb8b89a02b27d5ecf3988d0f473c0f91e4afb	2020-09-01 02:10:13 -07:00
Elias Ellison	a7e7981c0b	Use prim::TensorExprGroup interned symbol (#43635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43635 Intern the symbol, no functional changes. Aliasing need to be looked at but this should be done in a separate PR; this PR is just changing the symbol. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D23358806 Pulled By: eellison fbshipit-source-id: f18bcd142a0daf514136f019ae607e4c3f45d9f8	2020-08-31 11:52:16 -07:00
Alex Suhan	60ad7e9c04	[TensorExpr] Make sum available from Python (#43730 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43730 Test Plan: python test/test_jit_fuser_te.py -k TestTEFuser.test_sum test_tensorexpr --gtest_filter=TensorExprTest.KernelSum* Reviewed By: ZolotukhinM Differential Revision: D23407600 Pulled By: asuhan fbshipit-source-id: e6da4690ae6d802f9be012e39e61b7467aa5285c	2020-08-29 10:38:21 -07:00
Elias Ellison	a4cf4c2437	refactor tests (#43631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43631 I added a new test for just profiler stuff - I don't think the test should go in test_jit.py. Maybe this should just go in test_tensorexpr_fuser, but I'm not really testing tensorexpr stuff either... LMK Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D23358810 Pulled By: eellison fbshipit-source-id: 074238e1b60e4c4a919a052b7a5312b790ad5d82	2020-08-27 14:35:33 -07:00
Mikhail Zolotukhin	3ec24f02af	[TensorExpr] Start using typecheck in the fuser. (#43173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43173 With this change the fuser starts to generate typechecks for inputs of fusion group. For each fusion group we generate a typecheck and an if node: the true block contains the fused subgraph, the false block contains unoptimized original subgraph. Differential Revision: D23178230 Test Plan: Imported from OSS Reviewed By: eellison Pulled By: ZolotukhinM fbshipit-source-id: f56e9529613263fb3e6575869fdb49973c7a520b	2020-08-25 18:13:32 -07:00
Yujun Zhao	e5adf45dde	Add python unittest target to `caffe2/test/TARGETS` (#42766 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42766 Summary Some python tests are missing in `caffe2/test/TARGETS`, add them to be more comprehension. According to [run_test.py](https://github.com/pytorch/pytorch/blob/master/test/run_test.py#L125), some tests are slower. Slow tests are added as independent targets and others are put together into one `others` target. The reason is because we want to reduce overhead, especially for code covarge collection. Tests in one target can be run as a bundle, and then coverage can be collected together. Typically coverage collection procedure is time-expensive, so this helps us save time. Test Plan: Run all the new test targets locally in dev server and record the time they cost. Statistics ``` # jit target real 33m7.694s user 653m1.181s sys 58m14.160s --------- Compare to Initial Jit Target runtime: ---------------- real 32m13.057s user 613m52.843s sys 54m58.678s ``` ``` # others target real 9m2.920s user 164m21.927s sys 12m54.840s ``` ``` # serialization target real 4m21.090s user 23m33.501s sys 1m53.308s ``` ``` # tensorexpr real 11m28.187s user 33m36.420s sys 1m15.925s ``` ``` # type target real 3m36.197s user 51m47.912s sys 4m14.149s ``` Reviewed By: malfet Differential Revision: D22979219 fbshipit-source-id: 12a30839bb76a64871359bc024e4bff670c5ca8b	2020-08-10 09:48:59 -07:00
Nikolay Korovaiko	47c57e8804	rename TestFuser to TestTEFuser (#41542 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41542 Reviewed By: jamesr66a Differential Revision: D22579606 Pulled By: Krovatkin fbshipit-source-id: f65b2cae996b42d55ef864bc0b424d9d43d8a2e2	2020-07-22 13:37:27 -07:00
Michael Suo	ca1b8ebbcb	move misc implementation out of `jit/__init__.py` (#41154 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41154 Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D22445213 Pulled By: suo fbshipit-source-id: 200545715c5ef13beb1437f49e01efb21498ddb7	2020-07-13 16:59:55 -07:00
Jeff Daily	ac8c8b028d	[ROCm] restore jit tests (#40447 ) Summary: Remove `skipIfRocm` from most jit tests and enable `RUN_CUDA_HALF` tests for ROCm. These changes passed more than three rounds of CI testing against the ROCm CI. CC ezyang xw285cornell sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/40447 Differential Revision: D22190711 Pulled By: xw285cornell fbshipit-source-id: bac44825a2675d247b3abe2ec2f80420a95348a3	2020-06-27 01:03:59 -07:00
Nikolay Korovaiko	5036c94a6e	properly skip legacy tests regardless of the default executor (#40381 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40381 Differential Revision: D22173938 Pulled By: Krovatkin fbshipit-source-id: 305fc4484977e828cc4cee6e053a1e1ab9f0d6c7	2020-06-26 11:13:50 -07:00
Wanchao Liang	27d789500b	[test] split tracer related tests out of test_jit (#40142 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40142 test_jit is becoming huge again, which makes editor hard to load and write new tests, this split out the tracer related tests. Test Plan: Imported from OSS Reviewed By: ailzhang Differential Revision: D22085035 Pulled By: wanchaol fbshipit-source-id: 696bee84985ecfbfeac8e2ee5c27f1bdda8de394	2020-06-17 17:26:33 -07:00
Elias Ellison	daa85cfe2e	[JIT] Exit Transform Rewrite (#38282 ) Summary: After an early return, we conditionalize all further execution. This means that currently the pattern of `if return elif return elif return` generates better code than `if return if return if return`. It's obviously not good to have semantically equivalent code generate worse IR, so we should rewrite the graph to handle this case. This came up in https://github.com/pytorch/pytorch/pull/37171 ``` torch.jit.script def test_foo(x: bool, y: bool): if x: return 1 return 2 print(test_foo.code) ``` generates: ``` def test_foo(x: bool, y: bool) -> int: _0 = uninitialized(int) if x: _1, _2 = True, 1 else: _1, _2 = False, _0 if _1: _3 = _2 else: _3 = 2 return _3 ``` while ``` torch.jit.script def test_foo(x: bool, y: bool): if x: return 1 else: return 2 print(test_foo.code) ``` generates: ``` def test_foo(x: bool, y: bool) -> int: if x: _0 = 1 else: _0 = 2 return _0 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/38282 Differential Revision: D21576733 Pulled By: eellison fbshipit-source-id: 80cf1ad7fbda6d8d58557abbfb21c90eafae7488	2020-05-15 12:22:28 -07:00
Vitaly Fedyunin	57d01be92b	Replacing assertEqual with assertEqualIgnoreType wherever types missmatch (#38102 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38102 Test Plan: Imported from OSS Differential Revision: D21477060 Pulled By: VitalyFedyunin fbshipit-source-id: 25e0fd837ca9bfccf0ce994c80f7790c894096d4	2020-05-09 14:48:55 -07:00
Mikhail Zolotukhin	4784af1d78	[TensorExpr] Don't include aten::rand_like to TE fusion groups since we can't handle rand+broadcast case yet. (#38132 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38132 Test Plan: Imported from OSS Reviewed By: resistor Differential Revision: D21479256 Pulled By: ZolotukhinM fbshipit-source-id: 2678cfd6ad2feea132efb5eec09e5f41bbd54487	2020-05-08 13:37:13 -07:00
Elias Ellison	0e3a05ec00	[JIT] rename enable_profiling_mode to enable_profiling_mode_for_profiling_tests (#37825 ) Summary: The existing contextmanager only conditionally enabled_profiling_mode, which was counter intuitive. When we changed the default executor it broke internal benchmarking as a result. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37825 Differential Revision: D21404611 Pulled By: eellison fbshipit-source-id: 306b3c333ef4eb44ab6a6e5ab4e0682e5ce312ce	2020-05-06 11:30:02 -07:00
Nikolay Korovaiko	edc5ef1afb	run the simple executor for jit tests by default, add profiling jobs … (#37017 ) Summary: …for fusion tests fix flake8 warnings fix ci failures fix test_determination.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/37017 Differential Revision: D21238446 Pulled By: Krovatkin fbshipit-source-id: 393e6135883dc5ac57bdff580de96c66829d454c	2020-04-28 19:16:52 -07:00
Nikolay Korovaiko	a80a438e37	correctly set and restore states in te tests (#37210 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37210 Differential Revision: D21238634 Pulled By: Krovatkin fbshipit-source-id: 6462239753399c10c871baa5d5fdff5465cf2544	2020-04-24 20:16:51 -07:00
Mikhail Zolotukhin	af5121f62a	Invoke TensorExpr fuser pass from a graph executor. (#35913 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35913 The pass itself is still disabled by default, but with this change we don't need to register it as a custom pass anymore. It allows us to control its behavior with env variables more easily. Test Plan: Imported from OSS Reviewed By: suo Differential Revision: D20827189 Pulled By: ZolotukhinM fbshipit-source-id: e74d90b5e46422e7ab7bc40974a805220da50fbc	2020-04-03 12:20:26 -07:00
Christian Sarofeen	6d24f8fe21	Infrastructure for a new CUDA Fuser (#34785 ) Summary: Summary: This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_ One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated. Warning: This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser. Short term goals: Parity with current CUDA fuser (including performance): - Dynamic shapes (no recompilation) - Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code) - Dropout Mid-term goals: - Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation). - 1-D reductions fused with pointwise operations Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785 Reviewed By: ZolotukhinM Differential Revision: D20650977 Pulled By: soumith fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63	2020-04-02 09:22:42 -07:00
Bram Wasti	a3e10d2a17	Expose enablement of TensorExpr fuser as env variable (#35341 ) Summary: This commit allows one to use an environment variable to enable the fuser in torch/csrc/jit/tensorexpr/ ``` PYTORCH_TENSOREXPR=1 python benchmark.py ``` This commit also changes the registration to happen by default, removing the requirement for the python exposed "_jit_register_tensorexpr_fuser" Pull Request resolved: https://github.com/pytorch/pytorch/pull/35341 Reviewed By: ZolotukhinM Differential Revision: D20676348 Pulled By: bwasti fbshipit-source-id: 4c997cdc310e7567c03905ebff72b3e8a4c2f464	2020-03-26 14:31:57 -07:00
Johannes M Dieterich	d807292c4a	[ROCm] Hotfix disable tests (#35396 ) Summary: Regressions introduced sometime these last days - disable for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35396 Differential Revision: D20656744 Pulled By: xw285cornell fbshipit-source-id: 386e4e5d50fb81a1d44e8f3558b81cb69299fe92	2020-03-26 00:21:40 -07:00
Mikhail Zolotukhin	6bcf0b407b	[TensorExpr] Disable fuser-te cuda tests when run on ROCm. (#35388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35388 Test Plan: Imported from OSS Differential Revision: D20648735 Pulled By: ZolotukhinM fbshipit-source-id: 27bd776fbb84ec81034ace4b874522413d9e5643	2020-03-25 16:04:15 -07:00
Mikhail Zolotukhin	12f0052eee	Add TensorExpr Fuser tests (resubmit). (#35085 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35085 Test Plan: Imported from OSS Differential Revision: D20552334 Pulled By: ZolotukhinM fbshipit-source-id: 628fcf4719a879f18978ff8a0a64afbb045df645	2020-03-20 13:19:31 -07:00
Natalia Gimelshein	3c90a90730	Revert D20540599: Add TensorExpr Fuser tests. Test Plan: revert-hammer Differential Revision: D20540599 Original commit changeset: ced9b6657fe7 fbshipit-source-id: e8fa11f20207c35f39b3fbe6f45fc627715377c1	2020-03-19 18:37:32 -07:00
Mikhail Zolotukhin	7b59f41009	Add TensorExpr Fuser tests. (#35052 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35052 Differential Revision: D20540599 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: ced9b6657fe72bca61833ab5d59bdaddcacd114b	2020-03-19 14:31:54 -07:00

1 2 3 4 5

247 Commits