pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	ae2c668cc0	Revert "[dynamo][api] Better support of torch.nn.Module (#88629 )" This reverts commit `c83348597b`. Reverted https://github.com/pytorch/pytorch/pull/88629 on behalf of https://github.com/anijain2305 due to job failing on master https://github.com/pytorch/pytorch/actions/runs/3449914495/jobs/5758267231	2022-11-12 07:52:56 +00:00
Jerry Zhang	6b775c42dd	[quant][executorch] Support quant fusion for reshape in quant in executorch stack (#88858 ) Summary: This diff added support for fusing "dq - reshape - q" to a reshape op, the op is needed in wakeword model Test Plan: buck test executorch/exir/tests:quant_fusion_pass Reviewed By: qihqi, JacobSzwejbka Differential Revision: D41111069 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88858 Approved by: https://github.com/JacobSzwejbka	2022-11-12 07:52:44 +00:00
PyTorch MergeBot	34641c4384	Revert "Add comprehensive minifier tests (#88022 )" This reverts commit `5ff600aa6e`. Reverted https://github.com/pytorch/pytorch/pull/88022 on behalf of https://github.com/wconstab due to Seems to be causing CI failures relating to minifier test and some /tmp/ path not existing	2022-11-12 05:16:41 +00:00
Animesh Jain	c83348597b	[dynamo][api] Better support of torch.nn.Module (#88629 ) This is an API change, so please review carefully. With this PR, torchdynamo returns an `OptimizedModule` class object, a subclass of `torch.nn.Module`, when asked to optimize a `nn.Module` object. Most of the methods are redirected to the original `nn.Module`, which is installed as `_mod` in the `OptimizedModule`. This is helpful for many cases ``` mod = MockModule() opt_mod = torch._dynamo.optimize()(mod) print(opt_mod) # Works opt_mod = opt_mod.to(device="cuda") print(opt_mod) # Works opt_mod(input) # Triggers recompile if necessary, earlier we were shedding the TorchDynamo wrapper opt_mod.parameters() # Refers to the original module ``` Topics unclear to me * I have overridden many methods to raise NotImplementedError. A careful review of those will be good. * hooks * For the optimized forward, should we call torchdynamo optimization on `__call__` or `forward` * What else to test Pull Request resolved: https://github.com/pytorch/pytorch/pull/88629 Approved by: https://github.com/Chillee, https://github.com/jansel, https://github.com/msaroufim	2022-11-12 04:45:17 +00:00
Andrew Gu	d01bf1d1f1	[FSDP] Introduce `ModuleWrapPolicy` for simplicity (#88450 ) BC Breaking Change This renames `unwrapped_params` to `nonwrapped_numel`. I prefer `nonwrapped` over `unwrapped` because "unwrap" suggests that some wrapping has been undone. I prefer `numel` over `params` because that is unit of measurement; I think we should keep "params" to refer to `nn.Parameter`s themselves. This only breaks anything that passes `unwrapped_params` as a keyword argument, but I did not see anything that did that (except the one internal benchmark file but that does not actually depend on our `pytorch` code). In a follow-up, I want to rename `min_num_params` to `min_nonwrapped_numel` in `size_based_auto_wrap_policy`, which is also BC breaking. Again, this is to differentiate between "params" being `nn.Parameter`s and "numel" being the unit for `param.numel()`. Overview This PR introduces `ModuleWrapPolicy` as a lightweight layer over the existing `transformer_auto_wrap_policy`. The most common auto wrapping paradigm is: ``` module_classes: Set[Type[nn.Module]] = ... auto_wrap_policy = functools.partial( transformer_auto_wrap_policy, transformer_layer_cls=module_classes, ) fsdp_model = FSDP(model, auto_wrap_policy=auto_wrap_policy, ...) ``` Now, users can instead write: ``` auto_wrap_policy = ModuleWrapPolicy(module_classes) fsdp_model = FSDP(model, auto_wrap_policy=auto_wrap_policy, ...) ``` This hides the unused arguments expected from the callable (`recurse` and `unwrapped_params`/`nonwrapped_numel`). `ModuleWrapPolicy` inherits from an abstract base class `FSDPPolicy` that expects a `policy` property. This decouples the construct of such `FSDPPolicy` classes and their actual `policy`, which must abide by the `_recursive_wrap` interface. Any existing auto wrap policy can be rewritten as a class that inherits from `FSDPPolicy`, so this approach is fully backward compatible from a functionality perspective. I call this base class `FSDPPolicy` to generalize over the cases where we may not want to actually perform any nested wrapping. In reality, the policy is meant for constructing `FlatParameter`s, which just happened to be induced by a nested wrapping before. Given this, I am changing the constructor argument in `fully_shard()` to simply `policy` instead of `auto_wrap_policy`. This PR migrates usages of `transformer_auto_wrap_policy` within our unit test suite to `ModuleWrapPolicy` as much as possible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88450 Approved by: https://github.com/zhaojuanmao	2022-11-12 04:14:32 +00:00
PyTorch MergeBot	b2b0a0d3ba	[vision hash update] update the pinned vision hash (#88920 ) This PR is auto-generated nightly by [this action](https://github.com/pytorch/pytorch/blob/master/.github/workflows/_update-commit-hash.yml). Update the pinned vision hash. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88920 Approved by: https://github.com/pytorchbot	2022-11-12 03:21:08 +00:00
Chien-Chin Huang	ae4074669e	[FSDP][state_dict][6/N] Remove most FSDP module dependency from _optim_utils (#88638 ) What This PR removes most `FullyShardedDataParallel` dependencies from `optim_utils`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88638 Approved by: https://github.com/awgu	2022-11-12 03:16:37 +00:00
Bin Bao	4108367123	Exclude poolformer_m36 from the inductor model test (#88908 ) Summary: The root cause is still to be investigated. Issue tracked at https://github.com/pytorch/torchdynamo/issues/1856 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88908 Approved by: https://github.com/malfet	2022-11-12 03:10:25 +00:00
mikey dagitses	1e2327baf7	fix fx tests (#88886 ) Summary: Some source files are missing and TPX couldn't handle the default test names. Test Plan: Rely on CI. Differential Revision: D41218564 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88886 Approved by: https://github.com/zou3519	2022-11-12 02:23:48 +00:00
Edward Z. Yang	66736ff425	Fix bug in OptionalTensorList (#88887 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88887 Approved by: https://github.com/anjali411	2022-11-12 02:19:46 +00:00
Edward Z. Yang	2b166532f7	Remove incorrect assert about hermetic state. (#88885 ) I'm not sure why I thought this assert was valid in the first place, and there's no comment about it. The assert is tantamount to saying, "no tensor objects should become dead via SafePyObject when hermetic mode is on." But suppose we run a Python GC while we're inside hermetic mode. This could result in us disposing non-hermetic tensors, which would hit decref. So the assert seems invalid. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/88885 Approved by: https://github.com/anjali411, https://github.com/malfet	2022-11-12 02:19:45 +00:00
Jiaxu Zhu	2cd05a2818	Support torch.qint32 in Convert (#88871 ) Enable the `torch.qint32` when creating `quantize_per_tensor` function call in `convert_fx` Pull Request resolved: https://github.com/pytorch/pytorch/pull/88871 Approved by: https://github.com/jerryzh168	2022-11-12 01:20:52 +00:00
Will Constable	a3f3ec8fac	[FSDP+dynamo]: forward treats parameter-views as params (#88781 ) Dynamo+AotAutograd needs a way to wrap all tensors (whether inputs or params/buffers) in FakeTensor wrappers, and FSDP's mangling of parameters hides them from this wrapping. This PR unblocks running hf_bert and hf_T5 with FSDP under dynamo, whether using recursive wrapping around transformer layers or only applying FSDP around the whole model. Perf/memory validation and possibly optimization is the next step. `python benchmarks/dynamo/distributed.py --torchbench_model hf_Bert --fsdp --dynamo aot_eager` `python benchmarks/dynamo/distributed.py --torchbench_model hf_Bert --fsdp --dynamo aot_eager --fsdp_wrap` `python benchmarks/dynamo/distributed.py --torchbench_model hf_T5 --fsdp --dynamo aot_eager` `python benchmarks/dynamo/distributed.py --torchbench_model hf_T5 --fsdp --dynamo aot_eager --fsdp_wrap` The problem: Dynamo (Actually aot_autograd) trips up with FSDP becuase it must wrap all input tensors in FakeTensor wrappers, and it only knows to wrap graph inputs or named_(parameters, buffers). FSDP's pre_forward hook sets views (which are not nn.param) into the flatparam as attrs on the module with the same name as the original param, but they will not show up in named_parameters. - in use_orig_params mode, FSDP still de-registers params during pre-forward hook, then re-registers them post-forward - during forward (between the hooks), the params are setattr'd on the module as regular view tensors, not nn.Parameters - note: use_orig_params is the recommended way to use FSDP, and use_orig_params=False is being deprecated. So i only consider use_orig_params=True for this enablement The solution: - adding them to named_buffers is not possible because it interferes with how FSDP's `_apply` works - since they are not actual nn.parameters, register_parameter will complain about registering them - simply seting `module._parameters[name] = view` seems to be a viable workaround, despite being hacky, and FSDP code does modify _parameters directly already. Note: Manual checkpointing still isn't working with FSDP+dynamo, so that will have to be addressed in a follow up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88781 Approved by: https://github.com/ezyang, https://github.com/awgu	2022-11-12 01:17:23 +00:00
William Wen	5ff600aa6e	Add comprehensive minifier tests (#88022 ) Adds tests for https://github.com/pytorch/torchdynamo/issues/1241. To run: `pytest test/dynamo/test_minifier.py`. Actually runs minifier launcher script and repro scripts, rather than just checking for existence of the minifier launcher script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88022 Approved by: https://github.com/mlazos, https://github.com/anijain2305	2022-11-12 00:22:25 +00:00
Horace He	37c5b42fa6	Fix matmul decomp to use reshape instead of contiguous().view() (#88832 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88832 Approved by: https://github.com/bertmaher, https://github.com/ngimel	2022-11-12 00:15:42 +00:00
Richard Zou	7c3adddd6c	[functorch] delete some unused files (#88763 ) Some post-merge cleanup. - packaging/ was for building standalone windows binaries - our flake8 config got superceded by PyTorch's. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88763 Approved by: https://github.com/samdow	2022-11-11 23:58:51 +00:00
Peter Bell	a7fa423f48	copy_: Short-circuit when self and src view the same data (#88884 ) This comes up if you use inplace operators on a slice, e.g. ```python import torch a = torch.rand(1000000, device="cuda") a[::2] = 2 ``` The last line looks as if it should be fully inplace, but is actually equivalent to: ```python tmp = a[::2] tmp = 2 a[::2] = tmp ``` Which results in `mul_` and `copy_` being called. With this PR, the redundant copy becomes a no-op and the above example is 2x faster. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88884 Approved by: https://github.com/ngimel	2022-11-11 23:31:15 +00:00
Yanbo Liang	6fe47b682f	[Dynamo] Fix str(Guard.obj_weakref) bug to re-ennable support overriding __getattr__ (#88564 ) See my inline comments! Pull Request resolved: https://github.com/pytorch/pytorch/pull/88564 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2022-11-11 22:31:32 +00:00
Kevin Tse	be8d88f8d0	[DataLoader] Removing DataLoader2 related code (#88848 ) Removing these lines of code as `DataLoader2` has been added to [TorchData](https://github.com/pytorch/data). I'm importing this to confirm it will not impact internal codes. Differential Revision: [D41201578](https://our.internmc.facebook.com/intern/diff/D41201578) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88848 Approved by: https://github.com/ejguan	2022-11-11 22:27:01 +00:00
Nikita Shulga	f39cad50b7	Make InductorCPU usable in internally (#88870 ) Test Plan: `buck2 test mode/opt //caffe2/test:test_inductor -- --exact 'caffe2/test:test_inductor - test_dtype_mismatch_issue_cuda (caffe2.test.inductor.test_torchinductor.CudaTests)'` Differential Revision: D41206109 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88870 Approved by: https://github.com/izaitsevfb	2022-11-11 22:07:34 +00:00
BowenBao	fbc1878265	[ONNX] Pretty print diagnostic logging (#88261 ) Adds pretty print diagnostic logging. For example ```python import io import torch from torch.onnx._internal import diagnostics class CustomAdd(torch.autograd.Function): @staticmethod def forward(ctx, x, y): return x + y @staticmethod def symbolic(g, x, y): return g.op("custom::CustomAdd", x, y) class M(torch.nn.Module): def forward(self, x): return CustomAdd.apply(x, x) # trigger warning for missing shape inference. # rule = diagnostics.rules.node_missing_onnx_shape_inference torch.onnx.export(M(), torch.randn(3, 4), io.BytesIO()) ``` By default, observe minimum summary of diagnostics ``` ========= Diagnostic Run torch.onnx.export version 1.14.0a0+git90a69c5 ========= verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 3 WARNING 0 ERROR ======================== 3 WARNING were not printed due to the log level. ``` Adjusting the `verbose` and `level` argument. ```python diagnostics.engine.pretty_print(verbose=True, level=diagnostics.levels.WARNING) ``` Prints full log. ``` =============================== 1 Diagnostic Run =============================== ========= Diagnostic Run torch.onnx.export version 1.14.0a0+git90a69c5 ========= verbose: True, log level: Level.WARNING ======================= 0 NONE 0 NOTE 3 WARNING 0 ERROR ======================== WARNING: node-missing-onnx-shape-inference ========================================== The shape inference of custom::CustomAdd type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. --------------------------- Stack: Python call stack --------------------------- frame: diagnostic = ExportDiagnostic(rule, level, message, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/diagnostics/_diagnostic.py:151 frame: n, utils._params_dict, GLOBALS.export_onnx_opset_version /home/bowbao/pytorch_dev/torch/onnx/_patch_torch.py:82 frame: <@beartype(torch.onnx._patch_torch._graph_op) at 0x7f62184b6710>:78 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: return function(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_deprecation.py:30 frame: return g.op("custom::CustomAdd", x, y) test_pretty_print.py:14 frame: return symbolic_fn(g, args) /home/bowbao/pytorch_dev/torch/onnx/utils.py:1716 frame: return beartyped(args, kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: graph = _C._jit_pass_onnx(graph, operator_export_type) /home/bowbao/pytorch_dev/torch/onnx/utils.py:663 frame: <@beartype(torch.onnx.utils._optimize_graph) at 0x7f62180e05f0>:85 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: module=module, /home/bowbao/pytorch_dev/torch/onnx/utils.py:1123 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: dynamic_axes=dynamic_axes, /home/bowbao/pytorch_dev/torch/onnx/utils.py:1539 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: export_modules_as_functions=export_modules_as_functions, /home/bowbao/pytorch_dev/torch/onnx/utils.py:519 frame: <@beartype(torch.onnx.utils.export) at 0x7f62180e0170>:347 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: torch.onnx.export(M(), torch.randn(3, 4), io.BytesIO()) test_pretty_print.py:22 ---------------------------- Stack: C++ call stack ----------------------------- frame: (<unknown frame>) frame: (<unknown function> + 0x88411b (0x7f625b36011b in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::UpdateReliable(torch::jit::Value, std::pair<bool, bool> const&) + 0x7d3 (0x7f625b351743 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::UpdateReliable(torch::jit::Node) + 0x4f (0x7f625b35198f in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::ONNXShapeTypeInference(torch::jit::Node, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&, int) + 0xac9 (0x7f625b357179 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0xabd026 (0x7f625b599026 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0x3c0fda (0x7f625ae9cfda in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown frame>) WARNING: node-missing-onnx-shape-inference ========================================== The shape inference of custom::CustomAdd type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. --------------------------- Stack: Python call stack --------------------------- frame: diagnostic = ExportDiagnostic(rule, level, message, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/diagnostics/_diagnostic.py:151 frame: graph, params_dict, GLOBALS.export_onnx_opset_version /home/bowbao/pytorch_dev/torch/onnx/utils.py:688 frame: <@beartype(torch.onnx.utils._optimize_graph) at 0x7f62180e05f0>:85 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: module=module, /home/bowbao/pytorch_dev/torch/onnx/utils.py:1123 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: dynamic_axes=dynamic_axes, /home/bowbao/pytorch_dev/torch/onnx/utils.py:1539 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: export_modules_as_functions=export_modules_as_functions, /home/bowbao/pytorch_dev/torch/onnx/utils.py:519 frame: <@beartype(torch.onnx.utils.export) at 0x7f62180e0170>:347 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: torch.onnx.export(M(), torch.randn(3, 4), io.BytesIO()) test_pretty_print.py:22 ---------------------------- Stack: C++ call stack ----------------------------- frame: (<unknown frame>) frame: (<unknown function> + 0x88411b (0x7f625b36011b in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::UpdateReliable(torch::jit::Value, std::pair<bool, bool> const&) + 0x7d3 (0x7f625b351743 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::UpdateReliable(torch::jit::Node) + 0x4f (0x7f625b35198f in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::ONNXShapeTypeInference(torch::jit::Node, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&, int) + 0xac9 (0x7f625b357179 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0x87d6d1 (0x7f625b3596d1 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::ONNXShapeTypeInference(std::shared_ptr<torch::jit::Graph>&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&, int) + 0x33 (0x7f625b359cf3 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0xabdbae (0x7f625b599bae in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0x3c0fda (0x7f625ae9cfda in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown frame>) WARNING: node-missing-onnx-shape-inference ========================================== The shape inference of custom::CustomAdd type is missing, so it may result in wrong shape inference for the exported graph. Please consider adding it in symbolic function. --------------------------- Stack: Python call stack --------------------------- frame: diagnostic = ExportDiagnostic(rule, level, message, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/diagnostics/_diagnostic.py:151 frame: graph, params_dict, GLOBALS.export_onnx_opset_version /home/bowbao/pytorch_dev/torch/onnx/utils.py:1179 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: dynamic_axes=dynamic_axes, /home/bowbao/pytorch_dev/torch/onnx/utils.py:1539 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: export_modules_as_functions=export_modules_as_functions, /home/bowbao/pytorch_dev/torch/onnx/utils.py:519 frame: <@beartype(torch.onnx.utils.export) at 0x7f62180e0170>:347 frame: return beartyped(args, *kwargs) /home/bowbao/pytorch_dev/torch/onnx/_internal/_beartype.py:81 frame: torch.onnx.export(M(), torch.randn(3, 4), io.BytesIO()) test_pretty_print.py:22 ---------------------------- Stack: C++ call stack ----------------------------- frame: (<unknown frame>) frame: (<unknown function> + 0x88411b (0x7f625b36011b in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::UpdateReliable(torch::jit::Value, std::pair<bool, bool> const&) + 0x7d3 (0x7f625b351743 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::UpdateReliable(torch::jit::Node) + 0x4f (0x7f625b35198f in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::ONNXShapeTypeInference(torch::jit::Node, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&, int) + 0xac9 (0x7f625b357179 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0x87d6d1 (0x7f625b3596d1 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (torch::jit::ONNXShapeTypeInference(std::shared_ptr<torch::jit::Graph>&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, c10::IValue, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, c10::IValue> > > const&, int) + 0x33 (0x7f625b359cf3 in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0xabdbae (0x7f625b599bae in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown function> + 0x3c0fda (0x7f625ae9cfda in /home/bowbao/pytorch_dev/torch/lib/libtorch_python.so)) frame: (<unknown frame>) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/88261 Approved by: https://github.com/abock, https://github.com/justinchuby	2022-11-11 21:59:16 +00:00
efiks	ea0ec9d71c	[tourch] BatchBoxCox - fix numerical issue in vectorized code (#88875 ) Summary: Usage of fast math in BatchBoxCox kernel provided different math results between dev and optimized versions which cause few internal test to fail. For now disabling the compiler optimized version and relying on ATEN vectors Differential Revision: D41211784 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88875 Approved by: https://github.com/hyuen	2022-11-11 21:58:23 +00:00
Richard Barnes	dfb4b73e45	Fix unused variable 'options' warning in RNN.cpp (#88753 ) Fixes ``` /home/rbarnes/pytorch/aten/src/ATen/native/cudnn/RNN.cpp:73:17: warning: unused variable 'options' [-Wunused-variable] TensorOptions options = TensorOptions().dtype(dtype).layout(layout).device(device).pinned_memory(pin_memory); ^ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/88753 Approved by: https://github.com/soumith	2022-11-11 21:51:13 +00:00
Chien-Chin Huang	7aa144ac54	[FSDP][state_dict][5/N] Remove the FSDP module dependency from _state_dict_utils (#88637 ) What This PR completely removes the `FullyShardedDataParallel` dependency from `_state_dict_utils` -- `_state_dict_utils` now depends only on `_FSDPState` and all the utils modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88637 Approved by: https://github.com/awgu	2022-11-11 21:22:13 +00:00
Nikita Shulga	575e02df53	Fix CUDNN_PATH handling on Windows (#88898 ) Fixes https://github.com/pytorch/pytorch/issues/88873 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88898 Approved by: https://github.com/kit1980	2022-11-11 21:19:26 +00:00
kshitij12345	f74946324e	[fix] allow saving python attr on Tensor and Parameter via torch.save (#81616 ) Fixes: https://github.com/pytorch/pytorch/issues/72129 TODO: * [x] Fix for Parameter Benchmark (Measurable diff for small tensors) ``` [-------------- Save and Load --------------] \| After PR \| Before PR 1 threads: ---------------------------------- () \| 111.7 \| 106.9 (4, 4) \| 114.4 \| 109.2 (128, 128) \| 135.2 \| 128.3 (1024, 1024) \| 1431.9 \| 1431.3 Times are in microseconds (us). ``` <details> <summary> Benchmark Script </summary> ```python import torch from torch.testing._internal.common_utils import BytesIOContext from torch.utils import benchmark import pickle shapes = ((), (4, 4), (128, 128), (1024, 1024)) sizes = [1, 64, 1024, 10000] results = [] def save_load_fn(t): with BytesIOContext() as f: torch.save(t, f) f.seek(0) torch.load(f) for shape in shapes: t = torch.randn(shape) label = 'Save and Load' sub_label = f'{shape}' results.append(benchmark.Timer( stmt='save_load_fn(t)', globals={'t': t, 'save_load_fn':save_load_fn}, label=label, sub_label=sub_label, description='Before PR', ).blocked_autorange(min_run_time=2)) compare = benchmark.Compare(results) compare.print() with open('before_pr.pkl', 'wb') as f: pickle.dump(results, f) # with open('after_pr.pkl', 'rb') as f: # after_pr = pickle.load(f) # with open('before_pr.pkl', 'rb') as f: # before_pr = pickle.load(f) # compare = benchmark.Compare(after_pr + before_pr) # compare.print() ``` </details> NOTE : BC-Breaking : After this PR, all tensors (also regular tensors) will be serialised using `_rebuild_from_type_v2`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81616 Approved by: https://github.com/albanD, https://github.com/kurtamohler	2022-11-11 21:11:12 +00:00
PyTorch MergeBot	ba4d5aae06	Revert "rename DisableTorchFunction to DisableTorchFunctionSubclass (#88218 )" This reverts commit `7f28be10e5`. Reverted https://github.com/pytorch/pytorch/pull/88218 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41211901	2022-11-11 19:13:05 +00:00
PyTorch MergeBot	4e5d7afe84	Revert "add DisableTorchFunction that matches DisableTorchDispatch (#88219 )" This reverts commit `c0ecce15b5`. Reverted https://github.com/pytorch/pytorch/pull/88219 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41211901	2022-11-11 19:08:30 +00:00
BowenBao	9d7d21f569	[ONNX] Add stack info to diagnostics (#87258 ) ~~Investigating strange bug releasing 'graph' right when returning from `_C._jit_pass_onnx`.~~ ~~Can be repro-ed locally via `test_cpp_diagnose`, with changes in this PR.~~ Resolved by https://github.com/pytorch/pytorch/pull/87829. This PR adds methods to record stack backtrace information to diagnostics. * #87830 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87258 Approved by: https://github.com/abock	2022-11-11 18:58:15 +00:00
Chien-Chin Huang	3d1c5c89ed	[FSDP][state_dict][4/N] Move the core logic of summon full parameters to _unshard_params_utils.py (#88636 ) What `_summon_full_parameters` is required for state_dict. To enable composable FSDP state_dict, `_summon_full_params` must be accessible without FullyShardedDataParall. This PR move the core logic of `_summon_full_params` to `_unshard_params_utils`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88636 Approved by: https://github.com/awgu	2022-11-11 18:30:57 +00:00
Thiago Crepaldi	5f0783bd6d	Fix ATen Fallback for BUILD_CAFFE2=0 for ONNX-only ops (#88504 ) Follow-up for #87735 Once again, because BUILD_CAFFE2=0 is not tested for ONNX exporter, one scenario slipped through. A use case where the model can be exported without aten fallback when operator_export_type=ONNX_ATEN_FALLBACK and BUILD_CAFFE2=0 A new unit test has been added, but it won't prevent regressions if BUILD_CAFFE2=0 is not executed on CI again Fixes #87313 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88504 Approved by: https://github.com/justinchuby, https://github.com/BowenBao	2022-11-11 17:43:46 +00:00
Elias Ellison	8ff2e34ca6	Take input striding for conv forward based on eager output (#88706 ) From discussion with @Chillee and @ngimel we'll likely need further fixes to ensure that we hit channels last kernels but this is still worth landing in its own right. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88706 Approved by: https://github.com/ngimel	2022-11-11 17:29:15 +00:00
PyTorch MergeBot	adfbd831cf	Revert "[Autograd] Use in-place input accumulation fast path for dense Tensors. (#88339 )" This reverts commit `8f66ae413f`. Reverted https://github.com/pytorch/pytorch/pull/88339 on behalf of https://github.com/mehtanirav due to Internal test failures	2022-11-11 17:03:25 +00:00
Kurt Mohler	89a326ff7e	Explicitly check filelike arg of `torch.save` (#88867 ) Fixes #88793 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88867 Approved by: https://github.com/ezyang	2022-11-11 16:57:08 +00:00
Elias Ellison	a6832b08a3	Regularize bernouilli_ with bernouilli decomp (#88349 ) Fix for https://github.com/pytorch/torchdynamo/issues/1796. Just like the other [bernouilli decomp](https://github.com/pytorch/pytorch/blob/master/torch/_inductor/decomposition.py#L302) we need to pass `dtype=float32` to avoid `"check_uniform_bounds" not implemented` errors. Are we planning on enabling `TEST_WITH_TORCHINDUCTOR` ? Do I need to change anything with the tests ? Pull Request resolved: https://github.com/pytorch/pytorch/pull/88349 Approved by: https://github.com/desertfire	2022-11-11 16:53:02 +00:00
Nikita Karetnikov	1e8f95ace1	Symintify `broadcast_to` (#88776 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88776 Approved by: https://github.com/ezyang	2022-11-11 15:49:43 +00:00
anjali411	d615d12289	Add meta impl for topk (#88694 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88694 Approved by: https://github.com/ezyang	2022-11-11 15:28:41 +00:00
Chien-Chin Huang	3c7f96665e	[FSDP][state_dict][3/N] Change how state_dict utils access attributes in _FSDPState (#88635 ) What This PR Does _state_dict_utils currently accesses the FSDP states through module. To enable composable FSDP state_dict, these accesses need to go through _FSDPState. module is still required for most APIs as state_dict has to access per-module information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88635 Approved by: https://github.com/awgu	2022-11-11 15:20:36 +00:00
soulitzer	b92acee8f8	Add context manager to allow mutation on saved tensors (#79056 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79056 Approved by: https://github.com/albanD	2022-11-11 15:18:28 +00:00
Bin Bao	91b71cdbe4	[dynamo] Add torch.device to is_safe_constant (#88766 ) Test Plan: ``` PYTORCH_TEST_WITH_DYNAMO=1 python test/test_torch.py -k test_advancedindex_mixed_cpu_devices_cuda ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/88766 Approved by: https://github.com/jansel	2022-11-11 15:06:17 +00:00
Chien-Chin Huang	324ac93a43	[FSDP][state_dict][2/N] Move state_dict related enums/dataclasses/states to state_dict_utils.py, api.py and init_state_dict() (#88481 ) Motivation: Several Enums, Dataclasses and states defined in fully_sharded_data_paralle.py should be moved to a place where the composable FSDP can access. This PR does the move. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88481 Approved by: https://github.com/rohan-varma, https://github.com/awgu	2022-11-11 12:28:37 +00:00
Michael Gschwind	ee91c328da	Fix cuda/cpu check on NoneType (#88854 ) Summary: Fix cuda/cpu check on NoneType Test Plan: sabdcastle/ github CI/CD Differential Revision: D41203955 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88854 Approved by: https://github.com/drisspg, https://github.com/ngimel	2022-11-11 12:19:31 +00:00
kshitij12345	d15a6b0c97	Error on ZeroTensor serialization (#88803 ) Follow-up : https://github.com/pytorch/pytorch/pull/88182#issuecomment-1308628415 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88803 Approved by: https://github.com/anjali411	2022-11-11 08:51:29 +00:00
AllenTiTaiWang	b843f4db0a	[ONNX] Add test case for onnx::Max scalar type (#88751 ) Referenced by minimum cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/88751 Approved by: https://github.com/wschin, https://github.com/BowenBao	2022-11-11 07:08:56 +00:00
Eddie Yan	396c3b1d88	Use `atomicAdd` for `bfloat16` in Ampere and above (#84981 ) WIP to fix extremely slow `scatter_add` issue vs. fp16. The current changes seem to improve performance, but it still appears to lag behind the fp16 equivalent. CC @ngimel @ptrblck Pull Request resolved: https://github.com/pytorch/pytorch/pull/84981 Approved by: https://github.com/ngimel	2022-11-11 05:23:48 +00:00
AllenTiTaiWang	a6d72f44a4	[ONNX] Add onnx::Max into standard Op for scalar type alignment (#88750 ) Easy fix for onnx::Max ScalarType Pull Request resolved: https://github.com/pytorch/pytorch/pull/88750 Approved by: https://github.com/justinchuby, https://github.com/BowenBao	2022-11-11 04:22:04 +00:00
PyTorch MergeBot	0de8f047c1	Revert "[dynamo] fixes dict changed during runtime error (#87526 )" This reverts commit `cf04b36ce8`. Reverted https://github.com/pytorch/pytorch/pull/87526 on behalf of https://github.com/anijain2305 due to error reported	2022-11-11 04:19:08 +00:00
Jane Xu	310335de48	Update lr_scheduler.pyi to match lr_scheduler.py (#88818 ) Following #88503, we should also update the pyi file Pull Request resolved: https://github.com/pytorch/pytorch/pull/88818 Approved by: https://github.com/soulitzer	2022-11-11 04:02:44 +00:00
Wei-Sheng Chin	86b7aa26f0	Fix FakeTensorProp on Module with Parameters or Buffers (#88700 ) In `FakeTensorMode.__torch_dispatch__`, the output is now always computed by meta kernels in ```python try: with in_kernel_invocation_manager(self): r = func(args, *kwargs) # <----- "r" can be a real tensor. except NotImplementedError as not_implemented_error: # no meta kernel registered, fallback to kernel for the device if not self.allow_fallback_kernels: raise not_implemented_error return run_fallback_kernel(self, func, args, kwargs, not_implemented_error) return self.wrap_meta_outputs_with_default_device_logic(r, func, args, kwargs) ``` For example, I observed a CPU tensor is generated when executing `aten.addmm` when running `FakeTensorProp`. Therefore, I'd like to allow `FakeTensorMode` to wrap real tensor as `FakeTensor` during the computation. Does this PR look a good direction to fix this problem? If yes, I can go ahead and add some tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88700 Approved by: https://github.com/eellison, https://github.com/ezyang	2022-11-11 03:49:29 +00:00
Chien-Chin Huang	c4fc5d372f	[FSDP][state_dict][1/N] Moving state_dict logic to pre_state_dict_hook (#87900 ) This is one step toward the ultimate goal: remove the overwritten state_dict in FSDP. All the logic should be either in `pre_state_dict_hook` or `post_state_dict_hook`. Since current `nn.Module` does not support `pre_state_dict_hook`, this PR mimic `pre_state_dict_hook` by calling the pre hook inside post the hook, effectively ditching all the work done by `nn.Module.state_dict`. Once `pre_state_dict_hook` is supported by `nn.Module`, these pre hook calls can be moved out from the post hooks and be registered to `nn.Module.pre_state_dict_hook`. The major issue of this temporary solution is that `post_state_dict_hook` is called from the leaf node to the root node. This makes the `module._lazy_init()` invalid as FSDP assumes `_lazy_init()` to be called from the root. As a result, `FSDP.state_dict` currently contains only one logic -- calling `module._lazy_init()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/87900 Approved by: https://github.com/rohan-varma	2022-11-11 03:41:40 +00:00

1 2 3 4 5 ...

53731 Commits