pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Taylor Robie	e0a071a47e	[Profiler] Abstract interface for Python tracer Pull Request resolved: https://github.com/pytorch/pytorch/pull/77699 The current machinery to connect libtorch to libtorch_python for profiling is... meh. Adequite for separate components that mostly just need to send a trigger, but not really clean. This PR makes an abstract interface class that the python tracer subclasses so the profiler can actually get at the tracer singleton, albeit through a restricted interface. This will help fold Python tracing into the new unified event structure. Differential Revision: [D36325739](https://our.internmc.facebook.com/intern/diff/D36325739/) Approved by: https://github.com/aaronenyeshi	2022-05-25 16:11:01 +00:00
Taylor Robie	34d160b1fa	[Profiler] Build call tree in `collection.cpp` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77698 This PR adds tree building to the post processing of profiler. The basic algorithm is to sort the events, maintain a stack and a priority queue of event ends, and push/pop accordingly. The logic for merging Python events is still separate in `profiler_kineto.cpp`. That can be removed when Python events have an `EventType`. Differential Revision: [D36321105](https://our.internmc.facebook.com/intern/diff/D36321105/) Approved by: https://github.com/aaronenyeshi	2022-05-25 16:11:01 +00:00
PyTorch MergeBot	87148f2b59	Revert "[quant] Add utility function get_fqn_to_example_inputs" This reverts commit `50a44fe461`. Reverted https://github.com/pytorch/pytorch/pull/78146 on behalf of https://github.com/suo due to as it broke master	2022-05-25 06:37:32 +00:00
kshitij12345	17c1aed2b5	remove torch.no_grad from sample_inputs (#78076 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/78076 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-05-25 06:10:32 +00:00
Jerry Zhang	50a44fe461	[quant] Add utility function get_fqn_to_example_inputs Summary: After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`. This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs Example Call: ``` example_inputs = (tensor0,) get_fqn_to_example_inputs(m, example_inputs) ``` Example output: ``` { "linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ... } ``` Test Plan: python test/test_quantization.py TestUtils Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146 Approved by: https://github.com/vkuzo	2022-05-25 03:07:16 +00:00
Edward Z. Yang	a1765f0176	addr ref Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/78014 Approved by: https://github.com/ngimel	2022-05-25 01:40:11 +00:00
Xinfeng Xie	72a4f6773d	Add an argument to specify warmup iterations (#78124 ) Summary: Add an argument to specify the number of warmup iterations to the API ``torch.cuda.make_graphed_callables``. By default, it needs 3 warm-up iterations. To work with NCCL, it needs 11 warm-up iterations. Differential Revision: D36606758 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78124 Approved by: https://github.com/jianyuh	2022-05-25 01:21:15 +00:00
PyTorch MergeBot	d450034f24	Revert "Beta function (#78031 )" This reverts commit `da16450360`. Reverted https://github.com/pytorch/pytorch/pull/78031 on behalf of https://github.com/suo due to broke trunk, see the above message	2022-05-24 22:55:06 +00:00
Justin Chu	161e931156	[ONNX] Modernize python syntax (#77935 ) Use pyupgrade(https://github.com/asottile/pyupgrade) and flynt to modernize python syntax ```sh pyupgrade --py36-plus --keep-runtime-typing torch/onnx/*/.py pyupgrade --py36-plus --keep-runtime-typing test/onnx/*/.py flynt torch/onnx/ --line-length 120 ``` - Use f-strings for string formatting - Use the new `super()` syntax for class initialization - Use dictionary / set comprehension Pull Request resolved: https://github.com/pytorch/pytorch/pull/77935 Approved by: https://github.com/BowenBao	2022-05-24 22:52:37 +00:00
soulitzer	f3af51069d	Modernize LoggingTensorMode Pull Request resolved: https://github.com/pytorch/pytorch/pull/77667 Approved by: https://github.com/malfet	2022-05-24 22:41:49 +00:00
soulitzer	588826b389	Fix gradcheck when outputs that don't require grad precede those that do Pull Request resolved: https://github.com/pytorch/pytorch/pull/77743 Approved by: https://github.com/malfet	2022-05-24 22:41:49 +00:00
Brian Hirsh	07e4533403	reland of as_strided support for functionalization; introduce as_strided_scatter This reverts commit `a95f1edd85`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199 Approved by: https://github.com/ezyang	2022-05-24 22:40:44 +00:00
Elias Ellison	2d93e1fada	Add slow path for device Pull Request resolved: https://github.com/pytorch/pytorch/pull/77684 Approved by: https://github.com/ezyang	2022-05-24 21:56:01 +00:00
Sherlock Huang	6db8440f35	Python Jiterator supports multiple outputs (#78139 ) This PR is part3. Part1: https://github.com/pytorch/pytorch/pull/77902 Part2: https://github.com/pytorch/pytorch/pull/77921 Python Jiterator now supports returning multiple outputs ``` fn = torch.cuda.jiterator._create_multi_output_jit_fn( """ template <typename T> T binary_2outputs(T i0, T i1, T& out0, T& out1) { out0 = i0 + i1; out1 = i0 - i1; } """, num_outputs=2) x = torch.rand(3, device='cuda') y = torch.rand(3, device='cuda') out0, out1 = fn(x, y) torch.allclose(out0, x+y) torch.allclose(out1, x-y) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78139 Approved by: https://github.com/ngimel	2022-05-24 21:52:56 +00:00
PyTorch MergeBot	b994ce359e	Revert "[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002 )" This reverts commit `c274f2ad52`. Reverted https://github.com/pytorch/pytorch/pull/77002 on behalf of https://github.com/malfet due to please, as it breaks internal CI, but also no CUDA heads should be included from `torch/csrc/Module.cpp`, but rather should be implemented/registered in `torch/csrc/cuda/Module.cpp`	2022-05-24 21:52:35 +00:00
Allen Goodman	da16450360	Beta function (#78031 ) Euler beta function: ```Python torch.special.beta(input, other, *, out=None) → Tensor ``` `reentrant_gamma` and `reentrant_ln_gamma` implementations (using Stirling’s approximation) are provided. I started working on this before I realized we were missing a gamma implementation (despite providing incomplete gamma implementations). Uses the coefficients computed by Steve Moshier to replicate SciPy’s implementation. Likewise, it mimics SciPy’s behavior (instead of the behavior in Cephes). Pull Request resolved: https://github.com/pytorch/pytorch/pull/78031 Approved by: https://github.com/mruberry	2022-05-24 21:07:25 +00:00
Aidyn-A	f37ce948ff	add bfloat16 support for kl_div_backward_cuda (#77676 ) This PR adds a feature requested in issue #77375. `kl_div_backward_cuda` now supports `bfloat16` cc @ngimel @ptrblck @rosrad Pull Request resolved: https://github.com/pytorch/pytorch/pull/77676 Approved by: https://github.com/jbschlosser	2022-05-24 20:46:30 +00:00
PyTorch MergeBot	a95f1edd85	Revert "as_strided support for functionalization; introduce as_strided_scatter" This reverts commit `3a921f2d26`. Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master `3a921f2d26`. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them	2022-05-24 20:19:12 +00:00
Scott Wolchok	c083489f46	[kineto] Optimize getStepCallbacks for common case of no active callbacks Pull Request resolved: https://github.com/pytorch/pytorch/pull/77804 IIUC, the result of this function will be empty and unused if there are no sampled callbacks, which is the common case. We can accelerate this case by wrapping the result in an optional to save initializing an empty SmallVector. Differential Revision: [D36497279](https://our.internmc.facebook.com/intern/diff/D36497279/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36497279/)! Approved by: https://github.com/robieta	2022-05-24 19:38:01 +00:00
Antonio Kim	02c4d877b4	Codegen Non-Native IR Nodes (#76535 ) Add codegen infrastructure to generate IR nodes for non-native ops. The proposed change is to add a `non_native` key to the `{backend}_native_functions.yaml` file that contains schema definitions similar to what is found in `native_functions.yaml`. e.g. ``` non_native: ... - func: expand(Tensor input, int[] size, bool is_scalar_expand) -> Tensor ... ``` these definitions are parsed into a `LazyIrSchema` that can be used for generating IR nodes using `GenLazyIR`. Fixes #74628 CC: @wconstab @desertfire @henrytwo Pull Request resolved: https://github.com/pytorch/pytorch/pull/76535 Approved by: https://github.com/wconstab	2022-05-24 19:29:23 +00:00
kshitij12345	ab5e6f0915	[chalf] enable testing for multiple ops (#78171 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/78171 Approved by: https://github.com/ngimel	2022-05-24 19:11:10 +00:00
Brian Hirsh	3a921f2d26	as_strided support for functionalization; introduce as_strided_scatter Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128 Approved by: https://github.com/ezyang	2022-05-24 18:20:31 +00:00
lezcano	0c8c39fa71	Fix derivatives of norm(p=inf) Following up on https://github.com/pytorch/pytorch/pull/51099#discussion_r583323915, we fix these derivatives, as they were incorrect until now. As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be. Fixes https://github.com/pytorch/pytorch/issues/67517 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78105 Approved by: https://github.com/ngimel	2022-05-24 17:16:16 +00:00
Kshiteej K	664bb4de49	[composite compliance] backward: cummin, cummax (#77872 ) Reference : https://github.com/pytorch/pytorch/issues/69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77872 Approved by: https://github.com/zou3519	2022-05-24 17:10:09 +00:00
PyTorch MergeBot	821c711baf	Revert "Move THPStorage definitions out of `torch/csrc/generic` (#78032 )" This reverts commit `f012152836`. Reverted https://github.com/pytorch/pytorch/pull/78032 on behalf of https://github.com/suo due to This broke windows binary builds, see: `f012152836`	2022-05-24 16:37:35 +00:00
PyTorch MergeBot	ee4034ed0d	Revert "masked logsumexp/logaddexp" This reverts commit `49e15b578a`. Reverted https://github.com/pytorch/pytorch/pull/77876 on behalf of https://github.com/suo due to This broke master by adding a weird file, you attempted to delete it by committing on top of the head branch but that's not how ghstack works	2022-05-24 16:12:35 +00:00
Khushi Agrawal	6f4d200725	[complex32, jiterator] sin, asin (#77606 ) Follows #74537 and #74748 cc @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77606 Approved by: https://github.com/kshitij12345, https://github.com/ngimel	2022-05-24 16:08:44 +00:00
Natalia Gimelshein	4ea176ea57	expose fast get_current_stream (#78165 ) Expose fast no-frills version of getting raw `cudaStream_t` in python (200 ns instead of 4 us) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78165 Approved by: https://github.com/SherlockNoMad, https://github.com/soumith, https://github.com/gchanan	2022-05-24 15:54:47 +00:00
George Qi	49e15b578a	masked logsumexp/logaddexp Pull Request resolved: https://github.com/pytorch/pytorch/pull/77876 Approved by: https://github.com/cpuhrsch	2022-05-24 15:33:59 +00:00
sliorde	9b8abff4ac	fix typo in docstring of `Transformer.forward()` (#78167 ) Fixed the word "decode" to be "decoder". Pull Request resolved: https://github.com/pytorch/pytorch/pull/78167 Approved by: https://github.com/jbschlosser	2022-05-24 14:27:42 +00:00
kshitij12345	f9e346d5ac	[opinfo] tranpose_conv, conv, adaptive_{max, avg}_pool unbatched samples (#73002 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/73002 Approved by: https://github.com/jbschlosser	2022-05-24 14:26:11 +00:00
Kurt Mohler	f012152836	Move THPStorage definitions out of `torch/csrc/generic` (#78032 ) Fixes #77908 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78032 Approved by: https://github.com/ezyang	2022-05-24 13:42:14 +00:00
Nikita Shulga	6244daa6a9	[MPS] Fix torch.mps.is_available() (#78121 ) By introducing `at:mps::is_available()` and changing `torch._C._is_mps_available` from property to memoizable callable Also, if `_mtl_device` is released in MPSDevice destructor, shouldn't it be retained in the constructor Looks like GitHubActions Mac runner does not have any Metal devices available, according to https://github.com/malfet/deleteme/runs/6560871657?check_suite_focus=true#step:3:15 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78121 Approved by: https://github.com/albanD	2022-05-24 05:10:38 +00:00
Wanchao Liang	8eb62bd7ba	[shard] make ShardedTensor a torch.Tensor subclass This is the reland of PR https://github.com/pytorch/pytorch/pull/74695, which was reverted due to some internal failures. It also removes the ShardedTensorInterface change, we will delay that change later if we found there's a need to do that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78027 Approved by: https://github.com/pritamdamania87, https://github.com/fduwjj	2022-05-24 01:20:45 +00:00
Eddie Yan	c274f2ad52	[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002 ) (reopening due to botched merge) The cuDNN V8 API (main support merged in https://github.com/pytorch/pytorch/pull/60755) potentially exposes many more kernels with benchmark=True. While these additional kernels can improve performance, it is often unnecessary to run every kernel returned by the heuristic and doing so may degrade the user experience by causing the first model iteration to be very slow. To alleviate this issue, this PR introduces torch.backends.cudnn.benchmark_limit. benchmark_limit specifies the maximum number of working cuDNN kernels to try for a given workload, with the default being 10 (similar to what TensorFlow does). benchmark_limit = 0 yields the current behavior of trying every kernel returned by the heuristic. CC @ptrblck @ngimel @xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77002 Approved by: https://github.com/ngimel	2022-05-24 00:11:47 +00:00
mikeiovine	2ae3c59e4b	[SR] Remove linear/relu fusion Pull Request resolved: https://github.com/pytorch/pytorch/pull/77620 Apparently, this is not implemented in fbgemm, so it's strictly worse than using NNC. Differential Revision: [D36431811](https://our.internmc.facebook.com/intern/diff/D36431811/) Approved by: https://github.com/hlu1	2022-05-23 21:46:27 +00:00
Ryan Spring	bb4653e736	Add i0, i1, zeta refs (#78111 ) Add reference implementations for i0, i1, zeta Add prim operations for i0, i1, zeta Pull Request resolved: https://github.com/pytorch/pytorch/pull/78111 Approved by: https://github.com/mruberry	2022-05-23 21:33:56 +00:00
Khushi Agrawal	a136408ada	[complex32, jiterator] tan, atan (#77802 ) Follows #74537 and #74748 cc @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77802 Approved by: https://github.com/kshitij12345, https://github.com/ngimel	2022-05-23 21:01:19 +00:00
yuguo68	c186250d95	raise error when groups is not positive in Conv modules Pull Request resolved: https://github.com/pytorch/pytorch/pull/77919 Approved by: https://github.com/jbschlosser	2022-05-23 20:35:00 +00:00
Ilya Persky	317d601e8d	Fix docstring for nn.Hardswish (#70993 ) Fixes nn.Hardswish's docstring problem reported at #70498. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70993 Approved by: https://github.com/jbschlosser	2022-05-23 18:52:19 +00:00
Horace He	ea5d01e629	[Primtorch] Tried porting leaky_relu into a ref (#78041 ) Feels good to delete it from `torch._decomps`. This is mainly to clarify the process for me - Seems like there's still some components missing of the `torch <-> refs` mapping? For example, seems like methods don't work yet for mapping from torch <-> refs, and neither do the meta tests? (cc: @ezyang). If I replace the `torch` with `refs`, then the tests seem to pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78041 Approved by: https://github.com/mruberry	2022-05-23 18:00:21 +00:00
Justin Chu	652ecc9ad9	[ONNX] Fix typo when comparing DeviceObjType (#78085 ) #77423 Introduced a typo in `1db9be70a7/torch/onnx/symbolic_opset9.py (L5012-L5017)` where the string `DeviceObjType` was replaced with `_C.DeviceObjType`. This PR reverts the changes to the strings. Tested: With torchvision, ``` pytest test/test_onnx.py::TestONNXExporter::test_mask_rcnn pytest -n auto test/test_onnx.py::TestONNXExporter ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78085 Approved by: https://github.com/datumbox, https://github.com/BowenBao, https://github.com/ezyang	2022-05-23 17:29:36 +00:00
Jeff Daily	9aed30d3ad	[ROCm] support benchmark flag for MIOpen (#77438 ) Fixes #68172. Generally, this corrects multiple flaky convolution unit test behavior seen on ROCm. The MIOpen integration has been forcing benchmark=True when calling `torch._C._set_cudnn_benchmark(False)`, typically called by `torch.backends.cudnn.set_flags(enabled=True, benchmark=False)`. We now add support for MIOpen immediate mode to avoid benchmarking during MIOpen solution selection. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77438 Approved by: https://github.com/ngimel, https://github.com/malfet	2022-05-23 17:10:24 +00:00
vitrioil	b2d1104471	Fixed numpy bool check (#77857 ) Fixes #75704 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77857 Approved by: https://github.com/jbschlosser	2022-05-23 15:42:16 +00:00
Mike Ruberry	2738405a76	[primTorch] Adds any, all, equal, item references (#78072 ) This PR adds the item, equal, any, and all references. While doing this I found the following issues: - https://github.com/pytorch/pytorch/issues/78070 - https://github.com/pytorch/pytorch/issues/78071 And I fixed a bug where the `convert_element_type` prim could not convert tensors requiring grad to datatypes that don't require grad. Creating the item reference required adding item as a prim, but per @ngimel's suggestion I removed the prims for any and all and implemented them as references, so this is net negative one prim. Reference OpInfos are added for any and all, but item and equal don't even have regular OpInfos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78072 Approved by: https://github.com/ngimel	2022-05-23 12:49:04 +00:00
Kshiteej K	88fca3be59	[reland][complex32] conv1d, conv2d : enable test (#77999 ) Reland: #77239 Ref: #74537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77999 Approved by: https://github.com/anjali411	2022-05-23 05:49:03 +00:00
kshitij12345	2676931d3e	[composite compliance] forward_ad: linear (#77950 ) Reference: #69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77950 Approved by: https://github.com/soulitzer	2022-05-23 05:43:53 +00:00
Natalia Gimelshein	141ea86c33	reduce overhead of get_current_stream (#78066 ) This reduces overhead of `torch.cuda.current_stream()` from ridiculous 8.7 us to still ridiculous 4.4 us. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78066 Approved by: https://github.com/mruberry	2022-05-23 03:14:01 +00:00
Mike Ruberry	d4345ed0a6	[primTorch] Adds random operations (#78026 ) This PR... Issues Found - https://github.com/pytorch/pytorch/issues/78058 - https://github.com/pytorch/pytorch/issues/78054 - https://github.com/pytorch/pytorch/issues/78053 - https://github.com/pytorch/pytorch/issues/78050 - https://github.com/pytorch/pytorch/issues/77932 Testing - disables stride consistency checks in test_ops and test_meta pending resolution of https://github.com/pytorch/pytorch/issues/78050 - skips chalf in reference tests (addressing https://github.com/pytorch/pytorch/issues/78054) - splits test test_python_reference_consistency in one test for the ctx where torch.foo is torch.foo, and another for when torch.foo is refs.foo - updates test names to be more natural and consistent: - test_python_reference_errors -> test_python_ref_errors - test_python_reference_consistency -> test_python_ref and test_python_ref_torch_fallback - test_python_reference_meta_functions -> test_python_ref_meta - test_reference_testing -> test_numpy_ref - updates test_python_ref and test_python_ref_torch_fallback to check that the reference is more accurate than the torch op if the reference and torch op results are not close, a warning is raised when this occurs (addressing https://github.com/pytorch/pytorch/issues/77687) - adds reference inputs for broadcast_tensors - Updates the "fill_" OpInfo to "fill", adding a NumPy reference and making it an elementwise unary operator - Adds 1D no element sample inputs to the cat OpInfo and updates the NumPy reference to handle them and type promotion correctly - Adds reference inputs for elementwise ternary operations, like clamp - Adds a NumPy reference for clamp - Adds reference inputs to where's OpInfo - Makes softplus an elementwise unary OpInfo - Removes the great majority of Python reference OpInfo skips and xfails due to the above test changes - Adds Python reference OpInfos for fill, dropout, clamp, broadcast_tensors, and where Prims - adds the fill, empty_strided, and uniform prims - removes the empty, empty_like, full, and full_like prims -- these are now references that use empty_strided and fill - renames the "concatenate" and "select" prims to "cat" and "where", respectively, to be consistent with PyTorch - extends the `_elementwise_meta` operation to accepts tensors that don't participate in type promotion, like the `cond` tensor in `where` - fixes a bug in the stride propagation of broadcast_in_dim - moves some error checks from prims.cat to prims.where to refs.cat and refs.where, respectively, consistent with our new policy of doing as much error checking in the ref as possible Utils - adds the canoicalize_device, extract_shape, and extract_shape_from_varargs helpers - adds the elementwise_unary_scalar_wrapper -- this allows elementwise unary operators to take and return scalar values (ex. refs.sin(1) will return .84...) Refs - adds the fill, broadcast_tensors, clamp, empty_strided, ones, zeros, and uniform references - adds the nn.functional.dropout reference - fixes refs.cat to handle 1D tensors with no inputs consistent with eager mode Pull Request resolved: https://github.com/pytorch/pytorch/pull/78026 Approved by: https://github.com/ngimel	2022-05-23 01:56:28 +00:00
jjsjann123	735ab79168	Static initializer update (#78052 ) Code cleaning, call_once on a static initializer shouldn't be needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78052 Approved by: https://github.com/suo	2022-05-22 23:14:03 +00:00
Taylor Robie	d7680cb7f0	[Profiler][Trivial] Switch to nanoseconds for Result's internal representation Pull Request resolved: https://github.com/pytorch/pytorch/pull/77697 Certain steps in building the call tree rely on sorting, so we want to retain as much precision as possible. `profiler_kineto.cpp` and KinetoEvent still use microseconds. Differential Revision: [D36302563](https://our.internmc.facebook.com/intern/diff/D36302563/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:13 +00:00
Taylor Robie	e17f14fab2	[Profiler] Propagate metadata into `Engine::evaluate_function` event. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77696 https://github.com/pytorch/pytorch/pull/63619 added a RECORD_FUNCTION guard to make calls to `Engine::evaluate_function` visible regardless of the underlying op. While useful, this creates a call that looks like a forward call that somewhat complicates stitching forward and backward ops. I don't want to add complexity (and therefore work) on the hot path; instead it's fairly straightforward to stitch things back together in post. This PR simply propagates sequence number and forward tid info up to the `evaluate_function` event. Differential Revision: [D36302562](https://our.internmc.facebook.com/intern/diff/D36302562/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:13 +00:00
Taylor Robie	71b94b09ae	[Profiler][Trivial] Force Result to be a shared_ptr Pull Request resolved: https://github.com/pytorch/pytorch/pull/77695 A lot of the graph manipulation in later changes will rely on the ability to hold stable references, both in C++ and Python. Differential Revision: [D36302564](https://our.internmc.facebook.com/intern/diff/D36302564/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:12 +00:00
Taylor Robie	33dc5d1a39	[Profiler] Move Allocation into EventType. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77694 Continuing the trend of unification, this PR removes the special path for allocation tracking. The overall delta is pretty minimal; it's mostly just extending and unifying visitors. This also comes with the added benefit that memory profiling now gets to take advantage of the lock-free machinery. Differential Revision: [D36189043](https://our.internmc.facebook.com/intern/diff/D36189043/) Approved by: https://github.com/aaronenyeshi	2022-05-22 22:39:11 +00:00
PyTorch MergeBot	acfbc16b1c	Revert "[primTorch] Adds random operations (#78026 )" This reverts commit `043cf1f9c7`. Reverted https://github.com/pytorch/pytorch/pull/78026 on behalf of https://github.com/suo due to This broke trunk: `043cf1f9c7`	2022-05-22 18:11:14 +00:00
Taylor Robie	59be76c6cf	[Profiler] Introduce `torch::profiler::impl::EventType` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77693 Right now the profiler internals are rather ad-hoc and disjoint. As we move towards a unified experience this needs to be addressed. This PR adds an enum specifying the various types of events that can be profiled and specializes the `ExtraFields` struct on the values of the `EventType` enum. This lets us punt more of the heterogeneity onto the type system and allows a caller to simply think in terms of `ExtraFields<EventType::...>`. (No more "X field is always present but only makes sense for Y". e.g. inputs) For now only ops and backend events are transitioned since they are already in a weird union state. Changes planned for subsequent diffs in the stack: 1) Allocations 2) Python tracer events 3) Kineto (e.g. Cupti) events 4) Use unified event type for more post processing One rather pleasant observation was that this change exposed several minor bugs in the current implementation: 1) We just didn't plumb `end_thread_id_` from `OpEvent` to `Result`. Switching to using ctors rather than setting fields in `getRecords` fixes this. 2) We were calling `fn.threadId()` to get start TID, but that is wasteful because it is already stored in the `ThreadLocalSubqueue`. So that gives me some confidence that this is a step in the right direction. Differential Revision: [D36189044](https://our.internmc.facebook.com/intern/diff/D36189044/) NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36189044/)! Approved by: https://github.com/aaronenyeshi	2022-05-22 16:26:23 +00:00
Mike Ruberry	043cf1f9c7	[primTorch] Adds random operations (#78026 ) This PR... Issues Found - https://github.com/pytorch/pytorch/issues/78058 - https://github.com/pytorch/pytorch/issues/78054 - https://github.com/pytorch/pytorch/issues/78053 - https://github.com/pytorch/pytorch/issues/78050 - https://github.com/pytorch/pytorch/issues/77932 Testing - disables stride consistency checks in test_ops and test_meta pending resolution of https://github.com/pytorch/pytorch/issues/78050 - skips chalf in reference tests (addressing https://github.com/pytorch/pytorch/issues/78054) - splits test test_python_reference_consistency in one test for the ctx where torch.foo is torch.foo, and another for when torch.foo is refs.foo - updates test names to be more natural and consistent: - test_python_reference_errors -> test_python_ref_errors - test_python_reference_consistency -> test_python_ref and test_python_ref_torch_fallback - test_python_reference_meta_functions -> test_python_ref_meta - test_reference_testing -> test_numpy_ref - updates test_python_ref and test_python_ref_torch_fallback to check that the reference is more accurate than the torch op if the reference and torch op results are not close, a warning is raised when this occurs (addressing https://github.com/pytorch/pytorch/issues/77687) - adds reference inputs for broadcast_tensors - Updates the "fill_" OpInfo to "fill", adding a NumPy reference and making it an elementwise unary operator - Adds 1D no element sample inputs to the cat OpInfo and updates the NumPy reference to handle them and type promotion correctly - Adds reference inputs for elementwise ternary operations, like clamp - Adds a NumPy reference for clamp - Adds reference inputs to where's OpInfo - Makes softplus an elementwise unary OpInfo - Removes the great majority of Python reference OpInfo skips and xfails due to the above test changes - Adds Python reference OpInfos for fill, dropout, clamp, broadcast_tensors, and where Prims - adds the fill, empty_strided, and uniform prims - removes the empty, empty_like, full, and full_like prims -- these are now references that use empty_strided and fill - renames the "concatenate" and "select" prims to "cat" and "where", respectively, to be consistent with PyTorch - extends the `_elementwise_meta` operation to accepts tensors that don't participate in type promotion, like the `cond` tensor in `where` - fixes a bug in the stride propagation of broadcast_in_dim - moves some error checks from prims.cat to prims.where to refs.cat and refs.where, respectively, consistent with our new policy of doing as much error checking in the ref as possible Utils - adds the canoicalize_device, extract_shape, and extract_shape_from_varargs helpers - adds the elementwise_unary_scalar_wrapper -- this allows elementwise unary operators to take and return scalar values (ex. refs.sin(1) will return .84...) Refs - adds the fill, broadcast_tensors, clamp, empty_strided, ones, zeros, and uniform references - adds the nn.functional.dropout reference - fixes refs.cat to handle 1D tensors with no inputs consistent with eager mode Pull Request resolved: https://github.com/pytorch/pytorch/pull/78026 Approved by: https://github.com/ngimel	2022-05-22 10:06:24 +00:00
Natalia Gimelshein	192aa3ad5f	adds std and var refs and var prim (#77948 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/77948 Approved by: https://github.com/mruberry	2022-05-22 04:01:21 +00:00
kshitij12345	5f1b0a4f48	[primTorch] add exp2 (prim and ref), log10 (prim and ref), frac (ref) (#78046 ) Adds `exp2`, `log10` to the prims (both also exist in C++ lib and Intel SIMD intrinsic has `exp2`) Adds `exp2`, `log10`, `frac` to refs with corresponding entries to OpInfo. Tried to decompose `exp2` (before adding it as prim) as * `exp(log(2) * x)` but it wasn't stable at large numbers. * `pow(2, x)` in which case there was stride mismatch. At cursory look, `pow` tries to preserve stride of first arg if possible. Tried to decompose `log10` (before adding it as prim) as * `log(x) / log(10)` passed for real dtypes. Failed for complex at extremals. Probably related to https://github.com/pytorch/pytorch/issues/52332 (not a 100% sure) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78046 Approved by: https://github.com/mruberry	2022-05-22 03:43:54 +00:00
Kshiteej K	57fab66fdc	[primTorch] add refs fliplr, flipud (#78049 ) Add refs for `fliplr, flipud` with corresponding OpInfo entries. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78049 Approved by: https://github.com/mruberry	2022-05-22 01:04:01 +00:00
pritam	37eb31599c	[reland] Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77987 ) 1. Enabled multigpu tests. 2. Fixed failing multigpu tests. 3. Fixed custom operator decorator to be first preference in operator dispatch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77987 Approved by: https://github.com/fduwjj, https://github.com/wanchaol, https://github.com/janeyx99	2022-05-21 22:33:58 +00:00
Jerry Zhang	416899d1a9	[quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249 ) (#77608 ) Summary: X-link: https://github.com/facebookresearch/d2go/pull/249 X-link: https://github.com/fairinternal/ClassyVision/pull/104 X-link: https://github.com/pytorch/benchmark/pull/916 X-link: https://github.com/facebookresearch/ClassyVision/pull/791 X-link: https://github.com/facebookresearch/mobile-vision/pull/68 FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors. Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base. As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now. If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to pass the arguments by keyword BC-breaking Note: Before: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict) # or m = prepare_qat_fx(m, qconfig_dict) ``` After: ```python m = resnet18(...) m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) # or m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),)) ``` Test Plan: python test/test_quantization.py TestQuantizeFx python test/test_quantization.py TestQuantizeFxOps python test/test_quantization.py TestQuantizeFxModels Imported from OSS Static Docs Preview: classyvision \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)\| \|Modified Pages\| Reviewed By: vkuzo, andrewor14 Differential Revision: D35984526 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608 Approved by: https://github.com/dzdang	2022-05-21 21:03:48 +00:00
Horace He	4428218945	[primtorch] Added `native_group_norm` decomp (#78029 ) cc: @jansel @bertmaher More or less identical in spirit to the layer norm and batch norm ones. One annoying thing about all 3 of these is that layer_norm has slightly different `mean/var` semantics than batch norm and group norm. After normalization, `layer_norm` keeps them unsqueezed (so they're something like [1, 5, 1, 1]) while batch norm and group norm squeeze out the 1-dims. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78029 Approved by: https://github.com/bertmaher	2022-05-21 08:07:02 +00:00
Taylor Robie	580e6583d5	[Profiler] Fix segfault in AppendOnlyList Pull Request resolved: https://github.com/pytorch/pytorch/pull/77997 `buffer_last_` is supposed to start at buffer_.before_begin(). It is correctly set in the ctor, but incorrectly set in `clear()`. This causes a segfault in `maybe_grow()` (Specifically, `buffer_.emplace_after(buffer_last_)`) for an AppendOnlyList which has been cleared. Differential Revision: [D36555737](https://our.internmc.facebook.com/intern/diff/D36555737/) Approved by: https://github.com/aaronenyeshi	2022-05-21 02:38:26 +00:00
Taylor Robie	673346b350	[Profiler] Pop `KinetoThreadLocalState` at the start of post processing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77996 An issue recently surfaced internally which highlighted the fact that removing KinetoThreadLocalState from the TLS at the end of post processing means that we are profiling memory during post processing. (Which violates a whole bunch of invariants in the system.) This change switches the global profiling ctx to a shared_ptr, introduces a class to manage it (`init`, `get`, and `pop` methods) and moves the `pop` call to the beginning of `disableProfiler`. Differential Revision: [D36555738](https://our.internmc.facebook.com/intern/diff/D36555738/) Approved by: https://github.com/aaronenyeshi	2022-05-21 02:38:26 +00:00
Edward Z. Yang	6b273444c4	Add logit ref; allow non-refs to be called in refs. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77816 Approved by: https://github.com/mruberry	2022-05-21 02:35:14 +00:00
Horace He	50cadfae10	Add strictness check and made tensors into leaves if input tensors were leaves (#77474 ) I think this makes sense to do? Otherwise, if you call `backward()` in your traced function, you can't get gradients out of any tensors that should have been leaves. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77474 Approved by: https://github.com/ezyang	2022-05-21 01:16:39 +00:00
John Clow	c82fb7a67f	Adding support for upper and lower bound functions in SSA Pull Request resolved: https://github.com/pytorch/pytorch/pull/77389 Approved by: https://github.com/eellison	2022-05-20 23:58:40 +00:00
Han Qi (qihqi)	9432be9b8c	[flatbuffer] Move saving storage to the last step. (#78024 ) Summary: Move storage saving to last step, because otherwise tensors saved after storage are already saved will not have storage. Test Plan: Tested by loading the file in `clowder get GLDGLQnKrIsQFg8DAPxq9vg59ZwZbmQwAAAA orig.pt` and converting to flatbuffer and load again Differential Revision: D36552645 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78024 Approved by: https://github.com/Jack-Khuu	2022-05-20 23:48:44 +00:00
Horace He	64b4bb4b01	Fix meta tests on norm (and relanding norm fixes) (#77930 ) Had a land race with meta tests. Will also be relanding https://github.com/pytorch/pytorch/pull/77407 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77930 Approved by: https://github.com/malfet, https://github.com/ezyang	2022-05-20 23:15:53 +00:00
Alban Desmaison	04ac80c73a	Fix a few issues on assert/double error/legacy constructor (#77966 ) Fixes https://github.com/pytorch/pytorch/issues/77960, https://github.com/pytorch/pytorch/issues/77957, https://github.com/pytorch/pytorch/issues/77781 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77966 Approved by: https://github.com/soulitzer, https://github.com/kulinseth	2022-05-20 20:25:12 +00:00
PyTorch MergeBot	53b30579b7	Revert "[complex32] conv1d, conv2d : enable test (#77239 )" This reverts commit `2d3a6d7274`. Reverted https://github.com/pytorch/pytorch/pull/77239 on behalf of https://github.com/suo due to This broke nvfuser tests on master, see: `2d3a6d7274`	2022-05-20 19:10:42 +00:00
John Clow	dbee7e5499	Adding SSA support for convolution_backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/77283 Approved by: https://github.com/Krovatkin	2022-05-20 18:39:47 +00:00
PyTorch MergeBot	0f74b44f1a	Revert "Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77825 )" This reverts commit `8d4c8df33a`. Reverted https://github.com/pytorch/pytorch/pull/77825 on behalf of https://github.com/janeyx99 due to as it will break multigpu test reporting	2022-05-20 17:59:03 +00:00
Zafar	9d44b3d110	[quant][refactor] Remove the base class from __all__ In general, if we are expecting the users to use the base class, such as `_ConvNd`, we should rename it to something like `BaseConv`. However, because this base class is only used inside of the AO packages, there is no need to expose it to the users. Test Plan: ``` python test/test_quantization.py python test/test_module_init.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77344 Approved by: https://github.com/jerryzh168	2022-05-20 17:56:22 +00:00
pritam	8d4c8df33a	Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77825 ) 1. Enabled multigpu tests. 2. Fixed failing multigpu tests. 3. Fixed custom operator decorator to be first preference in operator dispatch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77825 Approved by: https://github.com/wanchaol, https://github.com/fduwjj	2022-05-20 16:53:27 +00:00
kshitij12345	2d3a6d7274	[complex32] conv1d, conv2d : enable test (#77239 ) Ref: #74537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77239 Approved by: https://github.com/anjali411	2022-05-20 15:54:30 +00:00
David Berard	38bc10ae25	retry - enable NVFuser by default Enable NVFuser in OSS. Retry of #77213, because it was breaking torchvision tests. Fix in #77471 has been verified by jjsjann123 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77579 Approved by: https://github.com/eellison, https://github.com/malfet, https://github.com/atalman, https://github.com/seemethere	2022-05-20 14:21:18 +00:00
zrphercule	734a97a7c8	Revert "Revert "Switch to use nested tensor by-default in Transformer… (#77924 ) …Encoder (#77217)"" This reverts commit `0d6fa91d1b`. Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/77924 Approved by: https://github.com/atalman	2022-05-20 11:44:03 +00:00
Philip Meier	dd313d7338	support TestCase.longMessage in TestCase.assertEqual Pull Request resolved: https://github.com/pytorch/pytorch/pull/77602 Approved by: https://github.com/mruberry	2022-05-20 11:09:28 +00:00
Philip Meier	63e9fdd92f	re-add dynamic error messages to assert_close Pull Request resolved: https://github.com/pytorch/pytorch/pull/77601 Approved by: https://github.com/mruberry	2022-05-20 11:09:28 +00:00
kshitij12345	efb2c093fc	[fix] complex type promotion (#77524 ) Fixes https://github.com/pytorch/pytorch/issues/76803 Before Fix: ```python >> a = torch.randn((2, 2), dtype=torch.float) >> b = torch.tensor(1, dtype=torch.cdouble) >> (a + b).dtype torch.complex128 ``` After Fix: ```python >> a = torch.randn((2, 2), dtype=torch.float) >> b = torch.tensor(1, dtype=torch.cdouble) >> (a + b).dtype torch.complex64 ``` Note: This is a BC Breaking change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77524 Approved by: https://github.com/anjali411, https://github.com/mruberry	2022-05-20 10:23:56 +00:00
Michael Andreas Dagitses	6dae1e419e	remove unnecessary ATen/core/Macros.h Pull Request resolved: https://github.com/pytorch/pytorch/pull/76376 This is only used in a few places and only aliases the c10 macros header. Differential Revision: [D35904936](https://our.internmc.facebook.com/intern/diff/D35904936/) Approved by: https://github.com/dreiss, https://github.com/malfet	2022-05-20 09:07:32 +00:00
Nikolay Korovaiko	df1f9b9840	Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#77756 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/77756 Approved by: https://github.com/desertfire	2022-05-20 05:39:03 +00:00
Zafar	a8c929b0a6	[quant] Reordering the imports in the torch/__init__.py Because the AO stuff depends on the torch packages, but very few (if any) torch packages depend on AO, we are moving the imports lower. That will reduce the probability of cyclic imports, as by the time the AO would start importing, the rest of the torch would be already imported. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77065 Approved by: https://github.com/albanD	2022-05-20 03:51:15 +00:00
Drazen Borkovic	f54098cd3e	Create JSON from new FX IR and lower to LLVM (#77765 ) Summary: Replace TensorView objects with maps for JSONing. Lower to LLVM. Reviewed By: jaybean-dev, jfix71 Differential Revision: D36318989 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77765 Approved by: https://github.com/jfix71, https://github.com/jamesr66a	2022-05-20 03:20:57 +00:00
Hao Lu	c60d2ef4eb	[StaticRuntime] Replace Permute with copy version only when it's followed by reshape or flatten (#77832 ) Reviewed By: mikeiovine Differential Revision: D36466622 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77832 Approved by: https://github.com/mikeiovine	2022-05-20 03:14:01 +00:00
PyTorch MergeBot	03546e9c07	Revert "Fixed type promotion semantics for native_batch_norm and native_layer_norm (#77407 )" This reverts commit `70d80fb424`. Reverted https://github.com/pytorch/pytorch/pull/77407 on behalf of https://github.com/malfet due to as it broke meta tests ( I guess due to landrace), see `70d80fb424`	2022-05-20 02:31:57 +00:00
Kurt Mohler	cecb2ad95e	Restore old names for private funcs in legacy storages (#77861 ) Followup from #75459 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77861 Approved by: https://github.com/ezyang	2022-05-20 02:03:34 +00:00
jjsjann123	6583c0384b	fixing trivial reduction & broadcast scheduling (#77884 ) cherry-picked fixes from https://github.com/csarofeen/pytorch/pull/1714 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77884 Approved by: https://github.com/csarofeen, https://github.com/davidberard98	2022-05-20 02:00:42 +00:00
Justin Chu	0d76299ff7	[ONNX] Clean up module imports (#77423 ) Cleaning up onnx module imports to prepare for updating `__init__`. - Simplify importing the `_C` and `_C._onnx` name spaces - Remove alias of the symbolic_helper module in imports - Remove any module level function imports. Import modules instead - Alias `symbilic_opsetx` as `opsetx` - Fix some docstrings Requires: - https://github.com/pytorch/pytorch/pull/77448 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77423 Approved by: https://github.com/BowenBao	2022-05-20 01:56:24 +00:00
Milad Mohammadi	e67284d9ee	Added support for `slogdet` in LazyTensor shape inference (#77904 ) Fixes https://github.com/pytorch/xla/pull/3576 Added support for `slogdet` in LazyTensor shape inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/77904 Approved by: https://github.com/wconstab, https://github.com/JackCaoG	2022-05-20 01:34:56 +00:00
Milad Mohammadi	d6ae650738	Added support for `inverse` in LazyTensor shape inference (#77888 ) Fixes https://github.com/pytorch/xla/pull/3575 Added support for `inverse` in LazyTensor shape inference Pull Request resolved: https://github.com/pytorch/pytorch/pull/77888 Approved by: https://github.com/wconstab	2022-05-20 01:31:13 +00:00
Eli Uriegas	1bec7f8468	torch: Fix black linter Fixes formatting issues when trying to import diff train Signed-off-by: Eli Uriegas <eliuriegasfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77925 Approved by: https://github.com/mehtanirav, https://github.com/osalpekar	2022-05-20 01:14:08 +00:00
Rohit Goswami	c915fbe201	ENH: Convert finfo.tiny to finfo.smallest_normal (#76292 ) Fixes #70909, by a straightforward search and replace discussed in #70909. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76292 Approved by: https://github.com/mruberry	2022-05-20 00:59:48 +00:00
Kurt Mohler	7892a45741	Add missing decref to `createStorageGetType` (#77860 ) Followup from PR #75459 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77860 Approved by: https://github.com/ezyang	2022-05-20 00:53:19 +00:00
Kevin Stephano	11daf200e8	Adding activation references for celu, mish, selu, softplus, and tanh (#77473 ) Adding activation references for celu, softplus, mish, selu. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77473 Approved by: https://github.com/mruberry	2022-05-20 00:47:31 +00:00
Andrew Gu	e69d13b8b3	[FSDP][Easy] Update `state_dict()` docstring Pull Request resolved: https://github.com/pytorch/pytorch/pull/77853 Approved by: https://github.com/rohan-varma	2022-05-19 23:59:03 +00:00
Andrew Gu	d9b3feb27d	[FSDP][Easy] Reword device placement warning Pull Request resolved: https://github.com/pytorch/pytorch/pull/77850 Approved by: https://github.com/rohan-varma	2022-05-19 23:57:40 +00:00
Andrew Gu	36bf8007f7	[FSDP][Easy] Fix `state_dict_type()` docstring example Pull Request resolved: https://github.com/pytorch/pytorch/pull/77848 Approved by: https://github.com/rohan-varma	2022-05-19 23:53:15 +00:00

1 2 3 4 5 ...

21846 Commits