Commit Graph

21846 Commits

Author SHA1 Message Date
Taylor Robie
e0a071a47e [Profiler] Abstract interface for Python tracer
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77699

The current machinery to connect libtorch to libtorch_python for profiling is... meh. Adequite for separate components that mostly just need to send a trigger, but not really clean. This PR makes an abstract interface class that the python tracer subclasses so the profiler can actually get at the tracer singleton, albeit through a restricted interface. This will help fold Python tracing into the new unified event structure.

Differential Revision: [D36325739](https://our.internmc.facebook.com/intern/diff/D36325739/)

Approved by: https://github.com/aaronenyeshi
2022-05-25 16:11:01 +00:00
Taylor Robie
34d160b1fa [Profiler] Build call tree in collection.cpp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77698

This PR adds tree building to the post processing of profiler. The basic algorithm is to sort the events, maintain a stack and a priority queue of event ends, and push/pop accordingly. The logic for merging Python events is still separate in `profiler_kineto.cpp`. That can be removed when Python events have an `EventType`.

Differential Revision: [D36321105](https://our.internmc.facebook.com/intern/diff/D36321105/)

Approved by: https://github.com/aaronenyeshi
2022-05-25 16:11:01 +00:00
PyTorch MergeBot
87148f2b59 Revert "[quant] Add utility function get_fqn_to_example_inputs"
This reverts commit 50a44fe461.

Reverted https://github.com/pytorch/pytorch/pull/78146 on behalf of https://github.com/suo due to as it broke master
2022-05-25 06:37:32 +00:00
kshitij12345
17c1aed2b5 remove torch.no_grad from sample_inputs (#78076)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78076
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-05-25 06:10:32 +00:00
Jerry Zhang
50a44fe461 [quant] Add utility function get_fqn_to_example_inputs
Summary:
After https://github.com/pytorch/pytorch/pull/77608 `example_inputs` is required input for `prepare_fx` and `prepare_qat_fx`.
This makes quantizing submodules harder, so we added this utility function to get a dictionary from fqn to submodule example_inputs

Example Call:

```
example_inputs = (tensor0,)
get_fqn_to_example_inputs(m, example_inputs)
```

Example output:
```
{
   "linear1": (tensor1,),
   "linear2": (tensor2,),
   "sub": (tensor3,),
   "sub.linear1": (tensor4,),
   ...
}
```

Test Plan:
python test/test_quantization.py TestUtils

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78146

Approved by: https://github.com/vkuzo
2022-05-25 03:07:16 +00:00
Edward Z. Yang
a1765f0176 addr ref
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78014

Approved by: https://github.com/ngimel
2022-05-25 01:40:11 +00:00
Xinfeng Xie
72a4f6773d Add an argument to specify warmup iterations (#78124)
Summary: Add an argument to specify the number of warmup iterations to the API ``torch.cuda.make_graphed_callables``. By default, it needs 3 warm-up iterations. To work with NCCL, it needs 11 warm-up iterations.

Differential Revision: D36606758

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78124
Approved by: https://github.com/jianyuh
2022-05-25 01:21:15 +00:00
PyTorch MergeBot
d450034f24 Revert "Beta function (#78031)"
This reverts commit da16450360.

Reverted https://github.com/pytorch/pytorch/pull/78031 on behalf of https://github.com/suo due to broke trunk, see the above message
2022-05-24 22:55:06 +00:00
Justin Chu
161e931156 [ONNX] Modernize python syntax (#77935)
Use pyupgrade(https://github.com/asottile/pyupgrade) and flynt to modernize python syntax

```sh
pyupgrade --py36-plus --keep-runtime-typing torch/onnx/**/*.py
pyupgrade --py36-plus --keep-runtime-typing test/onnx/**/*.py
flynt torch/onnx/ --line-length 120
```

- Use f-strings for string formatting
- Use the new `super()` syntax for class initialization
- Use dictionary / set comprehension
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77935
Approved by: https://github.com/BowenBao
2022-05-24 22:52:37 +00:00
soulitzer
f3af51069d Modernize LoggingTensorMode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77667

Approved by: https://github.com/malfet
2022-05-24 22:41:49 +00:00
soulitzer
588826b389 Fix gradcheck when outputs that don't require grad precede those that do
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77743

Approved by: https://github.com/malfet
2022-05-24 22:41:49 +00:00
Brian Hirsh
07e4533403 reland of as_strided support for functionalization; introduce as_strided_scatter
This reverts commit a95f1edd85.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199

Approved by: https://github.com/ezyang
2022-05-24 22:40:44 +00:00
Elias Ellison
2d93e1fada Add slow path for device
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77684

Approved by: https://github.com/ezyang
2022-05-24 21:56:01 +00:00
Sherlock Huang
6db8440f35 Python Jiterator supports multiple outputs (#78139)
This PR is part3.
Part1: https://github.com/pytorch/pytorch/pull/77902
Part2: https://github.com/pytorch/pytorch/pull/77921

Python Jiterator now supports returning multiple outputs

```
fn = torch.cuda.jiterator._create_multi_output_jit_fn(
"""
template <typename T>
T binary_2outputs(T i0, T i1, T& out0, T& out1) {
    out0 = i0 + i1;
    out1 = i0 - i1;
}
""",
num_outputs=2)

x = torch.rand(3, device='cuda')
y = torch.rand(3, device='cuda')
out0, out1 = fn(x, y)

torch.allclose(out0, x+y)
torch.allclose(out1, x-y)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78139
Approved by: https://github.com/ngimel
2022-05-24 21:52:56 +00:00
PyTorch MergeBot
b994ce359e Revert "[cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002)"
This reverts commit c274f2ad52.

Reverted https://github.com/pytorch/pytorch/pull/77002 on behalf of https://github.com/malfet due to please, as it breaks internal CI, but also no CUDA heads should be included from `torch/csrc/Module.cpp`, but rather should be implemented/registered in `torch/csrc/cuda/Module.cpp`
2022-05-24 21:52:35 +00:00
Allen Goodman
da16450360 Beta function (#78031)
Euler beta function:

```Python
torch.special.beta(input, other, *, out=None) → Tensor
```

`reentrant_gamma` and `reentrant_ln_gamma` implementations (using Stirling’s approximation) are provided. I started working on this before I realized we were missing a gamma implementation (despite providing incomplete gamma implementations). Uses the coefficients computed by Steve Moshier to replicate SciPy’s implementation. Likewise, it mimics SciPy’s behavior (instead of the behavior in Cephes).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78031
Approved by: https://github.com/mruberry
2022-05-24 21:07:25 +00:00
Aidyn-A
f37ce948ff add bfloat16 support for kl_div_backward_cuda (#77676)
This PR adds a feature requested in issue #77375.
`kl_div_backward_cuda` now supports `bfloat16`

cc @ngimel @ptrblck @rosrad

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77676
Approved by: https://github.com/jbschlosser
2022-05-24 20:46:30 +00:00
PyTorch MergeBot
a95f1edd85 Revert "as_strided support for functionalization; introduce as_strided_scatter"
This reverts commit 3a921f2d26.

Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master 3a921f2d26. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them
2022-05-24 20:19:12 +00:00
Scott Wolchok
c083489f46 [kineto] Optimize getStepCallbacks for common case of no active callbacks
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77804

IIUC, the result of this function will be empty and unused if there are no sampled callbacks, which is the common case. We can accelerate this case by wrapping the result in an optional to save initializing an empty SmallVector.

Differential Revision: [D36497279](https://our.internmc.facebook.com/intern/diff/D36497279/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36497279/)!

Approved by: https://github.com/robieta
2022-05-24 19:38:01 +00:00
Antonio Kim
02c4d877b4 Codegen Non-Native IR Nodes (#76535)
Add codegen infrastructure to generate IR nodes for non-native ops.

The proposed change is to add a `non_native` key to the `{backend}_native_functions.yaml` file that contains schema definitions similar to what is found in `native_functions.yaml`. e.g.
```
non_native:
    ...
    - func: expand(Tensor input, int[] size, bool is_scalar_expand) -> Tensor
    ...
```
these definitions are parsed into a `LazyIrSchema` that can be used for generating IR nodes using `GenLazyIR`.

Fixes #74628

CC: @wconstab @desertfire @henrytwo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76535
Approved by: https://github.com/wconstab
2022-05-24 19:29:23 +00:00
kshitij12345
ab5e6f0915 [chalf] enable testing for multiple ops (#78171)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78171
Approved by: https://github.com/ngimel
2022-05-24 19:11:10 +00:00
Brian Hirsh
3a921f2d26 as_strided support for functionalization; introduce as_strided_scatter
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128

Approved by: https://github.com/ezyang
2022-05-24 18:20:31 +00:00
lezcano
0c8c39fa71 Fix derivatives of norm(p=inf)
Following up on https://github.com/pytorch/pytorch/pull/51099#discussion_r583323915, we fix these derivatives, as they were incorrect until now.

As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be.

Fixes https://github.com/pytorch/pytorch/issues/67517

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78105

Approved by: https://github.com/ngimel
2022-05-24 17:16:16 +00:00
Kshiteej K
664bb4de49 [composite compliance] backward: cummin, cummax (#77872)
Reference : https://github.com/pytorch/pytorch/issues/69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77872
Approved by: https://github.com/zou3519
2022-05-24 17:10:09 +00:00
PyTorch MergeBot
821c711baf Revert "Move THPStorage definitions out of torch/csrc/generic (#78032)"
This reverts commit f012152836.

Reverted https://github.com/pytorch/pytorch/pull/78032 on behalf of https://github.com/suo due to This broke windows binary builds, see: f012152836
2022-05-24 16:37:35 +00:00
PyTorch MergeBot
ee4034ed0d Revert "masked logsumexp/logaddexp"
This reverts commit 49e15b578a.

Reverted https://github.com/pytorch/pytorch/pull/77876 on behalf of https://github.com/suo due to This broke master by adding a weird file, you attempted to delete it by committing on top of the head branch but that's not how ghstack works
2022-05-24 16:12:35 +00:00
Khushi Agrawal
6f4d200725 [complex32, jiterator] sin, asin (#77606)
Follows #74537 and #74748

cc @kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77606
Approved by: https://github.com/kshitij12345, https://github.com/ngimel
2022-05-24 16:08:44 +00:00
Natalia Gimelshein
4ea176ea57 expose fast get_current_stream (#78165)
Expose fast no-frills version of getting raw `cudaStream_t` in python (200 ns instead of 4 us)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78165
Approved by: https://github.com/SherlockNoMad, https://github.com/soumith, https://github.com/gchanan
2022-05-24 15:54:47 +00:00
George Qi
49e15b578a masked logsumexp/logaddexp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77876

Approved by: https://github.com/cpuhrsch
2022-05-24 15:33:59 +00:00
sliorde
9b8abff4ac fix typo in docstring of Transformer.forward() (#78167)
Fixed the word "decode" to be "decoder".

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78167
Approved by: https://github.com/jbschlosser
2022-05-24 14:27:42 +00:00
kshitij12345
f9e346d5ac [opinfo] tranpose_conv, conv, adaptive_{max, avg}_pool unbatched samples (#73002)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73002
Approved by: https://github.com/jbschlosser
2022-05-24 14:26:11 +00:00
Kurt Mohler
f012152836 Move THPStorage definitions out of torch/csrc/generic (#78032)
Fixes #77908

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78032
Approved by: https://github.com/ezyang
2022-05-24 13:42:14 +00:00
Nikita Shulga
6244daa6a9 [MPS] Fix torch.mps.is_available() (#78121)
By introducing `at:mps::is_available()` and changing `torch._C._is_mps_available` from property to memoizable callable

Also, if `_mtl_device` is released in MPSDevice destructor, shouldn't it be retained in the constructor

Looks like GitHubActions Mac runner does not have any Metal devices available, according to https://github.com/malfet/deleteme/runs/6560871657?check_suite_focus=true#step:3:15

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78121
Approved by: https://github.com/albanD
2022-05-24 05:10:38 +00:00
Wanchao Liang
8eb62bd7ba [shard] make ShardedTensor a torch.Tensor subclass
This is the reland of PR https://github.com/pytorch/pytorch/pull/74695, which was reverted due to some internal failures.

It also removes the ShardedTensorInterface change, we will delay that
change later if we found there's a need to do that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78027

Approved by: https://github.com/pritamdamania87, https://github.com/fduwjj
2022-05-24 01:20:45 +00:00
Eddie Yan
c274f2ad52 [cuDNN V8 API] (reopen) Allow the number of kernels profiled under torch.backends.cudnn.benchmark = True to be limitedCudnnv8 benchmark limit (#77002)
(reopening due to botched merge)
The cuDNN V8 API (main support merged in https://github.com/pytorch/pytorch/pull/60755) potentially exposes many more kernels with benchmark=True. While these additional kernels can improve performance, it is often unnecessary to run every kernel returned by the heuristic and doing so may degrade the user experience by causing the first model iteration to be very slow. To alleviate this issue, this PR introduces torch.backends.cudnn.benchmark_limit. benchmark_limit specifies the maximum number of working cuDNN kernels to try for a given workload, with the default being 10 (similar to what TensorFlow does). benchmark_limit = 0 yields the current behavior of trying every kernel returned by the heuristic.

CC @ptrblck @ngimel @xwang233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77002
Approved by: https://github.com/ngimel
2022-05-24 00:11:47 +00:00
mikeiovine
2ae3c59e4b [SR] Remove linear/relu fusion
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77620

Apparently, this is not implemented in fbgemm, so it's strictly worse than using NNC.

Differential Revision: [D36431811](https://our.internmc.facebook.com/intern/diff/D36431811/)

Approved by: https://github.com/hlu1
2022-05-23 21:46:27 +00:00
Ryan Spring
bb4653e736 Add i0, i1, zeta refs (#78111)
Add reference implementations for i0, i1, zeta
Add prim operations for i0, i1, zeta
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78111
Approved by: https://github.com/mruberry
2022-05-23 21:33:56 +00:00
Khushi Agrawal
a136408ada [complex32, jiterator] tan, atan (#77802)
Follows #74537 and #74748

cc @kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77802
Approved by: https://github.com/kshitij12345, https://github.com/ngimel
2022-05-23 21:01:19 +00:00
yuguo68
c186250d95 raise error when groups is not positive in Conv modules
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77919

Approved by: https://github.com/jbschlosser
2022-05-23 20:35:00 +00:00
Ilya Persky
317d601e8d Fix docstring for nn.Hardswish (#70993)
Fixes nn.Hardswish's docstring problem reported at #70498.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70993
Approved by: https://github.com/jbschlosser
2022-05-23 18:52:19 +00:00
Horace He
ea5d01e629 [Primtorch] Tried porting leaky_relu into a ref (#78041)
Feels good to delete it from `torch._decomps`. This is mainly to clarify the process for me -

Seems like there's still some components missing of the `torch <-> refs` mapping? For example, seems like methods don't work yet for mapping from torch <-> refs, and neither do the meta tests? (cc: @ezyang).

If I replace the `torch` with `refs`, then the tests seem to pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78041
Approved by: https://github.com/mruberry
2022-05-23 18:00:21 +00:00
Justin Chu
652ecc9ad9 [ONNX] Fix typo when comparing DeviceObjType (#78085)
#77423 Introduced a typo in

1db9be70a7/torch/onnx/symbolic_opset9.py (L5012-L5017)

where the string `DeviceObjType` was replaced with `_C.DeviceObjType`. This PR reverts the changes to the strings.

**Tested:**

With torchvision,

```
pytest test/test_onnx.py::TestONNXExporter::test_mask_rcnn
pytest -n auto test/test_onnx.py::TestONNXExporter
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78085
Approved by: https://github.com/datumbox, https://github.com/BowenBao, https://github.com/ezyang
2022-05-23 17:29:36 +00:00
Jeff Daily
9aed30d3ad [ROCm] support benchmark flag for MIOpen (#77438)
Fixes #68172.  Generally, this corrects multiple flaky convolution unit test behavior seen on ROCm.

The MIOpen integration has been forcing benchmark=True when calling `torch._C._set_cudnn_benchmark(False)`, typically called by `torch.backends.cudnn.set_flags(enabled=True, benchmark=False)`.  We now add support for MIOpen immediate mode to avoid benchmarking during MIOpen solution selection.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77438
Approved by: https://github.com/ngimel, https://github.com/malfet
2022-05-23 17:10:24 +00:00
vitrioil
b2d1104471 Fixed numpy bool check (#77857)
Fixes #75704

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77857
Approved by: https://github.com/jbschlosser
2022-05-23 15:42:16 +00:00
Mike Ruberry
2738405a76 [primTorch] Adds any, all, equal, item references (#78072)
This PR adds the item, equal, any, and all references.

While doing this I found the following issues:
- https://github.com/pytorch/pytorch/issues/78070
- https://github.com/pytorch/pytorch/issues/78071

And I fixed a bug where the `convert_element_type` prim could not convert tensors requiring grad to datatypes that don't require grad.

Creating the item reference required adding item as a prim, but per @ngimel's suggestion I removed the prims for any and all and implemented them as references, so this is net negative one prim.

Reference OpInfos are added for any and all, but item and equal don't even have regular OpInfos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78072
Approved by: https://github.com/ngimel
2022-05-23 12:49:04 +00:00
Kshiteej K
88fca3be59 [reland][complex32] conv1d, conv2d : enable test (#77999)
Reland: #77239
Ref: #74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77999
Approved by: https://github.com/anjali411
2022-05-23 05:49:03 +00:00
kshitij12345
2676931d3e [composite compliance] forward_ad: linear (#77950)
Reference: #69991

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77950
Approved by: https://github.com/soulitzer
2022-05-23 05:43:53 +00:00
Natalia Gimelshein
141ea86c33 reduce overhead of get_current_stream (#78066)
This reduces overhead of `torch.cuda.current_stream()` from ridiculous 8.7 us to still ridiculous 4.4 us.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78066
Approved by: https://github.com/mruberry
2022-05-23 03:14:01 +00:00
Mike Ruberry
d4345ed0a6 [primTorch] Adds random operations (#78026)
This PR...

**Issues Found**
- https://github.com/pytorch/pytorch/issues/78058
- https://github.com/pytorch/pytorch/issues/78054
- https://github.com/pytorch/pytorch/issues/78053
- https://github.com/pytorch/pytorch/issues/78050
- https://github.com/pytorch/pytorch/issues/77932

**Testing**
- disables stride consistency checks in test_ops and test_meta pending resolution of https://github.com/pytorch/pytorch/issues/78050
- skips chalf in reference tests (addressing https://github.com/pytorch/pytorch/issues/78054)
- splits test test_python_reference_consistency in one test for the ctx where torch.foo is torch.foo, and another for when torch.foo is refs.foo
- updates test names to be more natural and consistent:
  - test_python_reference_errors -> test_python_ref_errors
  - test_python_reference_consistency -> test_python_ref and test_python_ref_torch_fallback
  - test_python_reference_meta_functions -> test_python_ref_meta
  - test_reference_testing -> test_numpy_ref
- updates test_python_ref and test_python_ref_torch_fallback to check that the reference is more accurate than the torch op if the reference and torch op results are not close, a warning is raised when this occurs (addressing https://github.com/pytorch/pytorch/issues/77687)
- adds reference inputs for broadcast_tensors
- Updates the "fill_" OpInfo to "fill", adding a NumPy reference and making it an elementwise unary operator
- Adds 1D no element sample inputs to the cat OpInfo and updates the NumPy reference to handle them and type promotion correctly
- Adds reference inputs for elementwise ternary operations, like clamp
- Adds a NumPy reference for clamp
- Adds reference inputs to where's OpInfo
- Makes softplus an elementwise unary OpInfo
- Removes the great majority of Python reference OpInfo skips and xfails due to the above test changes
- Adds Python reference OpInfos for fill, dropout, clamp, broadcast_tensors, and where

**Prims**
- adds the fill, empty_strided, and uniform prims
- removes the empty, empty_like, full, and full_like prims -- these are now references that use empty_strided and fill
- renames the "concatenate" and "select" prims to "cat" and "where", respectively, to be consistent with PyTorch
- extends the `_elementwise_meta` operation to accepts tensors that don't participate in type promotion, like the `cond` tensor in `where`
- fixes a bug in the stride propagation of broadcast_in_dim
- moves some error checks from prims.cat to prims.where to refs.cat and refs.where, respectively, consistent with our new policy of doing as much error checking in the ref as possible

**Utils**
- adds the canoicalize_device, extract_shape, and extract_shape_from_varargs helpers
- adds the elementwise_unary_scalar_wrapper -- this allows elementwise unary operators to take and return scalar values (ex. refs.sin(1) will return .84...)

**Refs**
- adds the fill, broadcast_tensors, clamp, empty_strided, ones, zeros, and uniform references
- adds the nn.functional.dropout reference
- fixes refs.cat to handle 1D tensors with no inputs consistent with eager mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78026
Approved by: https://github.com/ngimel
2022-05-23 01:56:28 +00:00
jjsjann123
735ab79168 Static initializer update (#78052)
Code cleaning, call_once on a static initializer shouldn't be needed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78052
Approved by: https://github.com/suo
2022-05-22 23:14:03 +00:00
Taylor Robie
d7680cb7f0 [Profiler][Trivial] Switch to nanoseconds for Result's internal representation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77697

Certain steps in building the call tree rely on sorting, so we want to retain as much precision as possible. `profiler_kineto.cpp` and KinetoEvent still use microseconds.

Differential Revision: [D36302563](https://our.internmc.facebook.com/intern/diff/D36302563/)

Approved by: https://github.com/aaronenyeshi
2022-05-22 22:39:13 +00:00
Taylor Robie
e17f14fab2 [Profiler] Propagate metadata into Engine::evaluate_function event.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77696

https://github.com/pytorch/pytorch/pull/63619 added a RECORD_FUNCTION guard to make calls to `Engine::evaluate_function` visible regardless of the underlying op. While useful, this creates a call that looks like a forward call that somewhat complicates stitching forward and backward ops. I don't want to add complexity (and therefore work) on the hot path; instead it's fairly straightforward to stitch things back together in post. This PR simply propagates sequence number and forward tid info up to the `evaluate_function` event.

Differential Revision: [D36302562](https://our.internmc.facebook.com/intern/diff/D36302562/)

Approved by: https://github.com/aaronenyeshi
2022-05-22 22:39:13 +00:00
Taylor Robie
71b94b09ae [Profiler][Trivial] Force Result to be a shared_ptr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77695

A lot of the graph manipulation in later changes will rely on the ability to hold stable references, both in C++ and Python.

Differential Revision: [D36302564](https://our.internmc.facebook.com/intern/diff/D36302564/)

Approved by: https://github.com/aaronenyeshi
2022-05-22 22:39:12 +00:00
Taylor Robie
33dc5d1a39 [Profiler] Move Allocation into EventType.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77694

Continuing the trend of unification, this PR removes the special path for allocation tracking. The overall delta is pretty minimal; it's mostly just extending and unifying visitors. This also comes with the added benefit that memory profiling now gets to take advantage of the lock-free machinery.

Differential Revision: [D36189043](https://our.internmc.facebook.com/intern/diff/D36189043/)

Approved by: https://github.com/aaronenyeshi
2022-05-22 22:39:11 +00:00
PyTorch MergeBot
acfbc16b1c Revert "[primTorch] Adds random operations (#78026)"
This reverts commit 043cf1f9c7.

Reverted https://github.com/pytorch/pytorch/pull/78026 on behalf of https://github.com/suo due to This broke trunk: 043cf1f9c7
2022-05-22 18:11:14 +00:00
Taylor Robie
59be76c6cf [Profiler] Introduce torch::profiler::impl::EventType
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77693

Right now the profiler internals are rather ad-hoc and disjoint. As we move towards a unified experience this needs to be addressed. This PR adds an enum specifying the various types of events that can be profiled and specializes the `ExtraFields` struct on the values of the `EventType` enum. This lets us punt more of the heterogeneity onto the type system and allows a caller to simply think in terms of `ExtraFields<EventType::...>`. (No more "X field is always present but only makes sense for Y". e.g. inputs)

For now only ops and backend events are transitioned since they are already in a weird union state. Changes planned for subsequent diffs in the stack:
1) Allocations
2) Python tracer events
3) Kineto (e.g. Cupti) events
4) Use unified event type for more post processing

One rather pleasant observation was that this change exposed several minor bugs in the current implementation:
1) We just didn't plumb `end_thread_id_` from `OpEvent` to `Result`. Switching to using ctors rather than setting fields in `getRecords` fixes this.
2) We were calling `fn.threadId()` to get start TID, but that is wasteful because it is already stored in the `ThreadLocalSubqueue`.
So that gives me some confidence that this is a step in the right direction.

Differential Revision: [D36189044](https://our.internmc.facebook.com/intern/diff/D36189044/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36189044/)!

Approved by: https://github.com/aaronenyeshi
2022-05-22 16:26:23 +00:00
Mike Ruberry
043cf1f9c7 [primTorch] Adds random operations (#78026)
This PR...

**Issues Found**
- https://github.com/pytorch/pytorch/issues/78058
- https://github.com/pytorch/pytorch/issues/78054
- https://github.com/pytorch/pytorch/issues/78053
- https://github.com/pytorch/pytorch/issues/78050
- https://github.com/pytorch/pytorch/issues/77932

**Testing**
- disables stride consistency checks in test_ops and test_meta pending resolution of https://github.com/pytorch/pytorch/issues/78050
- skips chalf in reference tests (addressing https://github.com/pytorch/pytorch/issues/78054)
- splits test test_python_reference_consistency in one test for the ctx where torch.foo is torch.foo, and another for when torch.foo is refs.foo
- updates test names to be more natural and consistent:
  - test_python_reference_errors -> test_python_ref_errors
  - test_python_reference_consistency -> test_python_ref and test_python_ref_torch_fallback
  - test_python_reference_meta_functions -> test_python_ref_meta
  - test_reference_testing -> test_numpy_ref
- updates test_python_ref and test_python_ref_torch_fallback to check that the reference is more accurate than the torch op if the reference and torch op results are not close, a warning is raised when this occurs (addressing https://github.com/pytorch/pytorch/issues/77687)
- adds reference inputs for broadcast_tensors
- Updates the "fill_" OpInfo to "fill", adding a NumPy reference and making it an elementwise unary operator
- Adds 1D no element sample inputs to the cat OpInfo and updates the NumPy reference to handle them and type promotion correctly
- Adds reference inputs for elementwise ternary operations, like clamp
- Adds a NumPy reference for clamp
- Adds reference inputs to where's OpInfo
- Makes softplus an elementwise unary OpInfo
- Removes the great majority of Python reference OpInfo skips and xfails due to the above test changes
- Adds Python reference OpInfos for fill, dropout, clamp, broadcast_tensors, and where

**Prims**
- adds the fill, empty_strided, and uniform prims
- removes the empty, empty_like, full, and full_like prims -- these are now references that use empty_strided and fill
- renames the "concatenate" and "select" prims to "cat" and "where", respectively, to be consistent with PyTorch
- extends the `_elementwise_meta` operation to accepts tensors that don't participate in type promotion, like the `cond` tensor in `where`
- fixes a bug in the stride propagation of broadcast_in_dim
- moves some error checks from prims.cat to prims.where to refs.cat and refs.where, respectively, consistent with our new policy of doing as much error checking in the ref as possible

**Utils**
- adds the canoicalize_device, extract_shape, and extract_shape_from_varargs helpers
- adds the elementwise_unary_scalar_wrapper -- this allows elementwise unary operators to take and return scalar values (ex. refs.sin(1) will return .84...)

**Refs**
- adds the fill, broadcast_tensors, clamp, empty_strided, ones, zeros, and uniform references
- adds the nn.functional.dropout reference
- fixes refs.cat to handle 1D tensors with no inputs consistent with eager mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78026
Approved by: https://github.com/ngimel
2022-05-22 10:06:24 +00:00
Natalia Gimelshein
192aa3ad5f adds std and var refs and var prim (#77948)
Per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77948
Approved by: https://github.com/mruberry
2022-05-22 04:01:21 +00:00
kshitij12345
5f1b0a4f48 [primTorch] add exp2 (prim and ref), log10 (prim and ref), frac (ref) (#78046)
Adds `exp2`, `log10` to the prims (both also exist in C++ lib and Intel SIMD intrinsic has `exp2`)

Adds `exp2`, `log10`, `frac` to refs with corresponding entries to OpInfo.

Tried to decompose `exp2` (before adding it as prim) as
* `exp(log(2) * x)` but it wasn't stable at large numbers.
* `pow(2, x)` in which case there was stride mismatch. At cursory look, `pow` tries to preserve stride of first arg if possible.

Tried to decompose `log10` (before adding it as prim) as
* `log(x) / log(10)` passed for real dtypes. Failed for complex at extremals. Probably related to https://github.com/pytorch/pytorch/issues/52332 (not a 100% sure)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78046
Approved by: https://github.com/mruberry
2022-05-22 03:43:54 +00:00
Kshiteej K
57fab66fdc [primTorch] add refs fliplr, flipud (#78049)
Add refs for `fliplr, flipud` with corresponding OpInfo entries.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78049
Approved by: https://github.com/mruberry
2022-05-22 01:04:01 +00:00
pritam
37eb31599c [reland] Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77987)
1. Enabled multigpu tests.
2. Fixed failing multigpu tests.
3. Fixed custom operator decorator to be first preference in operator dispatch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77987
Approved by: https://github.com/fduwjj, https://github.com/wanchaol, https://github.com/janeyx99
2022-05-21 22:33:58 +00:00
Jerry Zhang
416899d1a9 [quant][fx][bc-breaking] Add required example_args argument to prepare_fx and prepare_qat_fx (#249) (#77608)
Summary:
X-link: https://github.com/facebookresearch/d2go/pull/249

X-link: https://github.com/fairinternal/ClassyVision/pull/104

X-link: https://github.com/pytorch/benchmark/pull/916

X-link: https://github.com/facebookresearch/ClassyVision/pull/791

X-link: https://github.com/facebookresearch/mobile-vision/pull/68

FX Graph Mode Quantization needs to know whether an fx node is a floating point Tensor before it can decide whether to
insert observer/fake_quantize module or not, since we only insert observer/fake_quantize module for floating point Tensors.
Currently we have some hacks to support this by defining some rules like NON_OBSERVABLE_ARG_DICT (https://github.com/pytorch/pytorch/blob/master/torch/ao/quantization/fx/utils.py#L496), but this approach is fragile and we do not plan to maintain it long term in the pytorch code base.

As we discussed in the design review, we'd need to ask users to provide sample args and sample keyword args
so that we can infer the type in a more robust way. This PR starts with changing the prepare_fx and prepare_qat_fx api to require user to either provide
example arguments thrugh example_inputs, Note this api doesn't support kwargs, kwargs can make https://github.com/pytorch/pytorch/pull/76496#discussion_r861230047 (comment) simpler, but
it will be rare, and even then we can still workaround with positional arguments, also torch.jit.trace(https://pytorch.org/docs/stable/generated/torch.jit.trace.html) and ShapeProp: https://github.com/pytorch/pytorch/blob/master/torch/fx/passes/shape_prop.py#L140 just have single positional args, we'll just use a single example_inputs argument for now.

If needed, we can extend the api with an optional example_kwargs. e.g. in case when there are a lot of arguments for forward and it makes more sense to
pass the arguments by keyword

BC-breaking Note:
Before:
```python
m = resnet18(...)
m = prepare_fx(m, qconfig_dict)
# or
m = prepare_qat_fx(m, qconfig_dict)
```
After:
```python
m = resnet18(...)
m = prepare_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),))
# or
m = prepare_qat_fx(m, qconfig_dict, example_inputs=(torch.randn(1, 3, 224, 224),))
```

Test Plan:
python test/test_quantization.py TestQuantizeFx
python test/test_quantization.py TestQuantizeFxOps
python test/test_quantization.py TestQuantizeFxModels

Imported from OSS

**Static Docs Preview: classyvision**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D35984526/V30/classyvision/)|

|**Modified Pages**|

Reviewed By: vkuzo, andrewor14

Differential Revision: D35984526

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77608
Approved by: https://github.com/dzdang
2022-05-21 21:03:48 +00:00
Horace He
4428218945 [primtorch] Added native_group_norm decomp (#78029)
cc: @jansel @bertmaher

More or less identical in spirit to the layer norm and batch norm ones.

One annoying thing about all 3 of these is that layer_norm has slightly different `mean/var` semantics than batch norm and group norm. After normalization, `layer_norm` keeps them unsqueezed (so they're something like [1, 5, 1, 1]) while batch norm and group norm squeeze out the 1-dims.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78029
Approved by: https://github.com/bertmaher
2022-05-21 08:07:02 +00:00
Taylor Robie
580e6583d5 [Profiler] Fix segfault in AppendOnlyList
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77997

`buffer_last_` is supposed to start at buffer_.before_begin(). It is correctly set in the ctor, but incorrectly set in `clear()`. This causes a segfault in `maybe_grow()` (Specifically, `buffer_.emplace_after(buffer_last_)`) for an AppendOnlyList which has been cleared.

Differential Revision: [D36555737](https://our.internmc.facebook.com/intern/diff/D36555737/)

Approved by: https://github.com/aaronenyeshi
2022-05-21 02:38:26 +00:00
Taylor Robie
673346b350 [Profiler] Pop KinetoThreadLocalState at the start of post processing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77996

An issue recently surfaced internally which highlighted the fact that removing KinetoThreadLocalState from the TLS at the end of post processing means that we are profiling memory during post processing. (Which violates a whole bunch of invariants in the system.) This change switches the global profiling ctx to a shared_ptr, introduces a class to manage it (`init`, `get`, and `pop` methods) and moves the `pop` call to the beginning of `disableProfiler`.

Differential Revision: [D36555738](https://our.internmc.facebook.com/intern/diff/D36555738/)

Approved by: https://github.com/aaronenyeshi
2022-05-21 02:38:26 +00:00
Edward Z. Yang
6b273444c4 Add logit ref; allow non-refs to be called in refs.
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77816

Approved by: https://github.com/mruberry
2022-05-21 02:35:14 +00:00
Horace He
50cadfae10 Add strictness check and made tensors into leaves if input tensors were leaves (#77474)
I think this makes sense to do? Otherwise, if you call `backward()` in your traced function, you can't get gradients out of any tensors that should have been leaves.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77474
Approved by: https://github.com/ezyang
2022-05-21 01:16:39 +00:00
John Clow
c82fb7a67f Adding support for upper and lower bound functions in SSA
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77389

Approved by: https://github.com/eellison
2022-05-20 23:58:40 +00:00
Han Qi (qihqi)
9432be9b8c [flatbuffer] Move saving storage to the last step. (#78024)
Summary: Move storage saving to last step, because otherwise tensors saved after storage are already saved will not have storage.

Test Plan: Tested by loading the file in `clowder get GLDGLQnKrIsQFg8DAPxq9vg59ZwZbmQwAAAA orig.pt` and converting to flatbuffer and load again

Differential Revision: D36552645

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78024
Approved by: https://github.com/Jack-Khuu
2022-05-20 23:48:44 +00:00
Horace He
64b4bb4b01 Fix meta tests on norm (and relanding norm fixes) (#77930)
Had a land race with meta tests.

Will also be relanding https://github.com/pytorch/pytorch/pull/77407
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77930
Approved by: https://github.com/malfet, https://github.com/ezyang
2022-05-20 23:15:53 +00:00
Alban Desmaison
04ac80c73a Fix a few issues on assert/double error/legacy constructor (#77966)
Fixes https://github.com/pytorch/pytorch/issues/77960, https://github.com/pytorch/pytorch/issues/77957, https://github.com/pytorch/pytorch/issues/77781
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77966
Approved by: https://github.com/soulitzer, https://github.com/kulinseth
2022-05-20 20:25:12 +00:00
PyTorch MergeBot
53b30579b7 Revert "[complex32] conv1d, conv2d : enable test (#77239)"
This reverts commit 2d3a6d7274.

Reverted https://github.com/pytorch/pytorch/pull/77239 on behalf of https://github.com/suo due to This broke nvfuser tests on master, see: 2d3a6d7274
2022-05-20 19:10:42 +00:00
John Clow
dbee7e5499 Adding SSA support for convolution_backward
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77283

Approved by: https://github.com/Krovatkin
2022-05-20 18:39:47 +00:00
PyTorch MergeBot
0f74b44f1a Revert "Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77825)"
This reverts commit 8d4c8df33a.

Reverted https://github.com/pytorch/pytorch/pull/77825 on behalf of https://github.com/janeyx99 due to as it will break multigpu test reporting
2022-05-20 17:59:03 +00:00
Zafar
9d44b3d110 [quant][refactor] Remove the base class from __all__
In general, if we are expecting the users to use the base class,
such as `_ConvNd`, we should rename it to something like
`BaseConv`. However, because this base class is only used inside of the
AO packages, there is no need to expose it to the users.

Test Plan:

```
python test/test_quantization.py
python test/test_module_init.py
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77344

Approved by: https://github.com/jerryzh168
2022-05-20 17:56:22 +00:00
pritam
8d4c8df33a Add sharding tests to multigpu-test.sh and fix custom operator decorator (#77825)
1. Enabled multigpu tests.
2. Fixed failing multigpu tests.
3. Fixed custom operator decorator to be first preference in operator dispatch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77825
Approved by: https://github.com/wanchaol, https://github.com/fduwjj
2022-05-20 16:53:27 +00:00
kshitij12345
2d3a6d7274 [complex32] conv1d, conv2d : enable test (#77239)
Ref: #74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77239
Approved by: https://github.com/anjali411
2022-05-20 15:54:30 +00:00
David Berard
38bc10ae25 retry - enable NVFuser by default
Enable NVFuser in OSS.

Retry of #77213, because it was breaking torchvision tests.

Fix in #77471 has been verified by jjsjann123

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77579

Approved by: https://github.com/eellison, https://github.com/malfet, https://github.com/atalman, https://github.com/seemethere
2022-05-20 14:21:18 +00:00
zrphercule
734a97a7c8 Revert "Revert "Switch to use nested tensor by-default in Transformer… (#77924)
…Encoder (#77217)""

This reverts commit 0d6fa91d1b.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77924
Approved by: https://github.com/atalman
2022-05-20 11:44:03 +00:00
Philip Meier
dd313d7338 support TestCase.longMessage in TestCase.assertEqual
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77602

Approved by: https://github.com/mruberry
2022-05-20 11:09:28 +00:00
Philip Meier
63e9fdd92f re-add dynamic error messages to assert_close
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77601

Approved by: https://github.com/mruberry
2022-05-20 11:09:28 +00:00
kshitij12345
efb2c093fc [fix] complex type promotion (#77524)
Fixes https://github.com/pytorch/pytorch/issues/76803

Before Fix:
```python
>> a = torch.randn((2, 2), dtype=torch.float)
>> b = torch.tensor(1, dtype=torch.cdouble)
>> (a + b).dtype
torch.complex128
```

After Fix:
```python
>> a = torch.randn((2, 2), dtype=torch.float)
>> b = torch.tensor(1, dtype=torch.cdouble)
>> (a + b).dtype
torch.complex64
```

**Note**: This is a BC Breaking change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77524
Approved by: https://github.com/anjali411, https://github.com/mruberry
2022-05-20 10:23:56 +00:00
Michael Andreas Dagitses
6dae1e419e remove unnecessary ATen/core/Macros.h
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76376

This is only used in a few places and only aliases the c10 macros
header.

Differential Revision: [D35904936](https://our.internmc.facebook.com/intern/diff/D35904936/)

Approved by: https://github.com/dreiss, https://github.com/malfet
2022-05-20 09:07:32 +00:00
Nikolay Korovaiko
df1f9b9840 Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#77756)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77756
Approved by: https://github.com/desertfire
2022-05-20 05:39:03 +00:00
Zafar
a8c929b0a6 [quant] Reordering the imports in the torch/__init__.py
Because the AO stuff depends on the torch packages, but very few (if any)
torch packages depend on AO, we are moving the imports lower.
That will reduce the probability of cyclic imports, as by the time the
AO would start importing, the rest of the torch would be already imported.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77065

Approved by: https://github.com/albanD
2022-05-20 03:51:15 +00:00
Drazen Borkovic
f54098cd3e Create JSON from new FX IR and lower to LLVM (#77765)
Summary:
Replace TensorView objects with maps for JSONing.
Lower to LLVM.

Reviewed By: jaybean-dev, jfix71

Differential Revision: D36318989

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77765
Approved by: https://github.com/jfix71, https://github.com/jamesr66a
2022-05-20 03:20:57 +00:00
Hao Lu
c60d2ef4eb [StaticRuntime] Replace Permute with copy version only when it's followed by reshape or flatten (#77832)
Reviewed By: mikeiovine

Differential Revision: D36466622

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77832
Approved by: https://github.com/mikeiovine
2022-05-20 03:14:01 +00:00
PyTorch MergeBot
03546e9c07 Revert "Fixed type promotion semantics for native_batch_norm and native_layer_norm (#77407)"
This reverts commit 70d80fb424.

Reverted https://github.com/pytorch/pytorch/pull/77407 on behalf of https://github.com/malfet due to as it broke meta tests ( I guess due to landrace), see 70d80fb424
2022-05-20 02:31:57 +00:00
Kurt Mohler
cecb2ad95e Restore old names for private funcs in legacy storages (#77861)
Followup from #75459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77861
Approved by: https://github.com/ezyang
2022-05-20 02:03:34 +00:00
jjsjann123
6583c0384b fixing trivial reduction & broadcast scheduling (#77884)
cherry-picked fixes from https://github.com/csarofeen/pytorch/pull/1714

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77884
Approved by: https://github.com/csarofeen, https://github.com/davidberard98
2022-05-20 02:00:42 +00:00
Justin Chu
0d76299ff7 [ONNX] Clean up module imports (#77423)
Cleaning up onnx module imports to prepare for updating `__init__`.

- Simplify importing the `_C` and `_C._onnx` name spaces
- Remove alias of the symbolic_helper module in imports
- Remove any module level function imports. Import modules instead
    - Alias `symbilic_opsetx` as `opsetx`
- Fix some docstrings

Requires:
- https://github.com/pytorch/pytorch/pull/77448
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77423
Approved by: https://github.com/BowenBao
2022-05-20 01:56:24 +00:00
Milad Mohammadi
e67284d9ee Added support for slogdet in LazyTensor shape inference (#77904)
Fixes https://github.com/pytorch/xla/pull/3576

Added support for `slogdet` in LazyTensor shape inference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77904
Approved by: https://github.com/wconstab, https://github.com/JackCaoG
2022-05-20 01:34:56 +00:00
Milad Mohammadi
d6ae650738 Added support for inverse in LazyTensor shape inference (#77888)
Fixes https://github.com/pytorch/xla/pull/3575

Added support for `inverse` in LazyTensor shape inference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77888
Approved by: https://github.com/wconstab
2022-05-20 01:31:13 +00:00
Eli Uriegas
1bec7f8468 torch: Fix black linter
Fixes formatting issues when trying to import diff train

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77925

Approved by: https://github.com/mehtanirav, https://github.com/osalpekar
2022-05-20 01:14:08 +00:00
Rohit Goswami
c915fbe201 ENH: Convert finfo.tiny to finfo.smallest_normal (#76292)
Fixes #70909, by a straightforward search and replace discussed in #70909.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76292
Approved by: https://github.com/mruberry
2022-05-20 00:59:48 +00:00
Kurt Mohler
7892a45741 Add missing decref to createStorageGetType (#77860)
Followup from PR #75459

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77860
Approved by: https://github.com/ezyang
2022-05-20 00:53:19 +00:00
Kevin Stephano
11daf200e8 Adding activation references for celu, mish, selu, softplus, and tanh (#77473)
Adding activation references for celu, softplus, mish, selu.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77473
Approved by: https://github.com/mruberry
2022-05-20 00:47:31 +00:00
Andrew Gu
e69d13b8b3 [FSDP][Easy] Update state_dict() docstring
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77853

Approved by: https://github.com/rohan-varma
2022-05-19 23:59:03 +00:00
Andrew Gu
d9b3feb27d [FSDP][Easy] Reword device placement warning
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77850

Approved by: https://github.com/rohan-varma
2022-05-19 23:57:40 +00:00
Andrew Gu
36bf8007f7 [FSDP][Easy] Fix state_dict_type() docstring example
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77848

Approved by: https://github.com/rohan-varma
2022-05-19 23:53:15 +00:00