Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32659
Applies linter to RPC test files so that we can use linter shortcuts
without getting unnecessary changes to the whole file.
ghstack-source-id: 97361237
Test Plan: No actual changes.
Differential Revision: D19584742
fbshipit-source-id: a11ce74ee0e2817e6f774fff7c39bcab06e99307
Summary:
Stacked PRs
* #32244 - Make zip serialization the default
* **#32241 - Split serialization tests to their own file**
This makes them all easier to run as a batch. This PR is just a code move / fixing up imports. There are still some serialization tests in `test_torch.py` as part of `TestDeviceType`.
](https://our.intern.facebook.com/intern/diff/19415826/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32241
Pulled By: driazati
Differential Revision: D19415826
fbshipit-source-id: a3f6cfe1626ff2f9b9631c409bf525bd32e4639b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32656
Fixes these flaky tests.
Test Plan: Run the test 500 times and verify that it succeeds every time.
Differential Revision: D19584453
fbshipit-source-id: 07cbc4914211f274182ac0fa74bb5ef6d43392d1
Summary:
Both `test_wait_all_workers` and `test_wait_all_workers_and_shutdown` test the same pattern of initialize RPC, call `_wait_all_workers`, and `rpc.shutdown(graceful=False)`.
`test_wait_all_workers` seems to be more thorough since it tests one worker driving and the others waiting on it as well.
We shouldn't have duplicate test so removing this `test_wait_all_workers_and_shutdown`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32588
Differential Revision: D19566294
Pulled By: rohan-varma
fbshipit-source-id: b69519d169b3964649d47ad75532bda5de538241
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32633
There were 2 sources of current RPC agent.
- One is in Python world, `torch.distributedrpc.api._agent`.
- The other is in C++ world, `RpcAgent::defaultRpcAgent_`
Setting Python `_agent` to `None`, does not necessarily reset the C++ `defaultRpcAgent_` to `nullptr`.
i.e.
```
torch.distributedrpc.api._agent = None
```
does not translate to
```
RpcAgent::defaultRpcAgent_ = nullptr
```
This PR is to remove this ambiguity, and use the C++ pointer as source of truth.
The solution is to leverage a pybind11 behavior that it implicitly casts C++ `shared_ptr<RpcAgent>(nullptr)` to Python `None`.
ghstack-source-id: 97293315
Test Plan:
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork -- test_duplicate_name
buck build mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
buck-out/gen/caffe2/test/distributed/rpc/rpc_fork\#binary.par -r test_process_group_debug_info
```
```
buck test mode/dev-nosan //caffe2/torch/fb/distributed/pytorch/tests:test_remote_module
buck test mode/dev-nosan //caffe2/torch/fb/distributed/modules/tests:test_sharded_embedding
buck test mode/dev-nosan //caffe2/torch/fb/distributed/modules/tests:test_sharded_pairwise_attention_pooling
buck test mode/dev-nosan //caffe2/torch/fb/distributed/pytorch/tests:test_rpc
```
Differential Revision: D5733066
fbshipit-source-id: b3e6032ee975f19ca556497edbbf40b517b25be8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32624
We need this PR to resolve the issue mentioned in https://github.com/pytorch/pytorch/issues/31325#issuecomment-574918917.
The solution is for each `_wait_all_workers()` call, there is a sequence ID added, to identify different calls.
ghstack-source-id: 97277591
Test Plan:
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork -- test_wait_all_workers
buck build mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
buck-out/gen/caffe2/test/distributed/rpc/rpc_fork\#binary.par -r test_wait_all_workers
```
Differential Revision: D5739520
fbshipit-source-id: a64131e09c365179624700514422f5375afe803f
Summary:
This PR adds support for 0-dim batch size input for `torch.nn.functional.interpolate` for various modes of interpolation.
Fixes part of gh-12013
CC: rgommers ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32400
Differential Revision: D19557090
Pulled By: ezyang
fbshipit-source-id: 6822f148bb47bfbcacb5e03798bf2744f24a2a32
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32466
It's a follow-up work of https://github.com/pytorch/pytorch/pull/32197.
In https://github.com/pytorch/pytorch/pull/32197, `rpc.sync_rpc(..) `and `rpc.rpc_async(..)` support taking a TorchScript annotated Python function as the user function for RPC.
This PR extend along this direction by making `rpc.remote(..)` support taking a TorchScript annotated Python function as well.
ghstack-source-id: 97211168
Test Plan:
# Unit tests
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork -- test_script_function_exception
buck build mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork
buck-out/gen/caffe2/test/distributed/rpc/rpc_fork\#binary.par -r test_script_function_exception
```
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:dist_autograd_fork -- test_backward_simple_script_call
buck build mode/dev-nosan //caffe2/test/distributed/rpc:dist_autograd_fork
buck-out/gen/caffe2/test/distributed/rpc/dist_autograd_fork\#binary.par -r test_backward_simple_script_call
```
Differential Revision: D19440633
fbshipit-source-id: d37f6dcdc0b80d35ac7bcba46ad6f9b831c3779b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32479
Run dynamic quantization on mobile (similar to FBGEMM). Currently only implemented on linear operator
Test Plan:
python test/test_quantized.py TestDynamicQuantizedLinear.test_qlinear
Imported from OSS
Differential Revision: D19542980
fbshipit-source-id: c9f6e5e8ded4d62ae0f2ed99e478c8307dde22ed
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445
Create distributed and rpc directories under caffe/test for better management
of unit tests.
Differential Revision: D18702786
fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
Summary:
This PR fixes the invalid None return when calling get_all_math_dtype(device='cuda').
Issue came from the __append__ method which doesn't have any return value used in `return dtypes.append(...)`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23028
Differential Revision: D16362732
Pulled By: colesbury
fbshipit-source-id: 0bbc30a0c663749d768159f1bc37b99f7263297b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18166
ghimport-source-id: a8e2ba2d966e49747a55701c4f6863c5e24d6f14
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18166 Bool Tensor for CUDA**
* #18165 Resolved comments from Bool Tensor for CPU PR
------
This PR enables bool tensor creation and some basic operations for the CPU backend. This is a part of Bool Tensor feature implementation work. The whole plan looks like this:
1. Storage Implementation [Done]
2. Tensor Creation.
a) CPU [Done]
b) CUDA [This PR]
3. Tensor Conversions.
4. Tensor Indexing.
5. Tensor Operations.
6. Back compatibility related changes.
Change:
Enable bool tensor in CUDA with the following operations:
torch.zeros
torch.tensor
torch.ones
torch.rand/rand_like/randint/randint_like
torch.full
torch.full_like
torch.empty
torch.empty_like
Tested via unit tests and local scripts.
Differential Revision: D14605104
fbshipit-source-id: b7d7340a7d70edd03a109222d271e68becba762c
Summary:
light weight implementation of LLVM filecheck utility. Currently only handles string matching - regexes & saving a regex to a variable name can be added as needed.
Current intended usage is through FileCheckBuilder python handle, and is shown in the tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16858
Differential Revision: D14096244
Pulled By: eellison
fbshipit-source-id: c7c8d1457691c105e6ccbb3c1a378d96baac2569
* Codemod to update our codebase to 0.4 standard
* Update some of the test scri[ts
* remove Variable in test_clip_grad_value
* fix _symbolic_override_wrapper_maker
* Separate cuda-ness from dtype.
There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType.
At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device).
There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types
on reduction functions.
* Fix test_autograd.
* Add defaults to randint_like.
* Track is_cuda in py tensor types.
* Fix test_sparse.
* Fix multiprocessing.
* Fix rnn.
* Fix test_nn.
* Fix flake8.
This compares the torch function against the reference math funciton
against a relative small set of inputs, including integers, extremes
of some common functions, zero, a few numbers from randn and a few
numbers near 1e6.
The idea here is not to be completely exhaustive, but rather quickly
expose the most common bugs. For exhaustive checks, we should evaluate
torch functions against all ~4e9 possible float32 value.
We compare the torch function evaluated against contiguous
and non-contiguous inputs and large vs. small tensors.
Also:
- Make torch.allclose work with nan and +/-inf
- Add torch.isclose (like numpy.isclose)
- Add torch.testing.assert_allclose (like
numpy.testing.assert_allclose)
* Introduce torch.layout and split layout from dtypes.
Tensors (and tensor types) now have a 'layout' attribute that returns either 'torch.strided' or 'torch.sparse_coo'.
Previously, dtypes were 1-to-1 with ATen types/PyTensorTypes; the impetus behind this decision was to make things easy in the common case
(i.e. specifying a type in a factory function). But this doesn't really follow for sparity, which isn't a common case.
It also doesn't properly represent the concept or a dtype, which in numpy are proper scalar types (i.e. roughly the type returned from indexing the
last dimension of an n-d array). But this should be the same whether or not the tensor is represented via strides, sparsity, etc.
This is accomplished by:
1) having the dtype of tensor return the (device-type, scalar-type) combination, i.e. torch.cuda.float32, so both
torch.cuda.FloatTensor and torch.cuda.sparse.FloatTensor have the same dtype
2) Adding a layout parameter to python functions, where the combination of (dtype, layout) maps to an ATen type that is used for dispatch.
* Formatting, make init throw python_error.
* Fix cuda not enabled error message.
* Fix test.
* Add torch.empty, torch.full and new_ size Tensor factory methods.
This adds torch.full, torch.empty equivalents of np.full, np.empty.
In addition, this adds size-based Tensor factory methods new_empty, new_ones, new_full, new_zeros,
which is meant to complete the separation of the legacy "new" method into data-based and size-based
functions.
This also fixes an issue in sparse zeros_like when the dtype didn't match the argument dtype.
* Get rid of unnecessary zero in sparse tensor zeros_like.
* Fix test if only 1 cuda device.
* Add kwarg-only 'requires_grad' parameter to Variable factories.
Functions that create variables, e.g. torch.ones_like currently always return Variables with requires_grad=False;
this is less convenient than the existing Variable constructor that has a requires_grad parameter. This commit
adds the parameter at the python binding level.
* Fix flake8.
* Address review comments.
* Match set_requires_grad implementation with tensor_new version.
* Various testing and utility improvements including torch.testing module.
1) Remove method definition for randn_like since ones_like, zeros_like do not have methods.
2) Add an empty_like native function for creating a tensor with uninitialized values.
3) Add an is_floating_point() native function, similar to is_signed().
4) Add a torch.testing module loosely modeled after numpy.testing; currently it contains
make_non_contiguous (moved from test_autograd) and randn_like (wrapper around the VariableFunction).
5) Remove code from test_autograd and test_nn that is responsible for generating grad_outputs to use
with gradgradcheck. These now use gradgradcheck's own generating code. This fixes
test_nn.py with scalars because gradgradcheck does the right thing here already.
* Rename parameter.
* Fix parameter usages.