Summary:
Fix https://github.com/pytorch/pytorch/issues/46242
This ensures that the `check_inplace()` run the proper checks even if the Tensor that is being modified inplace does not requires gradient. As the Tensor written into it might require gradient and will make this inplace modification actually differentiable.
This contains:
- Codegen changes to tell `check_inplace()` if the inplace will be differentiable
- Changes in `handle_view_on_rebase` to work properly even when called for an input that does not require gradients (which was assumed to be true before)
- Corresponding tests (both warnings and the error raise internal assert errors without this fix)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46296
Reviewed By: ezyang
Differential Revision: D24903770
Pulled By: albanD
fbshipit-source-id: 74e65dad3d2e3b9f762cbb7b39f92f19d9a0b094
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47952
We don't actually generate a TE kernel so no need to use the
arena-allocation guard.
Test Plan:
```
buck test //caffe2/test/cpp/tensorexpr -- FuserPass
```
Reviewed By: ZolotukhinM
Differential Revision: D24967107
fbshipit-source-id: 302f65b2fcff704079e8b51b942b7b3baff95585
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47896
Per title
ghstack-source-id: 116710141
Test Plan: CI
Reviewed By: osalpekar
Differential Revision: D24943323
fbshipit-source-id: 7bf33ce3a021b9750b65e0c08f602c465cd81d28
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47818
This is another relatively small codegen.
Ideally we should CppSignature.decl() to generate the c++ function declaration.
We didn't because it needs to add 'at::' to the types defined in ATen namespace.
E.g.:
- standard declaration:
```
Tensor eye(int64_t n, int64_t m, const TensorOptions & options={})
```
- expected:
```
at::Tensor eye(int64_t n, int64_t m, const at::TensorOptions & options = {})
```
Kept the hacky fully_qualified_type() method to keep compatibility with old codegen.
We could clean up by:
- Using these types in torch namespace - but this is a user facing header file,
not sure if it will cause problem;
- Update cpp.argument_type() method to take optional namespace argument;
Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
.jenkins/pytorch/codegen-test.sh <baseline_output_dir>
.jenkins/pytorch/codegen-test.sh <test_output_dir>
Then run diff to compare the generated files:
diff -Naur <baseline_output_dir> <test_output_dir>
```
Test Plan: Imported from OSS
Reviewed By: bhosmer
Differential Revision: D24909478
Pulled By: ljk53
fbshipit-source-id: a0ceaa60cc765c526908fee39f151cd7ed5ec923
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47746
- Removed the integration hack in gen_python_functions.py. It now directly
loads native_functions.yaml. All dependencies on Declarations.yaml
have been removed / moved to elsewhere.
- Rewrote the deprecated.yaml parsing logic to work with new data model directly.
Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
.jenkins/pytorch/codegen-test.sh <baseline_output_dir>
.jenkins/pytorch/codegen-test.sh <test_output_dir>
Then run diff to compare the generated files:
diff -Naur <baseline_output_dir> <test_output_dir>
```
Differential Revision: D24885067
Test Plan: Imported from OSS
Reviewed By: bhosmer
Pulled By: ljk53
fbshipit-source-id: 8e906b7dd36a64395087bd290f6f54596485ceb4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47745
This is a relatively small codegen. Reintroduced 'simple_type' to preserve
old codegen output.
It depends on some methods defined in gen_python_functions.py - next PR will
clean up the remaining Declarations.yaml methods in gen_python_functions.py.
Confirmed byte-for-byte compatible with the old codegen:
```
Run it before and after this PR:
.jenkins/pytorch/codegen-test.sh <baseline_output_dir>
.jenkins/pytorch/codegen-test.sh <test_output_dir>
Then run diff to compare the generated files:
diff -Naur <baseline_output_dir> <test_output_dir>
```
Differential Revision: D24885068
Test Plan: Imported from OSS
Reviewed By: ezyang
Pulled By: ljk53
fbshipit-source-id: c0fbd726bcc450c3c7fe232c23e5b31779d0b65f
Summary:
Change Partitioner.py file name to partitioner.py
Change GraphManipulation.py file name to graph_manipulation.py
Move test_replace_target_nodes_with() to test_fx_experimental.py
Remove the unnecessary argument in size_based_partition() in Partitioner class
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47914
Reviewed By: gcatron
Differential Revision: D24956653
Pulled By: scottxu0730
fbshipit-source-id: 25b65be7dc7d64e90ffdc59cf394446fee83c3e6
Summary:
If world_size is lesser than or equal to number of GPU's available
then the rank can be directly mapped to corresponding GPU.
This fixes the issue referenced in https://github.com/pytorch/pytorch/issues/45435 and https://github.com/pytorch/pytorch/issues/47629
For world_size = 3 and number of GPU's = 8, the rank to GPU mapping
will be 0,2,4. This is due to the introduction of barrier,
(refer PR https://github.com/pytorch/pytorch/issues/45181)
the tensors in barrier is mapped to cuda0,1,2 and the tensors in the
actual test cases are mapped to cuda0,2,4 resulting in different streams and
leading to timeout. This issue is specific to default process group.
Issue is not observed in new process group since the streams are created again
after the initial barrier call.
This patch maps the rank to corresponding GPU's when the world_size is
less than or equal to the number of GPU's, in this case 0,1,2
Note: The barrier function in distributed_c10d.py should include new parameter
to specify the tensor or rank to GPU mapping. In that case, this patch will be
redundant but harmless since the tests can specify the tensors with appropriate
GPU rankings.
Fixes https://github.com/pytorch/pytorch/issues/47629
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47898
Reviewed By: smessmer
Differential Revision: D24956021
Pulled By: rohan-varma
fbshipit-source-id: a88257f22a7991ba36566329766c106d3360bb4e
Summary:
I think these can be safely removed since the min version of supported Python is now 3.6
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47822
Reviewed By: smessmer
Differential Revision: D24954936
Pulled By: ezyang
fbshipit-source-id: 5d4b2aeb78fc97d7ee4abaf5fb2aae21bf765e8b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47246
We crash the process in NCCL Async Error Handling if the collective
has been running for greater than some set timeout. This PR introduces more
information about the rank and duration the collective ran.
ghstack-source-id: 116676182
Test Plan: Run desync tests and flow.
Reviewed By: pritamdamania87
Differential Revision: D24695126
fbshipit-source-id: 61ae46477065a1a451dc46fb29c3ac0073ca531b
Summary:
Fix for https://github.com/pytorch/pytorch/issues/46122
For `Any`, we infer the type of the ivalue to set the ivalue's type tag. When we saw a Tensor, we would use a specialized Tensor type, so when `Dict[str, Tensor]` was passed in as any `Any` arg it would be inferred as `Dict[str, Float(2, 2, 2, 2)]` which breaks runtime `isinstance` checking.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46130
Reviewed By: glaringlee
Differential Revision: D24261447
Pulled By: eellison
fbshipit-source-id: 8a2bb26ce5b6c56c8dcd8db79e420f4b5ed83ed5
Summary:
This is an automated pull request to update the first-party submodule for [pytorch/FBGEMM](https://github.com/pytorch/FBGEMM).
New submodule commit: 9b0131179f
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47929
Test Plan: Ensure that CI jobs succeed on GitHub before landing.
Reviewed By: smessmer
Differential Revision: D24957361
fbshipit-source-id: 72fe80a784f10ddca52ee99fcf67cf6448a93012
Summary: D24747035 (1478e5ec2a) removes the entry point of `nnq.functional.relu`. Adjust op benchmark to `torch.nn.ReLU` accordingly.
Test Plan: buck run caffe2/benchmarks/operator_benchmark/pt:qactivation_test -- --use_jit --iterations 1 --warmup_iterations 1
Reviewed By: mingzhe09088
Differential Revision: D24961625
fbshipit-source-id: 5ed0ec7fa6d8cfefc8e7fc8324cf9a2a3e59de90
Summary:
Inside a container, the user is often root. We should allow this use case so that people can easily run `run_test.py` insider a container
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43794
Reviewed By: ezyang
Differential Revision: D24904469
Pulled By: malfet
fbshipit-source-id: f96cb9dda3e7bd18b29801cde4c5b0616c750016
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47076
Pull Request resolved: https://github.com/pytorch/glow/pull/5038
Eliminate double casting in glow when submitting fp16 per sample weights
Test Plan:
buck test glow/glow/torch_glow/tests:embedding_bag_test
Due to dependency conflicts between glow and caffe2, the test has been reverted from this diff, and landed separately
Reviewed By: allwu
Differential Revision: D24421367
fbshipit-source-id: eb3615144a2cad3d593543428dfdec165ad301df
Summary:
* Enable ONNX shape inference by default.
* ONNX could potentially set inferred shape in output instead of value_infos, checking both to be sure.
* Small fix in symbol_map to avoid overlooking dup symbols.
* Fix scalar_type_analysis to be consistent with PyTorch scalar type promotion logic.
* Correctly handle None dim_param from ONNX inferred shape.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46629
Reviewed By: ailzhang
Differential Revision: D24900171
Pulled By: bzinodev
fbshipit-source-id: 83d37fb9daf83a2c5969d8383e4c8aac986c35fb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47926
Given that we're soon enabling async error handling in PET, we should make the behavior explicit when users have set NCCL_BLOCKING_WAIT in their own code while also using PET. This PR essentially gives blocking wait precedence (for now). This way the blast radius of the PET change is smaller, while we continue working with blocking wait users and discussing whether moving to async error handling may be a good fit.
ghstack-source-id: 116553583
Test Plan: Simple FBL run/CI
Reviewed By: jiayisuse
Differential Revision: D24928149
fbshipit-source-id: d42c038ad44607feb3d46dd65925237c564ff7a3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47718
Distributed Inference splits a predict net into multiple parts, part0 being the main part which contains ops to make remote calls to other parts. part0 predict net may contain AsyncIf ops to optimize rpc call usage. AsyncIf ops have internal nets which may refer to memongered blobs. This change handles AsyncIf ops to update internal nets to refer to memongered blobs.
As part of this change, I am also updating dag memonger traversal to always start from root op, i.e. ops with 0 in degree. Earlier logic will start traversing ops based on input head blobs and if one of the head inputs is getting used in a non-root op which gets visited before its parent, the traversal will throwing assertion error here: https://fburl.com/diffusion/ob110s9z . Almost for all the distributed inference part0 nets, it was throwing this assertion error.
Test Plan: Added corresponding tests in memonger_test.py . Could not find unit tests in c++ version of memonger.
Reviewed By: hlu1
Differential Revision: D24872010
fbshipit-source-id: 1dc99b2fb52b2bc692fa4fc0aff6b7e4c5e4f5b0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46454
we stopped syncing this folder to fbcode, and it's not been used. AIbench will use the ones in xplat.
Test Plan: zbgs fbcode/caffe2/mode/ find nothing
Reviewed By: xta0
Differential Revision: D24356743
fbshipit-source-id: 7e70a2181a49b8ff3f87e5be3b8c808135f4c527
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47731
**Summary**
This commit modifies `ScriptTypeParser::parseTypeFromExpr` so that
string literal type annotations are resolved using
`Resolver::resolveType`. At present, they are parsed in
`parseBaseTypeName`, which inadvertently allows any key from
`string_to_type_lut` to be used as a string literal type annotation.
**Test Plan**
Existing unit tests (most notably
`TestClassType.test_self_referential_method` which tests the main
feature, self-referential class type annotations, that make use of
string literal type annotations).
**Fixes**
This commit fixes#47570.
Test Plan: Imported from OSS
Reviewed By: navahgar
Differential Revision: D24934717
Pulled By: SplitInfinity
fbshipit-source-id: b915b2c08272566b63b3cf5ff4a07ad43bdc381a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47884
We need to know output types of everything in a fusion group to ensure
that we generate correctly-typed tensors. We were incorrectly starting a
fusion group with an unknown-typed output.
Test Plan:
New unit tests:
```
buck test //caffe2/test:jit //caffe2/test/cpp/tensorexpr:tensorexpr
```
Reviewed By: eellison
Differential Revision: D24932786
fbshipit-source-id: 83978a951f32c1207bbc3555a7d3bd94fe4e70fb
Summary:
This is a second attempt at 8304c25c67, since the first attempt did not work as shown by b05f3571fe and c59015f21d. This time the idea is to directly embed the commit hash itself into the generated command that is fed to `docker exec`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47922
Reviewed By: zou3519
Differential Revision: D24953734
Pulled By: samestep
fbshipit-source-id: 35b14d1266ef039e8c1bdf3648275af812a2e57b