## Motivation
The DLPack device type kDLOneAPI stands for the Unified Shared Memory allocated on a oneAPI device. The corresponding Pytorch backend type is XPU.
Support to export/import the Pytorch XPU tensor as a DLPack tensor of kDLOneAPI device.
## Solution
1. Update the DLPack protocol to v0.7.
2. Add the XPU hooks to map the Aten device and DLPack device with the address value and device information.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82867
Approved by: https://github.com/kit1980
## Motivation
The DLPack device type kDLOneAPI stands for the Unified Shared Memory allocated on a oneAPI device. The corresponding Pytorch backend type is XPU.
Support to export/import the Pytorch XPU tensor as a DLPack tensor of kDLOneAPI device.
## Solution
1. Update the DLPack protocol to v0.7.
2. Add the XPU hooks to map the Aten device and DLPack device with the address value and device information.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81021
Approved by: https://github.com/ezyang
There are small typos in:
- caffe2/python/recurrent.py
- test/distributed/test_c10d_nccl.py
- test/test_fx.py
- torch/csrc/jit/runtime/autodiff.cpp
- torchgen/gen.py
Fixes:
- Should read `propagation` rather than `propogation`.
- Should read `multiplied` rather than `multuplied`.
- Should read `eliminate` rather than `elminate`.
- Should read `dispatcher` rather than `disaptcher`.
Semi-automated pull request generated by
https://github.com/timgates42/meticulous/blob/master/docs/NOTE.md
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81435
Approved by: https://github.com/ngimel
## Motivation
The DLPack device type kDLOneAPI stands for the Unified Shared Memory allocated on a oneAPI device. The corresponding Pytorch backend type is XPU.
Support to export/import the Pytorch XPU tensor as a DLPack tensor of kDLOneAPI device.
## Solution
1. Update the DLPack protocol to v0.7.
2. Add the XPU hooks to map the Aten device and DLPack device with the address value and device information.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78154
Approved by: https://github.com/ezyang
Summary: The `create_if_missing` parameter is optional, and defaults to `None`.
Test Plan:
Confirmed that Pyre no longer complains about calling `SwitchWorkspace` with a
single string argument.
Differential Revision: D36366987
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77464
Approved by: https://github.com/voznesenskym
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73284
Some important ops won't support optional type until opset 16,
so we can't fully test things end-to-end, but I believe this should
be all that's needed. Once ONNX Runtime supports opset 16,
we can do more testing and fix any remaining bugs.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D34625646
Pulled By: malfet
fbshipit-source-id: 537fcbc1e9d87686cc61f5bd66a997e99cec287b
Co-authored-by: BowenBao <bowbao@microsoft.com>
Co-authored-by: neginraoof <neginmr@utexas.edu>
Co-authored-by: Nikita Shulga <nshulga@fb.com>
(cherry picked from commit 822e79f31ae54d73407f34f166b654f4ba115ea5)
This PR introduces 3 BC changes:
First, this PR propagates `BUILD_CAFFE2` flag to `libtorch` and `libtorch_python`, which is necessary for non-caffe2 ONNX runtimes when using `ONNX_ATEN_FALLBACK` operator export type.
Second, as a complement of https://github.com/pytorch/pytorch/pull/68490, this PR refactors Caffe2's Aten ops symbolics to consider not only the `operator_export_type` (aka `ONNX_ATEN_FALLBACK`) to emit Caffe2 Aten ops, but also whether `BUILD_CAFFE2` (which is called `torch.onnx._CAFFE2_ATEN_FALLBACK` in python binding) is set.
Lastly, it renames `onnx::ATen` to `aten::ATen` for ONNX spec consistency in a BC fashion.
ONNX doesn't have `ATen` op on its spec, but PyTorch ONNX converter emits them. Non-Caffe2 backend engines would be mislead by such operator's name/domain. A non-ideal workaround would be to have Aten ops handled based on its name and ignore the (non-complaint) domain. Moreover, users could incorrectly file bugs to either ONNX or ONNX Runtime when they inspect the model and notice the presence of an unspecified ONNX operator.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73954
Approved by: https://github.com/BowenBao, https://github.com/malfet, https://github.com/garymm, https://github.com/jiafatom
Fixes https://github.com/pytorch/pytorch/issues/69674
The fix is Back Compatible with any Caffe2 build. It simply tries to use `onnxptimizer` module when `onnx.optimizer` is not available.
`onnx.optimizer` does not exist since ONNX 1.9 (April 2021) as the code was moved to a different [repo](https://github.com/onnx/onnxoptimizer)
If both `onnx<1.9` and `onnxoptimizer` are not found, the current fallback behavior is maintained (no ONNX optimization happens). Otherwise, the ONNX optimization pass will run from whatever module it is found.
This PR does not require or enforce a direct package dependency to work
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75718
Approved by: https://github.com/BowenBao, https://github.com/malfet
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73148
Makes a bunch of things const, eliminates extraneous variables
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D34365183
fbshipit-source-id: 56e4c43e0c14d28f9d18903e9b05f993637489b1
(cherry picked from commit 51520edd16084270aefe8f8143799f918d7ae22d)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73086
I'm wrapping up the conversion of type comments to type annotations
in caffe2. The last remaining "bulk" codemod has test failures that
are hard for me to understand, so I'm going to submit PRs for each
module individually which makes it easier to see what's causing
problems.
All the codemods were produced via LibCST and then manually cleaned up.
Test Plan: Wait for github CI
Reviewed By: shannonzhu
Differential Revision: D34344202
fbshipit-source-id: 8342267cd27a90ad91a65db858bfbd3675281c9a
(cherry picked from commit 3d0658d8cf)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72741
as titled.
Context:
This is useful in fast mitigating feature induced overfitting in the sense that we can do omni-transfer on a trained model and apply dropout with ratio = 1 on features resulting in overfitting. Directly removing the features would not be feasible on omni-transfer scenarios since the downstream FC sizes would change.
Experimental records:
https://fb.quip.com/npIkAgRc8jl9#temp:C:DWC050ceaba14424d23a78462c01
Doing dropout = 1 on selected features improves the eval NE over the next few hours (compared to v0 baseline) as is shown in the figures.
Test Plan:
```
buck test caffe2/caffe2/python/operator_test:dropout_op_test
```
Reviewed By: ustctf
Differential Revision: D34178732
fbshipit-source-id: 533feebe21bc582eefd756de397d5c7807c7438d
(cherry picked from commit 5dabf9c484)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72660
Sometimes it might happen when model gets an empty input.
For consistency with numpy and torch we should just return 0 without averaging or NaN with averaging.
Test Plan: Modified unittest
Differential Revision: D33782786
fbshipit-source-id: 90d8d63d685c96acc903c08c59eb39fad39e493c
(cherry picked from commit ca85779a4e)
Summary: Add session based margin loss into caffe2 operator. This is the first diff make these 2 loss available to dper3
Test Plan:
unit test succeeds with gradient check for both new loss function
buck test //caffe2/caffe2/python/operator_test:softmax_l2r_operator_test
buck test //caffe2/caffe2/python/operator_test:margin_loss_l2r_operator_test
E2E test in bento notebook with model training in N1488923
margin loss model: f318207967 f318207399
Notice that the E2E test is run with dper change in D33532976 to change a full model
Reviewed By: devashisht
Differential Revision: D32902460
fbshipit-source-id: 8f21b9109f500583431156908b632e503ed90dbd
(cherry picked from commit 1592111aa4)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70248
Modified loops in files under fbsource/fbcode/caffe2/ from the format
```
for(TYPE var=x0;var<x_max;x++)
```
to the format
```
for(const auto var: irange(xmax))
```
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D32813863
fbshipit-source-id: 527244b4a2b220fdfe7f17dee3599603f492a2ca
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68485
In OSS, the only change is that we make the predict_net field of PredictorExporterMeta nullable.
Test Plan: sandcastle, let CI run
Reviewed By: boryiingsu
Differential Revision: D32467138
fbshipit-source-id: 81bd5fca695462f6a186bcfa927073874cc9c26a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66743
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
Test Plan: Sandcastle
Reviewed By: malfet
Differential Revision: D31705359
fbshipit-source-id: c9ea2fbc0f9cd29e97a52dcb203addc5f2abb09b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234
Modified loops in files under fbsource/fbcode/caffe2/ from the format
`for(TYPE var=x0;var<x_max;x++)`
to the format
`for(const auto var: irange(xmax))`
This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.
bypass_size_limit
allow-large-files
Test Plan: Sandcastle
Reviewed By: ngimel
Differential Revision: D30652629
fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66443
For some reason, this logging is adding noise to a lot of flow jobs. I am not sure if this is actually needed.
This is called from the __init__ so it's logged all the time and logs all key:values the current local symbol.
Test Plan: N/A
Reviewed By: chowarfb
Differential Revision: D31534372
fbshipit-source-id: bed032b66fed548c97a6f66b1b9e905fd2738851
Summary:
Addresses this network risk mitigation mentioned in https://github.com/pytorch/pytorch/issues/65439#issuecomment-924627239.
I didn't include any mobile app/benchmarking changes because I think the pretrained matters there.
I ended up removing the changes in test_utils because those were sensitive to the pretrained variable.
I am saving the quantization test changes for another PR because they are currently disabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66312
Reviewed By: ejguan
Differential Revision: D31542992
Pulled By: janeyx99
fbshipit-source-id: 57b4f70247af25cc96c57abd9e689c34641672ff
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66056
keep running into this unrelated failure when landing diffs regarding the gpu inference project,
disabling this operator unit test in gpu because it doesn't exist
RuntimeError: [enforce fail at operator.cc:277] op. Cannot create operator of type 'SmartDecaySparseAdam' on the device 'CUDA'. Verify that implementation for the corresponding device exist. It might also happen if the binary is not linked with the operator implementation code. If Python frontend is used it might happen if dyndep.InitOpsLibrary call is missing. Operator def: input: "param" input: "mom1" input: "mom2" input: "last_seen" input: "indices" input: "grad" input: "lr" input: "iter" output: "param" output: "mom1" output: "mom2" output: "last_seen" name: "" type: "SmartDecaySparseAdam" arg { name: "beta1" f: 0 } arg { name: "beta2" f: 0.9 } arg { name: "epsilon" f: 1e-05 } device_option { device_type: 1 }
https://www.internalfb.com/intern/testinfra/diagnostics/5910974579962988.562949996565057.1633122845/
Test Plan: sandcastle
Reviewed By: jianyuh
Differential Revision: D31364731
fbshipit-source-id: 7fbd994cbe7f6ca116f5f34506a1ed7f14759bdf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610
- Replace HIP_PLATFORM_HCC with USE_ROCM
- Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION.
- In the next PR
- Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify.
- HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc.
cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd
Reviewed By: jbschlosser
Differential Revision: D30909053
Pulled By: ezyang
fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06