Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31517
This is going to be used by upsample (which currently uses magic values to represent optionals).
For now, we just introduce a fake function for testing (torch._test_optional_float(x)).
Test Plan: Imported from OSS
Differential Revision: D19198721
Pulled By: gchanan
fbshipit-source-id: 0a1382fde0927c5d277d02d62bfb31fb574b8c74
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30144
Create script to produce libtorch that only contains ops needed by specific
models. Developers can use this workflow to further optimize mobile build size.
Need keep a dummy stub for unused (stripped) ops because some JIT side
logic requires certain function schemas to be existed in the JIT op
registry.
Test Steps:
1. Build "dump_operator_names" binary and use it to dump root ops needed
by a specific model:
```
build/bin/dump_operator_names --model=mobilenetv2.pk --output=mobilenetv2.yaml
```
2. The MobileNetV2 model should use the following ops:
```
- aten::t
- aten::dropout
- aten::mean.dim
- aten::add.Tensor
- prim::ListConstruct
- aten::addmm
- aten::_convolution
- aten::batch_norm
- aten::hardtanh_
- aten::mm
```
NOTE that for some reason it outputs "aten::addmm" but actually uses "aten::mm".
You need fix it manually for now.
3. Run custom build script locally (use Android as an example):
```
SELECTED_OP_LIST=mobilenetv2.yaml scripts/build_pytorch_android.sh armeabi-v7a
```
4. Checkout demo app that uses locally built library instead of
downloading from jcenter repo:
```
git clone --single-branch --branch custom_build git@github.com:ljk53/android-demo-app.git
```
5. Copy locally built libraries to demo app folder:
```
find ${HOME}/src/pytorch/android -name '*.aar' -exec cp {} ${HOME}/src/android-demo-app/HelloWorldApp/app/libs/ \;
```
6. Build demo app with locally built libtorch:
```
cd ${HOME}/src/android-demo-app/HelloWorldApp
./gradlew clean && ./gradlew assembleDebug
```
7. Install and run the demo app.
In-APK arm-v7 libpytorch_jni.so build size reduced from 5.5M to 2.9M.
Test Plan: Imported from OSS
Differential Revision: D18612127
Pulled By: ljk53
fbshipit-source-id: fa8d5e1d3259143c7346abd1c862773be8c7e29a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28402
Revert PR #27274 as it's absorbed by PR #28398.
Test Plan: - make sure all mobile models can load and run
Differential Revision: D18055993
Pulled By: ljk53
fbshipit-source-id: 0d0ffdf2cfae18577189d3b69de15fa892210916
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27294Fixes#27291
I'm a little annoyed that I have to reintroduce manual binding code. But it's
probably not a good idea to teach the codegen how to do fastpath functions
(is it?)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D17763486
Pulled By: ezyang
fbshipit-source-id: 5793b53e2db80b044e57faae325a95c649d9d459
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27274
This is yet another fix to address #26764.
PR #26908 toggles NonVariableTypeMode in ATen dispatcher, which is where
USE_STATIC_DISPATCH takes place thus it's most logically sound place to do
such tweaks.
However, we observed nontrivial perf regression due to this fix. Turns out
the numel() tensor method gets called in several for-loops thus incurs ~7M
thread_local updates in a single forward call:
```
7173330 numel
558 size
416 q_scale
302 _empty_affine_quantized
288 contiguous
257 q_zero_point
216 qscheme
173 empty
110 set_
105 as_strided
104 permute
...
```
As numel() is not called from a single place so a natural workaround is to
update function_wrapper.py so that it only adds the guard on gen_namespace_function()
case and ignore the gen_tensor_method() case. But some tensor methods are actually
being called from JIT side directly (e.g. "aten::eq_" -> "(self).eq_") so the
only "band aid" left on the table is to insert guard on JIT->aten path as originally
did on #26868 - this is a simplified version of it as it doesn't hurt to extend the
NonVariableMode scope a little bit to also cover stack drop/pack calls.
On Android we only expose JIT API so we don't need worry about TensorMethods being
called directly. On iOS we don't provide a wrapper yet but we can mention this caveat
in the doc. Hopefully by the time it's widely used we can finish Variable/Tensor
unification and remove all these hacks.
Test Plan:
- Verified it runs quantized/fp32 MobileNetV2 models;
- Verified it fixes the perf regression (revert #26908 separately);
Differential Revision: D17732489
Pulled By: ljk53
fbshipit-source-id: c14ca66aebc6b6f17ad6efac7ca47f9487c98de5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26851
Add codegen option to remove backward ops from jit-op-registry as they are not
likely to be used for inference only mobile build.
Measured ARM-v7 AAR build size change: 5,804,182 -> 5,331,219.
Test Plan: - build and integrate with demo app;
Differential Revision: D17587422
Pulled By: ljk53
fbshipit-source-id: 08c0fc7a710698a0d4baaf16bbb73cb812b1126a
Summary:
We need this to be able to register them with the c10 dispatcher.
The overload names are based on one-letter-per-argument-type.
Script used to change native_functions.yaml and derivatives.yaml: P75630718
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23532
ghstack-source-id: 87539687
Differential Revision: D16553437
fbshipit-source-id: a1d0f10c42d284eba07e2a40641f71baa4f82ecf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22175
- Rename AliasAnalysisKind::DEFAULT to AliasAnalysisKind::CONSERVATIVE
- Introduce AliasAnalysisKind::FROM_SCHEMA that means the alias annotations of the schema should be honored
- Introduce AliasAnalysisKind::INTERNAL_SPECIAL_CASE to be able to run assertions that internal special cased ops are treated correctly
- aten:: and prim:: ops are not treated as special cases anymore, but just use AliasAnalysisKind::FROM_SCHEMA
- There's a set of assertions to ensure that aten:: and prim:: ops are all correctly set up to use AliasAnalysisKind::FROM_SCHEMA. Once this PR lands and passes all tests, we will remove those assertions and open up for the possibility of different AliasAnalysisKind settings for aten:: and prim:: ops
Differential Revision: D15929595
fbshipit-source-id: 7c6a9d4d29e13b8c9a856062cd6fb3f8a46a2e0d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21937
This changes call sites to use the new naming scheme
Reviewed By: zdevito
Differential Revision: D15892404
fbshipit-source-id: 8d32aa90a0ead1066688166478f299fde9c2c133
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21852
To enable change of q_scale and q_zero_point in `copy_`
Differential Revision: D15793427
fbshipit-source-id: a7040b5b956d161fd6af6176287f4a4aa877c9be
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177
- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now
Differential Revision: D15476433
fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
Summary:
#19975 was separated by 2 PRs.
This one:
Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions.
At this moment both functions just operate with strides and doesn't store any tensor state.
(Original RFC #19092)
-----
Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api).
Note: We had several complaints about `.to(memory_format)` function, and decided not to support it.
1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.
- Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior.
- Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern.
`x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise.
2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.
- `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged.
- `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format.
Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455
Differential Revision: D15341577
Pulled By: VitalyFedyunin
fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d
Summary:
Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.
Supported functions:
```python
torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)
```
Part of the bigger: `Remove Storage` plan.
Now compatible with both torch scripts:
` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)`
and
` _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))`
Same checked for all similar functions `rand_like`, `empty_like` and others
It is fixed version of #18455
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18952
Differential Revision: D14801792
Pulled By: VitalyFedyunin
fbshipit-source-id: 8dbc61078ff7a637d0ecdb95d4e98f704d5450ba
Summary:
Add input information into generated RecordFunction calls in
VariableType wrappers, JIT operators and a few more locations
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18717
Differential Revision: D14729156
Pulled By: ilia-cher
fbshipit-source-id: 811ac4cbfd85af5c389ef030a7e82ef454afadec
Summary:
Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.
Supported functions:
```python
torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)
```
Part of the bigger: `Remove Storage` plan.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18455
Reviewed By: ezyang
Differential Revision: D14672084
Pulled By: VitalyFedyunin
fbshipit-source-id: 9d0997ec00f59500ee018f8b851934d334012124
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**
This was requested by someone at Facebook; this lint is turned
on for Facebook by default. "Sure, why not."
I had to noqa a number of imports in __init__. Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it. Left for future work.
Be careful! flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments. flake8-3 will
report an import unused; flake8-2 will not. For now, I just
noqa'd all these sites.
All the changes were done by hand.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Differential Revision: D14687478
fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
Summary:
In python2, float values get truncated. We are storing default float values as floats (not 100% sure why?), which results in the defaults being truncated in the JIT and not matching the (specified) native function signatures.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18044
Reviewed By: ezyang
Differential Revision: D14469868
Pulled By: gchanan
fbshipit-source-id: a456de599e8dab106966bcac7a6033f02ce3cdd2
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/16233
The following changes are made:
- Modify `TupleType` to store optional field names
- Modify schema matching to return fill in those field names when creating `TupleType` as return type.
- Modify codegen of JIT to copy field names to schema string
- Modify `SchemaParser` to set field names of returned schema.
- Modify `SimpleValue::attr` to emit tuple indexing for named tuple.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16253
Reviewed By: ezyang
Differential Revision: D13954298
Pulled By: zdevito
fbshipit-source-id: 247d483d78a0c9c12d1ba36e1f1ec6c3f1a3007b
Summary:
This PR is a follow up of #15460, it did the following things:
* remove the undefined tensor semantic in jit script/tracing mode
* change ATen/JIT schema for at::index and other index related ops with `Tensor?[]` to align with what at::index is really doing and to adopt `optional[tensor]` in JIT
* change python_print to correctly print the exported script
* register both TensorList and ListOfOptionalTensor in JIT ATen ops to support both
* Backward compatibility for `torch.jit.annotate(Tensor, None)`
List of follow ups:
* remove the undefined tensor semantic in jit autograd, autodiff and grad_of
* remove prim::Undefined fully
For easy reviews, please turn on `hide white space changes` in diff settings.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16379
Differential Revision: D13855677
Pulled By: wanchaol
fbshipit-source-id: 0e21c14d7de250c62731227c81bfbfb7b7da20ab
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751
This was made more complicated by the fact that ivalue::IntList
is a thing. So I had to fix all of the sites where we referring
to IValue post facto.
The following codemods were run, in this order:
```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```
Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752
Reviewed By: dzhulgakov
Differential Revision: D13954363
fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
Summary:
I went through my build log and did what I thought were reasonable fixes to all the C++ compilation warnings that came up
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16411
Differential Revision: D13901006
Pulled By: jamesr66a
fbshipit-source-id: 02df4e3e5a5c8dd9e69ac9f065cd3f2a80645033
Summary:
If "matches_jit_signature" is set to True for a particular function, we will assume that the func syntax follows the JIT signature syntax. This is a temporary attribute and doesn't need to be set by developers outside the core team. It serves as a means of tracking an ongoing schema unification with the goal of aligning func syntax with other components of PyTorch in order to reduce overall complexity and match coverage of different function descriptions.
Followup PRs might be about removing _out from native_functions.yaml and using Tensor annotations instead, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16040
Reviewed By: ezyang
Differential Revision: D13703176
Pulled By: cpuhrsch
fbshipit-source-id: ce248e1823a6f18efa95502f9f3eebf023b4a46c
Summary:
Optional clean up. This PR remove python_default_init from the yaml files, and the code-gen, and utilize optional type to do the work.
This also fix the bug in the #13149 to correctly adopt as_strided backward.
Fixes#9941
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15234
Differential Revision: D13502044
Pulled By: wanchaol
fbshipit-source-id: 774b61fc4414482cf11d56e22bd0275aefb352a4
Summary:
For #6593 and #9515
This completes the support for optional<ScalarType> in native, JIT and autograd.
Note: Mostly following the existing implementation for optional<Scalar> that was added in https://github.com/pytorch/pytorch/pull/12582.
This PR introduces a way to make functions accept an optional dtype and it will unblock #9515 by allowing the `dtype` param for type promotion interface:
```
func: name(inputs, *, ScalarType? dtype=None, Casting casting=same_kind)
```
An alternative approach could have been using `ScalarType::Undefined` for the same purpose but without optional, though it would have been a bit hacky.
```
func: name(inputs, *, ScalarType dtype=Undefined, Casting casting=same_kind)
```
Here's an example use of this in action: 971f69eac6
There are already a bunch of native functions that were getting optional `dtype` through function overloading. https://github.com/pytorch/pytorch/pull/15133 is the attempt to migrate all of those. I will send those changes separately after this since some functions (e.g. sum) need quite a bit of change in the codebase. See the commits over there.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15154
Differential Revision: D13457760
Pulled By: tugrulates
fbshipit-source-id: 706134f0bd578683edd416b96329b49a1ba8ab48
Summary:
Removing the deprecated functions in `torch/csrc/variable_tensor_functions.h` (like `torch::CPU`) and corresponding implementations from `torch/csrc/torch.cpp` from master after the release.
ezyang gchanan soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15003
Differential Revision: D13418086
Pulled By: goldsborough
fbshipit-source-id: a0accdf6f7b0efa1ec07ac7b74b86ff2da37543f
Summary:
[ note: stacked on expect files changes, will unstack once they land ]
This adds DeviceObjType (cannot use DeviceType it is already an enum)
to the type hierarchy and an isDevice/toDevice pair to IValue.
Previous hacks which used an int[] to represent Device are removed
and at::Device is used instead.
Note: the behavior or .to is only a subset of python, we need to
fix the aten op so that it accepts Option[Device] and Optional[ScalarType].
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14666
Reviewed By: suo
Differential Revision: D13290405
Pulled By: zdevito
fbshipit-source-id: 68b4381b292f5418a6a46aaa077f1c902750b134
Summary:
This PR is a part of task to unblock standard library export.
* we treat None differently from Tensor and other types, when passing None as Tensor, it's an undefined tensor rather than the None IValue.
* Refine the type system so that we have correct tensor types hierarchy (Dynamic/Tensor/CompleteTensor), Dynamic should be at the top of the inheritance hierarchy.
* It also tries to export bilinear as an example of undefined tensor(None) input.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13650
Differential Revision: D12967026
Pulled By: wanchaol
fbshipit-source-id: 6aedccc7ce2a12fadd13d9e620c03e1260103a5a
Summary:
Some extra documentation for other bits too.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13621
Differential Revision: D12943416
Pulled By: ezyang
fbshipit-source-id: c922995e420d38c2698ce59c5bf4ffa9eb68da83