Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54103
The goal is to reduce the spread of static casts in the autograd code as per the comment in https://github.com/pytorch/pytorch/pull/49097#discussion_r543695091
I wasn't sure how to use a virtual method here but a simple method in impl clean it up quite nicely.
Test Plan: Imported from OSS
Reviewed By: agolynski
Differential Revision: D27117840
Pulled By: albanD
fbshipit-source-id: 5f277dde34ccf6bc20f76583b906ff3528cde5aa
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857
These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
- `GLOSSARY.md`
- `aten/src/ATen/core/op_registration/README.md`
- `scripts/README.md`
- `torch/csrc/jit/codegen/fuser/README.md`
The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```
I looked over the auto-generated changes and didn't see anything that looked problematic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406
Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377
This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348
Reviewed By: walterddr, seemethere
Differential Revision: D26856620
Pulled By: samestep
fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
Summary:
Fix https://github.com/pytorch/pytorch/issues/46242
This ensures that the `check_inplace()` run the proper checks even if the Tensor that is being modified inplace does not requires gradient. As the Tensor written into it might require gradient and will make this inplace modification actually differentiable.
This contains:
- Codegen changes to tell `check_inplace()` if the inplace will be differentiable
- Changes in `handle_view_on_rebase` to work properly even when called for an input that does not require gradients (which was assumed to be true before)
- Corresponding tests (both warnings and the error raise internal assert errors without this fix)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46296
Reviewed By: ezyang
Differential Revision: D24903770
Pulled By: albanD
fbshipit-source-id: 74e65dad3d2e3b9f762cbb7b39f92f19d9a0b094
Summary:
Fixes https://github.com/pytorch/pytorch/issues/36403
Copy-paste of the issue description:
* Escape hatch: Introduce unsafe_* version of the three functions above that have the current behavior (outputs not tracked as views). The documentation will explain in detail why they are unsafe and when it is safe to use them. (basically, only the outputs OR the input can be modified inplace but not both. Otherwise, you will get wrong gradients).
* Deprecation: Use the CreationMeta on views to track views created by these three ops and throw warning when any of the views is modified inplace saying that this is deprecated and will raise an error soon. For users that really need to modify these views inplace, they should look at the doc of the unsafe_* version to make sure their usecase is valid:
* If it is not, then pytorch is computing wrong gradients for their use case and they should not do inplace anymore.
* If it is, then they can use the unsafe_* version to keep the current behavior.
* Removal: Use the CreationMeta on view to prevent any inplace on these views (like we do for all other views coming from multi-output Nodes). The users will still be able to use the unsafe_ versions if they really need to do this.
Note about BC-breaking:
- This PR changes the behavior of the regular function by making them return proper views now. This is a modification that the user will be able to see.
- We skip all the view logic for these views and so the code should behave the same as before (except the change in the `._is_view()` value).
- Even though the view logic is not performed, we do raise deprecation warnings for the cases where doing these ops would throw an error.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39299
Differential Revision: D22432885
Pulled By: albanD
fbshipit-source-id: 324aef091b32ce69dd067fe9b13a3f17d85d0f12
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32839
As mentioned in the updated comment in `variable.h`, this disambiguate code like:
```python
base = torch.rand(10, requires_grad=True)
with torch.no_grad():
view = base[1]
view.copy_(var)
torch.autograd.grad(base.sum(), var) # <- what should it return?
```
Given that there is no consensus of what should happen here (does the gradient flow through the view in the no_grad or not). This special case is detected and forbidden.
As mentionned in the error message:
- If you want it to be tracked: move both out of the no_grad
- If do not want them to be tracked, move both inside the no_grad
This implies that any custom Function that returns views does not allow inplace modification on its output. I'll add a PR to the stack to relax this to be a DeprecationWarning for now. And we will make it into an actual error for 1.6
This replaces https://github.com/pytorch/pytorch/pull/26607
cc sublee
Test Plan: Imported from OSS
Differential Revision: D19814114
Pulled By: albanD
fbshipit-source-id: ff2c9d97c8f876d9c31773a2170e37b06d88bed7
Summary:
This PR renames `at::Tensor::base()` to `at::Tensor::_base()`, to achieve parity with Python `torch.Tensor._base` API.
----
This PR is BC-breaking in the following way:
Previously, to get the tensor that this tensor is a view of, the user would call `tensor.base()` in C++. Now, they must call `tensor._base()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33316
Differential Revision: D19905687
Pulled By: yf225
fbshipit-source-id: 949d97b707b2c82becb99ac89e9ac24359d183e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31990
This PR does three things:
- Add a new `allow_rebase_history` flag to the differentiable views. If set, trying to rebase their history will raise an error.
- Make sure that the codegen functions verify this flag before doing inplace operations so that they fail before doing the inplace modification.
- Make sure the codegen functions set this flag properly when we don't support rebasing the history of the output.
The codegen change can be found [here](4bf180caa0).
Test Plan: Imported from OSS
Differential Revision: D19409649
Pulled By: albanD
fbshipit-source-id: a2b41c2d231e952ecfe162bdb6bad620ac595703
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32044
Fix the list of views in the codegen:
- Move `narrow` out of the autograd functions since it's now implemented with slice.
- Add `split_with_sizes` that was missing from the list
- Remove special formulas for both `split` and `split_with_sizes`. Both used not to be considered as views. When they are, all the rnn code breaks because it uses them in an invalid way. The generic formula will generate one `narrow` Node for each output. Which is always valid.
The diff for the generated code can be found [here](https://github.com/pytorch/pytorch/compare/16eff6e...albanD:06d6e85) (outdated for last commit)
Test Plan: Imported from OSS
Differential Revision: D19409648
Pulled By: albanD
fbshipit-source-id: 5ebc4c978af500403f7f008c0231b7db0cabab26
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915
Since we now have C++14, we don't need these c10::guts helpers anymore
ghstack-source-id: 95777609
Test Plan: waitforsandcastle
Differential Revision: D18869639
fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e
Summary:
Fixes https://github.com/pytorch/pytorch/issues/29161.
I looked a bit at the code changes related to this and think I have all of the use cases of `DeprecatedTypeProperties` covered in the message, but suggestions from someone with more context on this would be very much appreciated :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30281
Differential Revision: D18830818
Pulled By: ezyang
fbshipit-source-id: 1a7fcee15354ae09e6644577e7fa33bd26acfe20
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28287
This PR eliminates the static distinction between
Tensor and Variable. Every Variable is a Tensor, no need to static_cast
or call the Variable constructor.
To do this, I need Tensor to have API parity with Variable. I have already
moved most of the methods I don't want in Tensor off Variable.
These implementations are all placed in Tensor.cpp.
One API difference is that all Variable methods now have const, so we no longer
have faux const-correctness (see https://github.com/zdevito/ATen/issues/27 for
back story)
This diff is BC breaking in a few ways:
- Because torch::autograd::Variable is now just an alias of at::Tensor, ADL for
`torch::autograd` functions no longer works, you have to explicitly qualify
them with `torch::autograd` (examples: `torch/nn/parallel/data_parallel.h`)
- Because Variable and Tensor are now the same type, code which assumes that
they are different types (e.g., for the purposes of templating, or enable_if checks)
will not work until you delete the (now) redundant overload/specialization.
(examples: `torch/nn/modules/container/any.h`, `torch/csrc/utils/pybind.h`)
Some other notes:
- I'm not sure what was going with the old template implementation of `extract_vars`,
but I couldn't get the sfinae version to work. Replacing it with an overloading based version
made it work.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18571426
Pulled By: ezyang
fbshipit-source-id: 2ea8151e5f1d8512cdebf1345399642e68b707b8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29667
Some previous implementations are defined in native_functions.yaml.
In this case, I don't define them explicitly in Tensor; instead
they are placed in VariableTypeManual.cpp. When I did this, I would
have deleted documentation; instead, this documentation was moved
to native_functions.yaml
This also replaces `current_version` with just `_version`.
This is a carved out portion of #28287, rebased past Tensor-Variable
merge.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18504934
Pulled By: ezyang
fbshipit-source-id: be7adf45b637daffe2b0b1631eb31d967525fc31
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29665
Our intention is to merge the static distinction between Tensor and
Variable. Ordinarily, this would entail merging the methods of Tensor
and Variable. But there are a lot of "private"-ish methods on Variable
that we don't actually want to dump onto the Tensor class. So, as prep
work, we move all of those methods off of Variable and into
the torch::autograd::impl namespace (impl as in, please don't use this
end users). This ends up being a fairly large patch because all of
the call sites have to play ball too.
While I was on the topic, I also moved any of the touched functions into
the C++ file, so that modifying them would not trigger a recompilation of
all of torch.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18496169
Pulled By: ezyang
fbshipit-source-id: afb203252620ec274be596b3e7b1d84d321bad3a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28610
The basic idea is, in some cases where we stored a pointer to a full AutogradMeta object, instead store a nullptr. We let a nullptr represent a default-constructed AutogradMeta object, and simply populate it with a real AutogradMeta if there is ever a situation where we need to modify it.
The primary technical contrivance in this diff is I have to use AutogradMetaFactory to lazily initialize the AutogradMeta, as it is not available in the dynamic library that TensorImpl is in. (I spent a while trying to put them in the same compilation unit, but gave up in the end as it pushed us over the Windows linking binary size limit. Eep.)
Some other notes:
- `set_autograd_meta` now unconditionally turns a tensor into a variable. I audited all call sites and observed there are no occurrences where nullptr is passed (after this patch, there are now!)
- `copy_tensor_metadata` is updated to unconditionally preserve the VariableTensorId-ness of the destination tensor. I think this is the more correct semantics; we can't do the old semantics anymore.
- There's a bunch of places in the API where we return const references to objects. This is pretty weird to me, but I didn't feel like cleaning it up. But sometimes I don't conveniently have something that's the right lifetime, so I introduced a number of singletons to handle this correctly.
You might wonder why I'm doing the optimization before the variable-tensor dynamic merge. The reason is simple: this change is semantics preserving, while variable-tensor dynamic merge is not. So it is easier to get right, and prevents us from regressing performance if we do it the other way.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18171162
Pulled By: ezyang
fbshipit-source-id: 580df729e4d04881b2b9caa0f0c00785b3afbb92
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28602
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18171163
Pulled By: ezyang
fbshipit-source-id: 3f3d4cf0bd05c302f502795a04ecace0fc064255
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28593
When I turn on Variable everywhere, I will need to be able to construct
AutogradMetas from TensorImpl. But I cannot call the constructor directly
as it lives in another dynamic library. So I need another virtual factory interface
to be able to do this.
I also adjust the AutogradMeta constructor so that the TensorImpl argument is
optional. This argument is only needed if `requires_grad == True`, as we use it
to test if the variable is valid (only floating point tensors can have requires grad true).
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18171161
Pulled By: ezyang
fbshipit-source-id: 3f2e86720899b3bda36ddd90244c2624645cc519
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28601
In the process, I moved AutogradMeta out of the Variable class. The
intent here is that I'm going to delete Variable class entirely,
so I had better not be putting stuff in it!
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18171160
Pulled By: ezyang
fbshipit-source-id: 9c0bcdc82797eca0577d1b0745b4a2ae962f3010
Summary:
This is the first of a series of changes to reduce build size by cutting
autograd functions from mobile build.
When INTERN_DISABLE_AUTOGRAD is set:
* On CMake side we exclude Functions.h/cpp, VariableType*.h/cpp,
VariableTypeManual.cpp from the build process. Still keep variable_factories.h
as we rely on it to create variables instead of tensors.
* In source code we gate a couple autograd references (in autograd/variable.cpp)
with C10_MOBILE (technically we should use a dedicated c macro but its
maintenance cost is higher than cmake macro as we have several build systems
to change).
* Pass --disable-autograd flag to codegen script, which will stop generating
Functions/VariableType code. And for variable_factories.h it will stop
generating tracing code.
Edit: in this diff we will keep Functions.h/cpp to avoid changing source code.
Why we need this change if it's already not calling VariableType and autograd
stuff with USE_STATIC_DISPATCH=ON for mobile?
It's trying to reduce static library size for iOS build, for which it's
relatively harder to strip size with linker approach.
Why we need make involved change into codegen script?
There isn't a global config system in codegen - autograd/env.py provides similar
functionality but it says not adding anything there.
Test Plan:
- will check CI;
- test mobile build in sample app;
Differential Revision: D17202733
Pulled By: ljk53
fbshipit-source-id: 5701c6639b39ce58aba9bf5489a08d30d1dcd299
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24393
Ability to register hook on a variable, similar to python autograd API. register_hook will take a function as argument and create a CppFunctionPreHook similar to PyFunctionPreHook.
It will return the index of the hook which can be passed to remove_hook to disable the hook.
Test Plan: Added tests.
Differential Revision: D16861722
fbshipit-source-id: d08047f932e38c7bde04283a18b2d0311c8ad604
Summary:
We introduced RTTI in recent change: https://github.com/pytorch/pytorch/pull/21613
For internal mobile build we don't enable '-frtti' yet. This diff is trying to replace
RTTI with alternative approach.
According to dzhulgakov we could compare two tensors' type_id directly in most cases -
which is more strict than comparing TensorImpl subclass type as TensorImpl -> type_id
mapping is 1-to-n but it's more proper for this use case.
The only two cases where we can relax direct type comparison (for legacy reason) are:
1. CPUTensor <-> CUDATensor;
2. SparseCPUTensor <-> SparseCUDATensor;
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22773
Differential Revision: D16277696
Pulled By: ljk53
fbshipit-source-id: 043e264fbacc37b7a11af2046983c70ddb62a599
Summary:
https://github.com/pytorch/pytorch/pull/17072 breaks `model.to(xla_device)`, because moving `model` to XLA device involves changing its parameters' TensorImpl type, and the current implementation of `nn.Module.to()` doesn't support changing module parameters' TensorImpl type:
```python
# 6dc445e1a8/torch/nn/modules/module.py (L192-L208)
def _apply(self, fn):
...
for param in self._parameters.values():
if param is not None:
# Tensors stored in modules are graph leaves, and we don't
# want to create copy nodes, so we have to unpack the data.
param.data = fn(param.data) # NOTE: this doesn't allow changing `param.data`'s TensorImpl type
if param._grad is not None:
param._grad.data = fn(param._grad.data) # NOTE: this doesn't allow changing `param._grad.data`'s TensorImpl type
...
```
yf225 TODO: fix the description here when we finish the implementation
To fix this problem, we introduce a new API `model.to_()` that always assign new tensors to the parameters (thus supporting changing the parameters to any TensorImpl type), and also bump the version counter of the original parameters correctly so that they are invalidated in any autograd graph they participate in.
We also add warning to the current `model.to()` API to inform users about the upcoming behavior change of `model.to()`: in future releases, it would create and return a new model instead of in-place updating the current model.
This unblocks adding XLA to our CI test suite, which also allows XLA to catch up with other changes in our codebase, notably the c10 dispatcher.
[xla ci]
cc. resistor ailzhang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21613
Differential Revision: D15895387
Pulled By: yf225
fbshipit-source-id: b79f230fb06019122a37fdf0711bf2130a016fe6
Summary:
As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR:
1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class
2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()`
3. Remove `Variable.data()` API
3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history.
After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't.
**Note that this PR is BC-breaking in the following use cases:**
**Use Case 1:**
Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type.
**Use Case 2:**
If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example:
```python
params = torch.tensor([1.5, 1.5]).requires_grad_()
with torch.no_grad():
# Change gradient to a sparse tensor
params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.]))
grad_saved = params.grad
params.backward(torch.tensor([1.5, 1.5]))
assert id(grad_saved) == id(params.grad) # This will fail after this PR
```
The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072
Differential Revision: D14075257
Pulled By: yf225
fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957
Summary:
#19975 was separated by 2 PRs.
This one:
Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions.
At this moment both functions just operate with strides and doesn't store any tensor state.
(Original RFC #19092)
-----
Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api).
Note: We had several complaints about `.to(memory_format)` function, and decided not to support it.
1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.
- Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior.
- Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern.
`x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise.
2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.
- `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged.
- `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format.
Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455
Differential Revision: D15341577
Pulled By: VitalyFedyunin
fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d
Summary:
Previously, the caller of `shallow_copy_and_detach()` is responsible for deciding whether the shallow-copy should share the source TensorImpl's version counter, or have its own new version counter. However, since this decision is crucial for ensuring the correctness of the shallow-copy's version counter, we want to enforce users of `shallow_copy_and_detach()` to pass a version counter to the function call, so that they are required to make the decision at the time of API usage, not as an afterthought.
For similar reasons, we want to enforce users of `shallow_copy_and_detach()` to pass `allow_tensor_metadata_change` to the function call, so that they are required to decide "whether the TensorImpl shallow-copy should allow tensor metadata change" at the time of API usage, not as an afterthought.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20496
Differential Revision: D15363620
Pulled By: yf225
fbshipit-source-id: a65e74738b10452668d6dc644b43aad5b3d8c9e6
Summary:
In https://github.com/pytorch/pytorch/pull/18223/files#diff-77a6f3462f2233b921d3042412fed6d3R178, we used `auto saved_version_ = data_.unsafeGetTensorImpl()->version_counter().current_version()` and then `new_data_impl_copy->set_version_counter(saved_version_)`, which actually doesn't preserve the original semantics that `var.set_data(tensor)` should keep `var`'s version counter object intact. This PR fixes the bug and adds test to make sure it doesn't happen again.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20391
Differential Revision: D15323430
Pulled By: yf225
fbshipit-source-id: e3ba49b51ec8ccecd51c80cb182387f74cfd2b2b
Summary:
According to https://github.com/pytorch/pytorch/issues/13638#issuecomment-468055428, after the Variable/Tensor merge, we may capture variables without autograd metadata inside an autograd function, and we need a working version counter in these cases. This PR makes it possible by moving `version_counter_` out of autograd metadata and into TensorImpl, so that variables without autograd metadata still have version counters.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18223
Differential Revision: D14735123
Pulled By: yf225
fbshipit-source-id: 15f690311393ffd5a53522a226da82f5abb6c65b
Summary:
Currently, a TensorImpl's `is_variable_` is true if and only if the TensorImpl has AutogradMeta. This PR unifies these two concepts by removing `is_variable_` and change `is_variable()` to check existence of AutogradMeta instead.
Removing `is_variable_` is part of the work in Variable/Tensor merge.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19139
Differential Revision: D14893339
Pulled By: yf225
fbshipit-source-id: ceb5e22c3c01f79b5d21d5bdbf4a7d1bc397796a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833
ghimport-source-id: 6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.**
* #18832 [STACK] Disallow changing the device of a tensor via set_.
* #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.
1) We cache device on TensorImpl. This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves).
2) Clean up TensorImpl APIs. We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds. Instead, we just have two different constructors: one for types with a storage, one without.
Reviewed By: dzhulgakov
Differential Revision: D14766230
fbshipit-source-id: 745b8db84dcd6cb58f1a8675ad3ff8d033bc50df
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17991
changes:
-Breaks bc: Tensor::type() now returns DeprecatedTypeProperties& rather than Type&.
-Added DeprecatedTypeProperties, it serves as a temporary replacement for Type as the return value of Tensor::type(). This contributes to making Type just for dispatch purposes so that we can make it dtype agnostic.
-Tensor::dispatch_type() now returns Type& like Tensor::type() used to do.
-Changed callsites of Tensor::type() appropriately.
Reviewed By: ezyang
Differential Revision: D14443117
fbshipit-source-id: 239ccb7a09626279a71d1a37f8f82e7f57bf7d9e
Summary:
Currently, `THPVariable_Wrap(…)` and `THPVariable_NewWithVar(…)` depend on the existence of `pyobj_` in the autograd metadata of a Variable to convert the Variable to a Python tensor. However, after the Variable/Tensor merge, there will be Variables that don't contain autograd metadata, and to allow the conversion from non-autograd-meta Variable to a Python tensor we need to store the `pyobj_` outside of autograd metadata and in a place where it will always be available.
This PR makes it possible by moving `pyobj_` into TensorImpl, so that `THPVariable_Wrap(…)` and `THPVariable_NewWithVar(…)` can always access a Variable's `pyobj_` and convert the Variable to a Python tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18225
Differential Revision: D14562616
Pulled By: yf225
fbshipit-source-id: 18d4aaace70eee6120abaf9276036d1f8f51b18d
Summary:
In VariableType.cpp, when a function modifies its input tensors, it should only change the input tensors' storage data in-place, and should never change the input tensors' storage pointers. This PR adds checks for this, and also fixes functions that fail this test.
This is part of the Variable/Tensor merge work (https://github.com/pytorch/pytorch/issues/13638).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16305
Differential Revision: D13897855
Pulled By: yf225
fbshipit-source-id: 0c4fc7eb530d30db88037b1f0981f6f8454d3b79
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751
This was made more complicated by the fact that ivalue::IntList
is a thing. So I had to fix all of the sites where we referring
to IValue post facto.
The following codemods were run, in this order:
```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```
Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752
Reviewed By: dzhulgakov
Differential Revision: D13954363
fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
Summary:
In this PR, we are moving all functions away from `Variable::Impl`, in order to get rid of `Variable::Impl` (and the `data_` Tensor in it) in the next PR. Some of the functions (such as `set_requires_grad` / `requires_grad` / `grad`) will be living in `AutogradMeta` class, while others (such as `backward()` / `rebase_history()` / `grad_accumulator()` / `grad_fn()`) will be living in `Variable` class.
This is the 2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15487
Differential Revision: D13553173
Pulled By: yf225
fbshipit-source-id: 691f9432d0cd0640af380c757f3e3a2f64f8851c
Summary:
Changes originally in this PR:
1. Move Variable::Impl data members into TensorImpl as `AutogradMeta` struct
2. Change Variable::Impl functions to use data members in `AutogradMeta` struct
3. Add `shallow_copy_and_detach()` function to each subclass of TensorImpl
4. Do shallow copy when the user calls `make_variable(tensor)` / `make_variable_view(tensor)` / `variable.set_data(tensor)` / `variable.detach()`
Changes moved from https://github.com/pytorch/pytorch/pull/13645:
1. Add a flag to Variable to disallow size/stride/storage_ptr changes from in-place operations such as `resize_` / `resize_as_` / `set_` / `transpose_`, and set this flag to true when people call `tensor.data` in Python.
2. Write text in the docs to actively discourage changing the shape or storage of `tensor_detached` and expecting `tensor` to also be updated.
This is the 1st+2nd PR mentioned in https://github.com/pytorch/pytorch/issues/13638.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13827
Differential Revision: D13507173
Pulled By: yf225
fbshipit-source-id: b177b08438d534a8197e34e1ad4a837e2db0ed6a
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.
I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.
I used the following script to do the canonicalization:
```
import subprocess
import re
import os.path
files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
for fn in files:
if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
continue
if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
continue
with open(fn, 'r') as f:
c = f.read()
def fmt(p):
return "#include <{}>".format(p)
def repl(m):
p = m.group(1)
if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
return fmt(p)
if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
return fmt(p)
for root in ["aten/src", "torch/lib", ""]:
for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
new_p = os.path.relpath(os.path.join(bad_root, p), root)
if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
return fmt(new_p)
print("ERROR: ", fn, p)
return m.group(0)
new_c = re.sub(r'#include "([^"]+)"', repl, c)
if new_c != c:
print(fn)
with open(fn, 'w') as f:
f.write(new_c)
```
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849
Reviewed By: dzhulgakov
Differential Revision: D13363445
Pulled By: ezyang
fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
Summary:
This should improve the performance of wrapping a tensor in a Variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13630
Reviewed By: ezyang
Differential Revision: D12944960
Pulled By: zou3519
fbshipit-source-id: 89fa78a563e46a747d851a90ffd1b5cf3cd2d0d7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12792
This is a follow up diff after D10238910.
Only non-codemod change is the removal of ATen/Error.h and ATen/core/Error.h. Other files are basically changing the inclusion path + clang format for inclusion order.
Reviewed By: bddppq
Differential Revision: D10437824
fbshipit-source-id: 7f885f80ab5827468d1351cfb2765d0e3f555a69
Summary:
There are still a few work to be done:
- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h
This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:
(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.
Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354
Reviewed By: orionr
Differential Revision: D10238910
Pulled By: Yangqing
fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11294
The Tensor(ptr, retain) constructor is error prone and circumvents the intrusive_ptr safety.
This diff removes that and pushes the responsibility to callers.
Step by step, manual refcounting can be pushed back and possibly eliminated in the end.
Reviewed By: ezyang
Differential Revision: D9663476
fbshipit-source-id: 7f010e5e47b137a9575960201c5bf5d552c5c2f5
Summary:
This is so that TensorImpl does not have to depend on Tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11337
Differential Revision: D9684421
Pulled By: gchanan
fbshipit-source-id: d2af93420ca6d493429c251cfe5a34e9289c4484
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11205
Our short term plan for supporting out of tree complex development requires an
external library to add a custom subclass of Type without access to the
code generation facilities in ATen. This commit reorganizes Type so
as to minimize the amount of boilerplate you have to write when making
a subclass of Type.
In particular, it:
- Creates a new CPUTypeDefault/CUDATypeDefault class, which you are
intended to inherit from, which provides default implementations
of CPU/CUDA that is layout/dtype agnostic.
- Adds new getCPUAllocator() and getCUDAAllocator() functions, as
a more public API to get your hands on Allocator
- Adds allocator() and getDeviceFromPtr(), abstracting the device
specific parts of storage() methods; these methods are now
implemented in base TypeDefault.
- Delete the static typeString() method, which is now dead.
- Move is_cuda/is_sparse/is_distributed to TypeDefault.
Reviewed By: SsnL
Differential Revision: D9631619
fbshipit-source-id: 40b600d99691230e36e03eb56434c351cbc2aa3a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10824
API additions:
- Tensor(c10::intrusive_ptr<TensorImpl,UndefinedTensor>&&)
- Tensor(const c10::intrusive_ptr<TensorImpl,UndefinedTensor>&)
- Tensor::operator=(Tensor&&) && (for completeness sake)
- TensorBase::unsafeGetTensorImpl()
- TensorBase::unsafeReleaseTensorImpl()
- TensorBase::getIntrusivePtr()
- TensorImpl::type_id()
- Tensor::set_data()
- Tensor::is_same(Tensor)
- Tensor::use_count()
- Tensor::type_id()
- Tensor::scalar_type()
- WeakTensor::is_same(WeakTensor)
- intrusive_ptr::weak_use_count()
- weak_intrusive_ptr::weak_use_count()
- c10::raw::intrusive_ptr::{incref,decref,make_weak}
- c10::raw::weak_intrusive_ptr::{incref,decref,lock}
API changes:
- Tensor::pImpl is no longer public (and now named tensor_impl_)
- Most methods accessed this way are now accessible on Tensor
maybe_zero_dim() and set_wrapped_number() being prominent exceptions
(they are now accessed through unsafeGetTensorImpl())
- Type is no longer friend of Tensor
- TensorBase::reset(TensorImpl*) is deleted
- TensorBase::reset(TensorImpl*, bool should_retain) is deleted
- TensorBase::swap(TensorBaseImpl&) is deleted; use std::swap instead
- TensorBase::get() is deleted; use unsafeGetTensorImpl() instead
- TensorBase::detach() is deleted; use unsafeReleaseTensorImpl() instead
- TensorBase::retain() is deleted; use _raw_incref() instead
- TensorBase::release() is deleted; use _raw_decref() instead
- WeakTensor lost most of its methods (it no longer inherits from
TensorBase)
- TensorImpl::storage() is now a const method
- Tensor(TensorBase) constructor removed, instead
we go through getIntrusivePtr(). I'm not sure about
this change; I happened to have accidentally removed the
TensorBase constructor and decided to fix call sites,
but I could go the other way.
- detail::set_data() is deleted; use Tensor::set_data() instead
- c10::raw_intrusive_ptr_target removed; use the functions in c10::raw instead.
(The reason for this change, is that it is invalid to cast an intrusive_ptr_target*
to a raw_intrusive_ptr_target* to take advantage of the methods. But there is
no reason the incref/decref methods shouldn't also work on intrusive_ptr_target;
it is primarily an API consideration. We can be more standards compliant by
keeping them as functions, which are universally applicable.)
- intrusive_ptr::reclaim() and weak_intrusive_ptr::reclaim() now work on
pointers of the NullType. (This counts as a bug fix, because the documentation
specified that pointers produced by release() are valid to reclaim(), and
a release() on a null intrusive_ptr produces the NullType::singleton())
Bug fixes:
- Dispatch code for mutable references incorrectly returned
a reference to a value argument (which would immediately
go out of scope). They now correctly return a tensor by
value.
- intrusive_ptr copy/move assignment did not work correctly when
an object was assigned to itself. We now check for this case and
no-op if so. (This bug manifested itself as a Tensor mysteriously
becoming an UndefinedTensor after lines of code like
'x = x.mul_(y)')
Other changes:
- The checked cast functions in Utils.h have now been
renamed and detemplatized into checked unwrap functions.
- Added type_id() and scalar_type() methods to Tensor
- pImpl is no longer public
- Documented what the && overloads are doing
- All occurrences of 'new TensorImpl' (and similar spellings, like 'new THTensor')
have been expunged. This is NO LONGER a valid way to create a new
tensor, and if you do this, upon your first incref, you will catch an ASSERT
failure saying that only tensors created by intrusive_ptr::release() are valid
to reclaim(). Use c10::make_intrusive instead in this situation.
- IValue is adjusted to use intrusive_ptr instead of Retainable, and all
other sub-classes of Retainable were modified to use intrusive_ptr.
When doing this, I had to make the constructors of sub-classes like
ConstantList public, so that c10::make_intrusive could invoke them. Fortunately,
if you incorrectly stack allocate a ConstantList, and then try to get an
intrusive_ptr to it, it will fail, as stack allocated ConstantLists have refcount 0.
- IValue very narrowly sidesteps the problem of handling NullType, as it
considers intrusive_ptr<TensorImpl> identical to intrusive_ptr<TensorImpl, UndefinedTensor>
which is not always true. This was always the case, but there's now a comment
explaining what's going on.
Some MSVC bugs were uncovered during the preparation of this patch.
They are documented as comments in the code.
Reviewed By: gchanan
Differential Revision: D9481140
fbshipit-source-id: 14a8ea0c231ed88b5715fb86d92730926f9f92fc
Summary:
```
Use intrusive_ptr in Storage; replace unique_ptr<Storage> with Storage
This patch does two major changes:
- It replaces the use of Retainable in Storage with a new implementation
based on intrusive_ptr. This will be necessary because Caffe2 will
be using this class to implement intrusive_ptrs, and we need to
line these up for the merge. One good thing about the new implementation is
that the default copy/move constructors/assignment operators and destructor
work automatically, instead of needing to be hardcoded into Storage/Tensor.
- It replaces all places where we returned std::unique_ptr<Storage> with
Storage, collapsing an unnecessary double indirection that is no longer
necessary now that we have correctly working copy/move constructors.
I didn't initially want to do step (2), but it was very important to
eliminate all bare uses of new Storage and new StorageImpl, and this making
the API change was the most straightforward way to do this.
HOW TO FIX YOUR CODE IN THE NEW API
- You no longer need to dereference the result of tensor.storage() to pass
it to set. So, instead of:
x.set_(*y.storage());
just write:
x.set_(y.storage());
- If you were accessing methods on StorageImpl via the pImpl() method, you
must use the dot operator to run pImpl(). Even better; just drop pImpl,
we now have method forwarding. So, instead of:
storage->pImpl()->data();
just do:
storage->data();
// storage.pImpl()->data() works too but is not as recommended
- storage->getDevice() is no more; instead use storage->device().index()
MISC CODE UPDATES
- retain, release, weak_retain, weak_release and weak_lock are now
reimplemented using the "blessed API", and renamed to make it
clearer that their use is discouraged.
- nvcc OS X and general OS X portability improvements to intrusive_ptr
- A new comment in intrusive_ptr describing how stack allocated
intrusive_ptr_targets work differently than heap allocated ones
from c10::make_intrusive
CAVEAT EMPTOR
- THStorage_weakRetain used to work on strong pointers, but it NO LONGER
works with intrusive_ptr. You must reclaim the strong pointer into a
real strong pointer, construct a weak pointer from it, and then release
the strong and weak pointers. See StorageSharing.cpp for an example.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10488
Reviewed By: gchanan
Differential Revision: D9306134
Pulled By: ezyang
fbshipit-source-id: 02d58ef62dab8e4da6131e1a24834a65c21048e2
Summary:
This PR removes the `using Tensor = autograd::Variable;` alias from `torch/tensor.h`, which means `torch::Tensor` is now `at::Tensor`. This PR fixes up some last uses of `.data()` and tidies up the resulting code. For example, I was able to remove `TensorListView` such that code like
```
auto loss = torch::stack(torch::TensorListView(policy_loss)).sum() +
torch::stack(torch::TensorListView(value_loss)).sum();
```
is now
```
auto loss = torch::stack(policy_loss).sum() + torch::stack(value_loss).sum();
```
CC jgehring
ebetica
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10516
Differential Revision: D9324691
Pulled By: goldsborough
fbshipit-source-id: a7c1cb779c9c829f89cea55f07ac539b00c78449
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10278
Translation to Backend happens immediately before we go into the
Type universe; otherwise we use TensorTypeId.
I allocated TensorTypeId corresponding exactly to existing ATen
Backend. Only CPUTensorId and CUDATensorId are relevant in the
Caffe2 universe.
Reviewed By: gchanan
Differential Revision: D9184060
fbshipit-source-id: 9d3989c26f70b90f1bbf98b2a96c57e2b0a46597
Summary:
This PR provides 4 fixes / features:
1. torch::nn::Cloneable inherits virtually from torch::nn::Module. We want to pass around a module with new functions, and the best way to do this is to do a diamond inheritance pattern, i.e.
```c++
struct MySuperModuleImpl : virtual public torch::nn::Module {
virtual void myFunction() = 0;
}
struct MySuperModule : public torch::nn::Cloneable<MySuperModule>, MySuperModuleImple {};
struct MyModule : public MySuperModule<MyModule> {
void myFunction() override;
};
```
This way, we can simply pass around MySuperModuleImpl around instead of torch::nn::Module.
2. Optimizer options are public now, since there's no way to decay the LR or modify it during training otherwise
3. Serialization functions creates autograd history and calls copy_! Bad!
4. Optimizers did not create buffers after add_parameters was called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9837
Reviewed By: goldsborough
Differential Revision: D9199746
Pulled By: ebetica
fbshipit-source-id: 76d6b22e589a42637b7cc0b5bcd3c6b6662fb299
Summary:
This PR extends the existing type and shape metadata tracing and verification done in autograd with device information. This expansion of tracing is required for #8354, is likely useful in other scenarios, and is a healthy sanity check, just like type and shape tracing.
The precise changes are:
- TypeAndShape -> InputMetadata, now includes device()
- Creating InputMetadata is simplified to just require a tensor, and callers were updated to use this simpler invocation wherever possible
- The gradient accumulator of a variable is now reset when set_data() is called if either the type or device changes, and this reset now locks to avoid contention with acquiring the gradient accumulator
- Mismatched devices during backward() will throw a runtime error, just like mismatched type and shape
- (Bonus!) Two uninitialized pointers in THCReduce are now initialized (to nullptr) to prevent build warnings
fyi colesbury
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9796
Reviewed By: goldsborough
Differential Revision: D9119325
Pulled By: ezyang
fbshipit-source-id: 76d1861b8d4f74db0575ff1f3bd965e18f9463de
Summary:
The basic game plan is to stop accessing the type_ field directly,
and instead using the stored backend_, scalar_type_ and
is_variable_ to look up the appropriate Type from Context.
Storage of backend_ and scalar_type_ are new.
At some future point in time, I'd like to look at this code
carefully to see if I can get everything in this codepath inlining.
I didn't do it in this patch because there are circular include
problems making things difficult.
Some other details:
- Added Device::backend() which does what it says on the tin
- SparseTensorImpl is temporarily hard-coded to root in at::Context
for the appropriate context. If/when we put this in shared code,
we'll have to break this dep too, but for now it should be OK.
- There's a stupid problem with globalContext() deadlocking if
you didn't actually initialize it before loading libtorch.so
(which is bringing along the variable hooks). I fixed this by
reordering the static initializers. Fixes#9784
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10210
Differential Revision: D9150697
Pulled By: ezyang
fbshipit-source-id: 89e2006c88688bcfab0dcee82dc369127c198c35
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10130
Update some include paths to make them internally consistent
Reviewed By: ezyang
Differential Revision: D9119906
fbshipit-source-id: b44e5cab8e8e795ee18afe9ffc6caf1f2b413467
Summary:
…e_/is_variable_
The basic game plan is to stop accessing the type_ field directly,
and instead using the stored backend_, scalar_type_ and
is_variable_ to look up the appropriate Type from Context.
Storage of backend_ and scalar_type_ are new.
At some future point in time, I'd like to look at this code
carefully to see if I can get everything in this codepath inlining.
I didn't do it in this patch because there are circular include
problems making things difficult.
Some other details:
- Added Device::backend() which does what it says on the tin
- SparseTensorImpl is temporarily hard-coded to root in at::Context
for the appropriate context. If/when we put this in shared code,
we'll have to break this dep too, but for now it should be OK.
- There's a stupid problem with globalContext() deadlocking if
you didn't actually initialize it before loading libtorch.so
(which is bringing along the variable hooks). I didn't fix
it in this PR; it's tracked in #9784
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9787
Reviewed By: cpuhrsch
Differential Revision: D8980971
Pulled By: ezyang
fbshipit-source-id: 2b4d867abfdc3999a836a220c638c109053145a8
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9740
- Remove implicit ArrayRef -> vector conversion
- Fix 4 call sites that accidentally did an implicit expensive vector conversion but wouldn't have needed to
- Remove explicit vector conversion from 4 call sites that also didn't need to do that
Reviewed By: ezyang
Differential Revision: D8961693
fbshipit-source-id: 980da9f988083c0072497f9dbcbbf6f516fa311c
Summary:
I split it into two parts, _local_scalar and _local_scalar_dense (unchecked)
so I could reuse the sparse logic in both paths.
_local_scalar became a method on Tensor to work around a circular
include problem.
This is resurrected copy of #9652
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9762
Differential Revision: D8972348
Pulled By: ezyang
fbshipit-source-id: 2232dbfc8e1286b8a4a1c67d285c13a7771aad4c
Summary:
I got some tensor->variable conversion exceptions from `torch/csrc/autograd/variable.h`, which used the `TORCH_ASSERTM` macros instead of `AT_CHECK`, so they didn't have backtraces. This was such a substantial loss for debugability that I decided to update the whole codebase to use the backtrace-enabled ATen macros instead of `TORCH_ASSERT` and `JIT_ASSERT`, the latter having been an alias of the former.
ezyang apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9575
Differential Revision: D8924566
Pulled By: goldsborough
fbshipit-source-id: 7a4013b13eec9dbf024cef94cf49fca72f61d441
Summary:
This can hardly be called an improvement (we now print
CPUFloatType instead of CPUFloatTensor) but it was the
simplest way I could think of devirtualizing this function in
the short term. Probably need some sort of native function
that gives string information about a tensor.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Approved in #9710
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9758
Differential Revision: D8966935
Pulled By: ezyang
fbshipit-source-id: a4641affe0a6153f90cdd9f4f2a1100e46d1a2db
Summary:
As in the title. Lets us simplify a lot of code.
Depends on #9363, so please review only the last commit.
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9414
Reviewed By: zdevito
Differential Revision: D8836496
Pulled By: apaszke
fbshipit-source-id: 9b3c3d1f001a9dc522f8478abc005b6b86cfa3e3
Summary:
Add `WeakTensor` - a `Tensor` counterpart which doesn't keep the data (or any other expensive resources) alive. They can be `.lock()`ed and return `at::optional<Tensor>` if they're still alive.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9363
Reviewed By: ezyang
Differential Revision: D8815434
Pulled By: apaszke
fbshipit-source-id: 1b3e96503c1285d78ef124c585e65c7630f3253e
* Created TensorOptions
Storing the type in TensorOptions to solve the Variable problem
Created convenience creation functions for TensorOptions and added tests
Converted zeros to TensorOptions
Converted rand to TensorOptions
Fix codegen for TensorOptions and multiple arguments
Put TensorOptions convenience functions into torch namespace too
All factory functions except *_like support TensorOptions
Integrated with recent JIT changes
Support *_like functions
Fix in place modification
Some cleanups and fixes
Support sparse_coo_tensor
Fix bug in Type.cpp
Fix .empty calls in C++ API
Fix bug in Type.cpp
Trying to fix device placement
Make AutoGPU CPU compatible
Remove some auto_gpu.h uses
Fixing some headers
Fix some remaining CUDA/AutoGPU issues
Fix some AutoGPU uses
Fixes to dispatch_tensor_conversion
Reset version of new variables to zero
Implemented parsing device strings
Random fixes to tests
Self review cleanups
flake8
Undo changes to variable.{h,cpp} because they fail on gcc7.2
Add [cuda] tag to tensor_options_cuda.cpp
Move AutoGPU::set_index_from into .cpp file because Windows is stupid and sucks
Fix linker error in AutoGPU.cpp
Fix bad merge conflict in native_functions.yaml
Fixed caffe2/contrib/aten
Fix new window functions added to TensorFactories.cpp
* Removed torch::TensorOptions
Added code to generate wrapper functions for factory methods
Add implicit constructor from Backend to TensorOptions
Remove Var() from C++ API and use torch:: functions
Use torch:: functions more subtly in C++ API
Make AutoGPU::set_device more exception safe
Check status directly in DynamicCUDAHooksInterface
Rename AutoGPU to DeviceGuard
Removed set_requires_grad from python_variables.h and warn appropriately in Variable::set_requires_grad
remove python_default_init: self.type()
Add back original factory functions, but with deprecation warnings
Disable DeviceGuard for a couple functions in ATen
Remove print statement
Fix DeviceGuard construction from undefined tensor
Fixing CUDA device compiler issues
Moved as many methods as possible into header files
Dont generate python functions for deprecated factories
Remove merge conflict artefact
Fix tensor_options_cuda.cpp
Fix set_requires_grad not being checked
Fix tensor_new.h
TEMPORARILY put some methods in .cpp files to see if it solves issues on windows and mac
Fix bug in DeviceGuard.h
Missing includes
TEMPORARILY moving a few more methods into .cpp to see if it fixes windows
Fixing linker errors
* Fix up SummaryOps to use new factories
Undo device agnostic behavior of DeviceGuard
Use -1 instead of optional for default device index
Also move DeviceGuard methods into header
Fixes around device index after optional -> int32_t switch
Fix use of DeviceGuard in new_with_tensor_copy
Fix tensor_options.cpp
* Fix Type::copy(
* Remove test_non_float_params from ONNX tests
* Set requires_grad=False in ONNX tests that use ints
* Put layout/dtype/device on Tensor
* Post merge fixes
* Change behavior of DeviceGuard to match AutoGPU
* Fix C++ API integration tests
* Fix flip functions
* Factor python dependency out of interpreter
* Remove NO_PYTHON for the autograd engine
If there is no python bindings, then a default Engine is constructed
the first time it is requested.
If the python libraries are loaded, then they override the default
accessor and the default engine becomes a python Engine.
Note: it is possible for two engines to be generated if a non-python
one gets created before the python bindings are loaded. This case
is rare, and just results in additional threads being spawned.
* Fixing AlexNet test which is skipped in CI
* Add backward() to Tensor and Variable
* Add at:: in front of Tensor
* Trying to not move optional to appease windows?
* Move implementation into cpp file
* Undo some formatting changes