Commit Graph

444 Commits

Author SHA1 Message Date
Tugsbayasgalan Manlaibaatar
d23bf9cef0 Add fake impl for aten.unique2 (#124306)
Reapply of: https://github.com/pytorch/pytorch/pull/121571
Differential Revision: [D56258431](https://our.internmc.facebook.com/intern/diff/D56258431)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124306
Approved by: https://github.com/gmagogsfm
2024-04-17 22:55:27 +00:00
statelesshz
2216068559 Enable UFMT on test/test_ops* (#123935)
Part of https://github.com/pytorch/pytorch/issues/123062

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123935
Approved by: https://github.com/ezyang
2024-04-13 03:31:56 +00:00
Kurt Mohler
db895ace1d Only run backward part of COW test if results are strided (#123870)
Fixes #123792

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123870
Approved by: https://github.com/ezyang
2024-04-12 04:43:02 +00:00
Kurt Mohler
3908ebca86 Test COW materialization in backward ops (#123593)
Part of #97856

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123593
Approved by: https://github.com/ezyang
2024-04-09 22:31:50 +00:00
Kurt Mohler
ca9606f809 Update COW OpInfo test to include kwargs and expected materialization (#122437)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122437
Approved by: https://github.com/ezyang
2024-03-24 06:07:30 +00:00
PyTorch MergeBot
c80601f35a Revert "Avoid COW materialize in conv, log sigmoid, repeat, group_norm, batch_norm (#121537)"
This reverts commit a2a88f39ee.

Reverted https://github.com/pytorch/pytorch/pull/121537 on behalf of https://github.com/kurtamohler due to flaky CI failures ([comment](https://github.com/pytorch/pytorch/pull/121537#issuecomment-2010937226))
2024-03-21 00:03:30 +00:00
Kurt Mohler
a2a88f39ee Avoid COW materialize in conv, log sigmoid, repeat, group_norm, batch_norm (#121537)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121537
Approved by: https://github.com/ezyang
2024-03-19 06:15:00 +00:00
Peter Bell
34a28f01dd [Autograd] Improve error for leaf tensors as out argument to fallback (#121089)
Closes  #120988

Currently operators that hit the autograd fallback call `check_inplace`
on all mutated inputs, including out arguments. This leads to a slightly
confusing error message:
```
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
```

Compared to functions that don't fallback, which raise
```
RuntimeError: add(): functions with out=... arguments don't support automatic differentiation, but one of the arguments requires grad.
```

This changes the error message to make clear the issue is with the out argument,
but does not tighten the check to outright ban out arguments that require grad.
Instead, I use the same checks from `check_inplace` which allows non-leaf tensors
that require grad to pass without error.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121089
Approved by: https://github.com/lezcano, https://github.com/soulitzer
ghstack dependencies: #121142
2024-03-05 21:13:27 +00:00
Kurt Mohler
77aea289ae Add test to check that COW inputs are not materialized (#119507)
Part of #97856

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119507
Approved by: https://github.com/ezyang
ghstack dependencies: #120455
2024-03-01 05:05:28 +00:00
Sergii Dymchenko
09aefe1502 Fix ouput typos (#120870)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120870
Approved by: https://github.com/clee2000
2024-02-29 08:29:14 +00:00
David Berard
df1e855313 [fake_impls] fix max_seqlen return values in efficient_attention_forward (#120842)
To match the actual implementation, we should return the max_seqlen_q/k, not M, N, when in the sparse case

7e185277cd/aten/src/ATen/native/transformers/cuda/attention.cu (L981-L996)

Note that although the .cu file sets max_seqlen_k = 0 in the sparse case, it actually returns max_seqlen_k or N:

7e185277cd/aten/src/ATen/native/transformers/cuda/attention.cu (L1224-L1231)

Tests - added in the next PR (#102839, which also fixes other parts of the test_fake tests so that we can un-xfail them and actually run the tests)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120842
Approved by: https://github.com/YuqingJ
ghstack dependencies: #120682
2024-02-29 07:12:27 +00:00
PyTorch MergeBot
dbe0967a0a Revert "Add test to check that COW inputs are not materialized (#119507)"
This reverts commit 2ebf2c88ba.

Reverted https://github.com/pytorch/pytorch/pull/119507 on behalf of https://github.com/izaitsevfb due to breaks xla jobs ([comment](https://github.com/pytorch/pytorch/pull/119507#issuecomment-1970022840))
2024-02-28 22:26:59 +00:00
Kurt Mohler
2ebf2c88ba Add test to check that COW inputs are not materialized (#119507)
Part of #97856

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119507
Approved by: https://github.com/ezyang
ghstack dependencies: #120455
2024-02-28 00:37:33 +00:00
Isuru Fernando
b7df3bba62 add decomposition for frexp (#119217)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119217
Approved by: https://github.com/peterbell10
ghstack dependencies: #119284, #120027
2024-02-23 21:52:42 +00:00
Joel Schlosser
9ec8dd2467 Reify view_func() closures as ViewFuncs (#118404)
Replaces `view_func()` closures with a reified `ViewFunc` data structure. Codegen generates a `ViewFunc` subclass for each view op (e.g. `NarrowViewFunc`) containing state needed to reconstruct the view. The `ViewFunc` API allows for querying and hot-swapping any `SymInt`s or `Tensors` in the state through `get_symints()` / `get_tensors()` / `clone_and_set()`, which will be essential for fake-ification later on.

```cpp
/// Base class for view functions, providing reapplication of a view on a new base.
/// Each view op should get a codegenerated subclass of this class containing
/// any state needed to reconstruct the view. The class also provides convenience
/// accessors for saved SymInts / tensor state. This is useful for e.g. fake-ification,
/// where we want to use symbolic values or fake tensors instead.
struct TORCH_API ViewFunc {
  virtual ~ViewFunc() {}
  /// Returns any SymInts in the saved state.
  virtual std::vector<c10::SymInt> get_symints() const { return {}; }
  /// Returns the number of SymInts in the saved state.
  virtual size_t num_symints() const { return 0; }
  /// Returns any tensors in the saved state.
  virtual std::vector<at::Tensor> get_tensors() const { return {}; }
  /// Returns the number of tensors in the saved state.
  virtual size_t num_tensors() const { return 0; }
  /// Reapplies the view on the given base using the saved state.
  virtual at::Tensor operator()(const at::Tensor&) const = 0;
  /// Returns a clone of this ViewFunc, optionally with the specified saved state.
  virtual std::unique_ptr<ViewFunc> clone_and_set(
      std::optional<std::vector<c10::SymInt>> = c10::nullopt,
      std::optional<std::vector<at::Tensor>> = c10::nullopt) const = 0;

protected:
  /// Sets the values of any SymInts in the saved state. The input vector size must
  /// match the number of SymInts in the saved state (i.e. the size of the list
  /// returned by get_symints()).
  virtual void set_symints(std::vector<c10::SymInt>) {}
  /// Sets the values of any Tensors in the saved state. The input vector size must
  /// match the number of Tensors in the saved state (i.e. the size of the list
  /// returned by get_tensors()).
  virtual void set_tensors(std::vector<at::Tensor>) {}
};
```

New codegen files:
* `torch/csrc/autograd/generated/ViewFunc.h`
* `torch/csrc/autograd/generated/ViewFuncs.cpp`

The templates for these also contains impls for `ChainedViewFunc` and `ErroringViewFunc` which are used in a few places within autograd.

Example codegen for `slice.Tensor`:
```cpp
// torch/csrc/autograd/generated/ViewFuncs.h
#define SLICE_TENSOR_VIEW_FUNC_AVAILABLE
struct SliceTensorViewFunc : public torch::autograd::ViewFunc {
  SliceTensorViewFunc(int64_t dim, c10::optional<c10::SymInt> start, c10::optional<c10::SymInt> end, c10::SymInt step) : dim(dim), start(start), end(end), step(step)
  {};
  virtual ~SliceTensorViewFunc() override {};
  virtual std::vector<c10::SymInt> get_symints() const override;
  virtual size_t num_symints() const override;
  virtual std::vector<at::Tensor> get_tensors() const override;
  virtual size_t num_tensors() const override;
  virtual at::Tensor operator()(const at::Tensor&) const override;
  virtual std::unique_ptr<ViewFunc> clone_and_set(
      std::optional<std::vector<c10::SymInt>> = c10::nullopt,
      std::optional<std::vector<at::Tensor>> = c10::nullopt) const override;

protected:
  virtual void set_symints(std::vector<c10::SymInt>) override;
  virtual void set_tensors(std::vector<at::Tensor>) override;

private:
  int64_t dim;
  c10::optional<c10::SymInt> start;
  c10::optional<c10::SymInt> end;
  c10::SymInt step;
};
...

// torch/csrc/autograd/generated/ViewFuncs.cpp
std::vector<c10::SymInt> SliceTensorViewFunc::get_symints() const {
  ::std::vector<c10::SymInt> symints;
  symints.reserve((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1);
  if(start.has_value()) symints.insert(symints.end(), *(start));
  if(end.has_value()) symints.insert(symints.end(), *(end));
  symints.push_back(step);
  return symints;
}

size_t SliceTensorViewFunc::num_symints() const {
  return static_cast<size_t>((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1);
}

void SliceTensorViewFunc::set_symints(std::vector<c10::SymInt> symints) {
  TORCH_INTERNAL_ASSERT(symints.size() == num_symints());
  auto i = 0;
  if(start.has_value()) start = symints[i];
  i += (start.has_value() ? 1 : 0);
  if(end.has_value()) end = symints[i];
  i += (end.has_value() ? 1 : 0);
  step = symints[i];
}

std::vector<at::Tensor> SliceTensorViewFunc::get_tensors() const {
  ::std::vector<at::Tensor> tensors;
  return tensors;
}

size_t SliceTensorViewFunc::num_tensors() const {
  return static_cast<size_t>(0);
}

void SliceTensorViewFunc::set_tensors(std::vector<at::Tensor> tensors) {
  TORCH_INTERNAL_ASSERT(tensors.size() == num_tensors());

}

at::Tensor SliceTensorViewFunc::operator()(const at::Tensor& input_base) const {
  return at::_ops::slice_Tensor::call(input_base, dim, start, end, step);
}

std::unique_ptr<ViewFunc> SliceTensorViewFunc::clone_and_set(
    std::optional<std::vector<c10::SymInt>> symints,
    std::optional<std::vector<at::Tensor>> tensors) const {
  auto output = std::make_unique<SliceTensorViewFunc>(dim, start, end, step);
  if (symints.has_value()) {
    output->set_symints(std::move(*(symints)));
  }
  if (tensors.has_value()) {
    output->set_tensors(std::move(*(tensors)));
  }
  return output;
}
```

The `_view_func()` / `_view_func_unsafe()` methods now accept two additional (optional) args for `symint_visitor_fn` / `tensor_visitor_fn`. If these are defined, they are expected to be python callables that operate on a single SymInt / tensor and return a new one. This allows for the hot-swapping needed during fake-ification.

For testing, there are extensive pre-existing tests, and I added a test to ensure that hot-swapping functions correctly.
```sh
python test/test_autograd.py -k test_view_func_replay
python test/test_ops.py -k test_view_replay
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118404
Approved by: https://github.com/ezyang
2024-02-14 22:00:43 +00:00
blorange-amd
df9b44436a [ROCm] Enable float16/complex32 fft tests on ROCm (#117296)
This PR is to enable float16/complex32 fft tests on ROCm.
Sample results are attached here:
[test_spectral_ops_results.log](https://github.com/pytorch/pytorch/files/13908533/test_spectral_ops_results.log)

test_decomp::TestDecompCUDA::test_comprehensive_fft*
test_decomp::TestDecompCUDA::test_quick_fft*
test_jit_fuser_te::TestNNCOpInfoCUDA::test_nnc_correctness_fft*
test_meta::TestMetaCUDA::test_dispatch_meta_inplace_fft*
test_meta::TestMetaCUDA::test_dispatch_meta_outplace_fft*
test_meta::TestMetaCUDA::test_dispatch_symbolic_meta_inplace_fft*
test_meta::TestMetaCUDA::test_dispatch_symbolic_meta_outplace_fft*
test_meta::TestMetaCUDA::test_meta_inplace_fft*
test_meta::TestMetaCUDA::test_meta_outplace_fft*
test_ops::TestCommonCUDA::test_complex_half_reference_testing_fft*
test_ops::TestCommonCUDA::test_python_ref__refs_fft*
test_ops::TestCommonCUDA::test_python_ref_executor__refs_fft*
test_ops::TestCommonCUDA::test_python_ref_meta__refs*
test_ops::TestCommonCUDA::test_python_ref_torch_fallback__refs_fft*
test_schema_check::TestSchemaCheckModeOpInfoCUDA::test_schema_correctness_fft*
test_spectral_ops::TestFFTCUDA::test_empty_fft__refs_fft*
test_spectral_ops::TestFFTCUDA::test_empty_fft_fft*
test_spectral_ops::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error__refs_fft*
test_spectral_ops::TestFFTCUDA::test_fft_half_and_chalf_not_power_of_two_error_fft*
test_spectral_ops::TestFFTCUDA::test_fft_round_trip_cuda*
test_spectral_ops::TestFFTCUDA::test_fft_type_promotion_cuda*
test_spectral_ops::TestFFTCUDA::test_fftn_round_trip_cuda*
test_spectral_ops::TestFFTCUDA::test_hfftn_cuda_float16
test_spectral_ops::TestFFTCUDA::test_ihfftn_cuda_float16
test_utils::TestDeviceUtilsCUDA::test_device_mode_ops_fft

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117296
Approved by: https://github.com/pruthvistony, https://github.com/malfet
2024-02-13 22:35:32 +00:00
PyTorch MergeBot
24bdd03d23 Revert "Reify view_func() closures as ViewFuncs (#118404)"
This reverts commit d5a6762263.

Reverted https://github.com/pytorch/pytorch/pull/118404 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/118404#issuecomment-1938600260))
2024-02-12 12:38:51 +00:00
Joel Schlosser
d5a6762263 Reify view_func() closures as ViewFuncs (#118404)
Replaces `view_func()` closures with a reified `ViewFunc` data structure. Codegen generates a `ViewFunc` subclass for each view op (e.g. `NarrowViewFunc`) containing state needed to reconstruct the view. The `ViewFunc` API allows for querying and hot-swapping any `SymInt`s or `Tensors` in the state through `get_symints()` / `get_tensors()` / `clone_and_set()`, which will be essential for fake-ification later on.

```cpp
/// Base class for view functions, providing reapplication of a view on a new base.
/// Each view op should get a codegenerated subclass of this class containing
/// any state needed to reconstruct the view. The class also provides convenience
/// accessors for saved SymInts / tensor state. This is useful for e.g. fake-ification,
/// where we want to use symbolic values or fake tensors instead.
struct TORCH_API ViewFunc {
  virtual ~ViewFunc() {}
  /// Returns any SymInts in the saved state.
  virtual std::vector<c10::SymInt> get_symints() const { return {}; }
  /// Returns the number of SymInts in the saved state.
  virtual size_t num_symints() const { return 0; }
  /// Returns any tensors in the saved state.
  virtual std::vector<at::Tensor> get_tensors() const { return {}; }
  /// Returns the number of tensors in the saved state.
  virtual size_t num_tensors() const { return 0; }
  /// Reapplies the view on the given base using the saved state.
  virtual at::Tensor operator()(const at::Tensor&) const = 0;
  /// Returns a clone of this ViewFunc, optionally with the specified saved state.
  virtual std::unique_ptr<ViewFunc> clone_and_set(
      std::optional<std::vector<c10::SymInt>> = c10::nullopt,
      std::optional<std::vector<at::Tensor>> = c10::nullopt) const = 0;

protected:
  /// Sets the values of any SymInts in the saved state. The input vector size must
  /// match the number of SymInts in the saved state (i.e. the size of the list
  /// returned by get_symints()).
  virtual void set_symints(std::vector<c10::SymInt>) {}
  /// Sets the values of any Tensors in the saved state. The input vector size must
  /// match the number of Tensors in the saved state (i.e. the size of the list
  /// returned by get_tensors()).
  virtual void set_tensors(std::vector<at::Tensor>) {}
};
```

New codegen files:
* `torch/csrc/autograd/generated/ViewFunc.h`
* `torch/csrc/autograd/generated/ViewFuncs.cpp`

The templates for these also contains impls for `ChainedViewFunc` and `ErroringViewFunc` which are used in a few places within autograd.

Example codegen for `slice.Tensor`:
```cpp
// torch/csrc/autograd/generated/ViewFuncs.h
#define SLICE_TENSOR_VIEW_FUNC_AVAILABLE
struct SliceTensorViewFunc : public torch::autograd::ViewFunc {
  SliceTensorViewFunc(int64_t dim, c10::optional<c10::SymInt> start, c10::optional<c10::SymInt> end, c10::SymInt step) : dim(dim), start(start), end(end), step(step)
  {};
  virtual ~SliceTensorViewFunc() override {};
  virtual std::vector<c10::SymInt> get_symints() const override;
  virtual size_t num_symints() const override;
  virtual std::vector<at::Tensor> get_tensors() const override;
  virtual size_t num_tensors() const override;
  virtual at::Tensor operator()(const at::Tensor&) const override;
  virtual std::unique_ptr<ViewFunc> clone_and_set(
      std::optional<std::vector<c10::SymInt>> = c10::nullopt,
      std::optional<std::vector<at::Tensor>> = c10::nullopt) const override;

protected:
  virtual void set_symints(std::vector<c10::SymInt>) override;
  virtual void set_tensors(std::vector<at::Tensor>) override;

private:
  int64_t dim;
  c10::optional<c10::SymInt> start;
  c10::optional<c10::SymInt> end;
  c10::SymInt step;
};
...

// torch/csrc/autograd/generated/ViewFuncs.cpp
std::vector<c10::SymInt> SliceTensorViewFunc::get_symints() const {
  ::std::vector<c10::SymInt> symints;
  symints.reserve((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1);
  if(start.has_value()) symints.insert(symints.end(), *(start));
  if(end.has_value()) symints.insert(symints.end(), *(end));
  symints.push_back(step);
  return symints;
}

size_t SliceTensorViewFunc::num_symints() const {
  return static_cast<size_t>((start.has_value() ? 1 : 0) + (end.has_value() ? 1 : 0) + 1);
}

void SliceTensorViewFunc::set_symints(std::vector<c10::SymInt> symints) {
  TORCH_INTERNAL_ASSERT(symints.size() == num_symints());
  auto i = 0;
  if(start.has_value()) start = symints[i];
  i += (start.has_value() ? 1 : 0);
  if(end.has_value()) end = symints[i];
  i += (end.has_value() ? 1 : 0);
  step = symints[i];
}

std::vector<at::Tensor> SliceTensorViewFunc::get_tensors() const {
  ::std::vector<at::Tensor> tensors;
  return tensors;
}

size_t SliceTensorViewFunc::num_tensors() const {
  return static_cast<size_t>(0);
}

void SliceTensorViewFunc::set_tensors(std::vector<at::Tensor> tensors) {
  TORCH_INTERNAL_ASSERT(tensors.size() == num_tensors());

}

at::Tensor SliceTensorViewFunc::operator()(const at::Tensor& input_base) const {
  return at::_ops::slice_Tensor::call(input_base, dim, start, end, step);
}

std::unique_ptr<ViewFunc> SliceTensorViewFunc::clone_and_set(
    std::optional<std::vector<c10::SymInt>> symints,
    std::optional<std::vector<at::Tensor>> tensors) const {
  auto output = std::make_unique<SliceTensorViewFunc>(dim, start, end, step);
  if (symints.has_value()) {
    output->set_symints(std::move(*(symints)));
  }
  if (tensors.has_value()) {
    output->set_tensors(std::move(*(tensors)));
  }
  return output;
}
```

The `_view_func()` / `_view_func_unsafe()` methods now accept two additional (optional) args for `symint_visitor_fn` / `tensor_visitor_fn`. If these are defined, they are expected to be python callables that operate on a single SymInt / tensor and return a new one. This allows for the hot-swapping needed during fake-ification.

For testing, there are extensive pre-existing tests, and I added a test to ensure that hot-swapping functions correctly.
```sh
python test/test_autograd.py -k test_view_func_replay
python test/test_ops.py -k test_view_replay
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118404
Approved by: https://github.com/ezyang
2024-02-09 18:51:36 +00:00
Isuru Fernando
3e79ef6db8 Complete decomposition for aten.round (#118635)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118635
Approved by: https://github.com/peterbell10
2024-02-01 17:14:44 +00:00
Isuru Fernando
2f7839e6db register decomposition for rsub in torch._refs (#118288)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118288
Approved by: https://github.com/lezcano
ghstack dependencies: #118398
2024-01-30 22:18:15 +00:00
Alexander Grund
f1aef2c094 Don't check is_conj for _refs.linalg.svd (#117972)
The flag is not correctly set when PyTorch is compiled with GPU support resulting in failures in
`test_ops.py::test_python_ref_meta__refs_linalg_svd_cpu_complex`.

Use a similar approach to test_meta and skip the check for this function.

Workaround for #105068

Pull Request resolved: https://github.com/pytorch/pytorch/pull/117972
Approved by: https://github.com/lezcano
2024-01-26 15:24:29 +00:00
Sam Larsen
208e64a9ba Initial implementation of FakeTensor caching (#113873)
Summary: Cache the result of FakeTensor dispatch and skip re-evaluation on cache hits.

Test Plan: New unit tests. Caching is enabled in this diff, so all existing tests exercise the cache as well.

Differential Revision: [D52841637](https://our.internmc.facebook.com/intern/diff/D52841637)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113873
Approved by: https://github.com/eellison
2024-01-17 20:38:54 +00:00
Joel Schlosser
3c21264c9b Introduce reverse view_funcs (#115894)
Part 2 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw).

Details:
* Codegen `rev_view_func()` alongside `view_func()`
    * Reverse view_func gives you a "base" from a "view": `rev_view_func(new_view) -> new_base` AKA it plays the original view backwards
* Utilizes the functional inverses defined in `FunctionalInverses.cpp`, passing `InverseReturnMode::AlwaysView`
* Manually implements functional inverses for `narrow()` and `chunk()`
* **NB: Multi-output views now set view_func() / rev_view_func() for each of the output views!**
    * Due to this, the `as_view()` overload that operates on a list of views is scrapped in favor of iteration via codegen

Example codegen in `ADInplaceOrViewTypeN.cpp`:
```cpp
at::Tensor narrow(c10::DispatchKeySet ks, const at::Tensor & self, int64_t dim, c10::SymInt start, c10::SymInt length) {
  auto _tmp = ([&]() {
    at::AutoDispatchBelowADInplaceOrView guard;
    return at::_ops::narrow::redispatch(ks & c10::after_ADInplaceOrView_keyset, self, dim, start, length);
  })();
  std::function<at::Tensor(const at::Tensor&)> func=nullptr;
  std::function<at::Tensor(const at::Tensor&)> rev_func=nullptr;
  if (false || !self.unsafeGetTensorImpl()->support_as_strided() ||
      c10::AutogradState::get_tls_state().get_view_replay_enabled()) {
    func = [=](const at::Tensor& input_base) {
      return at::_ops::narrow::call(input_base, dim, start, length);
    };
    rev_func = [=](const at::Tensor& input_view) {
      // NB: args from narrow() signature are passed along to the inverse
      return at::functionalization::FunctionalInverses::narrow_copy_inverse(self, input_view, at::functionalization::InverseReturnMode::AlwaysView, dim, start, length);
    };
  }
  auto result = as_view(/* base */ self, /* output */ _tmp, /* is_bw_differentiable */ true, /* is_fw_differentiable */ true, /* view_func */ func, /* rev_view_func */ rev_func, /* creation_meta */ InferenceMode::is_enabled() ? CreationMeta::INFERENCE_MODE : (at::GradMode::is_enabled() ? CreationMeta::DEFAULT : CreationMeta::NO_GRAD_MODE));
  return result;
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115894
Approved by: https://github.com/soulitzer
2024-01-05 16:48:12 +00:00
Aaron Gokaslan
3fe437b24b [BE]: Update flake8 to v6.1.0 and fix lints (#116591)
Updates flake8 to v6.1.0 and fixes a few lints using sed and some ruff tooling.
- Replace `assert(0)` with `raise AssertionError()`
- Remove extraneous parenthesis i.e.
  - `assert(a == b)` -> `assert a == b`
  - `if(x > y or y < z):`->`if x > y or y < z:`
  - And `return('...')` -> `return '...'`

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116591
Approved by: https://github.com/albanD, https://github.com/malfet
2024-01-03 06:04:44 +00:00
kflu
c5dcb50c00 [easy] aten ops: support passing all args as kwargs, including self (#114920)
Summary:
This is important for writing aten IR based graph transformation.

```
In [4]: [x.name for x in torch.ops.aten.reshape.default._schema.arguments]
Out[4]: ['self', 'shape']

In [8]: torch.ops.aten.reshape.default(torch.rand(1,2), shape=[2])
Out[8]: tensor([0.7584, 0.4834])

# === CANNOT CALL `self` BY KWARGS ===

In [7]: torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2])

TypeError: OpOverload.__call__() got multiple values for argument 'self'

```

# Where's the problem?

1. the aten ops first arg is usually named `self` (aten/src/ATen/native/native_functions.yaml)
2. Unfortunately, in `torch._ops.{OpOverload, OpOverloadPacket}.__call__()`, the first arg is (by python convention) named `self` too.

So when call `self` by kwargs, `OpOverloadPacket.__call__` received:

```
OpOverloadPacket.__call__(self, {"self": ...})
```

It is Python that does not allow some argument named "arg" to appear twice. and hence

> TypeError: OpOverload.__call__() got multiple values for argument 'self'

# How to fix?

**Note that**, in above, `self` is an instance of `OpOverloadPacket`, and the "self" kwarg is the input tensor to the aten op. To fix, we only need to differentiate the two `self`s.

In Python, first arg of a method does not need to be named `self`. So we change the `__call__` definition to:

```
def __call__(_self, ...):
```

Now the call becomes:

```
OpOverloadPacket.__call__(_self, {"self": ...})
```

where:
* `_self` is the instance to the `OpOverloadPacket`
* `"self"` is the input tensor to the aten op.

Test Plan:
```
In [4]: [x.name for x in torch.ops.aten.reshape.default._schema.arguments]
Out[4]: ['self', 'shape']

In [3]: torch.ops.aten.reshape.default(self=torch.rand(1,2), shape=[2])
Out[3]: tensor([0.5127, 0.3051])
```

Differential Revision: D51731996

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114920
Approved by: https://github.com/houseroad
2023-12-16 18:32:58 +00:00
rzou
3477a2ee03 unMarkDynamoStrictTest on OpInfo-based tests (#115856)
These take too long to run under strict mode. We'll worry about them
later. Note that these decorators don't do anything yet (unless we flip
the default from non-strict to strict).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115856
Approved by: https://github.com/voznesenskym
ghstack dependencies: #115845, #115855
2023-12-15 01:22:31 +00:00
Isuru Fernando
505574c46a Add decomposition for torch.block_diag (#115096)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115096
Approved by: https://github.com/peterbell10
2023-12-11 20:04:22 +00:00
Aaron Gokaslan
794545c11f [BE]: Enable RUF015 codebase wide (#115507)
Constant time access of first value in collection. This is a constant time operation instead of converting the item to a list to get the first item which is linear. The rule is turned on which automatically autofixes and enforces this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115507
Approved by: https://github.com/malfet
2023-12-11 15:51:01 +00:00
Isuru Fernando
e4a88d9581 Convert SymInts to SymFloats with SymPy (#113683)
Fixes #109365

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113683
Approved by: https://github.com/ezyang, https://github.com/lezcano
2023-11-20 23:35:40 +00:00
Evgeni Burovski
237cbd5be6 BUG: trace frames with numpy scalar -> ndarray functions (#112959)
Fixes #112951

Make dynamo detect that `np.arange(3)` returns a FakeTensor, so the frame needs to be traced.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112959
Approved by: https://github.com/lezcano
2023-11-17 03:00:24 +00:00
Aryan Gupta
8cee0a25bd fix: Flake8-BugBear code B-026 for PyTorch (#111362)
Fixes #106571

I have fixed the B-026 error codes for Flake8 tests on the codebase. Please review and tell me anything else to do.
Thanks and excited for this first contribution to PyTorch.

Also I refer this issue which introduced [B-026](https://github.com/PyCQA/flake8-bugbear/issues/286) in `pytest-bugbear` and discuss the error code.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111362
Approved by: https://github.com/Skylion007
2023-11-07 21:38:18 +00:00
Peter Bell
66c32d099a Use pytree.arg_tree_leaves everywhere (#112394)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112394
Approved by: https://github.com/lezcano
ghstack dependencies: #112391, #112392, #112393
2023-10-31 15:57:06 +00:00
Peter Bell
bbd5b935e4 Use pytree.tree_leaves everywhere (#112324)
This changes all the instances I could find of `tree_flatten(...)[0]` or
`x, _ = tree_flatten` to use `tree_leaves`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324
Approved by: https://github.com/lezcano
ghstack dependencies: #112327, #112323
2023-10-30 03:39:04 +00:00
William Wen
a380bf3297 [dynamo, test] skip flaky dynamo-wrapped tests (#112310)
ghstack-source-id: 7a87e33e7513e7924e4513b6473284562989ed4c
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112309

Skip flaky tests reported by
- https://github.com/pytorch/pytorch/issues/111825
- https://github.com/pytorch/pytorch/issues/111826
- https://github.com/pytorch/pytorch/issues/111909
- https://github.com/pytorch/pytorch/issues/112142
- https://github.com/pytorch/pytorch/issues/112220

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112310
Approved by: https://github.com/xmfan
2023-10-28 04:14:57 +00:00
Isuru Fernando
c120e5606e Use ops_and_refs in test_ops.py instead of _ops_and_refs (#112022)
`ops_and_refs` and `_ops_and_refs` have the same definition.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112022
Approved by: https://github.com/lezcano
2023-10-27 18:37:05 +00:00
Isuru Fernando
fdbb73fa4e Check both ops and refs in test_strided_layout (#112160)
Trying #112023 again to see if CLA issue is fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112160
Approved by: https://github.com/lezcano, https://github.com/Neilblaze
2023-10-27 15:35:34 +00:00
alhridoy
0c64ac0d3a Add tests for strided layout in factory functions (#111463)
Fixes #111222
This pull request adds tests for factory functions that create tensors with a strided layout. The tests are added to the `test_ops.py` file and check the behavior of the `empty`, `zeros`, `ones`, and `rand` factory functions when used with the `layout=torch.strided` argument.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111463
Approved by: https://github.com/lezcano
2023-10-24 17:05:44 +00:00
Philip Meier
973c87b320 raise instead of skip in test/test_meta.py (#110939)
Supersedes #109004.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110939
Approved by: https://github.com/lezcano, https://github.com/kurtamohler
2023-10-17 10:17:43 +00:00
Jez Ng
ddb0c26511 [inductor] Re-enable more fixed tests (#110798)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110798
Approved by: https://github.com/Skylion007
2023-10-09 04:36:51 +00:00
Jez Ng
dddf581da7 [dynamo] Add graph break on requires_grad_() (#110053)
Fixes #107861.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110053
Approved by: https://github.com/eellison
2023-10-04 06:22:16 +00:00
SS-JIA
5df8aca994 [core IR] Add a core decomposition for floor_divide (#110046)
## Context

Introduce a core decomposition for `aten.floor_divide` into other `aten` ops, and add it to the core ATen decomposition table.

This replaces the decomposition of `floor_divide` that was used by Inductor. I noticed there was a note on that decomposition

```
# TorchInductor-only decomposition. It should not be taken to core.
# See https://github.com/pytorch/torchdynamo/pull/1120
```

but couldn't discern the reason why this is the case. cc: @lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110046
Approved by: https://github.com/peterbell10
2023-09-26 08:39:21 +00:00
SS-JIA
7de669f2f9 [core IR] Remove trunc decomp and add trunc to core (#109902)
Following up from [this comment](https://github.com/pytorch/pytorch/pull/109319#discussion_r1330803226). Remove the decomposition for `trunc`, and add it as a core operator.

Going forward, provide similar treatment for operators that map cleanly to hardware instructions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109902
Approved by: https://github.com/peterbell10
2023-09-25 18:18:06 +00:00
Mwiza Kunda
6b7b9c796e Fix registering jit decompositions for jvp for out wrapped decomps (#109367)
Python decompositions wrapped by `out_wrapper` need to be unwrapped before compiling with TorchScript since:
- `out_wrapper` extends the decompositions signature with an out parameter, however this `out` parameter is not present in the source code of the original decomposition so the resulting `ScriptFunction` will not have an `out` parameter
- `out_wrapper` is in the `torch._prims_common.wrappers` module so its `globals()` are different to the globals of the decomposition to be wrapped. This may cause symbol resolution to fail with the TorchScript compiler since it is compiling the unwrapped decomps source code rather than the wrapper

The python decomposition for `aten.trace` is wrapped as an example, other decompositions are to be fixed in https://github.com/pytorch/pytorch/pull/107707
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109367
Approved by: https://github.com/lezcano
2023-09-21 16:36:51 +00:00
Salil Desai
2e721aab98 [Decomposition] Trunc (#109319)
Summary:
Add Decomp for Trunc and add it to core_aten_decompositions

Differential Revision: D49042033

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109319
Approved by: https://github.com/SherlockNoMad
2023-09-19 13:30:13 +00:00
Jez Ng
7f3885137f Add meta function for _segment_reduce (#109359)
This fixes numerous tests which were xfailing. For instance, the
`_segment_reduce.lengths` OpInfo test, which was previously relying on
the fallback kernel to determine the shape of the meta tensor. The
fallback kernel would fail with

    segment_reduce(): Expected all rows of lengths along axis to sum to data.size(lengths.dim()-1) when !unsafe.

as it was trying to read the values of a meta tensor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109359
Approved by: https://github.com/ezyang
2023-09-16 13:31:03 +00:00
PyTorch MergeBot
41bd0fde7e Revert "Remove fixed skips (#108674)"
This reverts commit ab9fb03d6f.

Reverted https://github.com/pytorch/pytorch/pull/108674 on behalf of https://github.com/huydhn due to Sorry for picking this up a bit late, but with https://github.com/pytorch/pytorch/pull/108647 reverted, these tests are failing again. So we need to wait for the PR to reland before we can land this change ([comment](https://github.com/pytorch/pytorch/pull/108674#issuecomment-1715202692))
2023-09-12 08:04:32 +00:00
Ken Jin
c458fa0d35 Decompose/add reference for view_as_complex (#108005)
Aten source: d4a99631dd/aten/src/ATen/native/ComplexHelper.h (L78)

Documentation reference:
https://pytorch.org/docs/stable/generated/torch.view_as_complex.html

Note: this adds a new primitive `view_of_dtype`, which is trivially implemented, as its meta function is already implemented elsewhere.

Finally, this is not registered as a decomposition (yet), because TorchInductor does not yet support complex types. It should be added once we do.

Closes https://github.com/pytorch/pytorch/issues/108020 as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108005
Approved by: https://github.com/peterbell10, https://github.com/ezyang
2023-09-07 23:49:20 +00:00
eellison
ab9fb03d6f Remove fixed skips (#108674)
These no longer fail with TEST_WITH_TORCHINDUCTOR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108674
Approved by: https://github.com/desertfire
2023-09-07 17:36:56 +00:00
Kurt Mohler
3f88e3105f Reland: Remove remaining global set_default_dtype calls from tests (#108088)
Fixes #68972

Relands #107246

To avoid causing Meta-internal CI failures, this PR avoids always asserting that the default dtype is float in the `TestCase.setUp/tearDown` methods. Instead, the assert is only done if `TestCase._default_dtype_check_enabled == True`. `_default_dtype_check_enabled` is set to True in the `if __name__ == "__main__":` blocks of all the relevant test files that have required changes for this issue

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108088
Approved by: https://github.com/ezyang
2023-09-07 03:04:34 +00:00
PyTorch MergeBot
43527d41a2 Revert "Remove fixed skips (#108674)"
This reverts commit 518cfda2dd.

Reverted https://github.com/pytorch/pytorch/pull/108674 on behalf of https://github.com/huydhn due to Sorry for reverting this, but one test is failing on inductor 518cfda2dd, and it seems easier to revert this than disabling the test ([comment](https://github.com/pytorch/pytorch/pull/108674#issuecomment-1709310192))
2023-09-07 00:56:46 +00:00
eellison
518cfda2dd Remove fixed skips (#108674)
These no longer fail with TEST_WITH_TORCHINDUCTOR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108674
Approved by: https://github.com/desertfire
2023-09-06 22:33:43 +00:00
Guilherme Leobas
7e878c9d10 Add decomposition for aten.take_along_dim (#108185)
xref #107875

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108185
Approved by: https://github.com/lezcano
2023-09-04 13:49:53 +00:00
lezcano
239ee76177 Add refs/decomps for dot/vdot (#108194)
Follow-up on https://github.com/pytorch/pytorch/issues/108127#issuecomment-1698142427

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108194
Approved by: https://github.com/peterbell10
ghstack dependencies: #108188
2023-08-31 15:30:23 +00:00
Sherlock Huang
ee4b99cc3a Decomp for aten.dropout (#106274)
When exporting dropout with cpu tensor, we get following graph module
```
    class GraphModule(torch.nn.Module):
        def forward(self, arg0_1: f32[512, 10]):
            empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format)
            bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9);  empty_memory_format = None
            div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9);  bernoulli_p = None
            mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar);  arg0_1 = div_scalar = None
            return (mul_tensor,)
```

In addition, if we export with eval() mode, we will have an empty graph.

However, when exporting with cuda tensor, we got
```
    class GraphModule(torch.nn.Module):
        def forward(self, arg0_1: f32[512, 10]):
            native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True);  arg0_1 = None
            getitem: f32[512, 10] = native_dropout_default[0];  native_dropout_default = None
            return (getitem,)
```
and exporting under eval() mode will still have a dropout node in graph.

This PR make exporting with CPU tensor also produce aten.native_dropout.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274
Approved by: https://github.com/ezyang
2023-08-23 21:12:37 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Ivan Yashchuk
c913f3857f Remove dynamo+nvfuser (#105789)
This PR removes unmaintained Dynamo+nvFuser.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789
Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD
2023-08-08 22:29:32 +00:00
PyTorch MergeBot
891bb259f8 Revert "Remove dynamo+nvfuser (#105789)"
This reverts commit 6030151d37.

Reverted https://github.com/pytorch/pytorch/pull/105789 on behalf of https://github.com/DanilBaibak due to Break a lot of tests on main. ([comment](https://github.com/pytorch/pytorch/pull/105789#issuecomment-1669710571))
2023-08-08 14:20:32 +00:00
Ivan Yashchuk
6030151d37 Remove dynamo+nvfuser (#105789)
This PR removes unmaintained Dynamo+nvFuser.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105789
Approved by: https://github.com/jansel, https://github.com/jjsjann123, https://github.com/albanD
2023-08-08 13:29:31 +00:00
Peter Bell
ab6efb1649 [pt2] Add reference implementations of torch.{stft,istft} (#106400)
This allows symbolic shapes to be traced through `torch.stft` and `torch.istft`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106400
Approved by: https://github.com/lezcano
ghstack dependencies: #106319
2023-08-07 20:59:30 +00:00
Peter Bell
d4d090e2da [FakeTensor] Workaround FFT ops with incorrect meta strides (#106319)
Currently there are FFT operators which raise `UnsupportedOperatorException`
because their meta implementations sometimes give incorrect strides. This works
around the problem for static shapes by falling back to eager. Though we still
don't support calls with dynamic shapes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106319
Approved by: https://github.com/ezyang
2023-08-07 20:59:30 +00:00
Nikita Karetnikov
0ee3b84021 [pt2] add meta for cholesky_inverse (#106120)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106120
Approved by: https://github.com/ezyang
2023-07-29 17:16:20 +00:00
Nikita Karetnikov
80755884be [pt2] add meta for cholesky (#106115)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106115
Approved by: https://github.com/Skylion007, https://github.com/ezyang
2023-07-29 17:16:20 +00:00
Aaron Gokaslan
6d43c89f37 [BE]: Update Ruff to 0.0.280 (#105724)
Removes unusued loop values in python dictionary iteration. Automated fix from Ruff master

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105724
Approved by: https://github.com/ezyang, https://github.com/janeyx99
2023-07-22 23:03:34 +00:00
Justin Chu
4cc1745b13 [BE] f-stringify torch/ and scripts (#105538)
This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`.

- https://docs.python.org/3/reference/lexical_analysis.html#f-strings
- https://pypi.org/project/flynt/

Command used:

```
flynt torch/ -ll 120
flynt scripts/ -ll 120
flynt tools/ -ll 120
```

and excluded `collect_env.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538
Approved by: https://github.com/ezyang, https://github.com/malfet
2023-07-21 19:35:24 +00:00
Justin Chu
73e1455327 [BE] Enable ruff's UP rules and autoformat test/ (#105434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105434
Approved by: https://github.com/albanD
2023-07-19 20:36:06 +00:00
Kurt Mohler
ffce2492af Remove set_default_dtype calls from jit and ops tests (#105072)
Part of #68972

This only attempts to avoid setting the default dtype for `test_jit.py` and `test_ops.py`. There are other tests, like `test_nn.py`, which will be addressed in follow up PRs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105072
Approved by: https://github.com/ezyang
2023-07-15 03:18:33 +00:00
cyy
54cb61f7d9 enable ASAN on some tests (#103647)
Enabling more tests on ASAN, meanwhile we disable float-divide-by-zero and float-cast-overflow, both are disabled because they are also disabled by default in latest clang.
The following cited doc explains the reasons.
```
-fsanitize=float-cast-overflow: Conversion to, from, or between floating-point types
which would overflow the destination. Because the range of representable values
for all floating-point types supported by Clang is [-inf, +inf], the only cases detected are
conversions from floating point to integer types.
-fsanitize=float-divide-by-zero: Floating point division by zero.
This is undefined per the C and C++ standards,
 but is defined by Clang (and by ISO/IEC/IEEE 60559 / IEEE 754) as producing
either an infinity or NaN value,
so is not included in -fsanitize=undefined.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103647
Approved by: https://github.com/kit1980
2023-06-28 02:17:14 +00:00
Nikita Karetnikov
c40fa8b614 [inductor] remove fft and svd ops from fake_incorrect_kernels (#103616)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103616
Approved by: https://github.com/eellison
2023-06-22 03:01:43 +00:00
Aleksandar Samardžić
09fdea8564 Fix autograd issue with identity conversions (#92022)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92022
Approved by: https://github.com/pearu, https://github.com/mtaaooby, https://github.com/amjames, https://github.com/cpuhrsch
2023-06-21 21:23:03 +00:00
BowenBao
724a1ba2de Tidy __all__ under torch._refs (#103712)
- Added ops that were missing under `__all__`.
- Some misc changes to helper functions to make them private.
- Set correct `fn.__module__` for `fn` created by `_make_alias`, when called in another module.

All modification largely references results from a hacked version of `test_public_bindings::test_correct_module_names`.
By default `torch._refs` is not included in the test because it is technically a private package.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103712
Approved by: https://github.com/lezcano
2023-06-20 00:04:58 +00:00
ekkapricious
5d34656fd7 Update dynamo sum dtype handling to match eager (#103037)
The current behaviour for dynamo is to set the dtype to torch.int64 for integral types if the dtype is not specified explicitly which results in mismatched behaviour as compared to eager mode. In eager mode the semantics are:
- If both out is specified and dtype is specified then they have to match
- If dtype is not specified but out is specified then the dtype is set to match the out dtype
- If neither dtype nor out is set then the dtype is set to kLong if it is a bool or an integral type

Fixes #100698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103037
Approved by: https://github.com/ngimel
2023-06-19 22:26:37 +00:00
vfdev-5
e3d97b6213 [inductor] Added smooth_l1_loss refs (#102077)
Added `smooth_l1_loss` to refs + tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102077
Approved by: https://github.com/lezcano, https://github.com/ngimel
2023-05-24 15:07:08 +00:00
Khushi
51fe53e619 [opinfo] item (#100313)
Follows #100223

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100313
Approved by: https://github.com/ezyang
2023-05-10 11:32:45 +00:00
Pearu Peterson
3ae0e23b90 Fix sum OpInfo for sparse sample inputs and assert coverage for sparse-enabled operators (#100391)
This PR enables sum tests for sparse sample inputs. Previously, the tests existed but were never run because the sum OpInfo instance was created without specifying `supports_sparse_*=True`. To avoid such mistakes in the future, the following PR https://github.com/pytorch/pytorch/pull/100392 enables the `supports_sparse_*` flags automatically when OpInfo creation specifies `sample_inputs_sparse_*_func`.

In addition, the PR applies several fixes to sum tests for sparse sample inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100391
Approved by: https://github.com/cpuhrsch
2023-05-03 02:04:39 +00:00
Elias Ellison
638feec4e3 Turn on meta converter for complex (#98869)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98869
Approved by: https://github.com/ngimel
2023-04-20 16:42:38 +00:00
blorange-amd
455795c799 Enable fake_crossref unit tests on rocm (#97368)
This PR should enable 900+ fake_crossref unit tests for ROCm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97368
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2023-04-12 02:38:35 +00:00
eqy
2fddcf0fc0 [CUDA][CUDA 11] Remove more CUDA 11 version checks (#92934)
Working on removing stragglers missed in previous CUDA version < 11.0 cleanup PRs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92934
Approved by: https://github.com/ngimel
2023-03-30 19:49:52 +00:00
Aaron Gokaslan
47dca20d80 [BE] Enable flake8-comprehension rule C417 (#97880)
Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880
Approved by: https://github.com/ezyang, https://github.com/kit1980, https://github.com/albanD
2023-03-30 14:34:24 +00:00
Nikita Karetnikov
cb7c796b4b Enable min.unary_out (#96441)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96441
Approved by: https://github.com/ngimel
2023-03-11 19:23:33 +00:00
Edward Z. Yang
4833e47feb Add support for nonzero, some improvements to reduce guards (#95387)
This takes the strategy described in https://docs.google.com/document/d/1lFRYAJo5nrfxRhwIzGnfi2pbLpU6T4ytSRSuLJ5qebI/edit#

It is essentially https://github.com/pytorch/pytorch/pull/95222 but squashed and with changes that are unnecessary given that we assume nonzero returns > 1.

What's in the PR:

* nonzero now supports meta propagation. When `capture_dynamic_output_shape_ops`, it will return a tensor with an unbacked SymInt representing the size in question.
* The unbacked SymInt is UNSOUNDLY assumed to be not equal to 0/1. We will still error if you guard otherwise.
* PrimTorch pointwise operators are updated to use empty_permuted, to avoid guarding on unbacked SymInt from empty_strided (tested in `test_dynamic_pointwise_scalar`)
* Convolution is updated to skip backend selection if batch is unbacked, to avoid guarding on unbacked SymInt (tested in `test_unbacked_batch_resnet`)
* I kept the helper utilities like `definitely_true` for working with possibly unbacked SymInts. They're not used right now but maybe someone will find them useful.
* Added `constrain_unify` to let you specify two unbacked SymInts must have the same value

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95387
Approved by: https://github.com/voznesenskym
2023-02-24 00:27:45 +00:00
kshitij12345
3b966a6ce3 [autograd] disable backward/grad for complex scalar output (#92753)
Fixes https://github.com/pytorch/pytorch/issues/92750

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92753
Approved by: https://github.com/ezyang
2023-02-23 11:38:27 +00:00
Edward Z. Yang
f20c4d2345 Stop printing giant container in test failure message (#95226)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95226
Approved by: https://github.com/albanD
2023-02-21 21:15:02 +00:00
Fabio Rocha
b652577d8e Change test_torchinductor_opinfo.py to mark skips/xfails in a better way (#94813)
With this change, expected failures will be correctly reported as such by pytest (instead of passes as before).
It was sometimes a little confusing to see operators you did not expect to work in inductor reported as passing their tests.

One downside is that expected failures/skips for test variants have now to be identified by tuples. I.e., `("max", "reduction_no_dim"): {f16},` instead of just `"max.reduction_no_dim": {f16}`. It seems to me it is worth it.

This change would also allow to simplify `TestInductorOpInfo` class a little, since it doesn't have to handle the skips/xfails anymore, but that might require dropping support for things like `PYTORCH_COLLECT_EXPECT` and `PYTORCH_FAIL_ON_SUCCESS` so I didn't do it.

Also couple of other minor changes:

 - Got rid of c32, c64, c128 in torchinductor_opinfo. We don't support complex numbers, so they shouldn't be necessary.
 - Renamed TestExpect Enum to ExpectedTestResult to get rid of a pytest warning that thinks it is a class that has tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94813
Approved by: https://github.com/lezcano, https://github.com/jansel
2023-02-16 18:57:01 +00:00
Edward Z. Yang
ef5de0a4cf Don't use PrimTorch decomposition for empty (#94512)
This PR removes the unnecessary == 0 guard when constructing empty tensors, by ensuring that when we create a contiguous tensor we go directly to the C++ torch.empty implementation (instead of indirecting through empty_strided), where we can bypass doing zero tests when computing the size of the storage. This probably also speeds up trace time.

When I did this, I found out that `empty_tensor_restride_symint` was flagrantly wrong (we had never exercised it before because we redirected to `empty_strided` in PrimTorch decomp, which doesn't hit this codepath.) The bugs:

* Stride computation was wrong (only `last_idx` was ever written to)
* Using set_sizes_and_strides with `sym_sizes` input doesn't work, because there is some sort of ordering problem where `clone_symvec` isn't safe when you clone a vector into itself. Probably should fix this.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94512
Approved by: https://github.com/ngimel
2023-02-16 16:04:41 +00:00
PyTorch MergeBot
a049bbb100 Revert "Change test_torchinductor_opinfo.py to mark skips/xfails in a better way (#94813)"
This reverts commit bfc0d5e22c.

Reverted https://github.com/pytorch/pytorch/pull/94813 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it causes failures on trunk bfc0d5e22c due to a landrace with b6df987671
2023-02-16 05:08:23 +00:00
Fabio Rocha
bfc0d5e22c Change test_torchinductor_opinfo.py to mark skips/xfails in a better way (#94813)
With this change, expected failures will be correctly reported as such by pytest (instead of passes as before).
It was sometimes a little confusing to see operators you did not expect to work in inductor reported as passing their tests.

One downside is that expected failures/skips for test variants have now to be identified by tuples. I.e., `("max", "reduction_no_dim"): {f16},` instead of just `"max.reduction_no_dim": {f16}`. It seems to me it is worth it.

This change would also allow to simplify `TestInductorOpInfo` class a little, since it doesn't have to handle the skips/xfails anymore, but that might require dropping support for things like `PYTORCH_COLLECT_EXPECT` and `PYTORCH_FAIL_ON_SUCCESS` so I didn't do it.

Also couple of other minor changes:

 - Got rid of c32, c64, c128 in torchinductor_opinfo. We don't support complex numbers, so they shouldn't be necessary.
 - Renamed TestExpect Enum to ExpectedTestResult to get rid of a pytest warning that thinks it is a class that has tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94813
Approved by: https://github.com/lezcano, https://github.com/jansel
2023-02-16 03:32:01 +00:00
Aaron Gokaslan
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
albanD
496c0a207b Make segment_reduce properly private. (#93166)
I am attempting not to change the aten function to reduce the amount of BC issues on the torchscript side.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93166
Approved by: https://github.com/ngimel
2023-02-06 18:32:23 +00:00
Elias Ellison
e4f11e01bd [Fake Tensor] Allow fake meta by default, delete unused ctor args (#93993)
Two small changes that I'm bundling together because one of them needs to touch fbcode and I'm not sure how to do stacked diffs + internal changes + land before release cut.

Remove allow_meta from ctor, and allow by default: we should be able to trace through meta with fake tensors, so in some senses it's a bit weird to expose to user to disallow this. However, it's still useful debug wise to error from time to time, so I've added an option to the config that will get back previous behavior.

Remove `throw_on_data_dependent_ops=True`: this was intended as a temporary behavior as we were smoothing things turning on the erroring. There are no uses anywhere of `throw_on_data_dependent_ops=False` I could find.

These are technically backward-incompatble, but fake tensor is new since the last release / in a private namespace, and I don't want to release it with baggage that would be hard to remove later.

Fix for https://github.com/pytorch/pytorch/issues/92877.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93993
Approved by: https://github.com/bdhirsh, https://github.com/ezyang
2023-02-03 09:23:38 +00:00
Yanbo Liang
a6b51448f5 [Dynamo] Supports if condition on user defined object (#90892)
Fixes Meta internal user case, see the pattern in unit test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90892
Approved by: https://github.com/jansel, https://github.com/mlazos
2023-01-26 04:19:32 +00:00
lezcano
8b861544f9 Remove lowering and decompositions of zero_, zero, zeros_like... in favour of their references (#92071)
The generated triton code is identical.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92071
Approved by: https://github.com/ngimel
2023-01-18 23:22:36 +00:00
lezcano
da58f9eb8f Rewrite out-of-place decompositions in terms of out-of-place ops (#92003)
Fixes https://github.com/pytorch/torchdynamo/issues/1863

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92003
Approved by: https://github.com/ngimel
2023-01-17 16:53:27 +00:00
Elias Ellison
b651e06049 Add Pointwise Tag from pointwise set in DTensor, use in aot_autograd partitioner (#90029)
Takes the pointwise op list from [DTensor](https://github.com/pytorch/pytorch/blob/master/torch/distributed/_tensor/ops/pointwise_ops.py#L36) as an initially starting point for pointwise ops, and feeds them to the aot autograd partitioner.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90029
Approved by: https://github.com/ezyang
2022-12-08 20:21:17 +00:00
Jane Xu
8695f0cced Rectify native_batch_norm schema by splitting it into two legit schemas (#88697)
Using the same repro from the issue (but with BatchNorm2D)

Rectifies native_batch_norm schema by splitting the schema into 2:
1. one will have NON-optional alias-able running_mean and running_var inputs
2. the other will just not have those parameters at all (no_stats variation)

**Calling for name suggestions!**

## test plan
I've added tests in test_functionalization.py as well as an entry in common_method_invocations.py for `native_batch_norm_legit`
CI should pass.

## next steps
Because of bc/fc reasons, we reroute native_batch_norm to call our new schemas ONLY through the python dispatcher, but in 2 weeks or so, we should make `native_batch_norm_legit` the official batch_norm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88697
Approved by: https://github.com/albanD
2022-11-23 23:23:17 +00:00
lezcano
c2cf0bde1f Move the OpInfo same-storage error to the autograd test (#88306)
This check was previously located at the `non_contiguous` test (quite
and odd location). Even more, at https://github.com/pytorch/pytorch/pull/86378#discussion_r993658395, Kshiteej found that this assert was not doing anything really.

We move it to the autograd test and make it a proper `self.assert`. We also disallow returning 1-tuples from sample_input functions, as they were breaking this assert.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88306
Approved by: https://github.com/mruberry
2022-11-21 13:59:03 +00:00
lezcano
154e58c032 Add most in-place references/decompositions (#88117)
We add most in-place references in a generic way. We also implement a
wrapper to implement the annoying interface that `nn.functional`
nonlinearities have.

We fix along the way a couple decompositions for some non-linearities by
extending the arguments that the references have.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88117
Approved by: https://github.com/mruberry
2022-11-18 14:59:46 +00:00
Bin Bao
d0130cd21e Enable test_ops for inductor (#88994)
Summary: skip several unsupported test cases
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88994
Approved by: https://github.com/Krovatkin
2022-11-15 21:40:36 +00:00
Pruthvi Madugundu
2819df9a19 [ROCm] Enable python ref executor UTs for ROCm (#88981)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88981
Approved by: https://github.com/mruberry
2022-11-15 17:49:00 +00:00