Commit Graph

40 Commits

Author SHA1 Message Date
cyy
419a7e197d [6/N] Fix Wextra-semi warning (#139605)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139605
Approved by: https://github.com/ezyang
2024-11-04 13:43:16 +00:00
cyy
3ef45e5669 Fix ODR (#131032)
Fixes ODR violation

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131032
Approved by: https://github.com/ezyang
2024-08-05 23:19:49 +00:00
cyy
c99adce9a1 [12/N] Fix clang-tidy warnings in jit (#132209)
Follows #132131

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132209
Approved by: https://github.com/Skylion007
2024-08-01 15:12:12 +00:00
cyy
eccbd408e5 [10/N] Fix clang-tidy warnings in jit (#132122)
Follows #132010

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132122
Approved by: https://github.com/Skylion007
2024-07-30 12:56:31 +00:00
cyy
f4dcf2ae93 [1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301
Approved by: https://github.com/ezyang, https://github.com/r-barnes
2024-07-08 07:03:53 +00:00
PyTorch MergeBot
846bb30e13 Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)"
This reverts commit bd72e28314.

Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build bd72e28314. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))
2024-06-15 01:58:20 +00:00
cyy
bd72e28314 [1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301
Approved by: https://github.com/ezyang
2024-06-14 23:21:01 +00:00
Richard Barnes
ed327876f5 [codemod] c10:optional -> std::optional (#126135)
Generated by running the following from PyTorch root:
```
find . -regex ".*\.\(cpp\|h\|cu\|hpp\|cc\|cxx\)$" | grep -v "build/" | xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/'
```

`c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135
Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi
2024-05-14 19:35:51 +00:00
cyy
77f2883c41 [Reland2] fix missing-prototypes warnings in torch_cpu (Part 4) (#102228)
This PR relands the changes introduced in PR https://github.com/pytorch/pytorch/pull/100849. The old PR turnd nnc_* functions into  static. We now add declarations for them and hope that inter builds will pass.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102228
Approved by: https://github.com/albanD
2023-06-02 22:04:44 +00:00
PyTorch MergeBot
32ce06a5ab Revert "[Reland] fix missing-prototypes warnings in torch_cpu (Part 4) (#101949)"
This reverts commit 4f2c007a1b.

Reverted https://github.com/pytorch/pytorch/pull/101949 on behalf of https://github.com/osalpekar due to As noted in @izaitsevfb's comment, we are still seeing linker errors, this time due to `nnc_prepacked_linear_clamp_run` being made a static function. ([comment](https://github.com/pytorch/pytorch/pull/101949#issuecomment-1560226880))
2023-05-23 22:53:47 +00:00
cyy
4f2c007a1b [Reland] fix missing-prototypes warnings in torch_cpu (Part 4) (#101949)
This PR relands the changes introduced in PR #100849. The old PR turnd  nnc_aten_embedding  into a static function, however, it is actually used in torch/csrc/jit/tensorexpr/operators/misc.cpp.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/101949
Approved by: https://github.com/albanD
2023-05-22 10:53:07 +00:00
PyTorch MergeBot
498c34e8e8 Revert " fix missing-prototypes warnings in torch_cpu (Part 4) (#100849)"
This reverts commit c2f28d1c1d.

Reverted https://github.com/pytorch/pytorch/pull/100849 on behalf of https://github.com/izaitsevfb due to fails internal Meta builds, including fbcode and android, see D46009888: ld.lld: error: undefined symbol: nnc_aten_embedding ([comment](https://github.com/pytorch/pytorch/pull/100849#issuecomment-1555105800))
2023-05-19 19:05:15 +00:00
cyy
c2f28d1c1d fix missing-prototypes warnings in torch_cpu (Part 4) (#100849)
This PR fixes more missing-prototypes violations in the torch_cpu source following PRs #100053, #100147 and #100245

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100849
Approved by: https://github.com/albanD
2023-05-18 03:49:45 +00:00
Kazuaki Ishizaki
88234540e7 Fix typo under torch/csrc/jit/tensorexpr directory (#97218)
This PR fixes typo in comments and messages under `torch/csrc/jit/tensorexpr` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97218
Approved by: https://github.com/davidberard98, https://github.com/jgong5, https://github.com/EikanWang, https://github.com/kit1980
2023-03-30 04:21:24 +00:00
Aaron Gokaslan
0247ed27cc Apply Clang-Tidy readability-container-size-empty (#93236)
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
2023-01-29 23:28:19 +00:00
Nikita Shulga
8f1c3c68d3 [BE] Use nested namespaces in .cpp/.cu files (#92100)
As we live in C++17 world

This is a functional no-op, just
- `s/namespace at { namespace native {/namespace at::native {/`
- `s/namespace torch { namespace jit {/namespace torch::jit {/`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100
Approved by: https://github.com/izaitsevfb
2023-01-13 16:32:34 +00:00
Aaron Gokaslan
3916d7a575 Apply modernize-use-emplace to aten, c10, torch (#91077)
Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077
Approved by: https://github.com/ezyang
2022-12-19 07:49:56 +00:00
Aaron Gokaslan
7541c9f8be [Fix]: remove unnecessary copies in aten, c10, and torch bindings (#90629)
Applies various automated fixes that reduces the number of spurious copies in torch, aten, and c10. I also inlined any default dtors that would have made the type trivially destructible.

Follow up to #89000

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90629
Approved by: https://github.com/ezyang
2022-12-12 17:05:52 +00:00
PyTorch MergeBot
41b54c303d Revert "Fix crash on unload torch cpu dll (#67632)"
This reverts commit a54c9a419e.

Reverted https://github.com/pytorch/pytorch/pull/67632 on behalf of https://github.com/ezyang due to crashing in fbcode
2022-08-02 00:56:18 +00:00
David Braun
a54c9a419e Fix crash on unload torch cpu dll (#67632)
Trying to rebase https://github.com/pytorch/pytorch/pull/61290 into latest pytorch:master
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67632
Approved by: https://github.com/ezyang
2022-07-31 21:37:56 +00:00
Nikita Shulga
f6c275f55d Remove -Wno-unused-variable from utils.cmake (take 2) (#75538)
Summary:
[Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with  top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there.

Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block.

Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538

Reviewed By: anjali411

Differential Revision: D35747333

Pulled By: malfet

fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626
(cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)
2022-04-20 17:41:59 +00:00
PyTorch MergeBot
5c56b2286b Revert "Remove -Wno-unused-variable from utils.cmake"
This reverts commit 018cbe1f5c.

Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere
2022-04-19 17:19:09 +00:00
Nikita Shulga
018cbe1f5c Remove -Wno-unused-variable from utils.cmake
[Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with  top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there.

Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block.

Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538
Approved by: https://github.com/cpuhrsch
2022-04-19 15:26:55 +00:00
Ivan Kobzarev
519e226b66 [tensorexp] ExternalCall2 without memcpy (#72225)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72225

Test Plan: Imported from OSS

Reviewed By: dagitses

Differential Revision: D33960933

Pulled By: IvanKobzarev

fbshipit-source-id: fc73a3de9e5150919e3806516065b4a6c8316000
(cherry picked from commit f637842c341e0ba94906a0c8a1efc81691dc512c)
2022-03-09 21:19:26 +00:00
Hui Guo
7abb7667a6 [tensorexpr] Add memory planning to reuse intermediate buffers (#66452)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66452

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D31557188

Pulled By: huiguoo

fbshipit-source-id: f18dfeba1df20d5d4f118640fc10782534eb9219
2021-12-17 01:38:02 -08:00
Hui Guo
bbfd7b75ca [tensorexpr] Move the allocation of intermediate buffers from TEK to CodeGen (#67143)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67143

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D31881151

Pulled By: huiguoo

fbshipit-source-id: 457e5d4ff8a15f70af9c797c9ab4803d8e779abe
2021-12-17 01:37:56 -08:00
Bert Maher
8cf047afac [nnc] Add call_with_numel interface for fast CUDA calls (#65213)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65213

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D31319012

Pulled By: bertmaher

fbshipit-source-id: 93fee80f956795470f5a2ce3b33c2ea2f132036f
2021-10-01 06:58:37 -07:00
Bert Maher
e7fb35021a [nnc] Enable fusion of bfloat16 ops (#64196)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64196

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D30643864

Pulled By: bertmaher

fbshipit-source-id: e95edeaf7089464d713ea1d1f951743d3e5f61c5
2021-08-30 20:09:36 -07:00
Mikhail Zolotukhin
1dc2b52764 [TensorExpr] Add a wrapper for all expr and stmt pointers. (#63195)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63195

This helps us to later switch from using KernelArena with raw pointers
to shared pointers without having to change all our source files at
once.

The changes are mechanical and should not affect any functionality.

With this PR, we're changing the following:
 * `Add*` --> `AddPtr`
 * `new Add(...)` --> `alloc<Add>(...)`
 * `dynamic_cast<Add*>` --> `to<Add>`
 * `static_cast<Add*>` --> `static_to<Add>`

Due to some complications with args forwarding, some places became more
verbose, e.g.:
 * `new Block({})` --> `new Block(std::vector<ExprPtr>())`

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D30292779

Pulled By: ZolotukhinM

fbshipit-source-id: 150301c7d2df56b608b035827b6a9a87f5e2d9e9
2021-08-17 13:44:45 -07:00
Raghavan Raman
59dd12042e [nnc] Removed const from all fields in IR. (#62336)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62336

This PR was generated by removing `const` for all types of nodes in NNC IR, and fixing compilation errors that were the result of this change.

This is the first step in making all NNC mutations in-place.

Test Plan: Imported from OSS

Reviewed By: iramazanli

Differential Revision: D30049829

Pulled By: navahgar

fbshipit-source-id: ed14e2d2ca0559ffc0b92ac371f405579c85dd63
2021-08-03 11:44:36 -07:00
Mikhail Zolotukhin
c751e53800 [TensorExpr] Implement 'call_raw' in IREval. (#57882)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57882

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D28306752

Pulled By: ZolotukhinM

fbshipit-source-id: 11d0034f9bfbadf8483de90c457f952a2161f10b
2021-05-12 14:08:18 -07:00
Bert Maher
fce059d4ff [te] Don't throw when re-registering a CodeGen factory (#49174)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/49174

We've seen this happening when libtorch is loaded repeatedly on macOS.  Tbh I'm not sure I understand why this happens; why do we re-construct these static objects but re-use the static registry itself?  But it's fairly straightforward to just overwrite the factory method and no harm in doing so.
ghstack-source-id: 118306581

Test Plan: compile

Reviewed By: ZolotukhinM

Differential Revision: D25466642

fbshipit-source-id: 4c456a57407f23fa0c9f4e74975ed1186e790c74
2020-12-10 23:37:29 -08:00
Elias Ellison
664d2f48cf [NNC] Enable unary op cpu testing (#47374)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47374

A few small fixes needed to enable unary op cpu testing. If reviewers would prefer I split  them  up let me know.

Test Plan: Imported from OSS

Reviewed By: ansley

Differential Revision: D24805248

Pulled By: eellison

fbshipit-source-id: c2cfe2e3319a633e64da3366e68f5bf21d390cb7
2020-11-12 11:14:03 -08:00
Mikhail Zolotukhin
9b168a1fed [TensorExpr] Pick meaningful names for functions in TE codegen. (#47255)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47255

As a result of this change, the generated CUDA code for the following fusion group:
```
graph(%0 : Float(32, 32, 1, 1, strides=[32, 1, 1, 1], requires_grad=0, device=cuda:0),
      %1 : Float(32, 32, strides=[32, 1], requires_grad=0, device=cuda:0),
      %2 : Float(32, 32, 1, strides=[32, 1, 1], requires_grad=0, device=cuda:0)):
  %3 : int = prim::Constant[value=1]()
  %v1.1 : Float(32, 32, 32, strides=[1024, 32, 1], requires_grad=0, device=cuda:0) = aten::add(%1, %2, %3) # test/test_tensorexpr.py:155:0
  %5 : int = prim::Constant[value=1]()
  %6 : Float(32, 32, 32, 32, strides=[32768, 1024, 32, 1], requires_grad=0, device=cuda:0) = aten::add(%v1.1, %0, %5) # test/test_tensorexpr.py:156:0
  return (%6)
```

Would look like the following:
```
extern "C" __global__
void fused_add_add(float* t0, float* t1, float* t2, float* aten_add) {
{
  float v = __ldg(t1 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32) + (512 * blockIdx.x + threadIdx.x) % 32);
  float v_1 = __ldg(t2 + ((512 * blockIdx.x + threadIdx.x) / 32) % 32 + 32 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32));
  float v_2 = __ldg(t0 + ((512 * blockIdx.x + threadIdx.x) / 1024) % 32 + 32 * ((512 * blockIdx.x + threadIdx.x) / 32768));
  aten_add[((((512 * blockIdx.x + threadIdx.x) / 32768) * 32768 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32)) + 1024 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32)) + (512 * blockIdx.x + threadIdx.x) % 32] = (v + v_1) + v_2;
}
}
```

Previously we generated:
```
extern "C" __global__
void func(float* t0, float* t1, float* t2, float* aten_add) {
{
  float v = __ldg(t1 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32) + (512 * blockIdx.x + threadIdx.x) % 32);
  float v_1 = __ldg(t2 + ((512 * blockIdx.x + threadIdx.x) / 32) % 32 + 32 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32));
  float v_2 = __ldg(t0 + ((512 * blockIdx.x + threadIdx.x) / 1024) % 32 + 32 * ((512 * blockIdx.x + threadIdx.x) / 32768));
  aten_add[((((512 * blockIdx.x + threadIdx.x) / 32768) * 32768 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32)) + 1024 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32)) + (512 * blockIdx.x + threadIdx.x) % 32] = (v + v_1) + v_2;
}
}
```

Differential Revision: D24698273

Test Plan: Imported from OSS

Reviewed By: bertmaher

Pulled By: ZolotukhinM

fbshipit-source-id: 6da95c6ac3d5155ebfaaab4f84f55a24deb6d10d
2020-11-03 16:41:22 -08:00
Bert Maher
55ff9aa185 Test TE fuser unary ops and fix sigmoid(half) (#44094)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44094

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D23494950

Pulled By: bertmaher

fbshipit-source-id: 676c4e57267c4ad92065ea90b06323918dd5b0de
2020-09-03 12:48:46 -07:00
Mikhail Zolotukhin
ef50694d44 [TensorExpr] Apply GenericIntrinsicExpander recursively. (#42567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42567

Before this change we didn't expand arguments, and thus in an expr
`sigmoid(sigmoid(x))` only the outer call was expanded.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D22936177

Pulled By: ZolotukhinM

fbshipit-source-id: 9c05dc96561225bab9a90a407d7bcf9a89b078a1
2020-08-05 14:13:46 -07:00
Xiaoqiang Zheng
6bdfd6ae1a [TensorExpr] Fast sigmoid for LLVM (#39717)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39717

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D21949849

Pulled By: zheng-xq

fbshipit-source-id: f918bb2cb0ea647ce254fc51258af6fd01325f2d
2020-06-09 20:11:35 -07:00
Mikhail Zolotukhin
95833a49e6 [TensorExpr] Pull changes from bertmaher/pytorch_fusion. (#34842)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34842

This PR (hopefully the last one of such kind) is merging changes from a
side branch where tensor expessions based fuser work has been done so
far. This PR is is a squashed version of changes in the side branch,
which is available here: https://github.com/bertmaher/pytorch

Differential Revision: D20478208

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: 21556e009f1fd88099944732edba72ac40e9b9c0
2020-03-17 11:02:48 -07:00
Mikhail Zolotukhin
e31d462e92 [TensorExpr] Pull changes to core classes for representing expressions and statements from the side branch. (#34224)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34224

Our development has been happening on a side branch `pytorch_fusion` in
`bertmaher/pytorch` fork. This PR moves changes to the core classes
representing expressions and transformations on them.

At this moment, the tensor expressions are only used in tests.
Subsequent PRs add LLVM and CUDA codegen for tensor expressions and
implement fuser on top of these.

This PR is huge as it is a squashed version of changes in the side
branch. It is not practical to pull changes one by one from the branch,
so here is the squashed version. If you're interested in seeing the
history of changes, please refer to https://github.com/bertmaher/pytorch

Differential Revision: D20251835

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: 1a871acc09cf3c6f7fb4af40d408cdbb82dc7dab
2020-03-16 11:47:47 -07:00
Mikhail Zolotukhin
fc70fc3610 [TensorExpr] Add IR visitor, IR mutator, and IR evaluator. (#33219)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33219

Test Plan: Imported from OSS

Differential Revision: D19848381

Pulled By: ZolotukhinM

fbshipit-source-id: 44ca7cd99c25e290a8ffd8146785c19f9c785dfd
2020-02-21 13:10:22 -08:00