Commit Graph

412 Commits

Author SHA1 Message Date
Kazuaki Ishizaki
105f3b5f91 Fix typo under caffe2 directory (#110825)
This PR fixes typo `the the` of comments in files under `caffe2` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110825
Approved by: https://github.com/Skylion007
2023-10-08 20:48:12 +00:00
cyy
ac603bc2f8 [Reland] Eliminate invocations of c10::stoi,c10::stod,c10::stoull,c10::stoll (#109566)
This is reland of #87603 with definitions of c10::stoXX kept for further investigation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109566
Approved by: https://github.com/huydhn
2023-09-19 07:15:25 +00:00
PyTorch MergeBot
4d44d8c00a Revert "Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179)"
This reverts commit 852f1b8417.

Reverted https://github.com/pytorch/pytorch/pull/109179 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this is breaking periodic buck build, so please fix the issue and reland the change https://github.com/pytorch/pytorch/actions/runs/6207458526/job/16852695272 ([comment](https://github.com/pytorch/pytorch/pull/109179#issuecomment-1724168571))
2023-09-18 18:41:12 +00:00
cyy
852f1b8417 Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179)
We can remove these functions in favor of std ones.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109179
Approved by: https://github.com/colesbury
2023-09-16 07:22:50 +00:00
Alan Ji
70b0f1b248 fix some typos (#106018)
Fixes #ISSUE_NUMBER
Fix typos in `test_static_module.cc`, `backend_cutting_test.cc` and `types_base.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/106018
Approved by: https://github.com/awgu
2023-07-26 18:14:44 +00:00
cyy
483f748dd5 [BE] Enforce missing override keyword (#104032)
This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>

This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
2023-06-24 02:34:24 +00:00
PyTorch MergeBot
b5594f7df0 Revert "Use missing-prototypes in torch_cpu (#103725)"
This reverts commit 716b3b893d.

Reverted https://github.com/pytorch/pytorch/pull/103725 on behalf of https://github.com/osalpekar due to Broke caffe2 builds due. More info at [D46920675](https://www.internalfb.com/diff/D46920675) ([comment](https://github.com/pytorch/pytorch/pull/103725#issuecomment-1603129273))
2023-06-22 18:30:31 +00:00
cyy
716b3b893d Use missing-prototypes in torch_cpu (#103725)
This PR enables  Wmissing-prototypes in torch_cpu except some generated cpp files and the mps and metal backends.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103725
Approved by: https://github.com/albanD
2023-06-21 13:19:55 +00:00
Aleksei Nikiforov
eaa00017c8 S390x tests (#99871)
Disable tests using quantized operators if QNNPACK is not available

Two disabled tests use Int8FC operators
which are not available if QNNPACK is not available,
and fail only due to that.

Disable cpuid_test on s390x
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99871
Approved by: https://github.com/albanD
2023-04-26 21:48:03 +00:00
cyy
d0e4ca233e some reference and move fixes (#95942)
This PR introduces some modifications:
1. We find out some const function parameters that can be passed by reference and add the reference.
2. We find more opportunists of passing by value and change them accordingly.
3. Some use-after-move errors are fixed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95942
Approved by: https://github.com/Skylion007
2023-03-10 03:44:09 +00:00
Will Constable
4f34cd6d1e Replace all CHECK_ and DCHECK_ with TORCH_* macros (#82032)
Avoid exposing defines that conflict with google logging, since this blocks external usage of libtorch in certain cases.

All the 'interesting' changes should be in these two files, and the rest should just be mechanical changes via sed.
c10/util/logging_is_not_google_glog.h
c10/util/logging_is_google_glog.h

Fixes https://github.com/pytorch/pytorch/issues/81415

cc @miladm @malfet
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82032
Approved by: https://github.com/soumith, https://github.com/miladm
2022-07-26 01:20:44 +00:00
zhang, xiaobing
86b86202b5 fix torch.config can't respect USE_MKLDNN flag issue (#75001)
Fixes https://github.com/pytorch/pytorch/issues/74949, which reports that torch.config can't respect USE_MKLDNN flag.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75001
Approved by: https://github.com/malfet
2022-07-17 15:00:48 +00:00
Michael Andreas Dagitses
606b234336 turn on -Werror=unused-function in our Bazel CPU build
Summary:
We also fix any existing issues. Note that we only do this for the CPU
build because nvcc is considered a C++ toolchain but it does not have
the same flag support. Adding flags to the GPU build will cause nvcc
errors.

Test Plan: Built locally, rely on CI to confirm.

Reviewers: malfet

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79154

Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/albanD
2022-06-10 22:11:54 +00:00
PyTorch MergeBot
bcd7a20953 Revert "turn on -Werror=unused-function in our Bazel CPU build"
This reverts commit 67d313a032.

Reverted https://github.com/pytorch/pytorch/pull/79154 on behalf of https://github.com/malfet due to Breaks bazel build: 67d313a032
2022-06-10 20:43:03 +00:00
Michael Andreas Dagitses
67d313a032 turn on -Werror=unused-function in our Bazel CPU build
Summary:
We also fix any existing issues. Note that we only do this for the CPU
build because nvcc is considered a C++ toolchain but it does not have
the same flag support. Adding flags to the GPU build will cause nvcc
errors.

Test Plan: Built locally, rely on CI to confirm.

Reviewers: malfet

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79154

Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/albanD
2022-06-10 18:30:08 +00:00
Yulv-git
ac2d2e3a3d Fix some typos.
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75561
Approved by: https://github.com/albanD
2022-04-11 21:55:59 +00:00
Nikita Shulga
f6e7a2ab64 Fix sign-compare in caffe2 cpp tests
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75084

Approved by: https://github.com/ngimel
2022-04-05 00:08:05 +00:00
Ankur Singla
2539b6a984 [DistributedInference] Relax the assertion for uniqueness of blob name across external inputs and outputs (#72492)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72492

Having same blob name present in external inputs and external outputs is a valid case, so relaxing the validation for that.

Reviewed By: yyetim

Differential Revision: D34062055

fbshipit-source-id: 6772ef9c3259da221207d14e5cc93a7777002ef2
(cherry picked from commit 0de66a2941)
2022-02-09 21:50:47 +00:00
Richard Barnes
1622546050 use irange for loops (#70248)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70248

Modified loops in files under fbsource/fbcode/caffe2/ from the format
```
for(TYPE var=x0;var<x_max;x++)
```
to the format
```
for(const auto var: irange(xmax))
```

This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D32813863

fbshipit-source-id: 527244b4a2b220fdfe7f17dee3599603f492a2ca
2022-01-06 23:14:29 -08:00
Ramanpreet Nara
f587267dc7 Revert D31705359: use irange for loops 8
Test Plan: revert-hammer

Differential Revision:
D31705359 (17e5200441)

Original commit changeset: c9ea2fbc0f9c

fbshipit-source-id: 08fff2d12beca953ad30dd0baabf86e39ac84f14
2021-12-02 12:55:08 -08:00
Richard Barnes
17e5200441 use irange for loops 8 (#66743)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66743

Modified loops in files under fbsource/fbcode/caffe2/ from the format

`for(TYPE var=x0;var<x_max;x++)`

to the format

`for(const auto var: irange(xmax))`

This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D31705359

fbshipit-source-id: c9ea2fbc0f9cd29e97a52dcb203addc5f2abb09b
2021-12-02 10:21:29 -08:00
Xue Li
2f099c7555 Revert D30652629: use irange for loops
Test Plan: revert-hammer

Differential Revision:
D30652629 (687c2267d4)

Original commit changeset: 0ae6c4bbbb55

fbshipit-source-id: 5c4f067b584a021c8c9656454d1ee60999600fb3
2021-10-15 15:23:10 -07:00
Richard Barnes
687c2267d4 use irange for loops (#66234)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234

Modified loops in files under fbsource/fbcode/caffe2/ from the format

`for(TYPE var=x0;var<x_max;x++)`

to the format

`for(const auto var: irange(xmax))`

This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand.

bypass_size_limit
allow-large-files

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D30652629

fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e
2021-10-15 13:50:33 -07:00
Nikita Shulga
4c4525fa5c Compile without -Wno-unused-variable (take 2) (#66041)
Summary:
Delete `-Wno-unused-variable` from top level `CMakeLists.txt`
Still suppress those warnings for tests and `torch_python`

Delete number of unused variables from caffe2 code
Use `(void)var;` to suppress unused variable in range loops
Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants

Do not delete `caffe2::OperatorBase::Output` calls as they have side effects

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66041

Reviewed By: ngimel

Differential Revision: D31360142

Pulled By: malfet

fbshipit-source-id: 6fdfb9f91efdc49ca984a2f2a17ee377d28210c8
2021-10-04 20:39:39 -07:00
Nikita Shulga
e4ee5ca698 Revert D31326599: [pytorch][PR] Compile without -Wno-unused-variable
Test Plan: revert-hammer

Differential Revision:
D31326599 (a6280ab653)

Original commit changeset: 924155f1257a

fbshipit-source-id: b8ee5bc0298637443232f5ee9ec79e51ed256faf
2021-10-01 20:40:47 -07:00
Nikita Shulga
a6280ab653 Compile without -Wno-unused-variable (#65954)
Summary:
Delete `-Wno-unused-variable` from top level `CMakeLists.txt`
Still suppress those warnings for tests and `torch_python`

Delete number of unused variables from caffe2 code
Use `(void)var;` to suppress unused variable in range loops
Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65954

Reviewed By: ngimel

Differential Revision: D31326599

Pulled By: malfet

fbshipit-source-id: 924155f1257a2ba1896c50512f615e45ca1f61f3
2021-10-01 17:40:47 -07:00
Janet Yang
10f6294281 Fix shape inference dim_type for Clip, Mean, Div (#65996)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65996

Test Plan:
Facebook
```
buck build caffe2/caffe2/opt:bound_shape_inference_test && ./buck-out/gen/caffe2/caffe2/opt/bound_shape_inference_test --gtest_filter=*Clip*
```
```
buck build caffe2/caffe2/opt:bound_shape_inference_test && ./buck-out/gen/caffe2/caffe2/opt/bound_shape_inference_test --gtest_filter=*Div*
```
```
buck build caffe2/caffe2/opt:bound_shape_inference_test && ./buck-out/gen/caffe2/caffe2/opt/bound_shape_inference_test --gtest_filter=*Mean*
```

Reviewed By: yinghai

Differential Revision: D31121298

fbshipit-source-id: f366d8f4d4d0be159b62bfaafc42ca924c05e022
2021-10-01 17:34:34 -07:00
Nikita Shulga
709ac6853a Fix warnings (#62930)
Summary:
Add `-Wno-writable-strings`(which is clang's flavor of `-Wwrite-strings`) to list of warnings ignored while compiling torch_python.
Avoid unnecessary copies in range loop
Fix number of signed-unsigned comparisons

Found while building locally on M1

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62930

Reviewed By: albanD

Differential Revision: D30171981

Pulled By: malfet

fbshipit-source-id: 25bd43dab5675f927ca707e32737ed178b04651e
2021-08-11 14:07:10 -07:00
Nikita Shulga
30214aef2d [BE] irangefy (#62928)
Summary:
Replace for loop with for `irange` loop. Also fix some unused variable warnings in range loop cases

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62928

Reviewed By: driazati

Differential Revision: D30171904

Pulled By: malfet

fbshipit-source-id: 1b437a0f7e3515f4a2e324f3450e93312f1933ae
2021-08-07 13:34:13 -07:00
Janet Yang
962841b532 Fix subnet counting and re-enable check for multiple onnxifi ops in AOT (#62033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62033

Count the number of onnxifi ops rather than just number of subnets, since when the subnet size < min_ops, it isn't turned into an onnxifi op.

Test Plan:
Runs which ran into the "Did not find a partition with an SLS node" error now report "multiple onnxifi ops found"
From https://fb.workplace.com/groups/527892364588452/permalink/807802049930814/:
```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-06-30/onnxifi_caffe2_net_aot_input_arguments_01-55-32_711d9476?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"

```
Reran some failures from last week which now pass AOT:
From https://fb.workplace.com/groups/527892364588452/permalink/807802049930814/,
https://fb.workplace.com/groups/243933520351820/permalink/572715897473579/

```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-09/onnxifi_caffe2_net_aot_input_arguments_05-31-08_ef5393a6?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"
```
```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-12/onnxifi_caffe2_net_aot_input_arguments_14-44-34_cfdf3053?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"
```
```
buck run mode/opt-clang -c python.package_style=inplace sigrid/predictor/scripts:rerun_aot -- --manifold_url="https://manifold.facebook.net/v0/read/tree/2021-07-13/onnxifi_caffe2_net_aot_input_arguments_04-03-30_162e7e53?bucketName=dper3_job_meta&apiKey=dper3_job_meta-key&timeoutMsec=5000&withPayload=1"
```

Reviewed By: khabinov

Differential Revision: D29796893

fbshipit-source-id: e9de7529ef86745207d41643d0fbe932fa166437
2021-07-26 16:08:51 -07:00
Richard Barnes
ee44d73e59 Modernize override (#61744)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61744

Test Plan: Sandcastle

Reviewed By: malfet

Differential Revision: D29717320

fbshipit-source-id: 6eea4295ee2e5572ab337620be412376fcc2f3cc
2021-07-23 23:04:46 -07:00
Nikita Shulga
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00
Janet Yang
86eac5b456 [caffe2] Check for number of created subnets and optionally throw an error (#57366)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57366

We often get error messages such as
```
Model failed AOT (glow ahead-of-time compilation) with exception: Error during AOT optimization (non-provisioned addNetwork):
Non-recoverable device error when adding network:
Error code: PARTITIONER_ERROR
Error message: Did not find a partition with an SLS node

Error return stack:
--------------------------------------------------------------------------------
glow/glow/lib/Partitioner/Partitioner.cpp:1244
--------------------------------------------------------------------------------
glow/glow/lib/Runtime/HostManager/HostManager.cpp:375
--------------------------------------------------------------------------------
```
This makes the error message more clear by checking for the number of OnnixifiOp created before going into Glow. The check is enabled with the `verify_only_single_subnet` flag, and is disabled by default.

Test Plan: Unit tests pass

Reviewed By: khabinov

Differential Revision: D28097674

fbshipit-source-id: 0eefd8f6ec1a82546b759be8e541256bf271a673
2021-07-08 14:29:03 -07:00
Venkata Chintapalli
580831bfbb Add support for MatMul to BatchMatMulFP16Acc{16,32}Fake Op Mapping
Test Plan: f276981395

Reviewed By: hx89

Differential Revision: D28815646

fbshipit-source-id: c16b081bf3da2b157b9d42ea67b03dae88e82c6d
2021-06-02 08:32:21 -07:00
Oleg Khabinov
cab4849463 [caffe2][glow] Share info about current batch_size (#58902)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58902

Pull Request resolved: https://github.com/pytorch/glow/pull/5681

Reviewed By: ChunliF

Differential Revision: D28665162

fbshipit-source-id: 39e173a24ee247bc6fee44009798c74dddb27648
2021-06-01 01:21:42 -07:00
Edmund Chang
dd3bd0286b T89509943 - Improve error message during Glow ONNXIFI (#58069)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58069

We want to tell user 5821 means ONNXIFI_EVENT_STATE_NONSIGNALLED in the error message.

Added that status code to the mapping and the error message output.

Reviewed By: hl475

Differential Revision: D28359864

fbshipit-source-id: 87f50ddd4ded9ced03ec6af6a1a4ef85bd2195d6
2021-05-13 09:02:36 -07:00
Nikita Shulga
3a66a1cb99 [clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841)
Summary:
Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy
Remove existing nolint warnings using following script:
```
for file in `git ls-files | grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i  $file; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841

Reviewed By: samestep

Differential Revision: D28295045

Pulled By: malfet

fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163
2021-05-07 20:02:33 -07:00
Han Li
6d3bb01b1a Sequence Blob NVM Reader to Selectively NVMify Ads Embeddings in A*
Summary:
This diff enabled mapping a selected set of Ads embeddings to the T17 host on hierarchical memory (nvmify). To achieve that the following is implemented:

- Allow fo OTHER net to be both onnxified and nvmified
  - For that an allowlist placement policy is added to the nvmify stack
  - onnxifi_transform is lightly updated to accept a blacklist of operators based on name
  - nvm transform is broken into two parts, op replacement, and blob update.
  - A drived class `SeqBlobNVMReader` is defined which adds the functionality to load blobs to the card or nvm.

Test Plan:
* Unit test
* Run predictor replayer: selectively load the following ads embedding to NVM as in `--caffe2_nvm_dram_placement_file=/home/hanli/nvm_allowlist`:
```
SPARSE_AD_ACCOUNT_ID
SPARSE_NEW_AD_ID_COARSE
SPARSE_NEW_AD_ID_REFINED
SPARSE_NEW_CAMPAIGN_ID
SPARSE_NEW_TARGET_ID
SPARSE_NEW_AD_CLUSTER_ID
SPARSE_NEW_PAGE_ID
SPARSE_NEW_STORY_ID
SPARSE_NEW_VIDEO_ID
SPARSE_ENTITY_EQUIVALENCE_KEY
SPARSE_ENTITY_EQUIVALENCE_KEY_NO_CREATIVE
```
major parameter change in sigrid_remote_predictor_glow_nnpi:
```
--caffe2_nets_to_nvmify=DISAGG_ACC_REMOTE_OTHER \
--caffe2_nvm_sls_ops=SparseLengthsSumFused8BitRowwise,SparseLengthsWeightedSumFused8BitRowwise,SparseLengthsSumFused4BitRowwise,SparseLengthsWeightedSumFused4BitRowwise,SparseLengthsSum4BitRowwiseSparse \
--caffe2_nvm_table_path=/home/hanli/tables/225412100_2870/ \
--caffe2_nvm_dram_placement_file=/home/hanli/nvm_allowlist \
--caffe2_nvm_dram_placement_policy=by_file_allowlist \
--caffe2_predictor_nets_to_load=DISAGG_ACC_REMOTE_OTHER
```
In predictor log, observe that the blobs to be NVMified are transformed in op types, skipped in Onnxifi transform, and deferred loaded and do NVM net transform:
```
I0416 09:59:29.550690 662344 Nvmifier.cpp:142] ^[[92mReplacing SparseLengthsSumFused4BitRowwise with NVM variant.^[[0m
I0416 09:59:29.550701 662344 Nvmifier.cpp:142] ^[[92mReplacing SparseLengthsSumFused4BitRowwise with NVM variant.^[[0m
I0416 09:59:29.550705 662344 Nvmifier.cpp:142] ^[[92mReplacing SparseLengthsSumFused4BitRowwise with NVM variant.^[[0m
I0416 09:59:29.550712 662344 Nvmifier.cpp:142] ^[[92mReplacing SparseLengthsSumFused4BitRowwise with NVM variant.^[[0m
I0416 09:59:29.550715 662344 Nvmifier.cpp:142] ^[[92mReplacing SparseLengthsSumFused4BitRowwise with NVM variant.^[[0m
I0416 09:59:29.550721 662344 Nvmifier.cpp:142] ^[[92mReplacing SparseLengthsSumFused4BitRowwise with NVM variant.^[[0m

...
I0416 09:59:31.665369 662344 onnxifi_transformer.cc:1097] Skipping blocklisted op SparseLengthsSumFused4BitRowwiseNVM at pos 770
I0416 09:59:31.667042 662344 onnxifi_transformer.cc:1097] Skipping blocklisted op SparseLengthsSumFused4BitRowwiseNVM at pos 777
I0416 09:59:31.667294 662344 onnxifi_transformer.cc:1097] Skipping blocklisted op SparseLengthsSumFused4BitRowwiseNVM at pos 779
I0416 09:59:31.668828 662344 onnxifi_transformer.cc:1097] Skipping blocklisted op SparseLengthsSumFused4BitRowwiseNVM at pos 786
I0416 09:59:31.668843 662344 onnxifi_transformer.cc:1097] Skipping blocklisted op SparseLengthsSumFused4BitRowwiseNVM at pos 787
I0416 09:59:31.669909 662344 onnxifi_transformer.cc:1097] Skipping blocklisted op SparseLengthsSumFused4BitRowwiseNVM at pos 792

...

I0416 10:01:09.087282 662344 Nvmifier.cpp:346]  found the name: table0
I0416 10:01:09.373975 662344 Nvmifier.cpp:374] ^[[96mSaved /home/hanli/tables/225412100_2870/table0^[[0m
I0416 10:01:09.376008 662344 Nvmifier.cpp:343]  filename: sparse_nn_sparse_arch_SPARSE_NEW_AD_ID_COARSE_dedicated_13_w_EmbeddingFusedUint4Quantization
..

I0416 10:11:05.310854 662344 Nvmifier.cpp:161] ^[[95mNVMifying the model.^[[0m
I0416 10:11:05.310887 662344 Nvmifier.cpp:185]  found the name: table0 for sparse_nn_sparse_arch_SPARSE_NEW_AD_ID_COARSE_dedicated_13_w_EmbeddingFusedUint4Quantization
I0416 10:11:07.580587 662344 Nvmifier.cpp:185]  found the name: table4 for sparse_nn_sparse_arch_SPARSE_AD_ACCOUNT_ID_dedicated_20_w_EmbeddingFusedUint4Quantization
I0416 10:11:07.580648 662344 Nvmifier.cpp:185]  found the name: table3 for sparse_nn_sparse_arch_SPARSE_ENTITY_EQUIVALENCE_KEY_dedicated_22_w_EmbeddingFusedUint4Quantization
I0416 10:11:07.580667 662344 Nvmifier.cpp:185]  found the name: table5 for sparse_nn_sparse_arch_SPARSE_NEW_TARGET_ID_dedicated_29_w_EmbeddingFusedUint4Quantization
I0416 10:11:07.580682 662344 Nvmifier.cpp:185]  found the name: table2 for sparse_nn_sparse_arch_SPARSE_NEW_AD_ID_REFINED_dedicated_30_w_EmbeddingFusedUint4Quantization
I0416 10:11:07.580695 662344 Nvmifier.cpp:185]  found the name: table1 for sparse_nn_sparse_arch_SPARSE_NEW_STORY_ID_dedicated_35_w_EmbeddingFusedUint4Quantization

```
Make sure model is properly loaded:
```
I0415 21:42:48.400249 873685 ModelManagerBase.cpp:806] Loaded 225412100_2870 in 730944 ms (63800 ms of IO)  memory used 8744167456 byte(s)
```
* Only load user embedding to NVM to make sure baseline use case is not broken by this diff:
```
--caffe2_nets_to_nvmify=DISAGG_ACC_REMOTE_REQUEST_ONLY \
--caffe2_nvm_sls_ops=SparseLengthsSumFused8BitRowwise,SparseLengthsWeightedSumFused8BitRowwise,SparseLengthsSumFused4BitRowwise,SparseLengthsWeightedSumFused4BitRowwise,SparseLengthsSum4BitRowwiseSparse \
--caffe2_nvm_table_path=/home/hanli/tables/225412100_2870/
```
Make sure model is loaded:
```
Loaded 225412100_2870 in 381139 ms (56313 ms of IO)  memory used 7043933560 byte(s)
```
* Run feed replayer: `buck-out/gen/sigrid/feed/prediction_replayer/fully_remote_replayer_main --use_new_encoding_for_ads_services --use_new_encoding_from_model_id_to_shard_id --request_file_path /data/users/hanli/f266405843.requests --model_id=265540157_0 --replayer_thread_count=30 --sigrid_predictor_single_host=2401:db00:272c:602e:face:0:10:0 --sigrid_predictor_single_port=7444 --num_iterations=5 --qps=100 --client_name=predictor_v1` (load predictor as in P411172400)
Output:
```
I0428 21:20:25.106635 1396182 FullyRemoteReplayer.cpp:107] Loading requests from /data/users/hanli/f266405843.requests
I0428 21:20:25.547982 1396182 FullyRemoteReplayer.cpp:109] Requests size : 6699
I0428 21:20:25.548146 1396182 Client.cpp:274] V1 tier name:  V2 tier name: sigrid.predictor.fully_remote_test V2 fully remote tier name:
I0428 21:20:25.548153 1396182 Client.cpp:282] [MF] Migration Framework (traffic routing) enabled: false
I0428 21:20:25.548172 1396182 ModelRemoteStatus.cpp:206] Selection probabilities znode path: /configerator-gz/.prn
I0428 21:20:25.674162 1396265 ModelRemoteStatus.cpp:612] Found 0 host, 0 shards in predictor tier
I0428 21:20:25.674181 1396182 ModelRemoteStatus.cpp:557] Refresh sigrid model succeeded: 1
I0428 21:21:26.252820 1396265 ModelRemoteStatus.cpp:612] Found 0 host, 0 shards in predictor tier
I0428 21:21:26.252851 1396265 ModelRemoteStatus.cpp:557] Refresh sigrid model succeeded: 1
I0428 21:22:22.225976 1396182 PredictionReplayer.cpp:67] Previous request took too long, not reaching target QPS
I0428 21:22:26.252643 1396265 ModelRemoteStatus.cpp:612] Found 0 host, 0 shards in predictor tier
I0428 21:22:26.252678 1396265 ModelRemoteStatus.cpp:557] Refresh sigrid model succeeded: 1
I0428 21:23:26.252959 1396265 ModelRemoteStatus.cpp:612] Found 0 host, 0 shards in predictor tier
I0428 21:23:26.252987 1396265 ModelRemoteStatus.cpp:557] Refresh sigrid model succeeded: 1
I0428 21:24:26.253135 1396265 ModelRemoteStatus.cpp:612] Found 0 host, 0 shards in predictor tier
I0428 21:24:26.253166 1396265 ModelRemoteStatus.cpp:557] Refresh sigrid model succeeded: 1
I0428 21:25:27.252734 1396265 ModelRemoteStatus.cpp:612] Found 0 host, 0 shards in predictor tier
I0428 21:25:27.252763 1396265 ModelRemoteStatus.cpp:557] Refresh sigrid model succeeded: 1
I0428 21:26:03.172894 1396182 FullyRemoteReplayer.cpp:59] cpu time p25, p50, p75, p95, p99 9570 13011 16218 20788 24840
I0428 21:26:03.172927 1396182 FullyRemoteReplayer.cpp:61] wait time p25, p50, p75, p95, p99 11845 15958 19946 26579 31842
I0428 21:26:03.172940 1396182 FullyRemoteReplayer.cpp:63] wall time p25, p50, p75, p95, p99 16194 20888 25303 31692 37387
```

Reviewed By: ehsanardestani

Differential Revision: D27701121

fbshipit-source-id: e898abc6957c839e402a9763172cf85d9bb84cbd
2021-05-03 15:21:13 -07:00
Nikita Shulga
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
Igor Sugak
e1a7ec3c4f [caffe2] fix -Wrange-loop-construct
Test Plan:
```
% jf get -u D27943111
% buck build mode/dev-nosan admarket/adfinder:adfinder admarket/adindexer:adindexer \
  -c cxx.extra_cxxflags='-Wno-implicit-const-int-float-conversion -Wno-sign-compare -Wno-deprecated-copy -Wno-deprecated-declarations -Wno-pass-failed' \
  -c cxx.compiler_variant=clang-12 \
  -c cxx.modules=false
```

Reviewed By: hlu1

Differential Revision: D27988238

fbshipit-source-id: 304e44bfa141a1bcb291f9434fed514bbb568f8f
2021-04-26 13:27:59 -07:00
Chunli Fu
7929bc76a0 [shape inference] Fix dim type for Cast
Summary: ATT

Test Plan: unit test

Reviewed By: yinghai

Differential Revision: D27904584

fbshipit-source-id: b62d2eb5da0be79091c82e6300dd0c075a0bf2fe
2021-04-21 03:21:56 -07:00
Chunli Fu
00737efdb2 [shape inference] Add shape inference func for Bucketize
Summary: ATT, to ensure output has the same dim type with the input. We need to find a more generic way though...

Test Plan: unit test

Reviewed By: ipiszy, khabinov

Differential Revision: D27690748

fbshipit-source-id: e53832c67b8ac86973c288d2d6b76ef8e5db14b9
2021-04-13 05:59:40 -07:00
Oleg Khabinov
28531c97b2 [caffe2] Shape inference for Transpose (#55188)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55188

We need to make sure dim types are preserved after applying Transpose.

Test Plan:
```
$ buck build caffe2/caffe2/opt:bound_shape_inference_test && ./buck-out/gen/caffe2/caffe2/opt/bound_shape_inference_test --gtest_filter=*Transpose*
```

Reviewed By: yinghai

Differential Revision: D27514487

fbshipit-source-id: 431b7f2d08664f2ec311a733c926dbb52c63a7d4
2021-04-02 17:43:27 -07:00
Sam Estep
5bcbbf5373 Lint trailing newlines (#54737)
Summary:
*Context:* https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines.

The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR:

- `.github/workflows/lint.yml`
- `mypy-strict.ini`
- `tools/README.md`
- `tools/test/test_trailing_newlines.py`
- `tools/trailing_newlines.py`

I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository):

- [How to detect file ends in newline?](https://stackoverflow.com/q/38746)
- [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068)
- [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800)
- [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632)
- [git ensure newline at end of each file](https://stackoverflow.com/q/57770972)

To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737

Test Plan:
Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR:

- https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true

In contrast, this run (after correcting the trailing newlines in this PR) succeeded:

- https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241

To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow):
```
python tools/test/test_trailing_newlines.py
```

Reviewed By: malfet

Differential Revision: D27409736

Pulled By: samestep

fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19
2021-03-30 13:09:52 -07:00
Huamin Li
21f9a6da7d Avoid of creating a copy of statusString every inference time (#53756)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53756

as title

Reviewed By: yinghai

Differential Revision: D26949450

fbshipit-source-id: a737ce1ed25cf53faef8cdc94912542769a1008f
2021-03-10 16:58:02 -08:00
Eric Zeng
8a6df06a0e Print onnxifi failed status code in readable format (#53648)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53648

Reviewed By: hl475

Differential Revision: D26838564

fbshipit-source-id: 6e0e5695a58422d573f9c97bfb241bce2688f13b
2021-03-09 21:34:57 -08:00
Oleg Khabinov
0d04e51233 [caffe2] Add an optimization to avoid extra fp32->fp16 conversions in Onnxifi (#53560)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53560

If an op like Fused8BitRowwiseQuantizedToFloat ends up on CPU and Tile ends up on an accelerator and only FP16 is supported, then we want to make sure conversion from FP32 to FP16 is done on CPU to save cycles on accelerator.

Reviewed By: ChunliF

Differential Revision: D26862322

fbshipit-source-id: a7af162f2537ee9e4a78e6ef3f587129de410b07
2021-03-08 16:36:12 -08:00
Sam Estep
8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00
Oleg Khabinov
00bd0e9862 [caffe2] Fix shape inference for LpNorm (#53332)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53332

This is to make sure we don't get `BATCH` dim type for the output.

Reviewed By: ChunliF

Differential Revision: D26836902

fbshipit-source-id: bedbd12330c608406e3466b240015235a28d2c4a
2021-03-05 13:35:32 -08:00
Oleg Khabinov
fdd074e806 [caffe2] Fix shape inference for Softmax (#53132)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53132

Input and output should have the same shape for Softmax https://caffe2.ai/docs/operators-catalogue.html#softmax.

Reviewed By: walterddr, yinghai, ChunliF

Differential Revision: D26536592

fbshipit-source-id: 8b50794803aeadcb75d8f370c77f4fef98a1f2ad
2021-03-04 19:37:43 -08:00