pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	02a02a23ee	Revert "Move at::{Refcounted,}MapAllocator to c10 (#109881 )" This reverts commit `0341deb1c7`. Reverted https://github.com/pytorch/pytorch/pull/109881 on behalf of https://github.com/albanD due to It does break buck build ([comment](https://github.com/pytorch/pytorch/pull/109881#issuecomment-1756195823))	2023-10-10 20:39:12 +00:00
Anthony Alayo	31611b40b9	cmake: allow to build pytorch as a CMake subproject (#110373 ) This is a re-attempt of fixing https://github.com/pytorch/pytorch/issues/53980, first submitted in https://github.com/pytorch/pytorch/pull/54978. Quoting @SpaceIm ``` Fixes https://github.com/pytorch/pytorch/issues/53980 Maybe it would be nice to find why some files are generated in CMAKE_BINARY_DIR instead of CMAKE_CURRENT_BINARY_DIR or Torch_BINARY_DIR or PROJECT_BINARY_DIR, but there is a lot of indirection in the logic of pytorch build files, so I was not able to find where it comes from. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/110373 Approved by: https://github.com/malfet	2023-10-10 17:47:35 +00:00
Peter Bell	0341deb1c7	Move at::{Refcounted,}MapAllocator to c10 (#109881 ) `libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`, so it makes sense to move it to c10 along with the other memory allocators. This means `libshm.so` only depends on `c10` and we don't need to relink `libshm.so` for every ATen change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109881 Approved by: https://github.com/albanD	2023-10-09 23:53:47 +00:00
cyy	3ec33957eb	[1/N] Enable Wunused-result and Wunused-variable in torch targets (#110722 ) They are useful for checking results of function calls. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110722 Approved by: https://github.com/Skylion007	2023-10-08 23:43:45 +00:00
Kazuaki Ishizaki	105f3b5f91	Fix typo under caffe2 directory (#110825 ) This PR fixes typo `the the` of comments in files under `caffe2` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110825 Approved by: https://github.com/Skylion007	2023-10-08 20:48:12 +00:00
cyy	c3e4e4f6d2	[4/N] Add -Wdeprecated and related fixes (#110204 ) This PR enables Wdeprecated on torch_cpu Pull Request resolved: https://github.com/pytorch/pytorch/pull/110204 Approved by: https://github.com/ezyang	2023-10-07 19:46:08 +00:00
Peter Bell	80b6f072e3	[ATen] Remove ATen.h includes from transformers (#110199 ) The kernel files here in particular are quite slow to compile and don't use anything from `ATen.h`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110199 Approved by: https://github.com/malfet	2023-10-02 20:43:23 +00:00
Jaromir Latal	6e2c14a0e8	[Codemod][[codemod] Replace third-party mock with unittest.mock] caffe2/caffe2 (#106541 ) Reviewed By: thechrisu Differential Revision: D47909974 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106541 Approved by: https://github.com/thechrisu	2023-09-29 18:09:49 +00:00
PyTorch MergeBot	248a1b7011	Revert "Enable function declaration check in Vulkan and Metal backends (#106762 )" This reverts commit `bf8617c37d`. Reverted https://github.com/pytorch/pytorch/pull/106762 on behalf of https://github.com/atalman due to Breaks internal CI ([comment](https://github.com/pytorch/pytorch/pull/106762#issuecomment-1739184482))	2023-09-28 13:32:10 +00:00
cyy	bf8617c37d	Enable function declaration check in Vulkan and Metal backends (#106762 ) This PR enables declaration check in Vulkan and Metal backends, so that we can identify unused functions more easily. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106762 Approved by: https://github.com/ezyang	2023-09-27 14:29:24 +00:00
Aleksei Nikiforov	e05eb69c93	Don't link to libcpuinfo on s390x (#109875 ) Don't even build it. It does not support s390x. This is a follow up for https://github.com/pytorch/pytorch/pull/109496 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109875 Approved by: https://github.com/kit1980	2023-09-26 12:43:35 +00:00
PyTorch MergeBot	83deaa16ed	Revert "[1/N] Cleanup header inclusions in torch_cpu by iwyu (#101178 )" This reverts commit `b7a95f4fdb`. Reverted https://github.com/pytorch/pytorch/pull/101178 on behalf of https://github.com/atalman due to Break internal CI ([comment](https://github.com/pytorch/pytorch/pull/101178#issuecomment-1734384645))	2023-09-25 20:05:25 +00:00
cyy	265acd4bea	Clean up CMake target linking (#109959 ) This PR cleans up more CMake target linking. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109959 Approved by: https://github.com/ezyang	2023-09-25 01:37:14 +00:00
cyy	b7a95f4fdb	[1/N] Cleanup header inclusions in torch_cpu by iwyu (#101178 ) Following our previous IWYU work #100304 on C10, it makes more sense to try IWYU on torch_cpu. This PR does exactly that. Meanwhile, it fixes issue #48684. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101178 Approved by: https://github.com/ezyang	2023-09-24 05:01:20 +00:00
cyy	f5b753bab1	Fix inline_container_test on Windows (#109754 ) Fix the failure mentioned in https://github.com/pytorch/pytorch/pull/109393. The reason is that IO streams were not opened in binary mode while binary data was written and read. Interestingly, the test passed on Linux. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109754 Approved by: https://github.com/malfet	2023-09-21 07:46:25 +00:00
Catherine Lee	05b3a4dd88	Fix test_libtorch.bat not exiting on error (#109393 ) For some weird reason, the batch file gets rid of the `exit /b 1` inside the for loop, so failures never actually get surfaced. Add skips for the tests that were failing. Also don't run the windows cpu build on main since it's in trunk. This is what currently works for the rocm build. The temp file failure originates from https://github.com/pytorch/pytorch/pull/108508 (got fixed before I merged this PR) I'm not sure when the ChunkRecordIteratorTest started failing, but it was after the above. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109393 Approved by: https://github.com/malfet	2023-09-20 21:34:40 +00:00
cyy	ac603bc2f8	[Reland] Eliminate invocations of c10::stoi,c10::stod,c10::stoull,c10::stoll (#109566 ) This is reland of #87603 with definitions of c10::stoXX kept for further investigation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109566 Approved by: https://github.com/huydhn	2023-09-19 07:15:25 +00:00
PyTorch MergeBot	4d44d8c00a	Revert "Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 )" This reverts commit `852f1b8417`. Reverted https://github.com/pytorch/pytorch/pull/109179 on behalf of https://github.com/huydhn due to Sorry for reverting your change but this is breaking periodic buck build, so please fix the issue and reland the change https://github.com/pytorch/pytorch/actions/runs/6207458526/job/16852695272 ([comment](https://github.com/pytorch/pytorch/pull/109179#issuecomment-1724168571))	2023-09-18 18:41:12 +00:00
Aaron Gokaslan	6d725e7d66	[BE]: enable ruff rules PLR1722 and PLW3301 (#109461 ) Enables two ruff rules derived from pylint: * PLR1722 replaces any exit() calls with sys.exit(). exit() is only designed to be used in repl contexts as may not always be imported by default. This always use the version in the sys module which is better * PLW3301 replaces nested min / max calls with simplified versions (ie. `min(a, min(b, c))` => `min(a, b. c)`). The new version is more idiomatic and more efficient. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109461 Approved by: https://github.com/ezyang	2023-09-18 02:07:21 +00:00
cyy	852f1b8417	Eliminate c10::stoi,c10::stod,c10::stoull,c10::stoll (#109179 ) We can remove these functions in favor of std ones. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109179 Approved by: https://github.com/colesbury	2023-09-16 07:22:50 +00:00
cyy	4c208c1475	Remove unneeded linking in CMake targets (#109192 ) This PR removes unused library dependencies, help refactoring in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109192 Approved by: https://github.com/ezyang	2023-09-15 19:43:25 +00:00
Lujia Zhang	a6fadf643f	Re-do D48544397: [TGIF Inplace] [xlv2][1/n] Expose a couple APIs from inline_container that will be used for chunk read" (#109183 ) Summary: Original commit changeset: 4a5f31518ad0 Original Phabricator Diff: D48544397 fix easycla Differential Revision: D49221088 Pull Request resolved: https://github.com/pytorch/pytorch/pull/109183 Approved by: https://github.com/wqfish	2023-09-14 08:17:14 +00:00
Andrei Gheorghe	00908475e6	Use global variables to register the return_types namedtuples (#108832 ) Fixes #69221. Builds on top of #107000, fixing the buck build issue linked [here](https://github.com/pytorch/pytorch/pull/107000#issuecomment-1708857375). Pull Request resolved: https://github.com/pytorch/pytorch/pull/108832 Approved by: https://github.com/zou3519	2023-09-13 17:42:46 +00:00
Jeffrey Dunn	25d657c701	Fix possible naming collision issue (#107743 ) Summary: As pointed out in https://github.com/pytorch/pytorch/pull/107479, using a set prevents collisions like "a" => "a", "a" => "a_1", "a_1" => "a_1" (but should go to "a_1_1"). We can combine using counters and a set to avoid this problem. Still gets us the performance benefit in the case of collisions with a very minor penalty in a case with no collision. Test Plan: Extract this code and run: ``` # New version from typing import Dict, Set class Net: _net_names_used_counters: Dict[str, int] = {} _net_names_used: Set[str] = set() staticmethod def current_prefix(): return "test_prefix" staticmethod def _get_next_net_name(basename): basename = "/".join(x for x in [Net.current_prefix(), basename] if x) idx = Net._net_names_used_counters.get(basename, 0) while (name := basename if idx == 0 else f"{basename}_{idx}") in Net._net_names_used: idx += 1 Net._net_names_used_counters[basename] = idx + 1 Net._net_names_used.add(name) return name print(Net._get_next_net_name("basename")) print(Net._get_next_net_name("x_basename")) print(Net._get_next_net_name("basename")) print(Net._get_next_net_name("basename")) print(Net._get_next_net_name("x_basename")) print(Net._get_next_net_name("basename_1")) > test_prefix/basename > test_prefix/x_basename > test_prefix/basename_1 > test_prefix/basename_2 > test_prefix/x_basename_1 > test_prefix/basename_1_1 ``` Differential Revision: D48576516 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107743 Approved by: https://github.com/zdevito	2023-09-08 17:39:27 +00:00
Andrei Gheorghe	2028987bf7	Fix finding Intel MKL on Windows, as well as LAPACK, cuDNN and cuSPARSELt (#108040 ) Fixes #108039 Intel MKL is now found correctly: -- MKL libraries: C:/Program Files (x86)/Intel/oneAPI/mkl/latest/lib/intel64/mkl_intel_lp64.lib;C:/Program Files (x86)/Intel/oneAPI/mkl/latest/lib/intel64/mkl_sequential.lib;C:/Program Files (x86)/Intel/oneAPI/mkl/latest/lib/intel64/mkl_core.lib -- MKL include directory: C:/Program Files (x86)/Intel/oneAPI/mkl/latest/include and LAPACK too (excerpt from build.ninja): LINK_LIBRARIES = lib\c10.lib lib\pthreadpool.lib lib\cpuinfo.lib lib\XNNPACK.lib lib\fbgemm.lib lib\libittnotify.lib lib\gloo.lib lib\foxi_loader.lib lib\kineto.lib "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_intel_lp64.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_sequential.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_core.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_lapack95_lp64.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_intel_lp64.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_sequential.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_core.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_intel_lp64.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_sequential.lib" "C:\Program Files (x86)\Intel\oneAPI\mkl\latest\lib\intel64\mkl_core.lib" cuSPARSELt is also found correctly: -- Found CUSPARSELT: C:/Program Files/NVIDIA cuSPARSELt/v0.4/lib/cusparseLt.lib Also cuDNN include directory is properly added for the test target cuda_cudnn_test: build caffe2\CMakeFiles\cuda_cudnn_test.dir\__\aten\src\ATen\test\cuda_cudnn_test.cpp.obj: CXX_COMPILER__cuda_cudnn_test_RelWithDebInfo C$:\work\Repos\pytorch\aten\src\ATen\test\cuda_cudnn_test.cpp \|\| cmake_object_order_depends_target_cuda_cudnn_test DEFINES = .... FLAGS = .... INCLUDES = -IC:\work\Repos\pytorch\build\aten\src -IC:\work\Repos\pytorch\aten\src ........... -external:IC:\work\Repos\pytorch\third_party\ittapi\include -external:IC:\work\Repos\pytorch\cmake\..\third_party\eigen -external:I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.2\include" -external:IC:\work\Repos\pytorch\torch\include -external:IC:\work\Repos\pytorch\third_party\ideep\include -external:IC:\work\Repos\pytorch\third_party\googletest\googletest\include -external:IC:\work\Repos\pytorch\third_party\googletest\googletest -external:I"C:\Program Files\NVIDIA cuDNN\include" -external:IC:\work\Repos\pytorch\cmake\..\third_party\cudnn_frontend\include -external:W0 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108040 Approved by: https://github.com/ezyang	2023-09-08 14:41:00 +00:00
Shiyan Deng	d471eaeb1d	fix inline_container.cc inplace loading (#108573 ) Summary: bypass-github-pytorch-ci-checks bypass-github-export-checks force-merge-on-github Differential Revision: D48971847 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108573 Approved by: https://github.com/wqfish	2023-09-06 00:02:42 +00:00
cyy	0cc2f06aec	[Reland] Improve MKL related logic in FindOpenMP.cmake (#104224 ) Reland of PR #94924. The purpose of this PR is to deal with the complicated interactions between MKL and OpenMP. There are two improvements: 1. It uses a flag to avoid infinite mutual recursion in calling find_package(MKL) and find_package(OpenMP) in some cases. 2. The logic of finding iomp5 is improved and now we can test MKLDNN under ASAN. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104224 Approved by: https://github.com/malfet	2023-09-02 07:55:11 +00:00
drisspg	182a9cf366	Add Independent Memory Efficient and Flash Attention Build Flags (#107985 ) # Summary In an effort to simplify https://github.com/pytorch/pytorch/pull/105602, this PR pulls out independent chunks of code that can be landed prior to FlashV2 landing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107985 Approved by: https://github.com/cpuhrsch	2023-08-28 18:39:18 +00:00
Xia, Weiwen	97a291f6bd	[ONEDNN][BC-breaking] update onednn from v2.7.3 to v3.1.1 (#97957 ) Summary Update onednn from v2.7.3 to v3.1.1. It is bc-breaking as some APIs are changed on oneDNN side. Changes include: - PyTorch code where oneDNN is directly called - Submodule `third_party/ideep` to adapt to oneDNN's new API. - CMAKE files to fix build issues. Test plan Building issues and correctness are covered by CI checks. For performance, we have run TorchBench models to ensure there is no regression. Below is the comparison before and after oneDNN update. ![image](https://github.com/pytorch/pytorch/assets/12522207/415a4ff0-7566-40c6-aed0-24997a475b0e) Note: - Base commit of PyTorch: `da322ea` - CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Ice Lake) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97957 Approved by: https://github.com/jgong5, https://github.com/jerryzh168	2023-08-25 12:13:18 +00:00
Jeffrey Dunn	1e9b590df9	Optimize Net._get_next_net_name (#107479 ) Summary: This is surprisingly expensive and can be easily optimized. Differential Revision: D48440000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107479 Approved by: https://github.com/kit1980	2023-08-22 19:15:11 +00:00
Aaron Gokaslan	b1e8e01e50	[BE]: Apply PYI autofixes to various types (#107521 ) Applies some autofixes from the ruff PYI rules to improve the typing of PyTorch. I haven't enabled most of these ruff rules yet as they do not have autofixes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/107521 Approved by: https://github.com/ezyang	2023-08-20 02:42:21 +00:00
PyTorch MergeBot	22cade56ba	Revert "[Reland] Upgrade NVTX to NVTX3 (#97582 )" This reverts commit `5bbfb96203`. Reverted https://github.com/pytorch/pytorch/pull/97582 on behalf of https://github.com/izaitsevfb due to Breaks meta RL builds ([comment](https://github.com/pytorch/pytorch/pull/97582#issuecomment-1679568525))	2023-08-15 20:55:12 +00:00
cyy	5bbfb96203	[Reland] Upgrade NVTX to NVTX3 (#97582 ) PR #90689 replaces NVTX with NVTX3. However, the torch::nvtoolsext is created only when the third party NVTX is used. This is clear a logical error. We now move the creation code out of the branch to cover all cases. This should fix the issues reported in the comments of #90689. It would be better to move configurations of the failed FRL jobs to CI tests so that we can find such issues early before merging. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97582 Approved by: https://github.com/peterbell10	2023-08-14 16:55:25 +00:00
Lujia Zhang	b897c57d47	[TGIF][Inplace][Perf] Copy tensor to device with pinned memory & move copy weight sleep to getRecord (#106849 ) Summary: There are 2 changes in the diff that helps optimize perf during inplace update: 1. Read data with pinned memory 2. move the copy weight sleep from between copying the whole Tensor to between copying chunks Test Plan: Local Test ``` ./ai_infra/inference_platform/test_platform/script/run_sigrid_4card.sh --port 7451 --local_model_dir /home/lujia/script --cuda_devices 6 --bind_node 3 --model_id 962549778_514 --gflag_config_path sigrid/predictor/predictor_x_gflags_mrs_prospector_gpu_torchscript_fusedsolution_1card_opt_fm -- --enable_thrift_warmup=false --tgif_replicate_merge_by_tempfile=false --enable_inplace_snapshot_transition --model_version_config_path sigrid/predictor/models_version/lujia_test --inplace_update_max_retries 0 --submod_to_device="merge\|cuda0" ``` Load test on job tsp_eag/smart/inference_platform_sp__sigrid_predictor_gpu_adhoc_realtimetest_m962549778_latest.s3 Before: (p99 latency) {F1066957232} (SR error rate) {F1066957650} After: (p99 latency) {F1066957141} (SR error rate) {F1066957376} Differential Revision: D48182533 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106849 Approved by: https://github.com/842974287, https://github.com/kit1980	2023-08-13 07:37:46 +00:00
PyTorch MergeBot	354484ea6d	Revert "Add `_foreach_clamp` (#106574 )" This reverts commit `2b560d3c3a`. Reverted https://github.com/pytorch/pytorch/pull/106574 on behalf of https://github.com/kit1980 due to breaking internal windows builds ([comment](https://github.com/pytorch/pytorch/pull/106574#issuecomment-1675400335))	2023-08-11 21:05:04 +00:00
Howard Cheng	dfb1b95919	[caffe2] Add enforce inside ScatterAssignOp (#106882 ) Summary: Adding an enforce gives better error information than raising SIGFPE when division by zero happens. We'll get the actual BlobRef names as well as the error categories. Test Plan: Ran a local worker and client using DPP session with empty tensors and checked the error: `../buck-out/v2/gen/fbcode/data_preproc/perf_test/client --sr2_event_base_pool_size=24` `../buck-out/v2/gen/fbcode/data_preproc/perf_test/worker --dpp_session_id=5D49F56C98CC95BD97027BC0DDB38D8F` ```{dpp_internal_errorcategory : user_error, ONCALL : MLDP_CONTROL, CATEGORY : INPUT_ERROR, errorsubsystemtags : [DPP_WORKER], errorcause : USER_ERROR, RETRYABILITY : 0}F0806 17:47:52.607200 2280375 SchedRuntimeEnv.cpp:385] facebook::data_preproc::NonRetryableGenericUser Error: User preprocessing error c10::Error: [enforce fail at utility_ops.h:730] input.numel() > 0. 0 vs 0. tensor has t o be nonempty (Error from operator: input: "preproc_data_pipeline/preproc/features/default_feature_preproc/normalization/dper_feature_normalization/sparse_ features_processor_1/sparse_feature_transform/F3_ADFINDER_USER_ADS_COFFEE_LSF_FLEXIBLE_BATCH_USER_FB_UIP_FEATURE_IDSCOR ELIST_ENCODED_FB_UIP_TOP100_IDSCORELIST_ENCODED_1/sequential_1019/id_score_list_quantization_decode_1/Concat:0" input: "preproc_data_pipeline/preproc/features/default_feature_preproc/normalization/dper_feature_normalization/sparse_feature s_processor_1/sparse_feature_transform/F3_ADFINDER_USER_ADS_COFFEE_LSF_FLEXIBLE_BATCH_USER_FB_UIP_FEATURE_IDSCORELIST_E NCODED_FB_UIP_TOP100_IDSCORELIST_ENCODED_1/sequential_1019/id_score_list_quantization_decode_1/Mul_2" input: "preproc_d ata_pipeline/preproc/features/default_feature_preproc/normalization/dper_feature_normalization/sparse_features_processo r_1/sparse_feature_transform/F3_ADFINDER_USER_ADS_COFFEE_LSF_FLEXIBLE_BATCH_USER_FB_UIP_FEATURE_IDSCORELIST_ENCODED_FB_UIP_TOP100_IDSCORELIST_ENCODED_1/sequential_1019/id_score_list_quantization_decode_1/encoded_id_lengths" output: "preproc_data_pipeline/preproc/features/default_feature_preproc/normalization/dper_feature_normalization/sparse_features_processor_1/sparse_feature_transform/F3_ADFINDER_USER_ADS_COFFEE_LSF_FLEXIBLE_BATCH_USER_FB_UIP_FEATURE_IDSCORELIST_ENCODED_FB_UIP_TOP100_IDSCORELIST``` Differential Revision: D48104430 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106882 Approved by: https://github.com/kit1980	2023-08-10 21:46:13 +00:00
Masaki Kozuki	2b560d3c3a	Add `_foreach_clamp` (#106574 ) Rel: - #106221 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106574 Approved by: https://github.com/janeyx99	2023-08-10 05:26:09 +00:00
Yun Wang (Speech)	0d57e87000	Fix test_div in caffe2/caffe2/python:hypothesis_test (#106694 ) Summary: Suppress the "too_slow" health check for `test_div`. Test Plan: Sandcastle Differential Revision: D48105842 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106694 Approved by: https://github.com/malfet	2023-08-08 04:50:21 +00:00
cyy	c287262b02	enable missing-prototypes warnings on MPS backend (#105831 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105831 Approved by: https://github.com/kulinseth, https://github.com/albanD	2023-08-05 00:22:56 +00:00
v-s-2	60121e391b	[caffe2] Replace `CAFFE_` prefixes in `static_tracepoint.h` macros with `TORCH_` (#106380 ) Summary: Rename static tracepoint macros to better describe their targeted usage. Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: chaekit Differential Revision: D47727339 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106380 Approved by: https://github.com/chaekit	2023-08-03 21:51:36 +00:00
Jesse Cai	f81f9093ec	[core][pruning][feature] cuSPARSELt build integration (#103700 ) Summary: This stack of PR's integrates cuSPARSELt into PyTorch. This PR adds support for cuSPARSELt into the build process. It adds in a new flag, USE_CUSPARSELT that defaults to false. When USE_CUSPASRELT=1 is specified, the user can also specify CUSPASRELT_ROOT, which defines the path to the library. Compiling pytorch with cusparselt support can be done as follows: `` USE_CUSPARSELT=1 CUSPARSELT_ROOT=/path/to/cusparselt python setup.py develop ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/103700 Approved by: https://github.com/albanD	2023-08-02 12:48:39 +00:00
v-s-2	e35950cd0d	[caffe2] Move CAFFE SDT macros' definitions to `c10/util/` (#105856 ) Summary: Moving static tracepoint macros header to a location where it can be easily used by various PyTorch components (`c10/utill`). Test Plan: Same as for D47159249: Tested the following macros on test scripts with libbpf USDTs: * `CAFFE_SDT` * `CAFFE_DISABLE_SDT` * `CAFFE_SDT_WITH_SEMAPHORE` Reviewed By: EDG-GH Differential Revision: D47636258 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105856 Approved by: https://github.com/EDG-GH, https://github.com/chaekit	2023-08-01 14:42:55 +00:00
Jeff Daily	5379b5f927	[ROCm] use hipblas instead of rocblas (#105881 ) - BatchLinearAlgebraLib.cpp is now split into one additional file - BatchLinearAlgebraLib.cpp uses only cusolver APIs - BatchLinearAlgebraLibBlas.cpp uses only cublas APIs - hipify operates at the file level and cannot mix cusolver and cublas APIs within the same file - cmake changes to link against hipblas instead of rocblas - hipify mappings changes to map cublas -> hipblas instead of rocblas Pull Request resolved: https://github.com/pytorch/pytorch/pull/105881 Approved by: https://github.com/albanD	2023-07-31 20:42:55 +00:00
Nikita Shulga	bb0b283e5a	Do not force -Werror on Pooling.cpp As new versions of compilers are likely find new types of violation s as shown in https://github.com/pytorch/pytorch/issues/105728	2023-07-28 07:08:59 -07:00
Alan Ji	70b0f1b248	fix some typos (#106018 ) Fixes #ISSUE_NUMBER Fix typos in `test_static_module.cc`, `backend_cutting_test.cc` and `types_base.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/106018 Approved by: https://github.com/awgu	2023-07-26 18:14:44 +00:00
xvladus1	e47fad68a0	[caffe2] Update tracepoint USDT macros (#105232 ) Summary: Fix existing CAFFE static tracepoint macros and make them match the latest FOLLY version. Per anakryiko, current `CAFE_SDT` definition is broken. Quote: ``` "Arguments: -5@-16(%rbp) -4@$100 Arguments: -8@-16(%rbp) -4@$100 #define FOLLY_SDT_IS_ARRAY_POINTER(x) ((__builtin_classify_type(x) == 14) \|\| \ (__builtin_classify_type(x) == 5)) vs #define CAFFE_SDT_ISARRAY(x) (__builtin_classify_type(x) == 14) https://github.com/atgreen/gcc/blob/master/gcc/typeclass.h that 5 is "pointer_type_class" so you were right, it's just fixed up version of header I think it should be 8, not 5 5 is the size of literal, but you don't pass string literal as an argument, you pass its address, so actual argument is a pointer, and so 8 byte long you can try just fixing up CAFFE_SDT macro ``` {F1048035373} Test Plan: Tested the following macros on test scripts with libbpf USDTs: CAFFE_SDT CAFFE_DISABLE_SDT CAFFE_SDT_WITH_SEMAPHORE Reviewed By: RihamSelim Differential Revision: D47159249 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105232 Approved by: https://github.com/chaekit, https://github.com/malfet	2023-07-20 22:56:11 +00:00
Debojeet Chatterjee	70b5264ec5	[EZ][BE] Fix the massively annoying strict-weak-ordering issue. (#105189 ) Summary: kip_fist_pump Running any EgoOCR workflow in non-opt modes was breaking with https://fburl.com/strict-weak-ordering Painstakingly found out that the stable_sort comparator in the generate_proposals caffe2 op was the issue due to numerical imprecision. This was causing Word Detector model to barf with the error. Adding explicit handling for the [irreflexivity property](https://www.boost.org/sgi/stl/StrictWeakOrdering.html) fixes this annoying strict-weak-ordering issue that has bugged me and several others(https://fb.workplace.com/groups/1405155842844877/permalink/7079705785389826/) for a while. We can finally run all OCR workflows in non-opt mode! :) Test Plan: Debugged this with `fdb --disable-auto-breakpoints --secondary-debugger=lldb buck2 run mode/dev-sand ai_demos/server_model_zoo/models/ego_ocr_e2e_prod:ego_ocr_e2e_prod_binary` and running `breakpoint set -E c++` in the lldb terminal. Differential Revision: D47446816 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105189 Approved by: https://github.com/malfet, https://github.com/atalman	2023-07-19 19:37:50 +00:00
Nikita Shulga	5837e95d30	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04: - Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh` - Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-15 20:30:20 +00:00
PyTorch MergeBot	15fd1ea118	Revert "[Reland] Update mypy to 1.4.1 (#105227 )" This reverts commit `c9c4f8efc3`. Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))	2023-07-14 22:28:35 +00:00
Nikita Shulga	c9c4f8efc3	[Reland] Update mypy to 1.4.1 (#105227 ) This PR re-lands - [Typing] Fix PEP 484 Violation (#105022) - Update mypy to 1.4.1 (#91983) That were reverted due to the conflict with internal source repo. Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional) Plus few real fixes: - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi` - Add missing return statement to `torch._export. deserialize_graph` - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights` - Add assert it `torch/optim/optimizer.py` that Optional list is not None TODO (in followup PR): - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227 Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007	2023-07-14 20:45:12 +00:00

1 2 3 4 5 ...

7436 Commits