pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Hannes Friederich	5932c37198	[caffe2] drop XROS ports (#76366 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76366 caffe2 is not currently being built for XROS. Test Plan: CI Reviewed By: kimishpatel Differential Revision: D35923922 fbshipit-source-id: 260dacadf0bd5b6bab7833a4ce81e896d280b053 (cherry picked from commit 8370b8dd2519d55a79fa8d45e7951ca8dc0b21a8)	2022-04-26 23:54:22 +00:00
Richard Barnes	6c03f8d9e5	Drop unused variables and add some const (#71106 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71106 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D33490855 fbshipit-source-id: 9fc4a4e4a7ad5e6c31f394ec6d8221b964fdf043	2022-01-11 12:38:59 -08:00
Richard Barnes	704af23ee4	Use a reference in GetSingleArgument (#71007 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71007 A string copy at Line 417 is currently consuming 125,749,287,000 cycles/day. I suspect the issue is with a copy-on-return, but we can experiment with introducing a reference in the middle to see if that produces a good savings without changing the interface. Reference ``` ["Inline caffe2::ArgumentHelper::GetSingleArgument @ caffe2/caffe2/utils/proto_utils.cc:417"] ``` Test Plan: Sandcastle Reviewed By: xw285cornell Differential Revision: D33478883 fbshipit-source-id: e863e359c0c718fcd0d52fd4b3c7858067de0670	2022-01-07 20:18:56 -08:00
Richard Barnes	2d38d37f5f	use irange for loops (#69533 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69533 Modified loops in files under fbsource/fbcode/caffe2/ from the format ``` for(TYPE var=x0;var<x_max;x++) ``` to the format ``` for(const auto var: irange(xmax)) ``` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D32837942 fbshipit-source-id: 8663037a38ade8f81bd5e983a614d197ea11f0d1	2021-12-07 16:53:27 -08:00
Ramanpreet Nara	f587267dc7	Revert D31705359: use irange for loops 8 Test Plan: revert-hammer Differential Revision: D31705359 (`17e5200441`) Original commit changeset: c9ea2fbc0f9c fbshipit-source-id: 08fff2d12beca953ad30dd0baabf86e39ac84f14	2021-12-02 12:55:08 -08:00
Richard Barnes	17e5200441	use irange for loops 8 (#66743 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66743 Modified loops in files under fbsource/fbcode/caffe2/ from the format `for(TYPE var=x0;var<x_max;x++)` to the format `for(const auto var: irange(xmax))` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D31705359 fbshipit-source-id: c9ea2fbc0f9cd29e97a52dcb203addc5f2abb09b	2021-12-02 10:21:29 -08:00
Xiang Gao	b8dfb45ac2	Refactor cub namespace handling (#66219 ) Summary: This PR is to update PyTorch with the following cub changes: - Starting cub 1.13.1, cub requires users to define `CUB_NS_QUALIFIER` if `CUB_NS_PREFIX` is also defined. Besides that, a new mechanism `CUB_WRAPPED_NAMESPACE` is added. And I do the following change to PyTorch: - Starting CUDA 11.5, define `CUB_WRAPPED_NAMESPACE` globally as an nvcc flag. - Fix caffe2 failures caused by the above change. - Add a `aten/src/ATen/cuda/cub_definitions.cuh` that defines helper macros about feature availability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/66219 Reviewed By: bdhirsh Differential Revision: D31626931 Pulled By: ngimel fbshipit-source-id: 97ebf5ef671ade8bf46d0860edc317f22660f26d	2021-10-25 14:37:09 -07:00
Nikita Shulga	77beccaedb	Do not build PyTorch with caffe2 by default (#66658 ) Summary: CAFFE2 has been deprecated for a while, but still included in every PyTorch build. We should stop building it by default, although CI should still validate that caffe2 code is buildable. Build even fewer dependencies when compiling mobile builds without Caffe2 Introduce `TEST_CAFFE2` in torch.common.utils Skip `TestQuantizedEmbeddingOps` and `TestJit.test_old_models_bc` is code is compiled without Caffe2 Should be landed after https://github.com/pytorch/builder/pull/864 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66658 Reviewed By: driazati, seemethere, janeyx99 Differential Revision: D31669156 Pulled By: malfet fbshipit-source-id: 1cc45e2d402daf913a4685eb9f841cc3863e458d	2021-10-21 20:32:47 -07:00
Xue Li	2f099c7555	Revert D30652629: use irange for loops Test Plan: revert-hammer Differential Revision: D30652629 (`687c2267d4`) Original commit changeset: 0ae6c4bbbb55 fbshipit-source-id: 5c4f067b584a021c8c9656454d1ee60999600fb3	2021-10-15 15:23:10 -07:00
Richard Barnes	687c2267d4	use irange for loops (#66234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234 Modified loops in files under fbsource/fbcode/caffe2/ from the format `for(TYPE var=x0;var<x_max;x++)` to the format `for(const auto var: irange(xmax))` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. bypass_size_limit allow-large-files Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D30652629 fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e	2021-10-15 13:50:33 -07:00
Karol Kosik	eb3b9fe719	[XROS][ML] System specific adjustments for UTs to work. (#65245 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65245 Building and running c10 and qnnpack tests on XROS. Notable changes: - Adding #if define(_XROS_) in few places not supported by XROS - Changing Threadpool to abstract class ghstack-source-id: 139513579 Test Plan: Run c10 and qnnpack tests on XROS. Reviewed By: veselinp, iseeyuan Differential Revision: D30137333 fbshipit-source-id: bb6239b935187fac712834341fe5a8d3377762b1	2021-10-01 18:15:14 -07:00
Pruthvi Madugundu	085e2f7bdd	[ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610 - Replace HIP_PLATFORM_HCC with USE_ROCM - Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION. - In the next PR - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify. - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc. cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd Reviewed By: jbschlosser Differential Revision: D30909053 Pulled By: ezyang fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06	2021-09-29 09:55:43 -07:00
Scott Wolchok	03a58a2ba0	[Caffe2] Create fewer strings during argument fetching (#64285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64285 With C++14 heterogeneous ordered container lookup, it is no longer necessary to create a `std::string` in order to look up elements of a `CaffeMap` keyed by std::string. Accordingly, this diff reworks the argument-getting operator functions to avoid that in favor of `c10::string_view`. ghstack-source-id: 137139818 ghstack-source-id: 137139818 Test Plan: buildsizebot iOS apps -- code size win. less strings is probably marginally good for perf but this only happens at setup time anyway. Reviewed By: dzhulgakov Differential Revision: D26826676 fbshipit-source-id: ee653b14dc2c528bae8c90f0fc6a7a419cbca1d6	2021-09-01 13:30:54 -07:00
Pruthvi Madugundu	ab7a472980	[ROCm] Update HIP_VERSION to TORCH_HIP_VERSION (#62786 ) Summary: - HIP_VERSION semantic versioning will change in ROCm4.3. The changes essentially remove the dependency on HIP_VERSION provided in the hip header to keep code compatible with older and newer versions of ROCm. - TORCH_HIP_VERSION is derived from HIP_VERSION_MAJOR and HIP_VERSION_MINOR Pull Request resolved: https://github.com/pytorch/pytorch/pull/62786 Reviewed By: bdhirsh Differential Revision: D30281682 Pulled By: seemethere fbshipit-source-id: e41e69fb9e13de5ddd1af99ba5bbdcbb7b64b673	2021-08-13 15:00:43 -07:00
Nikita Shulga	f82d4b8957	Mark unused functions with `C10_UNUSED` (#62929 ) Summary: Which fixes number of warnings Pull Request resolved: https://github.com/pytorch/pytorch/pull/62929 Reviewed By: walterddr, albanD Differential Revision: D30171953 Pulled By: malfet fbshipit-source-id: f82475289ff4aebb0c97794114e94a24d00d2ff4	2021-08-09 13:00:33 -07:00
peterjc123	08f6bc1da6	Stop exporting symbols in anonymous namespaces (#62952 ) Summary: The cases are found out by compiling against clang on Windows. Those functions will still be exported under this case, which is a waste of space in the symbol table. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62952 Reviewed By: gchanan Differential Revision: D30191291 Pulled By: ezyang fbshipit-source-id: 3319b0ec4f5fb02e0fe1b81dbbcedcf12a0c795e	2021-08-09 12:52:12 -07:00
Stephen Macke	174433267c	[dte] fastpath implementation for broadcast utility function (4/x) (#62493 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62493 This diff adds a broadcast fastpath for the caffe2 broadcast utility function, which just copies the contents of a smaller tensor into a larger one. We also update the tests to exercise the new functionality. Test Plan: unit tests + let CI run Differential Revision: D29938285 fbshipit-source-id: 543ecc548500380e307be91902696033454964a2	2021-07-30 16:15:10 -07:00
Stephen Macke	eef85f89b9	[dte] broadcast fastpath implementations for reduce utility functions (2/x) (#62428 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62428 In this diff we add a broadcast fastpath for reduce utility functions. These functions are used by various elementwise ops, whose tests we update to exercise the new functionality. Test Plan: Added test cases to elementwise ops (which will exercise the new reducer functionality) that will be run by CI. It's worth noting there's still no code (outside of the new test cases) that takes the new code paths added -- the user must explicitly request `allow_broadcast_fastpath=True`, and nothing outside of the added tests currently does so. Differential Revision: D29938264 fbshipit-source-id: 5d5542bd93afb85fd9f7a4073f766adc07eb3b65	2021-07-29 17:27:39 -07:00
Stephen Macke	9f9244aabe	[dte] scaffolding for c2 operator broadcasting fastpath (1/x) (#62369 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62369 This diff is a big no-op that just sets up scaffolding for passing the "allow_broadcast_fastpath" from caffe2 operator protos created in Python down to C++. To facilitate this, we create helper template wrappers that pass a flag for "allow_broadcast_fastpath" down to elementwise functors. This flag will determine whether to try and take the broadcast fastpath, which we will add in subsequent diffs. Test Plan: sandcastle + let github CI run Differential Revision: D28154475 fbshipit-source-id: 15750a0bcd2994fbc6a61fb5653d8cae6b0177dd	2021-07-29 16:31:02 -07:00
Nikita Shulga	a9b0a921d5	Disable `avoid-non-const-global-variables` lint check (#62008 ) Summary: As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH` All changes but the ones to `.clang-tidy` are generated using following script: ``` for i in `find . -type f -iname ".c" -or -iname "*.h"\|xargs grep cppcoreguidelines-avoid-non-const-global-variables\|cut -f1 -d:\|sort\|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008 Reviewed By: driazati, r-barnes Differential Revision: D29838584 Pulled By: malfet fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13	2021-07-22 18:04:40 -07:00
Xiao Wang	c74c0c5718	add thrust/host_vector.h header for cuda 11.4 build (#61004 ) Summary: needed for cuda 11.4 build Close https://github.com/pytorch/pytorch/issues/61011 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61004 Reviewed By: ngimel Differential Revision: D29523896 Pulled By: malfet fbshipit-source-id: acb11bdd19c0cc240696be21e5c492f8976fea65	2021-07-06 12:44:56 -07:00
Xiaomeng Yang	357a21bc92	Fix numerical issue of rowwise normalization in Caffe2 and internal tests. (#60880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60880 Fix numerical issue of rowwise normalization in Caffe2 and internal tests. Test Plan: buck test mode/opt //dper3/dper3/modules/tests:xdeepint_test -- --exact 'dper3/dper3/modules/tests:xdeepint_test - test_xdeepint_with_full_features_with_interactions_3 (dper3.dper3.modules.tests.xdeepint_test.XdeepInt_Test)' Reviewed By: esqu1 Differential Revision: D29431597 fbshipit-source-id: 72df52fdcbb29ad3de7b9472f25fde26cf804a76	2021-06-30 17:31:04 -07:00
Feng Shi	b4a4a8434d	[1/n]support double for Caffe2 ScatterWeightedSum (#60402 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60402 Add float64 data type support for ScatterWeightedSum for cases that 10^7 precision is not sufficient. Test Plan: buck test caffe2/caffe2/python/operator_test:sparse_ops_test -- testScatterWeightedSum Reviewed By: jianyuh Differential Revision: D29190324 fbshipit-source-id: 871a60744694e901a2c7685a67350860745d6729	2021-06-29 14:17:04 -07:00
Jeff Daily	d36ce61a5e	use explicitly non-returning GPU atomics (#60607 ) Summary: Enables an important performance optimization for ROCm, in light of the discussion in https://github.com/pytorch/pytorch/issues/41028. CC jithunnair-amd sunway513 Pull Request resolved: https://github.com/pytorch/pytorch/pull/60607 Reviewed By: jbschlosser Differential Revision: D29409894 Pulled By: ngimel fbshipit-source-id: effca258a0f37eaefa35674a7fd19459ca7dc95b	2021-06-28 18:17:29 -07:00
Andrew Gallagher	c3977bf3da	[caffe2/utils] Add some fine-grained rules to avoid package boundary violations Test Plan: CI Reviewed By: igorsugak Differential Revision: D29401295 fbshipit-source-id: e921e5578c1fcc8df6bd670ae9f95722b8e32d85	2021-06-28 14:45:30 -07:00
Andrew Gallagher	03de807d81	[caffe2/utils] Add explicit rule to avoid package boundary violation (#60677 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60677 Add a rule to wrap conversions.h and depend on that, rather than relying on a glob which violates package boundaries. Test Plan: `buck2 build fbcode//caffe2/caffe2:caffe2_core` Reviewed By: mzlee Differential Revision: D29370841 fbshipit-source-id: d4dd383eb8457d4f5118574e34e6f17c32fde647	2021-06-28 14:43:30 -07:00
Andrew Gallagher	20bda0057e	[caffe2/utils] Add explicit rule to avoid package boundary violation Summary: Add a rule to wrap proto_utils.h and depend on that, rather than relying on a glob which violates package boundaries. Reviewed By: igorsugak Differential Revision: D29273453 fbshipit-source-id: 08f198a03d06ee2fdf61f5dbe1d0087db22aec8b	2021-06-22 12:22:24 -07:00
Andrew Gallagher	7c1bca9e94	[caffe2/utils] Add explicit rule to avoid package boundary violation Summary: Add a rule to wrap simple_queue.h and depend on that, rather than relying on a glob which violates package boundaries. Test Plan: `buck2 build fbcode//caffe2/caffe2:caffe2_core` Reviewed By: igorsugak Differential Revision: D29273415 fbshipit-source-id: f2b62a82cd6478bd71a8194d661d1c8b023c0953	2021-06-22 12:21:08 -07:00
Winston Smith	567e6d3a87	Remove Caffe2 thread-pool leak warning (#60318 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57273. Some users reported that they dislike the Caffe2 thread-pool leak warning, as it floods their logs, and have requested disabling it, or have asked for a way to filter it. It seems caffe2 pthreadpool already exists because of some dependency in the binary distribution, so `torch.set_num_threads()` invocation isn't required to reproduce the issue (as is otherwise the case when building from the master branch). https://github.com/pytorch/pytorch/issues/60171's test script does have a `set_num_threads` invocation & hence that's why I was able to reproduce the issue after building from the master branch's source code. cc malfet & ejguan, who have the authority to make a decision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60318 Reviewed By: albanD Differential Revision: D29265771 Pulled By: ezyang fbshipit-source-id: 26f678af2fec45ef8f7e1d39a57559790eb9e94b	2021-06-22 10:26:55 -07:00
Adam Simpkins	aeb55225e0	[caffe2] add a basic implementation of run-time feature rollout checks (#59355 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59355 Add a `CheckKnob()` function for doing run-time checks of feature roll-out knobs. This provides an API for safely controlling the roll-out of new functionality in the code. Test Plan: Included some basic unit tests. Reviewed By: voznesenskym Differential Revision: D26536430 fbshipit-source-id: 2e53234c6d9ce624848fc8b2c76f6833f344f48b	2021-06-04 14:34:41 -07:00
Akshit Khurana	f976275858	Run pthreadpool with _NoPThreadPoolGuard on the same thread (#58759 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58759 * Makes `pthreadpool()->run` respect `_NoPThreadPoolGuard` Runs tasks on the same thread instead of parallelizing when guard is present Test Plan: buck build //xplat/caffe2:aten_test_test_thread_pool_guard ./buck-out/last/aten_test_test_thread_pool_guard Reviewed By: kimishpatel Differential Revision: D28597425 fbshipit-source-id: 0365ad9947c239f5b37ce682802d4d401b8b0a48	2021-05-25 11:39:05 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Pritam Damania	dc8a8cea79	Move caffe2 signal_handler to c10. (#56717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56717 The signal_handler was under the caffe2 namespacee but was being used by PyTorch as well. I've fixed this my moving it to the c10 namespace where now both C2 and PyTorch can use it. The signal_handler interface in caffe2/utils/signal_handler.h is kept the same for backward compatiblity for C2, but most of the commmon code is moved to c10. ghstack-source-id: 127446929 Test Plan: waitforbuildbot Reviewed By: ezyang Differential Revision: D27946738 fbshipit-source-id: d6228d1a0108f4c807d405e7a0bb799c5375388f	2021-04-26 23:08:12 -07:00
davidriazati@fb.com	a8ea490f67	Revert caffe2 print stack traces flag (#56496 ) Summary: This reverts the change in #56198 which broke some internal tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/56496 Pulled By: driazati Reviewed By: walterddr Differential Revision: D27886611 fbshipit-source-id: b04de01b3bcf886294ff7ae45776b5955ce19858	2021-04-20 11:43:33 -07:00
davidriazati@fb.com	43c747859c	Use c10 backtrace generation in caffe2 (#56198 ) Summary: This cuts out caffe2's old backtrace generation in favor of the one already in c10. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56198 Pulled By: driazati Reviewed By: nikithamalgifb Differential Revision: D27868282 fbshipit-source-id: aa9b9691271eaa3f95baab48773ffefebd924ae2	2021-04-20 07:00:33 -07:00
davidriazati@fb.com	2f5c352162	Fix protobuf warnings in caffe2 (#56186 ) Summary: This guards some deprecated usages of the Protobuf API behind an `#ifdef` (this is how onnx does it as well) ](https://our.intern.facebook.com/intern/diff/27803121/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56186 Pulled By: driazati Reviewed By: bertmaher, dzhulgakov Differential Revision: D27803121 fbshipit-source-id: 2d3a348ec1ab9879a0d8f2dff17c5444fd4baf2c	2021-04-19 15:19:53 -07:00
Winston Smith	facbcec298	Make leak_corrupted_threadpool non-atomic (#55341 ) Summary: Following up on https://github.com/pytorch/pytorch/pull/54895#discussion_r606402656. A race-condition wouldn't arise because `leak_corrupted_threadpool` can be set to true only after fork via the `pthread_atfork` handler, when a (child) process would be single-threaded. It's set to false also when the process is still single-threaded (`pthreadpool` is called during an invocation to `set_num_threads`, prior to which a child process would remain single-threaded). All threads (if & when multiple threads would be created) would always see `leak_corrupted_threadpool` as false if it would be accessed concurrently. Since no reader threads can exist while a writer thread changes its value (false->true and true->false), `leak_corrupted_threadpool` might as well be a non-atomic bool. ### Pros 1. No thread-synchronization is required for `leak_corrupted_threadpool`, as it's a non-atomic bool. 2. The call to `compare_exchange_strong` has been be removed. cc: malfet VitalyFedyunin ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/55341 Reviewed By: albanD Differential Revision: D27669442 Pulled By: ezyang fbshipit-source-id: 926cb5c1b0a537c1c2ab164b0d51d37c1f1b67f0	2021-04-10 19:25:33 -07:00
Tao Xu	f1a0b817f0	[pthreadpool] Apply cap for macos builds (#55435 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55435 We've seen issues from the macos skylight app that PyTorch is super slow due to the lack of cap support in pthreadpools. For mac builds, we set the thread count to `#threads/2`. ghstack-source-id: 125900852 Test Plan: - Sandcastle CI - CircleCI Reviewed By: kimishpatel Differential Revision: D27578871 fbshipit-source-id: 7b947bc5d6cf289378abf5f479575e112325d02b	2021-04-08 03:56:12 -07:00
Ying Zhang	8c1a70a7c9	[A*][Gen-1.5] Add shape inference func for PredictorCall. Summary: ATT, so that the shape inference works for a model with only distributed parts. Previously, we rely on a full_predictor net to do shape inference. For very large models, the full_predictor net won't be generated, so we have to do shape inference based on distributed parts. Surprisingly, the PredictorCall op does tensor name mapping so it has to have shape inference func supported. Test Plan: Added unittests. Reviewed By: khabinov Differential Revision: D27250956 fbshipit-source-id: 3ebd36ba1eb020bb5d00358cffb8f038a6a996e8	2021-04-06 21:18:40 -07:00
Pritam Damania	e3691be2d9	Dump C++ stack traces of all threads for distributed tests. (#55003 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55003 Using the `caffe2::setPrintStackTracesOnFatalSignal` utility in distributed tests to set a signal handler that dumps the state of all threads for all processes when it receives a FATAL signal. This would help in debugging tests further. I had to revert all the python faulthandler code since only one signal handler function is supported, so running python faulthandler with `setPrintStackTracesOnFatalSignal` doesn't work. Sample output: ``` SIGSEGV(11), PID: 3492872, Thread 3492872: [0] ???(0x7fa7b2d1d61b) in libcaffe2_caffe2_caffe2_cpu.so [1] ???(0x7fa7b2d1d3fb) in libcaffe2_caffe2_caffe2_cpu.so [2] ???(0x7fa7b2d1d33d) in libcaffe2_caffe2_caffe2_cpu.so [3] ???(0x7fa7b2d1d167) in libcaffe2_caffe2_caffe2_cpu.so [4] ???(0x7fa7ce683150) in libpthread.so.0 [5] ???(0x7fa7be2b233c) in libcaffe2__C_impl_cuda.so [6] ???(0x7fa7be2ce80c) in libcaffe2__C_impl_cuda.so [7] ???(0x7fa7be2a0512) in libcaffe2__C_impl_cuda.so [8] torch::distributed::rpc::TensorPipeAgent::send(torch::distributed::rpc::WorkerInfo const&, torch::distributed::rpc::Message&&, float, std::unordered_map<signed char, signed char, std::hash<signed char>, std::equal_to<signed char>, std::allocator<std::pair<signed char const, signed char> > > const&)+0x24f(0x7fa7be29f71f) in libcaffe2__C_impl_cuda.so [9] torch::distributed::autograd::sendMessageWithAutograd(torch::distributed::rpc::RpcAgent&, torch::distributed::rpc::WorkerInfo const&, torch::distributed::rpc::Message&&, bool, float, bool)+0x393(0x7fa7b602b203) in libcaffe2_libtorch.so [10] torch::distributed::rpc::pyRpcPythonUdf(torch::distributed::rpc::WorkerInfo const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<at::Tensor, std::allocator<at::Tensor> >&, float, bool)+0x201(0x7fa7bd844971) in libcaffe2__C_impl_cuda.so ``` ghstack-source-id: 125630551 Test Plan: waitforbuildbot Reviewed By: SciPioneer Differential Revision: D27419714 fbshipit-source-id: 8aca9a14ef688004053d8798124d9c3a3fbe3489	2021-04-03 13:59:56 -07:00
Winston Smith	8ed20b3f65	Leak Caffe2 threadpool in child processes right after fork to prevent segfault (#54895 ) Summary: ## Problem summary Fixes https://github.com/pytorch/pytorch/issues/54752 - when the number of threads is more than 3 and at least one `set_num_threads` invocation has taken place before forking child processes by the dataloader, `set_num_threads(1)` in the child process causes a segfault, as during its invocation, the child process is made to handle the data structures of the Caffe2 thread-pool of the parent process, whose data structures it inherits from the parent process (these threads don't exist in the child process, but some of its data structures do, due to the copy-on-write technique used by `fork`). ## Solution malfet [advised](https://github.com/pytorch/pytorch/issues/54752#issuecomment-810315302) & [authored code](https://github.com/pytorch/pytorch/pull/54895#pullrequestreview-625670122) for adding a `pthread_atfork` handler in `pytorch/caffe2/utils/threadpool/pthreadpool-cpp.cc`, that's invoked in the child process right after fork, to leak the Caffe2 thread-pool (the child inherits the thread-pool's data structures from its parent process, but doesn't actually have those threads, since after `fork` , a child process only has one thread). ## Additional changes Added unittest `test_no_segfault` to test for this issue in `test_dataloader.py` Also enabled `test_segfault` (which actually makes sure that segfaults happen in worker processes in a particular case). Pull Request resolved: https://github.com/pytorch/pytorch/pull/54895 Reviewed By: zhangguanheng66 Differential Revision: D27542253 Pulled By: malfet fbshipit-source-id: 10f9c67ce1ff1aa37d3efebf405bd93f7f9d2489	2021-04-03 10:51:20 -07:00
Sam Estep	5bcbbf5373	Lint trailing newlines (#54737 ) Summary: Context: https://github.com/pytorch/pytorch/issues/53406 added a lint for trailing whitespace at the ends of lines. However, in order to pass FB-internal lints, that PR also had to normalize the trailing newlines in four of the files it touched. This PR adds an OSS lint to normalize trailing newlines. The changes to the following files (made in 54847d0adb9be71be4979cead3d9d4c02160e4cd) are the only manually-written parts of this PR: - `.github/workflows/lint.yml` - `mypy-strict.ini` - `tools/README.md` - `tools/test/test_trailing_newlines.py` - `tools/trailing_newlines.py` I would have liked to make this just a shell one-liner like the other three similar lints, but nothing I could find quite fit the bill. Specifically, all the answers I tried from the following Stack Overflow questions were far too slow (at least a minute and a half to run on this entire repository): - [How to detect file ends in newline?](https://stackoverflow.com/q/38746) - [How do I find files that do not end with a newline/linefeed?](https://stackoverflow.com/q/4631068) - [How to list all files in the Git index without newline at end of file](https://stackoverflow.com/q/27624800) - [Linux - check if there is an empty line at the end of a file [duplicate]](https://stackoverflow.com/q/34943632) - [git ensure newline at end of each file](https://stackoverflow.com/q/57770972) To avoid giving false positives during the few days after this PR is merged, we should probably only merge it after https://github.com/pytorch/pytorch/issues/54967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/54737 Test Plan: Running the shell script from the "Ensure correct trailing newlines" step in the `quick-checks` job of `.github/workflows/lint.yml` should print no output and exit in a fraction of a second with a status of 0. That was not the case prior to this PR, as shown by this failing GHA workflow run on an earlier draft of this PR: - https://github.com/pytorch/pytorch/runs/2197446987?check_suite_focus=true In contrast, this run (after correcting the trailing newlines in this PR) succeeded: - https://github.com/pytorch/pytorch/pull/54737/checks?check_run_id=2197553241 To unit-test `tools/trailing_newlines.py` itself (this is run as part of our "Test tools" GitHub Actions workflow): ``` python tools/test/test_trailing_newlines.py ``` Reviewed By: malfet Differential Revision: D27409736 Pulled By: samestep fbshipit-source-id: 46f565227046b39f68349bbd5633105b2d2e9b19	2021-03-30 13:09:52 -07:00
frdong	92770d25cd	fix comparison of narrow type with wide type in loop condition (#53951 ) Summary: fix Semmle warning: Comparison of narrow type with wide type in loop condition For example there is below piece of code: for (int i=0; i<array.size(); ++i) {} The problem is that array.size() return type is size_t can be larger type than int depending on the implementation so there is chance that i overflows (for very large array that array size is beyond the range of integer) and this loop will never be terminated. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53951 Reviewed By: zou3519 Differential Revision: D27181495 Pulled By: malfet fbshipit-source-id: 0612c5cedcdc656c193085e7fbb87dd163f20688	2021-03-22 16:40:35 -07:00
Akshit Khurana	ecd8e4c1d5	Add guard to run on current thread (#52361 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52361 Test Plan: buck build //xplat/caffe2:aten_test_test_thread_pool_guard ./aten_test_test_thread_pool_guard Reviewed By: kimishpatel Differential Revision: D26429540 fbshipit-source-id: 16e4a56d4bf9b73b1ea1ff88d7dc6730e0b1e029	2021-03-03 11:43:40 -08:00
Richard Barnes	fa325d7c9f	Use `sum_integers` and `multiply_integers` (#51146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51146 Test Plan: Sandcastle tests Reviewed By: ngimel Differential Revision: D25903430 fbshipit-source-id: 329c14018c9e5192864eed88a8ed0a5068ff1c69	2021-02-10 18:05:45 -08:00
Dan Fan	3cd8ed972a	add and adjust kernel launch checks under fbcode/caffe2/caffe2/utils (#50862 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50862 add all missing kernal launch check for all cu and cuh files under caffe2/caffe2/utils Test Plan: building ```buck build //caffe2/caffe2:``` gives no error Tests all pass ```buck test //caffe2/caffe2:``` check using the check to ensure there is no show under `fbcode/caffe2/caffe2/utils` the PR on github shows all tests are passed https://github.com/pytorch/pytorch/actions/runs/500036434 Reviewed By: r-barnes Differential Revision: D25987367 fbshipit-source-id: 52add63a14f2da855c784ab24468f64056c93836	2021-01-21 15:20:55 -08:00
Jessica Zhao	725640ed84	Check CUDA kernel launches in caffe2/caffe2/utils/math (#50238 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50238 Added `C10_CUDA_KERNEL_LAUNCH_CHECK();` after all kernel launches in caffe2/caffe2/utils/math Test Plan: ``` buck build //caffe2/caffe2 ``` {F356531214} files in caffe2/caffe2/utils/math no longer show up when running ``` python3 caffe2/torch/testing/check_kernel_launches.py ``` Reviewed By: r-barnes Differential Revision: D25773299 fbshipit-source-id: 28d67b4b9f57f1fa1e8699e43e9202bad4d42c5f	2021-01-12 13:09:15 -08:00
Anshul Jain (FRL)	ef172e138c	[Mask R-CNN]Add Int8 AABB Generate proposals Op (#49574 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49574 Adds support for additional Eigen Utils for custom type defs. Reviewed By: linbinyu Differential Revision: D25624556 fbshipit-source-id: 0ffa90aaf8cbf1d08825e95156fb40d966ca7042	2020-12-21 09:43:33 -08:00
Jane Xu	71ca600af9	Renaming CAFFE2_API to TORCH_API (#49496 ) Summary: Since caffe2 and torch have been consolidated, CAFFE2_API should be merged with TORCH_API. Addresses a TODO. Manually edited some references of the removed `CAFFE2_API`: * `CONTRIBUTING.md` * `caffe2/proto/CMakeLists.txt` * `cmake/ProtoBuf.cmake` * `c10/macros/Export.h` * `torch/csrc/WindowsTorchApiMacro.h` Pull Request resolved: https://github.com/pytorch/pytorch/pull/49496 Reviewed By: malfet, samestep Differential Revision: D25600726 Pulled By: janeyx99 fbshipit-source-id: 7e068d959e397ac183c097d7e9a9afeca5ddd782	2020-12-18 10:54:50 -08:00

1 2 3 4 5 ...

531 Commits