pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Michael Andreas Dagitses	f96d96a7fc	turn on -Werror=type-limits in our Bazel CPU build Summary: We also fix any existing issues. Test Plan: Built locally, rely on CI to confirm. Reviewers: malfet Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79139 Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/albanD	2022-06-10 10:04:08 +00:00
Nikolay Korovaiko	df1f9b9840	Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#77756 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/77756 Approved by: https://github.com/desertfire	2022-05-20 05:39:03 +00:00
PyTorch MergeBot	e9d660c331	Revert "Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 )""" This reverts commit `acf7136a52`. Reverted https://github.com/pytorch/pytorch/pull/77719 on behalf of https://github.com/suo	2022-05-18 05:06:50 +00:00
Edward Z. Yang	acf7136a52	Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 )"" This reverts commit `c35bd8d423`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719 Approved by: https://github.com/Chillee, https://github.com/malfet	2022-05-18 03:25:43 +00:00
PyTorch MergeBot	c35bd8d423	Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 )" This reverts commit `fc4c3c9bc7`. Reverted https://github.com/pytorch/pytorch/pull/76836 on behalf of https://github.com/suo	2022-05-18 02:45:25 +00:00
Nikolay Korovaiko	fc4c3c9bc7	Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 ) LTC Tensors now create real IR (SizeNode) for sym_sizes() in LTCTensorImpl.cpp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76836 Approved by: https://github.com/ezyang	2022-05-18 00:40:42 +00:00
Nikolay Korovaiko	99339fddd9	move SymInt and SymIntArrayRef to c10/core (#77009 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/77009 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-05-11 16:21:31 +00:00
Nikita Shulga	f6c275f55d	Remove `-Wno-unused-variable` from `utils.cmake` (take 2) (#75538 ) Summary: [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Reviewed By: anjali411 Differential Revision: D35747333 Pulled By: malfet fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626 (cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)	2022-04-20 17:41:59 +00:00
Nikolay Korovaiko	69e048b090	List of SymInt rebase on master Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75115 Approved by: https://github.com/ezyang	2022-04-20 02:09:55 +00:00
PyTorch MergeBot	5c56b2286b	Revert "Remove `-Wno-unused-variable` from utils.cmake" This reverts commit `018cbe1f5c`. Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere	2022-04-19 17:19:09 +00:00
Nikita Shulga	018cbe1f5c	Remove `-Wno-unused-variable` from utils.cmake [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Approved by: https://github.com/cpuhrsch	2022-04-19 15:26:55 +00:00
Nikita Shulga	bdf5a87714	Extend sign-compare warnings to gcc (take 2) Remove `-Wno-sign-compare` option for GCC Suppress erroneous sign-compare warning in `c10::greater_than_max`(see https://godbolt.org/z/Tr3Msnz99) Fix sign-compare in torch/deploy, `caffe2::QTensor::dim32()` and `generate_proposals_op_test.cc` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75544 Approved by: https://github.com/osalpekar	2022-04-13 00:06:52 +00:00
Nikita Shulga	f6e7a2ab64	Fix sign-compare in caffe2 cpp tests Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: https://github.com/pytorch/pytorch/pull/75084 Approved by: https://github.com/ngimel	2022-04-05 00:08:05 +00:00
Nikita Shulga	6d85e7dafa	Fix sign-compare in caffe2 Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo Pull Request resolved: https://github.com/pytorch/pytorch/pull/75082 Approved by: https://github.com/ngimel	2022-04-05 00:08:05 +00:00
Mike Iovine	6bd4376c60	[caffe2] Fix alias analysis for quantization compression ops (#74169 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74169 Alias DB was being way too conservative about the semantics of exported Caffe2 ops - it thought some pure functions were writing to their inputs, which caused `ReplaceWithMaybeCopy` to fail. This in turn lead to a huge decrease in out variant coverage and regressions in many models. I've extended the export macro to let the user specify an `AliasAnalysisKind` and marked all of the quantization compression ops as pure functions. ghstack-source-id: 151394133 Reviewed By: hlu1 Differential Revision: D34733630 fbshipit-source-id: e968812e052f14261c10f9a280abe1d910de1f2f (cherry picked from commit 5e9de49b98caff57be13e8bd101144ae2475b6b5)	2022-03-15 22:29:59 +00:00
Nolan O'Brien	17540c5c80	[warnings][Caffe2] Suppress warnings in non-c10 headers (#71370 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71370 Round out suppressing warnings in `caffe2` headers Test Plan: CI check Reviewed By: r-barnes Differential Revision: D33613084 fbshipit-source-id: 9306d480bd796aeae4d887ad26b6ddc2c571c9e4	2022-01-17 10:09:31 -08:00
Nolan O'Brien	8f4cec2231	[warnings][Caffe2] Suppress warnings in caffe2 headers (#71196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71196 `caffe2` headers contain code that can elicit warnings when built with strict compiler flags. Rather than force downstream/consuming code to weaken their compiler flags, suppress those warnings in the header using `#pragma clang diagnostic` suppressions. Test Plan: CI Pass Reviewed By: malfet Differential Revision: D33536233 fbshipit-source-id: 74404e7a5edaf244f79f7a0addd991a84442a31f	2022-01-12 10:16:35 -08:00
Xiaodong Wang	025cd69a86	[AMD] Fix some legacy hipify script (#70594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70594 Pull Request resolved: https://github.com/facebookincubator/gloo/pull/315 Fix some out-dated hipify script: * python -> python3 (fb internal) * rocblas return code * gloo makefile for hip clang Test Plan: Sandcastle + OSS build Reviewed By: malfet, shintaro-iwasaki Differential Revision: D33402839 fbshipit-source-id: 5893039451bcf77bbbb1b88d2e46ae3e39caa154	2022-01-05 11:34:25 -08:00
Richard Barnes	29d759948e	use irange for loops 2 (#66746 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66746 Modified loops in files under fbsource/fbcode/caffe2/ from the format `for(TYPE var=x0;var<x_max;x++)` to the format `for(const auto var: irange(xmax))` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. Test Plan: Sandcastle Reviewed By: malfet Differential Revision: D31705361 fbshipit-source-id: 33fd22eb03086d114e2c98e56703e8ec84460268	2021-12-10 04:26:23 -08:00
Mike Iovine	8363da3f92	[SR][C2][easy] Benchmarks report # of ops (#67436 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67436 This information is useful for comparing static runtime to c2 Reviewed By: d1jang Differential Revision: D31991571 fbshipit-source-id: eb83bc4564b05d56fb9a550863eea3f6312f3f6c	2021-10-28 13:03:09 -07:00
Xiang Gao	b8dfb45ac2	Refactor cub namespace handling (#66219 ) Summary: This PR is to update PyTorch with the following cub changes: - Starting cub 1.13.1, cub requires users to define `CUB_NS_QUALIFIER` if `CUB_NS_PREFIX` is also defined. Besides that, a new mechanism `CUB_WRAPPED_NAMESPACE` is added. And I do the following change to PyTorch: - Starting CUDA 11.5, define `CUB_WRAPPED_NAMESPACE` globally as an nvcc flag. - Fix caffe2 failures caused by the above change. - Add a `aten/src/ATen/cuda/cub_definitions.cuh` that defines helper macros about feature availability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/66219 Reviewed By: bdhirsh Differential Revision: D31626931 Pulled By: ngimel fbshipit-source-id: 97ebf5ef671ade8bf46d0860edc317f22660f26d	2021-10-25 14:37:09 -07:00
Xue Li	2f099c7555	Revert D30652629: use irange for loops Test Plan: revert-hammer Differential Revision: D30652629 (`687c2267d4`) Original commit changeset: 0ae6c4bbbb55 fbshipit-source-id: 5c4f067b584a021c8c9656454d1ee60999600fb3	2021-10-15 15:23:10 -07:00
Richard Barnes	687c2267d4	use irange for loops (#66234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234 Modified loops in files under fbsource/fbcode/caffe2/ from the format `for(TYPE var=x0;var<x_max;x++)` to the format `for(const auto var: irange(xmax))` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. bypass_size_limit allow-large-files Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D30652629 fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e	2021-10-15 13:50:33 -07:00
Scott Wolchok	2d885ab73d	[jit] Reduce refcounting of Types (#65345 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65345 FooType::get() can return a const reference. Inconveniently, converting shared_ptr<FooType> to shared_ptr<Type> requires a copy & refcount bump, so to properly take advantage of this in unshapedType() we need to take a const Type& in isSubtypeOf(), which is good practice anyway -- don't require a shared_ptr if you don't need to take ownership. ghstack-source-id: 140044165 Test Plan: CI perf says c10::unshapedType time decreased from 2.8% to 2.2% during static runtime startup, though I expect this to be generally beneficial. Reviewed By: hlu1 Differential Revision: D31027361 fbshipit-source-id: 676feb81db9f74ad7b8651d8774f4ecb4cfa6ab8	2021-10-08 09:03:04 -07:00
Richard Barnes	2f1ab477f1	Speed up DataTypeToTypeMeta (#66113 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66113 For a benchmark compiled in opt-mode in which the lookup items were shuffled and then the items were looked up round-robin fashion 10M times (for a total of 140M lookups) compiled in opt-mode we see: ``` Function Container Time (ms) Multiplier TypeMetaToDataType if-chain 233 1x TypeMetaToDataType std::vector 795 3.41x TypeMetaToDataType std::map 1566 6.72x TypeMetaToDataType std::unordered_map 2136 9.17x DataTypeToTypeMeta switch 102 1x DataTypeToTypeMeta std::vector 666 6.53x DataTypeToTypeMeta std::map 1212 11.9x DataTypeToTypeMeta std::unordered_map 1539 15.1x DataTypeToTypeMeta folly::F14FastMap 1789 17.5x ``` From this, we draw two conclusions: 1. Using a complex container like `std::map` is worse than using a simple vector lookup here (there aren't enough items for the Big-O to assert itself). 2. Using any container at all is a mistake. (Unless we pull in more exotic reasoning like invalidating the code cache or preventing inlining.) Test Plan: Sandcastle Reviewed By: dzhulgakov Differential Revision: D31375117 fbshipit-source-id: 0b310c6c2e94080d125c82fb7c2b43ab869adbcb	2021-10-07 08:06:09 -07:00
Nikita Shulga	4c4525fa5c	Compile without -Wno-unused-variable (take 2) (#66041 ) Summary: Delete `-Wno-unused-variable` from top level `CMakeLists.txt` Still suppress those warnings for tests and `torch_python` Delete number of unused variables from caffe2 code Use `(void)var;` to suppress unused variable in range loops Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants Do not delete `caffe2::OperatorBase::Output` calls as they have side effects Pull Request resolved: https://github.com/pytorch/pytorch/pull/66041 Reviewed By: ngimel Differential Revision: D31360142 Pulled By: malfet fbshipit-source-id: 6fdfb9f91efdc49ca984a2f2a17ee377d28210c8	2021-10-04 20:39:39 -07:00
Nikita Shulga	e4ee5ca698	Revert D31326599: [pytorch][PR] Compile without -Wno-unused-variable Test Plan: revert-hammer Differential Revision: D31326599 (`a6280ab653`) Original commit changeset: 924155f1257a fbshipit-source-id: b8ee5bc0298637443232f5ee9ec79e51ed256faf	2021-10-01 20:40:47 -07:00
Nikita Shulga	a6280ab653	Compile without -Wno-unused-variable (#65954 ) Summary: Delete `-Wno-unused-variable` from top level `CMakeLists.txt` Still suppress those warnings for tests and `torch_python` Delete number of unused variables from caffe2 code Use `(void)var;` to suppress unused variable in range loops Use `C10_UNUSED` for global constructors and use `constexpr` instead of `static` for global constants Pull Request resolved: https://github.com/pytorch/pytorch/pull/65954 Reviewed By: ngimel Differential Revision: D31326599 Pulled By: malfet fbshipit-source-id: 924155f1257a2ba1896c50512f615e45ca1f61f3	2021-10-01 17:40:47 -07:00
Pruthvi Madugundu	085e2f7bdd	[ROCm] Changes not to rely on CUDA_VERSION or HIP_VERSION (#65610 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65610 - Replace HIP_PLATFORM_HCC with USE_ROCM - Dont rely on CUDA_VERSION or HIP_VERSION and use USE_ROCM and ROCM_VERSION. - In the next PR - Will be removing the mapping from CUDA_VERSION to HIP_VERSION and CUDA to HIP in hipify. - HIP_PLATFORM_HCC is deprecated, so will add HIP_PLATFORM_AMD to support HIP host code compilation on gcc. cc jeffdaily sunway513 jithunnair-amd ROCmSupport amathews-amd Reviewed By: jbschlosser Differential Revision: D30909053 Pulled By: ezyang fbshipit-source-id: 224a966ebf1aaec79beccbbd686fdf3d49267e06	2021-09-29 09:55:43 -07:00
Jane Xu	1ee66a5278	Remove CUDA 9.2 references conditionals and workarounds (#65070 ) Summary: Title says it all Pull Request resolved: https://github.com/pytorch/pytorch/pull/65070 Reviewed By: malfet Differential Revision: D30966464 Pulled By: janeyx99 fbshipit-source-id: e454906fd5d7d321d390939ba5d237e1d9b150f8	2021-09-17 12:28:23 -07:00
Peter Bell	d701357d92	Factor out TensorBase that doesn't depend on native operators (#63612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63612 This makes Tensor inherit from a new class TensorBase, that provides a subset of Tensor that doesn't directly depend on native_functions.yaml. Code that only includes TensorBase.h with thus not need to be rebuilt every time someone changes an operator signature. Making `Tensor` inherit from this class means that `const TensorBase&` parameters will be callable with an ordinary `Tensor`. I've also made `Tensor` constructible and assignable from `TensorBase` to minimize friction in code mixing the two types. To help enforce that `Tensor.h` and `Functions.h` aren't accidentally included, I've added an error into `Operators.h` if `TORCH_ASSERT_NO_OPERATORS` is defined. We can either set this in the build system for certain folders, or just define it at the top of any file. I've also included an example of manually special-casing the commonly used `contiguous` operator. The inline function's slow path defers to `TensorBase::__dispatch_contiguous` which is defined in `Tensor.cpp`. I've made it so `OptionalTensorRef` is constructible from `TensorBase`, so I can materialize a `Tensor` for use in dispatch without actually increasing its refcount. Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D30728580 Pulled By: ezyang fbshipit-source-id: 2cbc8eee08043382ee6904ea8e743b1286921c03	2021-09-08 13:28:54 -07:00
Seth Elliott	f04e6594ed	Fix broken caffe2 test: PlanExecutorTest.BlockingErrorPlan (#64401 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64401 PlanExecutorTest.BlockingErrorPlan uses `ASSERT_DEATH` which internally performs a `fork()`. This can cause problems under certain configurations that use threads. This change updates this test to use the "threadsafe" style for GTest death tests in order to improve its quality in multithreaded environments. Test Plan: I confirmed that this change fixes the issue on my devvm with the following command: ``` buck test mode/dev //caffe2/caffe2:caffe2_test_cpu -- PlanExecutorTest.BlockingErrorPlan ``` Reviewed By: praihan Differential Revision: D30709447 fbshipit-source-id: 12ffd9ad0371e2e5b43a9873c80568e5ab02d246	2021-09-02 08:30:29 -07:00
Tanvir Zaman	25e2578967	Fix bytes_written and bytes_read (#64244 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64244 Pull Request resolved: https://github.com/pytorch/pytorch/pull/64040 In operator cost inference functions, in many places we are using sizeof(x.data_type()). Since data_type() returns a 32 bit integer from [this enum](https://www.internalfb.com/code/fbsource/[15e7ffe4073cf08c61077c7c24a4839504b964a2]/fbcode/caffe2/caffe2/proto/caffe2.proto?lines=20), we are basically always getting 4 for sizeof(x.data_type()) no matter what actual data type x has. Big thanks to Jack Langman for specifically pointing to this bug. We would instead use the size in bytes based on actual data type. Test Plan: Added unit tests BatchMatMulMemCostTest: buck test //caffe2/caffe2/fb/fbgemm:batch_matmul_op_test -- BatchMatMulMemCostTest Extended existing unit test test_columnwise_concat for different data types: buck test //caffe2/caffe2/python/operator_test:concat_op_cost_test -- test_columnwise_concat Reviewed By: CrazySherman Differential Revision: D30656698 fbshipit-source-id: d42c0c9a0c5b0ddc5dba39e4994f1f85a5e618bf	2021-09-01 13:35:41 -07:00
Scott Wolchok	03a58a2ba0	[Caffe2] Create fewer strings during argument fetching (#64285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64285 With C++14 heterogeneous ordered container lookup, it is no longer necessary to create a `std::string` in order to look up elements of a `CaffeMap` keyed by std::string. Accordingly, this diff reworks the argument-getting operator functions to avoid that in favor of `c10::string_view`. ghstack-source-id: 137139818 ghstack-source-id: 137139818 Test Plan: buildsizebot iOS apps -- code size win. less strings is probably marginally good for perf but this only happens at setup time anyway. Reviewed By: dzhulgakov Differential Revision: D26826676 fbshipit-source-id: ee653b14dc2c528bae8c90f0fc6a7a419cbca1d6	2021-09-01 13:30:54 -07:00
Alban Desmaison	c3464e78a4	Revert D30561459: Fix bytes_written and bytes_read Test Plan: revert-hammer Differential Revision: D30561459 (`e98173ff34`) Original commit changeset: 976fa5167097 fbshipit-source-id: 43f4c234ca400820fe6db5b4f37a25e14dc4b0dd	2021-08-30 14:59:54 -07:00
Tanvir Zaman	e98173ff34	Fix bytes_written and bytes_read (#64040 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64040 In operator cost inference functions, in many places we are using sizeof(x.data_type()). Since data_type() returns a 32 bit integer from [this enum](https://www.internalfb.com/code/fbsource/[15e7ffe4073cf08c61077c7c24a4839504b964a2]/fbcode/caffe2/caffe2/proto/caffe2.proto?lines=20), we are basically always getting 4 for sizeof(x.data_type()) no matter what actual data type x has. Big thanks to Jack Langman for specifically pointing to this bug. We would instead use the size in bytes based on actual data type. Test Plan: Added unit tests BatchMatMulMemCostTest: buck test //caffe2/caffe2/fb/fbgemm:batch_matmul_op_test -- BatchMatMulMemCostTest Extended existing unit test test_columnwise_concat for different data types: buck test //caffe2/caffe2/python/operator_test:concat_op_cost_test -- test_columnwise_concat Differential Revision: D30561459 fbshipit-source-id: 976fa5167097a35af548498480001aafd7851d93	2021-08-30 12:57:31 -07:00
Scott Wolchok	0a66d5b325	[PyTorch] Remove unnecessary iostream includes in headers (#61500 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61500 libstdc++ defines a static variable called `std::__ioinit` in iostream that adds global constructor size overhead to each translation that includes iostream. To reduce the size overhead from that, we can often include ostream instead. ghstack-source-id: 136163529 Test Plan: buildsizebot some mobile apps Reviewed By: dhruvbird Differential Revision: D29648016 fbshipit-source-id: 9c3139712c71248513cc5032d21e77f3ecbae8fe	2021-08-19 18:54:51 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Nikita Shulga	709ac6853a	Fix warnings (#62930 ) Summary: Add `-Wno-writable-strings`(which is clang's flavor of `-Wwrite-strings`) to list of warnings ignored while compiling torch_python. Avoid unnecessary copies in range loop Fix number of signed-unsigned comparisons Found while building locally on M1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62930 Reviewed By: albanD Differential Revision: D30171981 Pulled By: malfet fbshipit-source-id: 25bd43dab5675f927ca707e32737ed178b04651e	2021-08-11 14:07:10 -07:00
peterjc123	08f6bc1da6	Stop exporting symbols in anonymous namespaces (#62952 ) Summary: The cases are found out by compiling against clang on Windows. Those functions will still be exported under this case, which is a waste of space in the symbol table. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62952 Reviewed By: gchanan Differential Revision: D30191291 Pulled By: ezyang fbshipit-source-id: 3319b0ec4f5fb02e0fe1b81dbbcedcf12a0c795e	2021-08-09 12:52:12 -07:00
Jeff Daily	b7391f44df	cast return of cudaGetLastError() to void when discarding (#62518 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/62511. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62518 Reviewed By: walterddr, janeyx99 Differential Revision: D30029858 Pulled By: malfet fbshipit-source-id: d47ce4e507ac800b4e5a5e0a8d9a6fabdfd28e6d	2021-08-03 11:17:22 -07:00
Adam Simpkins	e0364ccc33	[caffe2] break one circular dependency between Caffe2 and ATen-cpu (#62632 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62632 Update the caffe2/core/context.h to directly use `at::mt19937` instead of the `at::CPUGeneratorImpl` wrapper class from the ATen-cpu library. Using `at::CPUGeneratorImpl` causes circular dependencies between the ATen and caffe2 code. In particular the `at::CPUGeneratorImpl::get_state()` logic depends on CPU Tensor functionality that currently depends on code from caffe2. Test Plan: The RNG behavior should be identically to the previous code (perhaps even faster since we now avoid virtual function calls). buck test //caffe2/caffe2:caffe2_test_cpu \ //caffe2/caffe2/python: //caffe2/caffe2/fb/operators: Differential Revision: D29915701 fbshipit-source-id: f9b2eab8d3b21b2224d30bcf52be9c0e7eb7cd0a	2021-08-02 22:40:56 -07:00
Nikita Shulga	a9b0a921d5	Disable `avoid-non-const-global-variables` lint check (#62008 ) Summary: As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH` All changes but the ones to `.clang-tidy` are generated using following script: ``` for i in `find . -type f -iname ".c" -or -iname "*.h"\|xargs grep cppcoreguidelines-avoid-non-const-global-variables\|cut -f1 -d:\|sort\|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008 Reviewed By: driazati, r-barnes Differential Revision: D29838584 Pulled By: malfet fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13	2021-07-22 18:04:40 -07:00
Jeff Daily	15210f3b82	ignore and clear not ready errors (#61554 ) Summary: Follow-up to https://github.com/pytorch/pytorch/issues/18584. This PR covers the remaining places where event or stream query might result in not ready errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61554 Reviewed By: mrshenli Differential Revision: D29763973 Pulled By: ezyang fbshipit-source-id: 41d988d1826b2309cc6b01a81144094b353abdf9	2021-07-19 16:03:04 -07:00
Hao Lu	4adc5eb6c5	[Caffe2][Testing] Check for equality first in assertTensorEqualsWithType<float> (#61006 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61006 Test Plan: Modified existing unit test to test for eps = 0. It would fail without the equality test first. Reviewed By: ajyu Differential Revision: D29423770 fbshipit-source-id: 168e7de00d8522c4b646a8335d0120700915f260	2021-06-29 23:31:37 -07:00
Adam Simpkins	fadaa52f64	[caffe2] add an EstimateAllBlobSizes operator (#59775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59775 This operator is similar to `GetAllBlobNames` but also returns the estimated size required to serialize each node. One goal of this operator is to allow checkpoint saving logic to estimate the amount of space/bandwidth required to save a checkpoint when first starting training, without actually serializing any blobs yet. Currently the checkpointing logic uses `GetAllBlobNames` to determine the blobs to checkpoint. It can instead be updated to use `EstimateAllBlobSizes` to also get an estimate for how much space will be required for the checkpoint. ghstack-source-id: 132275153 Test Plan: Included a new unit test. Reviewed By: mraway Differential Revision: D29020227 fbshipit-source-id: 811e5d86c4b59183e84e6424c48c97739be09043	2021-06-24 16:55:22 -07:00
Adam Simpkins	00896cb9ed	[caffe2] update db::Transaction::Put() to accept the value by rvalue reference (#60208 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60208 Update the DB APIs so that `db::Transaction::Put()` accepts the value by rvalue reference. This allows DB implementations to write data asynchronously without being forced to make an additional copy of the data in memory. `Put()` implementations can now use the string move constructor or assignment operator to get the string data and continue performing the write asynchronously after returning from `Put()`. Note that I chose to entirely replace the existing `Put()`, removing the ability for callers to call `Put()` with a `const std::string&` argument for the value, rather than simply adding another overloaded version of `Put()`. This was done because in practice there were no call sites using `Put()` that cannot move in their data. Eliminating the `const std::string&` API entirely simplifies the DB implementations: DBs that wish do support move semantics do not have to implement both the move and the copy versions of `Put()`. Test Plan: Searched through fbcode to try and make sure I found all `db::Transaction` subclasses, and will check sandcastle results to help confirm. Ran the modelstore checkpointing unit tests. Differential Revision: D29204425 fbshipit-source-id: 28be6646e92e5df71954d4bb3dc0c8add30ed041	2021-06-23 22:12:53 -07:00
Adam Simpkins	b09c0b6550	[caffe2] update the BlobSerializer acceptor to allow moving in the data (#60207 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60207 Update the `BlobSerializerBase` API so that the serizialized blob data is passed as a `std::string&&` rather than `const std::string&`. This allows the acceptor to take ownership of the string data. This allows the acceptor to do things like queue it for storing asynchronously, rather than having to make a copy of the data if they need it to remain valid after returning. All existing `BlobSerializerBase` implementations already pass in a valid rvalue reference to the data, so this change did not require updating any of the existing serializer implementations. ghstack-source-id: 132216750 Test Plan: Examined all ~46 `BlobSerializerBase` subclasses in fbsource to confirm they already pass in an rvalue reference for this argument. Also searched for `BlobSerializerBase` on google and did not find any external references to this class in other open source projects that might be affected. Differential Revision: D29204426 fbshipit-source-id: b1d567e52a5c17a01d651c70bbfa2fddbaea6cd9	2021-06-23 22:11:42 -07:00
Emilio Castillo	f9ec86a6c6	External stream (#59527 ) Summary: Previous is https://github.com/pytorch/pytorch/issues/57781 We add now two CUDA bindings to avoid using ctypes to fix a windows issue. However, we use ctypes to allocate the stream and create its pointer (we can do this with a 0-dim tensor too if it feels better). CC. ezyang rgommers ngimel mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/59527 Reviewed By: albanD Differential Revision: D29053062 Pulled By: ezyang fbshipit-source-id: 661e7e58de98b1bdb7a0871808cd41d91fe8f13f	2021-06-14 13:46:11 -07:00

1 2 3 4 5 ...

1394 Commits