pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Jason Ansel	ac26f8237c	Allow disabling nvfuser without CUDA (#71358 ) Summary: On a CPU-only build of pytorch `torch._C._jit_set_nvfuser_enabled(False)` would throw an error (even though it is a no-op operation), with this fix: ``` >>> torch._C._jit_set_nvfuser_enabled(False) False >>> torch._C._jit_set_nvfuser_enabled(True) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: Running CUDA fuser is only supported on CUDA builds. >>> ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/71358 Reviewed By: eellison Differential Revision: D33601135 Pulled By: jansel fbshipit-source-id: c764df2fa197ce7b4f71e5df0a91cd988766e99c (cherry picked from commit `a801df9321`)	2022-01-19 20:01:09 +00:00
Pearu Peterson	214f4bf2ff	Support sparse.sum on empty sparse tensor (#71091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71091 Fixes https://github.com/pytorch/pytorch/issues/65394 The masked sum on a full input tensor (of any layout) with an all-true mask is the same as the sum on the strided input tensor (after applying `to_dense` to sparse inputs). Since masked sum uses `torch.sparse.sum` then, for the simplicity of masked reductions implementations, its reduction behavior ought to be defined by the behavior of the `torch.sum`. This PR implements the behavioral connection with respect to the directional summation of empty sparse tensors that correspond to all-zero strided tensors. cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: davidberard98 Differential Revision: D33651750 Pulled By: cpuhrsch fbshipit-source-id: 703891bff88c8da6270b4272f5d2da81688db67d (cherry picked from commit `53f97e80f7`)	2022-01-19 18:58:08 +00:00
Rohan Varma	3b589c3497	[DDP Checkpointing] non-reentrant checkpoint tests (#69060 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69060 Saved variable hooks checkpointing was added in https://github.com/pytorch/pytorch/pull/69508, this PR adds some tests for DDP. Specifically, we can support almost all DDP use cases with this new API, such as dynamic module with find_unused_parameters=True. One case remains to be supported, which is static_graph + non-reentrant based checkpointing. The underlying reason this does not work is https://github.com/pytorch/pytorch/issues/58111. ghstack-source-id: 147219887 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D32712126 fbshipit-source-id: ba5ae9ca77fd8929ee020c7dc97838bae9a1931b (cherry picked from commit `9c7f93e217`)	2022-01-19 18:09:41 +00:00
Richard Barnes	75aaa9f92b	Remove simd qualifier for pragma omp loop in upsample_nearest_op.h (#71462 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71462 Fixes ``` 6 aienv/aienv_ig_reels_base:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 6 deep_entity_classification/si_dec_gnn:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 6 feed_recommendation_infra/multifeed_execution_graph_service_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 12 mobile_cv/mobile-vision_experimental:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 30 mobile_cv/mobile-vision_xraymobilev2_detection_caffe2:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 42 aienv/aienv:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 128 feed_recommendation_infra/multifeed_recagg_dev:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 136 fluent2/fblearner_flow_projects_fluent2_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] 1338 f6/f6_nosan:caffe2/modules/detectron/upsample_nearest_op.h:65:1: error: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering [-Werror,-Wpass-failed=transform-warning] ``` Test Plan: Sandcastle Reviewed By: luciang Differential Revision: D33641869 fbshipit-source-id: 8424849cfac5cb0109272dec2086863067bbde66 (cherry picked from commit `d18429905c`)	2022-01-19 18:04:10 +00:00
kshitij12345	908fd3d78b	[fix] composite compliance: quantile and nanquantile (#70894 ) Summary: Reference https://github.com/pytorch/pytorch/issues/69991 Refactored such that only `out` variant copies the result into `out` otherwise we just return the result of the composite functions as is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70894 Reviewed By: samdow Differential Revision: D33641742 Pulled By: zou3519 fbshipit-source-id: 671be13b31a7fff3afc0b7976706a5ecfc51ccac (cherry picked from commit `e7d5ac9af3`)	2022-01-19 17:54:00 +00:00
Mike Ruberry	a0ada2d22b	Back out "[pytorch][PR] Performance and memory improvements to batched torch.linalg.solve" (#71421 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71421 Original commit changeset: 7a0dd443cd0e Original Phabricator Diff: D33028236 (`410e91adee`) Test Plan: PyTorch OSS CI Reviewed By: ngimel Differential Revision: D33637628 fbshipit-source-id: 1e81485be202b2f9d6a1ff315279cc099754c2dc (cherry picked from commit `c2d730bfeb`)	2022-01-19 17:26:01 +00:00
Nikita Shulga	8a9243996c	Lazy load `pandas` when importing pytorch (#71316 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/71313 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71316 Reviewed By: wenleix Differential Revision: D33595043 Pulled By: malfet fbshipit-source-id: da8c7a7f132696645191d7b7055c4c21970d92c3 (cherry picked from commit `2d4847780a`)	2022-01-19 17:02:50 +00:00
Jane Xu	671a0b5376	Move sccache compilation log to its own group (#71444 ) Summary: The sccache compilation log is often misleading. We can move it to its own group so people don't see it right away Pull Request resolved: https://github.com/pytorch/pytorch/pull/71444 Reviewed By: atalman Differential Revision: D33659650 Pulled By: janeyx99 fbshipit-source-id: f22fd21640a8747beeacce8857bbb8281efd76f4 (cherry picked from commit `e25970abf9`)	2022-01-19 16:47:36 +00:00
Andrey Talman	7ed2a43d26	Adding wheels with py3.10 (#71419 ) Summary: Adding wheels with py3.10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71419 Reviewed By: janeyx99 Differential Revision: D33657770 Pulled By: atalman fbshipit-source-id: 5d24f1771991ff07fbfd92d04d3d5211cf53084c (cherry picked from commit `bf2f2624e1`)	2022-01-19 16:40:39 +00:00
Pritam Damania	b56ba296b1	Support multiple input dims for sharded linear. (#70266 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70266 Addresses some of the issues mentioned in https://github.com/pytorch/pytorch/issues/65638. ShardedLinear implementation only support 2D inputs. On the other hand `nn.Linear` supports arbitrary dimensions for inputs and outputs. As a result, in this PR I've added support to ensure that ShardedLinear supports arbitrary input dims as well. ghstack-source-id: 147206607 Test Plan: waitforbuildbot Reviewed By: wanchaol Differential Revision: D33267630 fbshipit-source-id: 0460994c3aa33348b80547d9274206ef90cb29b6 (cherry picked from commit `7c289e1dbf`)	2022-01-19 08:07:14 +00:00
Rohan Varma	fbc3b8c1bb	[RPC] Fix a few flaky RPC tsan tests (#71460 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71460 When running with TSAN, we use a larger RPC timeout: https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/dist_utils.py#L68. As a result, the assertions here are invalid. Tried to fix this by just setting `self.rpc_backend_options.rpc_timeout` to the new timeout, but `rpc_backend_options` is reconstructed every time it is accessed, so this doesn't work:: https://github.com/pytorch/pytorch/blob/master/torch/testing/_internal/distributed/rpc/tensorpipe_rpc_agent_test_fixture.py#L15 Just removing the asserts should be fine as they don't really add value to what's being tested. ghstack-source-id: 147208455 Test Plan: CI Reviewed By: fduwjj Differential Revision: D33648421 fbshipit-source-id: 9a5052b1c851fe7f838792d8bdf17d0563b4aa00 (cherry picked from commit `96ddab3433`)	2022-01-19 06:12:43 +00:00
Chen Lai	9515213070	[Operator Versioning] Remove version compare as they are decoupled now (#71461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71461 After operator versioning work, the version in model file is used for operator versioning, while bytecode_version is used for bytecode versioning (for bytecode schema). They are two seperate things now and this comparison is not needed. ghstack-source-id: 147209286 Test Plan: CI Reviewed By: iseeyuan, tugsbayasgalan Differential Revision: D33648592 fbshipit-source-id: beaa136a728f88435176a00c07b2d521210f107f (cherry picked from commit `e90e650e1a`)	2022-01-19 04:51:45 +00:00
Pearu Peterson	677fab6d1d	Support broadcast_to on sparse COO tensors (#71073 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71073 cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33645744 Pulled By: cpuhrsch fbshipit-source-id: 4775c9636c4e868022a8c1bbfec93e351d1cf885 (cherry picked from commit `640f21e09a`)	2022-01-19 04:33:41 +00:00
Mike Ruberry	9b9b878c89	Fixes jiterator cache macro include + updates CUDA note with cache variables (#71452 ) Summary: Per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71452 Reviewed By: ngimel Differential Revision: D33646495 Pulled By: mruberry fbshipit-source-id: bbf627e6d7a724a83a3ea2ae9c0f50430f8d578e (cherry picked from commit `d1e72b144a`)	2022-01-19 03:45:05 +00:00
Peter Bell	125bdb6d51	empty_meta: Add functions that don't depend on Tensor (#70615 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70615 This adds `at::detail::empty_meta` and `at::detail::empty_strided_meta` to complement the cpu API. Test Plan: Imported from OSS Reviewed By: samdow Differential Revision: D33623678 Pulled By: ngimel fbshipit-source-id: 59e003116361fb547ec2c633bbc15a7973e21d0e (cherry picked from commit `b4f5836fa1`)	2022-01-19 03:41:20 +00:00
Mengchi Zhang	b4a75af758	[fx2trt] Export some options out (#71315 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71315 Add variables in LowerSetting to export options from TRTInterpreter and interpreter.run: - explicit precision - int8_mode Export skip_folding_node_fn options from split_const_subgraphs. Reviewed By: wushirong Differential Revision: D33585385 fbshipit-source-id: 3d20b69d255ad97487e462436ae479587a8e2118 (cherry picked from commit `f24a279517`)	2022-01-19 02:13:31 +00:00
Peter Bell	87215ed526	empty_strided: Factor out generic implementation (#70614 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70614 This creates an `empty_strided_generic` function which, similar to `empty_generic`, is a device-independent tensor constructor. This also adds `at::detail::empty_strided_cpu` to complement `at::detail::empty_cpu`. Test Plan: Imported from OSS Reviewed By: samdow Differential Revision: D33623679 Pulled By: ngimel fbshipit-source-id: 85994e88d664870bf425f398dfcdfc467885c694 (cherry picked from commit `2ff2a89df5`)	2022-01-19 01:54:16 +00:00
Matthias Braun	d5e9a276ea	Adapt to llvm marking SmallVector::set_size private (#71434 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71434 See also https://reviews.llvm.org/D115380 Reviewed By: zhuhan0 Differential Revision: D33638540 fbshipit-source-id: a55e51462dc0d8f55a75bb79d9d76db781a36af2 (cherry picked from commit `78d1d65f77`)	2022-01-19 00:54:03 +00:00
Eli Uriegas	30739f5329	ci: Change binary trigger to be nightly push (#71447 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71447 Changes the nightly build trigger to be based on pushes to the `nightly` branch instead of being based on the tagged push. This aligns it with our current CircleCI trigger and should make it so that it's easily viewable using tools like https://hud.pytorch.org/ci/pytorch/pytorch/nightly Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D33647102 Pulled By: seemethere fbshipit-source-id: c6757da35b7ec2d68bf36160dd7f3cb9ed040899 (cherry picked from commit `99b7b22650`)	2022-01-19 00:27:42 +00:00
Peter Bell	6f4c491c6b	empty_cpu: Add functions that don't depend on Tensor (#70613 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70613 This refactors `at::detail::empty_cpu` to use only `TensorBase` so you can construct tensors without including `Tensor.h`. It also adds a `TensorOptions` version to reduce friction in operators moving from the `at::empty` API. Test Plan: Imported from OSS Reviewed By: samdow Differential Revision: D33623682 Pulled By: ngimel fbshipit-source-id: 7a7b08bc2ed06830a3d698197a0c8389a096dc1d (cherry picked from commit `2e17ad0bbd`)	2022-01-19 00:01:58 +00:00
Yan Li	6964aa2ced	backout D33469839 (#71443 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71443 cogwheel test inline_cvr_infer_canary_pyper_model_publish is timing out. The convert_fx call takes > 20 mins for local and local_ro sub modules, which used to take ~ 2 mins. Test Plan: Fblearn flow run * the following cmd took 1113 seconds before the diff and 5002 seconds after. flow-cli clone-locally 320014219 --run-as-secure-group pytorch_at_scale --operators pyper_model_publish_workflow.pyper_model_publish_workflow.process_torch_package_model_files.process_non_sparse_parameters[0] Cogwheel test * Cogwheel test with packages in B3588 (the last good run) took 4694.48s * Cogwheel test with packages in B3590 (the first timeout) took 13975.83s * Cogwheel test with the following packages took 4535.04s * all packages in B3588 except the model publish * the model publish built with D33469839 (`043e84b3d2`) reversed (created D33633570) Reviewed By: albanD, jerryzh168 Differential Revision: D33633570 fbshipit-source-id: dc5e777c48a90c551641a3f79126461f6a60449e (cherry picked from commit `03ab65023a`)	2022-01-18 23:51:51 +00:00
Rohan Varma	4fd1992a60	[Docs][BE] DDP doc fix (#71363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71363 Looks like DDP example is currently broken as per https://discuss.pytorch.org/t/official-ddp-example-is-broken/141493. Fix the issue by setting the correct env variable. ghstack-source-id: 147080377 Test Plan: CI Reviewed By: mrshenli Differential Revision: D33607250 fbshipit-source-id: e0e7d03cc365c186253b959c4c5405a5e3609218 (cherry picked from commit `32472884ec`)	2022-01-18 22:24:51 +00:00
Taylor Robie	322f13d914	[Profiler] Fix memory profile type from recent refactor (#71417 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71417 I accidentally changed CPU_INSTANT_EVENT to CPU_OP, which broke TensorBoard. Test Plan: Make memory profiling unit test check this case. Reviewed By: aaronenyeshi Differential Revision: D33637286 fbshipit-source-id: c95945f6b85cd4168820bd4d2a9203274a0a5bd6 (cherry picked from commit `b1e258672a`)	2022-01-18 22:18:11 +00:00
Nikita Shulga	ff8fb717db	Fix `get_git_repo_dir` (#71448 ) Summary: Otherwise, rev-list will only pick-up commits in `.github` repo Before: ``` % git -C .github rev-list 1eb6146d967b2d09af37c54af411d03f0b790209..1ff7f65cc1ad499a71457368894ca14bed069749 -- . `598b55fd18` `ae089d6bdf` ``` After ``` % git -C . rev-list 1eb6146d967b2d09af37c54af411d03f0b790209..1ff7f65cc1ad499a71457368894ca14bed069749 -- . `1ff7f65cc1` `2ac58b0dc1` `598b55fd18` `55899528a2` `ae089d6bdf` ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/71448 Reviewed By: seemethere, atalman Differential Revision: D33644256 Pulled By: malfet fbshipit-source-id: fa2e06f6767e7702af6ce85471aea07fa58292c0 (cherry picked from commit `594cecc0e1`)	2022-01-18 22:12:41 +00:00
XiaobingSuper	b8679ee1fc	fix conv+bn folding issue when bn hasn't running states (#71259 ) Summary: Doing conv+bn folding which bn hasn't a running stats, there have error for JIT and FX path: ``` import torch import torch.nn as nn import torch.fx.experimental.optimization as optimization class M(nn.Module): def __init__(self): super(M, self).__init__() self.conv = nn.Conv2d(32, 64, 3, stride=2) self.bn = nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=False) def forward(self, x): x = self.conv(x) x = self.bn(x) return x x = torch.randn([1, 32, 50, 50]) model = M().eval() ''' # jit path with torch.no_grad(): traced = torch.jit.trace(model, x).eval() traced = torch.jit.freeze(traced) ''' # FX path fused_model = optimization.fuse(model) ``` expected result: 1. JIT path ``` Traceback (most recent call last): File "bn_test.py", line 27, in <module> traced = torch.jit.freeze(traced) File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/jit/_freeze.py", line 119, in freeze run_frozen_optimizations(out, optimize_numerics, preserved_methods) File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/jit/_freeze.py", line 167, in run_frozen_optimizations torch._C._jit_pass_optimize_frozen_graph(mod.graph, optimize_numerics) RuntimeError: Expected Tensor but got None ``` 2. FX path ``` Traceback (most recent call last): File "bn_test.py", line 31, in <module> model = optimization.fuse(model, inplace=True) File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/fx/experimental/optimization.py", line 71, in fuse fused_conv = fuse_conv_bn_eval(conv, bn) File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/nn/utils/fusion.py", line 11, in fuse_conv_bn_eval fuse_conv_bn_weights(fused_conv.weight, fused_conv.bias, File "/home/xiaobinz/miniconda3/envs/pytorch-master/lib/python3.8/site-packages/torch/nn/utils/fusion.py", line 23, in fuse_conv_bn_weights bn_var_rsqrt = torch.rsqrt(bn_rv + bn_eps) TypeError: unsupported operand type(s) for +: 'NoneType' and 'float' ``` This PR will fix this issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71259 Reviewed By: anjali411 Differential Revision: D33595049 Pulled By: davidberard98 fbshipit-source-id: 0fe56bb2bb25d6d54ebc53789d2ad22458da9012 (cherry picked from commit `5672c08378`)	2022-01-18 22:12:41 +00:00
Nikita Shulga	a986154950	Lazy import `packaging` in `torch_version` (#71345 ) Summary: As it is a pretty big package and to be used during normal course of PyTorch initialization Fixes https://github.com/pytorch/pytorch/issues/71280 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71345 Reviewed By: seemethere Differential Revision: D33594547 Pulled By: malfet fbshipit-source-id: e0abea82dbdc29914512b610692701140d3e68a2 (cherry picked from commit `1ff7f65cc1`)	2022-01-18 22:12:41 +00:00
Andrey Talman	efd274bbcb	Fix for windows builds with python 3.10 , getting rid of ssize_t (ssize_t is not a C++ defined type) (#71390 ) Summary: Fix for windows builds with python 3.10 , getting rid of ssize_t Here is the completed bin build : https://app.circleci.com/pipelines/github/pytorch/pytorch/441527/workflows/144edb79-b398-4d70-92fe-b63158c1b439/jobs/16954881 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71390 Reviewed By: samdow Differential Revision: D33637686 Pulled By: atalman fbshipit-source-id: fcdfca672dc20385a3d2339c20e69bd2d1717e88 (cherry picked from commit `2ac58b0dc1`)	2022-01-18 22:12:41 +00:00
Peiqi Yin	ea0524dbc3	[FIX LOG] Complete a '\n' in GRAPH_DEBUG (#70421 ) Summary: In file graph_executor.cpp, line 963, a '\n' is missing in GRAPH_DEBUG, which all other GRAPH_DEBUG places here holds. The output in GRAPH_DEBUG seems weird. [DEBUG graph_executor.cpp:963] After CheckInplace (end of runOptimization)graph(%0 : Float(, , , , requires_grad=0, device=cpu), Pull Request resolved: https://github.com/pytorch/pytorch/pull/70421 Reviewed By: Gamrix Differential Revision: D33596430 Pulled By: davidberard98 fbshipit-source-id: 0e7c3c02ce44bf925f0c45e96a382104059fe397 (cherry picked from commit `55899528a2`)	2022-01-18 22:12:41 +00:00
Eli Uriegas	02ac73a973	ci: Add PR trigger for binary builds workflows (#71431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71431 Adds a PR trigger based on paths to the binary build workflows to make it easier to test / verify changes to the binary build workflows without adding a bunch of skipped checks to the majority of our workflows Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: atalman Differential Revision: D33641276 Pulled By: seemethere fbshipit-source-id: 0ed65cbcebf06dfe998f81d67df817250dd1a716 (cherry picked from commit `598b55fd18`)	2022-01-18 21:19:27 +00:00
Nikita Shulga	5243986df6	Update `syncbranches` workflow (#71420 ) Summary: Use `pytorchmergebot` credentials to do the merge Infer sync branch name from the workflow rather than hardcode it Move common functions from `syncbranches.py` to `gitutils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/71420 Reviewed By: bigfootjon Differential Revision: D33638846 Pulled By: malfet fbshipit-source-id: a568fd9ca04f4f142a7f5f64363e9516f5f4ef1c	2022-01-18 11:31:57 -08:00
Jane Xu	1eb6146d96	Add manual simple retry to ECR login (#71287 ) Summary: Current retry with AWS_MAX_ATTEMPTS does not seem to work as we still get failures https://github.com/pytorch/pytorch/runs/4806177738?check_suite_focus=true This should hopefully alleviate Pull Request resolved: https://github.com/pytorch/pytorch/pull/71287 Reviewed By: malfet, seemethere Differential Revision: D33573788 Pulled By: janeyx99 fbshipit-source-id: 300fde9a9fa5a2da3e9d18b7989a3676500d8011	2022-01-18 10:56:53 -08:00
Peter Bell	2bb6a4f437	Generate aten_interned_strings.h automatically (#69407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69407 This generates aten_interned_strings.h from `native_functions.yaml` which is more like how it was originally done. The items deleted from `interned_strings.h` are duplicates that need to be removed in order for the code to compile, some of the remaining items may still be out of date but it is fairly benign even if that's the case. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D32923636 Pulled By: albanD fbshipit-source-id: a0fd6b3714e70454c5f4ea9b19da5e047d2a4687	2022-01-18 08:29:54 -08:00
Michael Dagitses	d665097cad	allow Bazel to build without glog and gflags (#70850 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70850 We support both, so we want to ensure both continue to work. ghstack-source-id: 146960552 Test Plan: Tested manually. A subsequent diff adds this test configuration to CI. Reviewed By: malfet Differential Revision: D33297464 fbshipit-source-id: 70e1431d0907d480c576239af93ef57036d5e4d7	2022-01-18 08:08:46 -08:00
Michael Dagitses	ffdc6b4994	extract //c10/macros to its own package (#70849 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70849 ghstack-source-id: 146960563 Test Plan: Bazel CI tests will protect this. Reviewed By: malfet Differential Revision: D33297235 fbshipit-source-id: 6504a977e82ad2f2232a74233b96cdea8bf94a20	2022-01-18 08:08:42 -08:00
Michael Dagitses	8d0e354191	fix CAFFE2_BUILD_MAIN_LIB to the correct C10_BUILD_MAIN_LIB (#70848 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70848 This is the C10 library, it that's the main lib we are building here. While here, use `local_defines` instead of `copts` for this definition. Both `copts` and `local_defines` only apply to the compilation units in the library, and not transitively. ghstack-source-id: 146998039 Test Plan: We are relying on CI to verify this doesn't cause any problems. Reviewed By: malfet Differential Revision: D33429420 fbshipit-source-id: b3fc84c0588bd43346e3f9f77e851d293bde9428	2022-01-18 08:05:20 -08:00
Erjia Guan	fd9e08df5d	Make Demux serializable with lambda function (#71311 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71311 Test Plan: Imported from OSS Reviewed By: NivekT Differential Revision: D33584552 Pulled By: ejguan fbshipit-source-id: 52324faf5547f9f77582ec170ec91ce3114cfc61	2022-01-18 06:47:54 -08:00
CodemodService FBSourceClangFormatLinterBot	f0db15122f	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D33629127 fbshipit-source-id: 47befcd98cfa544a4d822161d8bfbe8d7a788e4d	2022-01-18 01:50:08 -08:00
Mike Ruberry	d17f340a2e	The Cacherator (#71350 ) Summary: This PR adds a persistent filesystem cache for jitted kernels. The cache is disabled on Windows because it relies on POSIX headers. The cache writes, by default, to `~/.cache/torch/kernels`, but the location can be controlled by setting the `PYTORCH_KERNEL_CACHE_PATH`. A separate environment variable, `USE_PYTORCH_KERNEL_CACHE`, will disable all caching logic when set to zero. The use of a persistent fileystem cache dramatically lowers the "first call time" for an operator AFTER its has been compiled, because it skips (most of) the jit compilation process. On systems where we're compiling only to ptx that ptx still has to be just-in-time compiled by the driver API, so an additional latency of around 10 milliseconds is expected at first call time. On systems which compile to SASS the additional first call time latency is about one millisecond. This compares with times of 150 milliseconds+ for just-in-time kernel compilation. Files in the cache use a mostly human readable string that includes an SHA1 hash of the CUDA C string used to generate them. Note that this is not an SHA1 hash of the file's contents, because the contents are the compiled ptx or SASS. No verification is done when the file is loaded to ensure the kernel is what's expected, but it's far more likely you'll be struck by a meteor than observe two file names conflict. Using SHA1 hashes to generate unique ids this way is a common practice (GitHub does it, too). This cache design could be reused by other fusion systems and should allow us to jiterate more operations without fear of regressing the "incremental development" scenario where users are tweaking or extending programs slightly, rerunning then, and then repeating that process again and again. Without a cache each run of the program would have to recompile every jitted kernel, but with this cache we expect a negligible impact to the user experience. cc kshitij12345, xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71350 Reviewed By: ngimel Differential Revision: D33626671 Pulled By: mruberry fbshipit-source-id: d55df53416fbe46348623846f699f9b998e6c318	2022-01-17 23:52:14 -08:00
Peter Bell	7b9fff90d2	empty_generic: Remove redundant device argument (#70612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70612 The device information is embedded in the `DataPtr` returned from the allocator, so this argument is completely ignored. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D33623681 Pulled By: ngimel fbshipit-source-id: bea64707bb17d46debb0ed7c1175493df56fee77	2022-01-17 20:18:43 -08:00
Ivan Yashchuk	f93ffc9ea8	Sparse CSR: Handle zero matrix consistently for triangular_solve (#71304 ) Summary: This PR enables `test_block_triangular` tests on the CPU. These tests revealed that there was a problem with how the nnz==0 case is handled. Now we return a tensor filled with NaNs both on CUDA and CPU. cc nikitaved pearu cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/71304 Reviewed By: davidberard98 Differential Revision: D33600482 Pulled By: cpuhrsch fbshipit-source-id: d09cb619f8b6e54b9f07eb16765ad1c183c42487	2022-01-17 13:47:49 -08:00
Nolan O'Brien	17540c5c80	[warnings][Caffe2] Suppress warnings in non-c10 headers (#71370 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71370 Round out suppressing warnings in `caffe2` headers Test Plan: CI check Reviewed By: r-barnes Differential Revision: D33613084 fbshipit-source-id: 9306d480bd796aeae4d887ad26b6ddc2c571c9e4	2022-01-17 10:09:31 -08:00
Nolan O'Brien	cf47338191	[Caffe2][warnings] Suppress -Wimplicit-int-float-conversion in TypeSafeSignMath.h for clang (#71369 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71369 Suppress `-Wimplicit-int-float-conversion` in `TypeSafeSignMath.h` when building with clang Test Plan: CI check Reviewed By: r-barnes Differential Revision: D33612983 fbshipit-source-id: cff1239bc252d4a2f54a50a2bbcd48aeb8bf31ca	2022-01-17 10:05:21 -08:00
Xu Zhao	ddf97a59ca	Remove the dependency of pytorch nightly. (#71323 ) Summary: This PR removes the PyTorch nightly dependencies of TorchBench CI. Instead, it relies on the bisection script to install TorchBench dependencies (https://github.com/pytorch/benchmark/pull/694). This will unblock TorchBench CI users when the nightly build fails (e.g., https://github.com/pytorch/pytorch/issues/71260) RUN_TORCHBENCH: resnet18 TORCHBENCH_BRANCH: xz9/optimize-bisection Pull Request resolved: https://github.com/pytorch/pytorch/pull/71323 Reviewed By: wconstab Differential Revision: D33591713 Pulled By: xuzhao9 fbshipit-source-id: f1308ea33ece1f18196c993b40978351160ccc0c	2022-01-17 09:52:36 -08:00
Nolan O'Brien	a383d01774	[fbcode][warnings] Suppress warnings in caffe2/c10 (#71356 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71356 Suppress remaining header based warnings in `caffe2/c10` when building with `clang` Test Plan: CI pass Reviewed By: r-barnes Differential Revision: D33600097 fbshipit-source-id: e1c0d84a0bad768eb03e047d62b5379cf28b48e2	2022-01-15 18:34:08 -08:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	1ecfa1d61a	Load zip file in deploy interpreter (#71072 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71072 This PR replaces the old logic of loading frozen torch through cpython by directly loading zipped torch modules directly onto deploy interpreter. We use elf file to load the zip file as its' section and load it back in the interpreter executable. Then, we directly insert the zip file into sys.path of the each initialized interpreter. Python has implicit ZipImporter module that can load modules from zip file as long as they are inside sys.path. Test Plan: buck test //caffe2/torch/csrc/deploy:test_deploy Reviewed By: shunting314 Differential Revision: D32442552 fbshipit-source-id: 627f0e91e40e72217f3ceac79002e1d8308735d5	2022-01-15 14:39:59 -08:00
Jerry Zhang	08d8f81704	[quant][fix][fx][graphmode] Fix qconfig setting for fused modules (#71254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71254 when we configure linear and relu with the same qconfig, we currently have utility functions to also generate a qconfig for the fused linear relu module, but this code is not called in correct order before which resulted in unexpected behaviors. This PR fixes the issue. Please see test case for more details. (Test case is from Supriya) Test Plan: python test/test_quantization.py TestQuantizeFx.test_fused_module_qat_swap Imported from OSS Reviewed By: supriyar Differential Revision: D33558321 fbshipit-source-id: d95114dc4b77264e603c262c2da02a3de4acba69	2022-01-14 23:31:11 -08:00
Lucian Grijincu	bb49352354	caffe2/torch/csrc/jit/frontend/tree_views: workaround nvcc compiler error Test Plan: Move it outside the header so it's not seen by nvcc ``` $ buck2 build -c fbcode.platform=platform010 fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda Downloading buck2... [======================================================================] watchman fresh instance event, clearing cache Using disallowed linker flag 'arvr/third-party/toolchains/platform009/build/mesa/lib/libGL.so' in library rule 'fbsource//third-party/toolchains:opengl' Using disallowed linker flag 'arvr/third-party/freeglut/3.0.0/libs/x64-linux/libglut.a' in library rule 'fbsource//third-party/toolchains:GLUT' Action Failed for fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda (ovr_config//platform/linux:x86_64-fbcode-platform010-clang-6dbc4bb1b9a32829)#5: cxx_compile ngram_repeat_block_cuda_kernel.cu (pic) failed with non-zero exit code 1 debug information: action_digest=b2bda91d24dad53e960c740ef9a412cee1902d86:94 stdout: stderr: fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Property>]': fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:117: required from here fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Property>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'} 220 \| return Maybe<T>(Compound::create(TK_OPTION, range, {value})); \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note: initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)' 143 \| const SourceRange& range_, \| ~~~~~~~~~~~~~~~~~~~~~~~~ 144 \| TreeList&& trees_) { \| ^ fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Assign>]': fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:171: required from here fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Assign>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'} 220 \| return Maybe<T>(Compound::create(TK_OPTION, range, {value})); \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note: initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)' 143 \| const SourceRange& range_, \| ~~~~~~~~~~~~~~~~~~~~~~~~ 144 \| TreeList&& trees_) { \| ^ cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics command: buck-out/v2/gen/fbcode/999b02f9444004c1/tools/build/__wrap_nvcc.py__/wrap_nvcc.py -_NVCC_BIN_ fbcode ...<omitted>... ors/pytorch/lib/cuda/__ngram_repeat_block_cuda__/__objects__/ngram_repeat_block_cuda_kernel.cu.pic.o (rerun with -v to view the untruncated command) ``` Reviewed By: zhxchen17 Differential Revision: D33592885 fbshipit-source-id: a36dcb3c8265d009b2287f0a479695d1ddbf85aa	2022-01-14 21:58:31 -08:00
Lucian Grijincu	4bf1be898d	caffe: fix warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction" Summary: Need to bring in all signatures https://www.internalfb.com/code/fbsource/[36035b9e4e41813e215ffd5f4377d65b7259237e]/fbcode/caffe2/aten/src/ATen/core/function.h?lines=91-101 Test Plan: ``` Action Failed for fbcode//accelerators/pytorch/lib/cuda:ngram_repeat_block_cuda (ovr_config//platform/linux:x86_64-fbcode-platform010-clang-6dbc4bb1b9a32829)#5: cxx_compile ngram_repeat_block_cuda_kernel.cu (pic) failed with non-zero exit code 1 debug information: action_digest=988629a726bc4eabcaf334db2317a969958d5fd2:94 stdout: stderr: fbcode/caffe2/torch/csrc/jit/api/function_impl.h(11): warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction" fbcode/caffe2/torch/csrc/jit/api/function_impl.h(11): warning: overloaded virtual function "torch::jit::Function::call" is only partially overridden in class "torch::jit::GraphFunction" fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Property>]': fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:117: required from here fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Property>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'} 220 \| return Maybe<T>(Compound::create(TK_OPTION, range, {value})); \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note: initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)' 143 \| const SourceRange& range_, \| ~~~~~~~~~~~~~~~~~~~~~~~~ 144 \| TreeList&& trees_) { \| ^ fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h: In instantiation of 'static torch::jit::Maybe<T> torch::jit::Maybe<T>::create(const torch::jit::SourceRange&, const T&) [with T = torch::jit::List<torch::jit::Assign>]': fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:505:171: required from here fbcode/caffe2/torch/csrc/jit/frontend/tree_views.h:220:33: error: cannot convert 'const torch::jit::List<torch::jit::Assign>' to 'torch::jit::TreeList&&' {aka 'c10::SmallVector<c10::intrusive_ptr<torch::jit::Tree>, 4>&&'} 220 \| return Maybe<T>(Compound::create(TK_OPTION, range, {value})); \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~ fbcode/caffe2/torch/csrc/jit/frontend/tree.h:144:1: note: initializing argument 3 of 'static torch::jit::TreeRef torch::jit::Compound::create(int, const torch::jit::SourceRange&, torch::jit::TreeList&&)' 143 \| const SourceRange& range_, \| ~~~~~~~~~~~~~~~~~~~~~~~~ 144 \| TreeList&& trees_) { \| ^ cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option '-Wno-ignored-optimization-argument' may have been intended to silence earlier diagnostics cc1plus: note: unrecognized command-line option '-Wno-ambiguous-reversed-operator' may have been intended to silence earlier diagnostics command: buck-out/v2/gen/fbcode/999b02f9444004c1/tools/build/__wrap_nvcc.py__/wrap_nvcc.py -_NVCC_BIN_ fbcode ...<omitted>... ors/pytorch/lib/cuda/__ngram_repeat_block_cuda__/__objects__/ngram_repeat_block_cuda_kernel.cu.pic.o (rerun with -v to view the untruncated command) ``` Differential Revision: D33579670 fbshipit-source-id: 9acb443732feb3e921ce0fa5f38f21ed44f64114	2022-01-14 20:27:09 -08:00
Nikita Shulga	3ed27a96ed	[BE] Refactor repetitions into TorchVersion._cmp_wrapper` (#71344 ) Summary: First step towards https://github.com/pytorch/pytorch/issues/71280 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71344 Reviewed By: b0noI Differential Revision: D33594463 Pulled By: malfet fbshipit-source-id: 0295f0d9f0342f05a390b2bd4aa0a5958c76579b	2022-01-14 19:57:55 -08:00
Scott Wolchok	c43e0286a9	[PyTorch][Lazy] Make hashing null optionals cheap (#71290 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71290 The existing code called an out-of-line hash function on a constant. This is just going to get the same random-looking 64-bit integer every time, so I just changed the constant to an integer I generated with `hex(random.randint(0x1000000000000000, 0xFFFFFFFFFFFFFFFF))` to get the same effect but without the runtime hashing. ghstack-source-id: 146991945 Test Plan: CI Reviewed By: wconstab Differential Revision: D33574676 fbshipit-source-id: d6ce1e1cc0db67dfede148b7e3173508ec311ea8	2022-01-14 17:13:50 -08:00

1 2 3 4 5 ...

43090 Commits