pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Meghan Lele	05b802d4e0	[pytorch] Bring back RemoveInplaceOps() (#62200 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62200 This commit brings back the `RemoveInplaceOps` pass removed in D29523283 (`dec5aa2260`) that apparently had a bunch of internal users. Test Plan: danthe3rd Reviewed By: danthe3rd Differential Revision: D29833316 fbshipit-source-id: 6cf13d463ab0a5e50ba3eb3243f79a9c51623809	2021-07-28 12:00:38 -07:00
Raghavan Raman	b91a917616	[Static Runtime] Fixed another build failure in OSS due to test_utils.h (#62338 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62338 Test Plan: Imported from OSS Reviewed By: d1jang Differential Revision: D29965744 Pulled By: navahgar fbshipit-source-id: cf3e54ac13432ea8afc4b718fac6c9768743d01b	2021-07-28 11:41:33 -07:00
Thomas J. Fan	7c588d5d00	ENH Adds no_batch_dim support for pad 2d and 3d (#62183 ) Summary: Towards https://github.com/pytorch/pytorch/issues/60585 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62183 Reviewed By: ejguan Differential Revision: D29942250 Pulled By: jbschlosser fbshipit-source-id: d1df4ddcb90969332dc1a2a7937e66ecf46f0443	2021-07-28 11:10:44 -07:00
zhouzhuojie	6da4a25509	Use private squid proxy (#62244 ) Summary: This PR adds a private squid proxy (note that the internal ELB is only accessible from the private VPC subnets of GitHub Runners) that's deployed dedicated for PyTorch CI for GitHub runners. ``` dig $SQUID_PROXY 10.0.x.x 10.0.x.x ``` http_proxy and https_proxy are compatible with the following http clients: - curl - wget - python Existing cache policy: refresh_pattern -i .(7z\|deb\|rpm\|exe\|zip\|tar\|tgz\|gz\|ram\|rar\|bin\|tiff\|bz2\|run\|csv\|sh)$ 1440 80% 2880 It uses the standard squid refresh_pattern for cache requests. In our setup, we tried to cache at least (1440 minutes - 1 day) and at max (2880 minutes - 2 days), with last-modified factor 80% (squid doc). Please refer to pytorch/test-infra for details. Right now, it only applies to the build and test step, to limit the scope and make sure build and test are more reliable with egress cache. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62244 Test Plan: ``` # first time, cache miss (4min20s) http_proxy=$SQUID_PROXY https_proxy=$SQUID_PROXY curl -v -L http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz --output /tmp/tmp_mnist.zip 100 9680k 100 9680k 0 0 37836 0 0:04:21 0:04:21 --:--:-- 29908 # second time, cache hit (0s) http_proxy=$SQUID_PROXY https_proxy=$SQUID_PROXY curl -v -L http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz --output /tmp/tmp_mnist.zip 100 9680k 100 9680k 0 0 103M 0 --:--:-- --:--:-- --:--:-- 103M ``` Load Test Plan: ``` # ab load test with `-n 100` requests ab -X $SQUID_PROXY -n 100 http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz Concurrency Level: 1 Time taken for tests: 9.044 seconds Complete requests: 100 Failed requests: 0 Total transferred: 991326300 bytes HTML transferred: 991242200 bytes Requests per second: 11.06 [#/sec] (mean) Time per request: 90.442 [ms] (mean) Time per request: 90.442 [ms] (mean, across all concurrent requests) Transfer rate: 107040.50 [Kbytes/sec] received ``` Reviewed By: malfet Differential Revision: D29928698 Pulled By: zhouzhuojie fbshipit-source-id: 4ee78be0abe35411666c6121991b0addded57106	2021-07-28 10:37:42 -07:00
Yi Wang	2581dfc249	[Model Averaging] Create a base class for model averaging (#62111 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62111 This base class will be passed to the post-localSGD optimizer in the next PR. This way, the same post-localSGD optimizer can choose different model averaging algorithms. Proposal: https://github.com/pytorch/pytorch/issues/59699 ghstack-source-id: 134489187 Test Plan: buck test mode/dev-nosan caffe2/test/distributed:distributed_nccl_fork -- test_periodic_model_averager Reviewed By: rohan-varma Differential Revision: D29884954 fbshipit-source-id: 1dc5e35c58895902991567f633afd621c7108938	2021-07-28 10:15:36 -07:00
Howard Huang	a15fff0a7f	Revert D29794666: Remove faulty process group code Test Plan: revert-hammer Differential Revision: D29794666 (`afe3644321`) Original commit changeset: 0b35191cc072 fbshipit-source-id: 6467bc5100f4115f2fdb385e205740cd68c89743	2021-07-28 10:15:34 -07:00
Thomas J. Fan	71a6ef17a5	ENH Adds no_batch_dim tests/docs for Maxpool1d & MaxUnpool1d (#62206 ) Summary: Towards https://github.com/pytorch/pytorch/issues/60585 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62206 Reviewed By: ejguan Differential Revision: D29942341 Pulled By: jbschlosser fbshipit-source-id: a3fad774cee30478f7d6cdd49d2eec31be3fc518	2021-07-28 10:15:32 -07:00
Jerry Zhang	cdf85a82ed	[quant][graphmode][fx] Add reference pattern support for BatchNorm (#62215 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62215 including batchnorm2d, batchnorm3d, batchnormrelu2d and batchnormrelu3d Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29917524 fbshipit-source-id: 3a9520ff659cb21e6e2fe614973b3d08aa0af923	2021-07-28 10:14:16 -07:00
leslie-fang-intel	7443c90f15	optimize non lastdim softmax bf16 (#60371 ) Summary: Here is the PR to enable the softmax calculation with data type of `bfloat16` when not along the last dim. * Use bf16 specialization for forward calculation to reduce the bf16/fp32 cast in vec template. * Release the bf16 limitation for backward calculation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60371 Reviewed By: ejguan Differential Revision: D29563109 Pulled By: cpuhrsch fbshipit-source-id: f6b439fa3850a6c633f35db65ea3d735b747863e	2021-07-28 10:06:51 -07:00
Don Jang	68efa186cc	[static runtime] Implement aten::full (#62227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62227 Test Plan: Added `StaticRuntime.IndividualOps_Full` to cover the newly added code path. Reviewed By: hlu1 Differential Revision: D29923649 fbshipit-source-id: 722950137c35ae325590a670b97f03b395e8eac3	2021-07-28 09:50:27 -07:00
Rohan Varma	10c6811a6b	[DDP] Run test_ddp_new_tensor_in_fwd with static graph (#61992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61992 This test previously was not enabled for static graph but to ensure this feature is supported with DDPSink, enable it for static graph which currently passes outputs to DDPSink. ghstack-source-id: 134471406 Test Plan: CI Reviewed By: zhaojuanmao Differential Revision: D29830887 fbshipit-source-id: 2d3f750d9eb4289558ed21acccd172d83d9b82cc	2021-07-28 09:49:12 -07:00
Alban Desmaison	acf8907e94	These should be equivalent per the previous formula but breaks xla (#62329 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62329 Reviewed By: ejguan Differential Revision: D29961527 Pulled By: albanD fbshipit-source-id: 46e46726591f4c0c8faf6ec0d7136a2d4ca976ea	2021-07-28 09:23:51 -07:00
Jerry Zhang	f4baa83eae	[bc-breaking] reference option for conv produce a pattern instead of reference conv module (#61942 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61942 This PR changes is_reference=True for conv to produce a pattern consists of dequant - float conv - quant instead of reference conv module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29810656 fbshipit-source-id: 549237a62bfda4341a2a7474c124f5e33350e267	2021-07-28 09:13:40 -07:00
Richard Zou	52d1ffb789	Teach pytrees about namedtuple (#62292 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62292 This PR adds pytree support for namedtuples. The challenge about namedtuple is that each namedtuple class is actually different. This PR does the following: - it adds a namedtuple flatten/unflatten. The flatten function returns a context that is the actual type of the namedtuple subclass. The unflatten function uses that type to reconstruct the namedtuple - Special cases all pytree logic to consider all namedtuples the same. This is done by creating a `_get_node_type(pytree)` helper function that returns `namedtuple` if `pytree` is any namedtuple subclass. The effect of this is that all namedtuple subclasses will go through the namedtuple flatten/unflatten functions - Adds a `_namedtuple_flatten_spec` function for FX pytrees. This function flattens the namedtuple based on the spec and is equivalent to the `_tuple_flatten_spec`. Test Plan - new tests in test/test_pytree.py and test/test_fx.py Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D29947302 Pulled By: zou3519 fbshipit-source-id: 19c00665b13546642c315df0f243ad99b8e7ff7c	2021-07-28 06:27:44 -07:00
Nikita Shulga	c06b6e445f	Build M1 binaries with PocketFFT (#62222 ) Summary: As MKL is only available on x86_64 platform, clone header-only PocketFFT library and use it as FFT provider Fixes https://github.com/pytorch/pytorch/issues/62107 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62222 Reviewed By: ejguan Differential Revision: D29938718 Pulled By: malfet fbshipit-source-id: ac0bd98b5090d6c8a26c36c4e34a4d6e1d9f1a92	2021-07-27 22:41:29 -07:00
Nikita Shulga	cb2b5f06c9	Revert D29816592: [pytorch][PR] [fix] polygamma n>=1 Test Plan: revert-hammer Differential Revision: D29816592 (`b73d759708`) Original commit changeset: 2c020a6e4c32 fbshipit-source-id: 310c93ade300966366ef04f206a5908fb27745db	2021-07-27 22:14:10 -07:00
Amy He	73f1e2d1dc	[8/N] Nnapi backend delegation preprocess: New refactored design (#62225 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62225 Rewrote the preprocess function for Android NNAPI delegate. Previously, `preprocess()` called `convert_model_to_nnapi()` using Pybind and returned a NnapiModule that is serialized for mobile. Now, `preprocess()` calls a sub-function of `convert_model_to_nnapi()` and returns several preprocessed items (that were previously components of NnapiModule). Dictionary returned contains: "shape_compute_module": torch::jit::Module, "ser_model": torch::Tensor, "weights": List[torch.Tensor], "inp_mem_fmts": List[int], "out_mem_fmts": List[int] Purpose and Future: The purpose of these changes are to move more implementation from bytecode and Torchscript to the delegate API, since bytecode is less efficient. Now, only the shape computation uses bytecode. In the future, shape computation will be moved out of Torchscript as well. nnapi_backend_preprocess.cpp: preprocess implementation prepare.py: refactored a portion of `convert_model_to_nnapi()` to `process_for_nnapi()`, so preprocess can get components of NnapiModule Test: Ran `python test/test_jit.py TestNnapiBackend` and `python test/test_nnapi.py` on OSS successfully ghstack-source-id: 134444190 Test Plan: Ran `python test/test_jit.py TestNnapiBackend` and `python test/test_nnapi.py` on OSS successfully Reviewed By: raziel Differential Revision: D29922279 fbshipit-source-id: cadcf8908d8a745dc7abbe286e97d6ead937d4ab	2021-07-27 18:52:48 -07:00
Nikita Shulga	7aabda6d5d	Update nccl to v2.10.3-1 (#62276 ) Summary: Which at the time of creating PR is points to `7e51592129` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62276 Reviewed By: ngimel Differential Revision: D29940950 Pulled By: malfet fbshipit-source-id: 59c6fda76a9023af3adbfb5a96b83ca50950df6c	2021-07-27 18:32:53 -07:00
Nikita Shulga	1f1d01df3e	Revert D29943356: .github: Migrate ecr_gc to github actions Test Plan: revert-hammer Differential Revision: D29943356 (`8e0622abf1`) Original commit changeset: 493592baf2f7 fbshipit-source-id: f0e604aab2b828561adc3e8fabf0f39221e15615	2021-07-27 18:14:31 -07:00
Wanchao Liang	af0f083d42	[dist_optim] fix the bug of none grads on functional optimizers (#62249 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62249 parameter and grads passed to torch.optim.functional should always match, we should skip the parameters that have none gradients to avoid the size mismatch ghstack-source-id: 134452467 Test Plan: test_dist_optim_none_grads Reviewed By: mrshenli Differential Revision: D29929653 fbshipit-source-id: 4ca6167fecdfe1db422236655edee3aa59b8b044	2021-07-27 18:10:51 -07:00
Nikita Shulga	c0b806694f	Do not use deprecated data accessor in IndexKernel.cu (#62268 ) Summary: Fixes repeated warnings like: ``` /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/IndexKernel.cu: In lambda function: /var/lib/jenkins/workspace/aten/src/ATen/native/cuda/IndexKernel.cu:354:683: warning: 'T* at::Tensor::data() const [with T = c10::BFloat16]' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations] AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND3 (`e23ddf06e9`)(at::ScalarType::Half, at::ScalarType::Bool, at::ScalarType::BFloat16, iter.dtype(), "take_cuda", [&] { ^ /var/lib/jenkins/workspace/build/aten/src/ATen/core/TensorBody.h:559:1: note: declared here T * data() const { ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62268 Reviewed By: walterddr Differential Revision: D29937267 Pulled By: malfet fbshipit-source-id: 6413deb9762b973880f4a7db47652eacd013214f	2021-07-27 17:58:19 -07:00
Christopher Dewan	e3be185069	[PyTorch] Add KWargs support to script module forward (#62224 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62224 They underlying operator allows both args and kwargs, but we only expose args in this convenience method. this brings them in line while not changing any existing programs. Test Plan: CI Reviewed By: gunchu Differential Revision: D29920830 fbshipit-source-id: f4b2aa88d4a679e33595625b7ef355e4d14e54c4	2021-07-27 17:02:57 -07:00
Peter Bell	9776e1ff2f	Migrate thnn_conv_depthwise2d from THC to ATen (#62281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62281 Closes gh-24646, Closes gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. \| \| Master Forward (us) \| This PR Forward (us) \| \|------------------:\|:-------------------:\|:--------------------:\| \| Forward \| 133.5 \| 133.6 \| \| Backward (input) \| 1,102 \| 1,119 \| \| Backward (weight) \| 2,220 \| 2,217 \| Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D29943062 Pulled By: ngimel fbshipit-source-id: fc5d16496eb733743face7c5a14e532d7b8ee26a	2021-07-27 16:51:23 -07:00
Alban Desmaison	ba9423aa93	Fix forward ad for matrix power land race (#62291 ) Summary: Fix land race from https://github.com/pytorch/pytorch/pull/59993 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62291 Reviewed By: driazati, seemethere Differential Revision: D29946599 Pulled By: albanD fbshipit-source-id: 16411e1a0c298fad12a6a6788ec2427923b0112a	2021-07-27 16:17:51 -07:00
Peter Bell	171e13fde9	Rework PowKernel.cu (#62260 ) Summary: PowKernel.cu is the single slowest file to compile in all of pytorch, taking 7 m 34 s on my machine. After investigating, I discovered that the case with complex inputs and a cpu scalar for the first argument takes more than half that time just on its own. Noting that [`thrust::pow`] for complex is just `exp(log(base) * exponent)`, we can improve this kernel by precomputing `log(base)` on cpu and computing only the `exp` on CUDA. This is faster in both runtime and compile time. For 1 million elements, master takes 61.6 us vs 56.9 us with this PR. I also noticed that the constant exponent case is implemented twice, once in `gpu_kernel_with_scalars` and again in `pow_tensor_scalar_kernel`. Further, the `Pow.cpp` code detects cpu-scalar exponents and redispatches to the `tensor_scalar` overload, making the `gpu_kernel_with_scalars` version dead code. Now instead, we unconditionally run `tensor_tensor` and it will call into `tensor_scalar` if appropriate. With these changes, PowKernel.cu takes just 2 m 30 s to compile. [`thrust::pow`]: `368266e80e/thrust/detail/complex/cpow.h (L33)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/62260 Reviewed By: ejguan Differential Revision: D29938789 Pulled By: ngimel fbshipit-source-id: 7ab7d81ececc92a9e6e62e60b0a4f2e6e3146df8	2021-07-27 16:16:20 -07:00
Jerry Zhang	7507aeded5	[reland][bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) (#62277 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62277 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Imported from OSS Reviewed By: ejguan Differential Revision: D29941079 fbshipit-source-id: 84bdfc0bb872c34fc345875e545c8b323e77c41e	2021-07-27 15:46:44 -07:00
Jane Xu	24d94f5102	Limit smoke tests on PRs to just one config (#62288 ) Summary: When coming across the short runtime of a periodic job on this PR, I realized the current smoke tests on PRs set up was flawed. Previously an attempt for better future compatibility, our conditional for running smoke tests only was for USE_CUDA=1 on Windows. This is BAD and has unintended consequences, such as misleading results when a ci/scheduled workflow is triggered but fails to test the full test suite. e.g., with PR https://github.com/pytorch/pytorch/issues/62266 https://github.com/pytorch/pytorch/actions/runs/1071698069 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62288 Reviewed By: seemethere, ejguan Differential Revision: D29945540 Pulled By: janeyx99 fbshipit-source-id: 3cc91511c151f7348872b039c94d7752b6ea4692	2021-07-27 15:33:37 -07:00
Eli Uriegas	8e0622abf1	.github: Migrate ecr_gc to github actions (#62284 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62284 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: malfet, zhouzhuojie Differential Revision: D29943356 Pulled By: seemethere fbshipit-source-id: 493592baf2f7abe206e1fb17438bac4e908b1251	2021-07-27 15:11:01 -07:00
Eli Uriegas	d0e5ef5eba	.circleci: Remove conda-package-handling pin (#62290 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62290 No longer needed anymore. Fixes nightly failures that we're observing as well: ``` Jul 27 07:33:02 Found conflicts! Looking for incompatible packages. Jul 27 07:33:02 This can take several minutes. Press CTRL-C to abort. Jul 27 07:33:02 failed Jul 27 07:33:02 Jul 27 07:33:02 UnsatisfiableError: The following specifications were found Jul 27 07:33:02 to be incompatible with the existing python installation in your environment: Jul 27 07:33:02 Jul 27 07:33:02 Specifications: Jul 27 07:33:02 Jul 27 07:33:02 - conda-package-handling=1.6.0 -> python[version='>=2.7,<2.8.0a0\|>=3.6,<3.7.0a0\|>=3.7,<3.8.0a0\|>=3.8,<3.9.0a0'] Jul 27 07:33:02 Jul 27 07:33:02 Your python: python=3.9 ``` From: https://app.circleci.com/pipelines/github/pytorch/pytorch/356478/workflows/2102acf1-c92a-4a59-919c-61d32d3bcd71/jobs/15027876 Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: driazati Differential Revision: D29946501 Pulled By: seemethere fbshipit-source-id: 3e9182f4cbcf2aab185dbbc21b7a6171746e2281	2021-07-27 14:59:41 -07:00
Rong Rong (AI Infra)	8fe32c9c13	fix test-report uploading uniqueness issue (#62217 ) Summary: Should fix: https://github.com/pytorch/pytorch/issues/61978. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62217 Reviewed By: seemethere, ejguan Differential Revision: D29944444 Pulled By: walterddr fbshipit-source-id: 4b737d1535fd5cbfafb24245fad9ef67285f1dc0	2021-07-27 14:17:50 -07:00
Rong Rong (AI Infra)	190cdcb08c	remove print for status on scribe sending (#62285 ) Summary: Following up on https://github.com/pytorch/pytorch/issues/61768. Currently the printout is hugely long because each test case returns a status code OK without an exception. This should be avoided when no exception was raised from send_to_scribe. Removing the log printing when response without error Pull Request resolved: https://github.com/pytorch/pytorch/pull/62285 Reviewed By: zhouzhuojie Differential Revision: D29944461 Pulled By: walterddr fbshipit-source-id: fc3c2b88bba27c68521cef7079ca2b6197d2d58b	2021-07-27 14:16:32 -07:00
Mike Iovine	e1bee3eb30	[Static Runtime] Add missing unit tests for static runtime ops (#62238 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62238 Added tests for the following ops: * `aten::mul` * `aten::nan_to_num` * `aten::stack` * `aten::relu` * `aten::tanh` Reviewed By: hlu1 Differential Revision: D29914217 fbshipit-source-id: 6a6c39629310e7131127e24fdce7253ccdf80340	2021-07-27 14:12:21 -07:00
Sameer Deshmukh	4a15f4a902	Allow 0-dim batch sizes in Bilinear NN layer. (#47106 ) Summary: Part of the fix for https://github.com/pytorch/pytorch/issues/12013 Checks if the inputs and outputs are non-zero in order to allow the Bilinear layer to accept 0-dim batch sizes. The if-check for this checks for both input and output dim sizes since the `_trilinear` function is written to work with both forward and backward for Bilinear. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47106 Reviewed By: ejguan Differential Revision: D29935589 Pulled By: jbschlosser fbshipit-source-id: 607d3352bd4f88e2528c64408f04999960be049d	2021-07-27 13:59:42 -07:00
albanD	ab0354b650	All remaining linear/element-wise formulas (#59993 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59993 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29914594 Pulled By: albanD fbshipit-source-id: 2ffc5993cb66586e1458d7016774a03dfe786863	2021-07-27 13:06:46 -07:00
albanD	4c3eea26bd	Fix out= variant forward grad detection (#60499 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60499 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29914595 Pulled By: albanD fbshipit-source-id: c51bb3aed91ab1f6ebc57936143b249590a43bd5	2021-07-27 13:06:45 -07:00
albanD	4a36e2a223	Add forward AD inplace check and fix codegen (#60498 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60498 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29914593 Pulled By: albanD fbshipit-source-id: bde649d5a03639a240dfe5fe027c6a3f758428a4	2021-07-27 13:04:55 -07:00
Tanvir Zaman	df18d05429	Make bytes_read available for OperatorCost (#62059 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62059 GetOperatorCost in Workspace exposes flops and bytes_written only. Make the an additional piece, bytes_read, available from OperatorSchema::Cost. Test Plan: Added the two additional pieces in the unit test testGetOperatorCost in workspace_test buck test caffe2/caffe2/python:workspace_test -- testGetOperatorCost buck test //aml/ml_foundation/exp_platform/large_scale_training/distributed_hogwild/auto_device_placement/tests/... buck test //aiplatform/training/autotuning/tests/... buck test //aiplatform/training/pipelining/tests/... buck test //deeplearning/fblsim/tests/... Flow tests: ADP Greedy: f288078287 ADP MILP: f288079278 Reviewed By: CrazySherman, xtaofb Differential Revision: D29860676 fbshipit-source-id: 8b3a9f2bf17c0dae48cfe2800e8821bf441e0b03	2021-07-27 12:48:36 -07:00
JackCaoG	bba7800933	Add logical op symbol (#62063 ) Summary: This is for xla side [pr](https://github.com/pytorch/xla/pull/3054) to add logical op lowering Pull Request resolved: https://github.com/pytorch/pytorch/pull/62063 Reviewed By: ejguan Differential Revision: D29937449 Pulled By: bdhirsh fbshipit-source-id: ba421f6c2dad67395a383b5ed0b81ad9d59abe86	2021-07-27 12:19:56 -07:00
Laurence Rouesnel	3bdee2bbed	[jit] Rewrote DFS graph iterator to remove unnecessary local state (#61326 ) (#61980 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61980 Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D29917766 Pulled By: laurencer fbshipit-source-id: 536c4806636fe9e709e8bffdefa9320127064dea	2021-07-27 11:50:20 -07:00
Eli Uriegas	fa52b4b922	.github: chown workspace for render_test_results (#62207 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62207 Workspace was getting held back due to permission denied errors, let's ensure we have a chown'd / clean workspace for all render_test_results runs Signed-off-by: Eli Uriegas <eliuriegas@fb.com> Test Plan: Imported from OSS Reviewed By: walterddr, janeyx99 Differential Revision: D29915232 Pulled By: seemethere fbshipit-source-id: dd9fcc9c00d9665569bd8cfa57e5d2d8da965aac	2021-07-27 11:44:15 -07:00
Erjia Guan	acaac70f63	Revert D29883676: Migrate thnn_conv_depthwise2d from THC to ATen Test Plan: revert-hammer Differential Revision: D29883676 (`de3a4eb583`) Original commit changeset: 9b2ac62cdd8a fbshipit-source-id: d211d3cb7723b5d2e73de6941a7e649e5f78864f	2021-07-27 11:28:52 -07:00
Pritam Damania	82d81455ae	[2/N] Remove unittest.skip across all of torch.distributed. (#61887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61887 1) Introduced a `sandcastle_skip_if` decorator that ensures these tests just get passed on sandcastle. 2) Fixed all test files under `test/distributed` to not use `unittest.skip` Overall goal is to avoid using skips since sandcastle tags these tests as continuously skipping. ghstack-source-id: 134382237 Test Plan: waitforbuildbot Reviewed By: SciPioneer Differential Revision: D29784152 fbshipit-source-id: 17b4df6c5a55ff1d1e8e1de128fa679c3dfbcb7d	2021-07-27 10:53:23 -07:00
huqinghao	7fc96db45d	fix typo errors in quantization-support.rst Line320 (#44447 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/44379 change "`torch.per_channel_symmetric` — per tensor, symmetric" to "`torch.per_channel_symmetric` — per channel, symmetric" Pull Request resolved: https://github.com/pytorch/pytorch/pull/44447 Reviewed By: mruberry Differential Revision: D29909645 Pulled By: ezyang fbshipit-source-id: e1505d070ec2b335dd6503b528e6a2f3bda2f1e3	2021-07-27 10:42:29 -07:00
Edward Yang	5f7f08f498	Reenable AMP on XLA (#61861 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61861 Fixes https://github.com/pytorch/pytorch/issues/61804 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29881903 Pulled By: ezyang fbshipit-source-id: 91530c10fa37715bec33f477285da119415a9da9	2021-07-27 10:32:01 -07:00
Oleg Khabinov	a0c1c7e5d4	Fixing the case when starter nodes depend on get_attr node (#62234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62234 There was a typo that we caught until recently, thus making this fix. Reviewed By: 842974287 Differential Revision: D29924190 fbshipit-source-id: ee6259fcd41358aefe9680b419acc87c0c2821cb	2021-07-27 10:29:53 -07:00
Erjia Guan	8cdf16d1de	Revert D29810657: [bc-breaking] reference option for linear produce a pattern instead of reference linear module Test Plan: revert-hammer Differential Revision: D29810657 (`9df605133e`) Original commit changeset: 949615bbc017 fbshipit-source-id: 54597d1f9636b0f94ae01c66018ff2592e5c39fc	2021-07-27 10:10:13 -07:00
Nikita Vedeneev	d7ddae8e4f	det_backward: correct, more robust and with complex support [clone] (#61905 ) Summary: Clone of https://github.com/pytorch/pytorch/pull/58195 to ease the import. Done by request from anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61905 Reviewed By: albanD Differential Revision: D29937920 Pulled By: anjali411 fbshipit-source-id: 025892a8e6147790825b20458986730ad8c5bb0f	2021-07-27 10:08:26 -07:00
Peter Bell	de3a4eb583	Migrate thnn_conv_depthwise2d from THC to ATen (#62006 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62006 Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. \| \| Master Forward (us) \| This PR Forward (us) \| \|------------------:\|:-------------------:\|:--------------------:\| \| Forward \| 133.5 \| 133.6 \| \| Backward (input) \| 1,102 \| 1,119 \| \| Backward (weight) \| 2,220 \| 2,217 \| Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29883676 Pulled By: ngimel fbshipit-source-id: 9b2ac62cdd8a84e1a23ffcd66035b2b2fe2374d8	2021-07-27 10:00:25 -07:00
Jerry Zhang	9df605133e	[bc-breaking] reference option for linear produce a pattern instead of reference linear module (#61892 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61892 This PR changes is_reference=True for linear to produce a pattern consists of dequant - float linear - quant instead of reference linear module, this is useful for future transformations to custom backends, it is also helpful to simplify the implementation for convert in the future. Test Plan: python test/test_quantization.py TestQuantizeFxOps Imported from OSS Reviewed By: vkuzo Differential Revision: D29810657 fbshipit-source-id: 949615bbc017bc454d81c8a6b2bdec53badaab19	2021-07-27 09:49:20 -07:00
Amy He	6c6a9c73f2	[7/N] Nnapi backend delegation preprocess: compile_spec sanity check (#62213 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62213 Added sanity checks in preprocess function for Android NNAPI delegate. `preprocess()` requires some input metadata passed through its `method_compile_spec` function argument. `preprocess()` now throws specific error messages, if it cannot find the correct input arguments. Example error message: ``` RuntimeError: method_compile_spec does not contain the "forward" key. method_compile_spec should contain a Tensor or Tensor List which bundles input parameters: shape, dtype, quantization, and dimorder. For input shapes, use 0 for run/load time flexible input. method_compile_spec must use the following format: {"forward": {"inputs": at::Tensor}} OR {"forward": {"inputs": c10::List<at::Tensor>}} ``` nnapi_backend_preprocess.cpp: contains sanity check implementation test_backend_nnapi.py: sanity check unit tests Test: Ran `python test/test_jit.py TestNnapiBackend` in OSS successfully. TODO: Using Tensors to pass input parameters is a temporary hack. When a dedicated object is implemented, update the sanity check error message. ghstack-source-id: 134339282 Test Plan: Ran `python test/test_jit.py TestNnapiBackend` in OSS successfully. Reviewed By: raziel, iseeyuan Differential Revision: D29917004 fbshipit-source-id: 0d5c6b35889c556cda905ffc29c25c5422ae9ee4	2021-07-27 09:31:35 -07:00

1 2 3 4 5 ...

38881 Commits