Commit Graph

31000 Commits

Author SHA1 Message Date
Heitor Schueroff
ddeacf1565 Fix median bug on discontigous tensors (#46917)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46917

fixes https://github.com/pytorch/pytorch/issues/46814

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24633412

Pulled By: heitorschueroff

fbshipit-source-id: 54732671b298bdc2b04b13ab3a373892ee0933c3
2020-10-29 17:12:22 -07:00
James Reed
9bc8f071a3 [WIP] Move torch.fx into its own target (#46658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46658

ghstack-source-id: 115213192

Test Plan: waitforsadcastle

Reviewed By: zdevito, vkuzo

Differential Revision: D24374723

fbshipit-source-id: 2b5708001f5df2ffb21ea5e586e26030653ccdcf
2020-10-29 17:03:08 -07:00
Leon Gao
7190155408 [Transposed Conv]add ConvTranspose3d with FBGEMM as backend (#46608)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46608

introduce frontend API for quantized transposed convolution with only FBGEMM as backend.
ghstack-source-id: 115289210

Test Plan: https://www.internalfb.com/intern/testinfra/testconsole/testrun/7599824394184104/

Reviewed By: z-a-f

Differential Revision: D24369831

fbshipit-source-id: b8babd3ddbe0df8f4c8bc652bb745f85e0813797
2020-10-29 16:18:43 -07:00
Michael Carilli
3c643d112e Pin destination memory for cuda_tensor.to("cpu", non_blocking=True) (#46878)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39694.

[`torch.cuda._sleep(int(100 * get_cycles_per_ms()))`](https://github.com/pytorch/pytorch/pull/46878/files#diff-893b1eea27352f336f4cd832919e48d721e4e90186e63400b8596db6b82e7450R511-R513) in the test helps avoid flakiness noted by ngimel (https://github.com/pytorch/pytorch/pull/35144#issuecomment-602103631).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46878

Reviewed By: izdeby

Differential Revision: D24550403

Pulled By: xw285cornell

fbshipit-source-id: 1ecc35ef75f9a38ab332aacdf4835955105edafc
2020-10-29 15:42:55 -07:00
Natalia Gimelshein
e17b8dea1d fix calculation of number of elements to not overflow (#46997)
Summary:
Possibly fixes https://github.com/pytorch/pytorch/issues/46764.
Computing number of tensor elements in many cases is written as
```
int64_t numel = std::accumulate(oldshape.begin(), oldshape.end(), 1,
                                  std::multiplies<int64_t>());
```
This computes the product with the type of `1` literal, which is `int`. When there's more than INT_MAX elements, result overflows. In https://github.com/pytorch/pytorch/issues/46746, the tensor that was sent to reshape had 256^4 elements, and that was computed as `0`, so reshape was not done correctly.
I've audited usages of std::accumulate and changed them to use int64_t as `init` type.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46997

Reviewed By: albanD

Differential Revision: D24624654

Pulled By: ngimel

fbshipit-source-id: 3d9c5e6355531a9df6b10500eec140e020aac77e
2020-10-29 15:37:16 -07:00
Pritam Damania
78de12f588 Replace -f with -x for pytest tests. (#46967)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46967

Tests under `tests/distributed/_pipeline/sync` use pytest and
specifying the `-f` option for such tests as follows: `python test/run_test.py
-i distributed/_pipeline/sync/skip/test_api -- -f` doesn't work.

The equivalent option for pytest is `-x`. To resolve this issue, I've updated
`run_test.py` to replace `-f` with `-x` for pytest tests.

More details in https://github.com/pytorch/pytorch/issues/46782

#Closes: https://github.com/pytorch/pytorch/issues/46782
ghstack-source-id: 115440558

Test Plan:
1) waitforbuildbot
2) `python test/run_test.py -i distributed/_pipeline/sync/skip/test_api -- -f`

Reviewed By: malfet

Differential Revision: D24584556

fbshipit-source-id: bd87f5b4953504e5659fe72fc8615e126e5490ff
2020-10-29 15:28:06 -07:00
BowenBao
a4caa3f596 [ONNX] bump CI ort to 1.5.2 rel for stability (#46595)
Summary:
Recently the ort-nightly has become unstable and causing issues with CI tests. Switching to release package for now for stability, until the situation is improved.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46595

Reviewed By: houseroad

Differential Revision: D24566175

Pulled By: bzinodev

fbshipit-source-id: dcf36e976daeeb17465df88f28bc9673eebbb7b7
2020-10-29 14:51:38 -07:00
Edward Yang
843cab3f2e Delete TypeDefault.h and TypeDerived.h codegen entirely. (#47002)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47002

There was no good reason for TypeDerived.h (CPUType.h) codegen
to exist after static dispatch was deleted, and now that we
have Math alias key TypeDefault.h header is not needed either.
Sorry to anyone who was using these out of tree.

I didn't entirely delete TypeDefault.h as it has a use in
a file that I can't conveniently compile test locally.  Will
kill it entirely in a follow up.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24596583

Pulled By: ezyang

fbshipit-source-id: b5095d3509098ff74f836c5d0c272db0b2d226aa
2020-10-29 14:43:53 -07:00
Edward Yang
c689b4d491 Delete TypeDefault call code generation logic in VariableType (#47000)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47000

There is a new invariant that emit_body is only ever called when
strategy is 'use_derived', which means we can delete a bunch of code.
This removes the last use of TypeXXX.h headers.

Note that this change makes sense, as the TypeDefault entries are
registered as Math entries, which means they automatically populate
Autograd (and we no longer have to register them ourselves).  Ailing
did all the hard work, this is just the payoff.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24596584

Pulled By: ezyang

fbshipit-source-id: 6fa754b5f16e75cf2dcbf437887c0fdfda5e44b1
2020-10-29 14:43:50 -07:00
Edward Yang
41f8641f1e Delete SchemaRegister.cpp, make flag operate on TypeDefault.cpp (#46991)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46991

This change is motivated by a problem bdhirsh observed which is
that in internal builds that include both SchemaRegister.cpp
and TypeDefault.cpp, some operators have their schemas defined
multiple times.  Instead of dumping schema registrations in
multiple files, it seems better to just toggle how many schemas
we write into TypeDefault.cpp.

ljk53 observes that technically SchemaRegister.cpp is only needed by
full-JIT frontend, and not by light interpreter (to resolve schema
lookups).  However, in practice, the registration file seems to be
unconditionally loaded.  This change will make it harder to do the
optimization where we drop schemas in the light interpreter, but you
probably want to architect this differently (similar to per-op
registrations, DON'T do any registrations in ATen, and then write out
the schema registrations in a separate library.)

I took this opportunity to also simplify the TypeDefault generation
logic by reworking things so that we only ever call with None argument
when registering.  Soon, we should be able to just split these
files up entirely.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ljk53

Differential Revision: D24593704

Pulled By: ezyang

fbshipit-source-id: f01ea22a3999493da77b6e254d188da0ce9adf2f
2020-10-29 14:43:47 -07:00
Edward Yang
54d83296a9 Desugar missing dispatch field into singleton Math entry (#46970)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46970

Now that catchall declarations are reinterpreted as registrations to
dispatch key Math, we can now simplify code generation logic by directly
generating to Math, and bypasing logic for catchall.  This also helps
avoid bugs where we incorrectly classify some kernels as Math and others
as not, even though they get registered in the same way.

Bill of changes:
- Give Math its own unique TORCH_LIBRARY_IMPL
- Make it so NativeFunction.dispatch is always non-None.  Simplify
  downstream conditionals accordingly
- When parsing NativeFunction, fill in missing dispatch with a
  singleton Math entry (pointing to the cpp.name!)

One thing that is a little big about this change is a lot of kernels
which previously didn't report as "math" now report as math.  I picked
a setting for these booleans that made sense to me, but I'm not sure
if e.g. XLA will handle it 100% correctly.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24592391

Pulled By: ezyang

fbshipit-source-id: 2e3355f19f9525698864312418df08411f30a85d
2020-10-29 14:43:44 -07:00
Edward Yang
87e86fa84c Some miscellaneous cleanup in codegen (#46940)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46940

- Remove inaccurate generated comments
- Delete some dead code
- Delete some unused headers
- Delete unnecessary SparseTypeDerived.cpp template

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D24573971

Pulled By: ezyang

fbshipit-source-id: 3de05d9cd9bada4c73f01d6cfaf51f16ada66013
2020-10-29 14:43:41 -07:00
Edward Yang
dc6f723cb4 Delete Vulkan from code generator. (#46938)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46938

It turns out that after https://github.com/pytorch/pytorch/pull/42194
landed we no longer actually generate any registrations into this
file.  That means it's completely unnecessary.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: IvanKobzarev

Differential Revision: D24573518

Pulled By: ezyang

fbshipit-source-id: b41ada9e394b780f037f5977596a36b896b5648c
2020-10-29 14:40:54 -07:00
Ailing Zhang
156c08b0d9 view_as_real doesn't work for all backends since it relies on strides. (#47018)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47018

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D24607340

Pulled By: ailzhang

fbshipit-source-id: c7fd85cd636ae9aebb22321f8f1a255af81a473f
2020-10-29 14:33:19 -07:00
Nikolay Korovaiko
71c0133e23 enable PE everywhere but mobile (#47001)
Summary:
enable PE everywhere but mobile

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47001

Reviewed By: eellison

Differential Revision: D24596252

Pulled By: Krovatkin

fbshipit-source-id: 3e3093a43287e1ff838cb03ec0e53c11c82c8dd2
2020-10-29 14:22:56 -07:00
Basil Hosmer
377a09c8e8 reland fast TypeMeta/ScalarType conversion (#45544)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45544

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24006482

Pulled By: bhosmer

fbshipit-source-id: 5da2401ab40bbf58da27a5d969e00bcee7562ed6
2020-10-29 14:07:39 -07:00
shubhambhokare1
1ea14e30f5 [ONNX] Enable NoneType inputs to export API (#45792)
Summary:
Enables the use of NoneType arguments to inputs tuple in the export API

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45792

Reviewed By: heitorschueroff

Differential Revision: D24312784

Pulled By: bzinodev

fbshipit-source-id: 1717e856b56062add371af7dc09cdd9c7b5646da
2020-10-29 13:56:52 -07:00
Wang Xu
c556d4550c fix_combine_two_partition_size (#47053)
Summary:
fix combine_two_partitions in Partitioner.py to calculate new partition used memory size after combining two partitions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47053

Reviewed By: gcatron

Differential Revision: D24624270

Pulled By: scottxu0730

fbshipit-source-id: a0e2a8486e012d02ea797d6ba36ab304d27cc93f
2020-10-29 13:40:44 -07:00
BowenBao
129b41226e [ONNX] Support nd mask index in opset >= 11 (#45252)
Summary:
Fixes below pattern for opset >= 11

`return tensor[tensor > 0]`

where rank of `tensor` > 1.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45252

Reviewed By: VitalyFedyunin

Differential Revision: D24116945

Pulled By: bzinodev

fbshipit-source-id: 384026cded1eb831bb5469e31ece4fcfb6ae8f2a
2020-10-29 13:32:59 -07:00
kshitij12345
1d233d7d1f [fix] torch.nn.functional.embedding -> padding_idx behavior (#46714)
Summary:
Reference https://github.com/pytorch/pytorch/issues/46585

Fix for second snippet in the mentioned issue.
```python
predefined_weights = torch.rand(10, 3)
result = torch.nn.functional.embedding(torch.LongTensor([1,2,0]), predefined_weights, padding_idx=0)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46714

Reviewed By: VitalyFedyunin

Differential Revision: D24593352

Pulled By: albanD

fbshipit-source-id: 655b69d9ec57891871e26feeda2aa0dcff73beba
2020-10-29 13:29:00 -07:00
Mingzhe Li
3e499e490a Bump up NCCL to v2.8 (#46742)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46742

Use NCCL v2.8

Test Plan: waitforsandcastle

Reviewed By: mrshenli

Differential Revision: D24488800

fbshipit-source-id: d39897da1499e63ca783a81aec1ce707606423a3
2020-10-29 13:17:58 -07:00
Rohan Varma
d850b5c98c Fix DDP issue where parameters share same grad_accumulator (#46755)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46755

As reported in https://github.com/pytorch/pytorch/issues/41324, there is a bug in DDP when `find_unused_parameters=True` and 2 or more parameters share the same gradient accumulator.

In the reducer, we currently keep a mapping of grad accumulator to index and populate it with map[accumulator] = index, but this overwrites indices when the accumulator is the same. To fix this, switch the mapping values to a vector of indices to hold all such indices that share the same accumulator.
ghstack-source-id: 115453567

Test Plan: Added UT

Reviewed By: pritamdamania87

Differential Revision: D24497388

fbshipit-source-id: d32dfa9c5cd0b7a8df13c7873d5d28917b766640
2020-10-29 12:23:06 -07:00
Martin Yuan
680571533b [RFC] Decouple fast pass functions (#46469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46469

There are some "fast_pass" function calls, where the symbols in `ATen/native` are directly referenced from outside of native at linking stage. This PR is to decouple one of the fast pass from native, while keeping the same functionality. `scalar_to_tensor` is included through `ATen/ATen.h`, which could be referenced by any cpp file including this header.

ghstack-source-id: 114485740

Test Plan: CI

Reviewed By: ezyang

Differential Revision: D24361863

fbshipit-source-id: 28d658688687b6cde286a6e6933ab33a4b3cf9ec
2020-10-29 12:18:50 -07:00
Xiong Wei
74d730c0b5 implement NumPy-like functionality column_stack, row_stack (#46313)
Summary:
Related https://github.com/pytorch/pytorch/issues/38349

This PR implements `column_stack` as the composite ops of `torch.reshape` and `torch.hstack`, and makes `row_stack` as the alias of `torch.vstack`.

Todo

- [x] docs
- [x] alias pattern for `row_stack`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46313

Reviewed By: ngimel

Differential Revision: D24585471

Pulled By: mruberry

fbshipit-source-id: 62fc0ffd43d051dc3ecf386a3e9c0b89086c1d1c
2020-10-29 12:14:39 -07:00
tmanlaibaatar
fee585b5a3 Correctly mark unannotated NamedTuple field to be inferred TensorType (#46969)
Summary:
If there is no annotation given, we want to show users that the type is inferred

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46969

Test Plan:
Added a new test case that throws an error with the expected error message

Fixes https://github.com/pytorch/pytorch/issues/46326

Reviewed By: ZolotukhinM

Differential Revision: D24614450

Pulled By: gmagogsfm

fbshipit-source-id: dec555a53bfaa9cdefd3b21b5142f5e522847504
2020-10-29 12:07:40 -07:00
Sam Estep
1e275bc1a6 Show Flake8 errors in GitHub CI again (#46990)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46985.

Can someone comment on whether the "Run flake8" step should fail if `flake8` produces errors? This PR makes sure the errors are still shown, but [the job linked from the issue](https://github.com/pytorch/pytorch/runs/1320258832) also shows that the failure of that step seems to have caused the "Add annotations" step not to run.

Is this what we want, or should I instead revert back to the `--exit-zero` behavior (in this case by just removing the `-o pipefail` from this PR) that we had before https://github.com/pytorch/pytorch/issues/46740? And if the latter, then (how) should I modify this `flake8-py3` job to make sure it fails when `flake8` fails (assuming it didn't already do that?)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46990

Reviewed By: VitalyFedyunin

Differential Revision: D24593573

Pulled By: samestep

fbshipit-source-id: 361392846de9fadda1c87d2046cf8d26861524ca
2020-10-29 11:59:30 -07:00
mfkasim91
6eaa324c9f Implement torch.igamma (#46183)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41637
This is regularized lower incomplete gamma function, equivalent to scipy's `gammainc` and tensorflow `igamma`.

cc fritzo mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46183

Reviewed By: gchanan

Differential Revision: D24479126

Pulled By: mruberry

fbshipit-source-id: fdf8ea289fe4ca1b408810732192411e948fcdfe
2020-10-29 11:40:18 -07:00
Blaise Sanouillet
dd95bf65b6 [caffe2/FC DNNLOWP] Shrink Y_int32_ vector capacity when appropriate
Summary:
The FullyConnectedDNNLowPOp::Y_int32_ vectors consume between 1GB and 2GB on one of FB's larger applications. By adding tracing I noticed that the number of elements in each instance oscillates wildy over time. As the buffer backing a vector can only be extended in a resize operation, this means there is wasted memory space. So as a simple optimization, I added code to right-size the buffer backing the vector when the number of elements is less than half the vector capacity at that point; this doesn't affect the existing elements.

There is of course a memory/cpu tradeoff here - with the change we are doing more mallocs and frees. I added tracing to measure how many times we grow or shrink per second: it's about 100 per second on average, which is not a great deal.

Test Plan:
Memory growth impact: over 24 hours and after the startup period, the memory consumed by this code grows from 0.85GB to 1.20GB vs 0.95GB to 1.75GB in the baseline. [ source: https://fburl.com/scuba/heap_profiles/wm47kpfe ]
https://pxl.cl/1pHlJ

Reviewed By: jspark1105

Differential Revision: D24592098

fbshipit-source-id: 7892b35f24e42403653a74a1a9d06cbc7ee866b9
2020-10-29 11:19:45 -07:00
Stephen Jia
38265acfbe Add Mul op for Vulkan (#47021)
Summary:
Updates mul_scalar shader to support the new Vulkan API, and adds a new op for it using the new API.

Also adds an in-place version for the op.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47021

Test Plan:
Unit test included. To build & run:
```
BUILD_CUSTOM_PROTOBUF=OFF \
  BUILD_TEST=ON \
  USE_EIGEN_FOR_BLAS=OFF \
  USE_FBGEMM=OFF \
  USE_MKLDNN=OFF \
  USE_NNPACK=OFF \
  USE_NUMPY=OFF \
  USE_OBSERVERS=OFF \
  USE_PYTORCH_QNNPACK=OFF \
  USE_QNNPACK=OFF \
  USE_VULKAN=ON \
  USE_VULKAN_API=ON \
  USE_VULKAN_SHADERC_RUNTIME=ON \
  USE_VULKAN_WRAPPER=OFF \
  MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python3 setup.py develop --cmake && ./build/bin/vulkan_api_test
```

Reviewed By: AshkanAliabadi

Differential Revision: D24624729

Pulled By: SS-JIA

fbshipit-source-id: 97e76e4060307a9a24311ac51dca8812e4471249
2020-10-29 11:14:25 -07:00
Nikita Shulga
2b6a720eb1 Update pybind to 2.6.0 (#46415)
Summary:
Preserve PYBIND11 (63ce3fbde8) configuration options in `torch._C._PYBIND11 (63ce3fbde8)_COMPILER_TYPE` and use them when building extensions

Also, use f-strings in `torch.utils.cpp_extension`

"Fixes" https://github.com/pytorch/pytorch/issues/46367

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46415

Reviewed By: VitalyFedyunin

Differential Revision: D24605949

Pulled By: malfet

fbshipit-source-id: 87340f2ed5308266a46ef8f0317316227dab9d4d
2020-10-29 10:53:47 -07:00
Sameer Deshmukh
2249a293b7 Fix segfault with torch.orgqr. (#46700)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41768

The fault was that a NULL `tau` would get passed to LAPACK function. This PR fixes that by checking whether the `tau` contains 0 elements at the beginning of the function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46700

Reviewed By: albanD

Differential Revision: D24616427

Pulled By: mruberry

fbshipit-source-id: 92e8f1489b113c0ceeca6e54dea8b810a51a63c3
2020-10-29 10:34:39 -07:00
Ivan Yashchuk
f629fbe235 Added torch.linalg.tensorsolve (#46142)
Summary:
This PR adds `torch.linalg.tensorsolve` function that matches `numpy.linalg.tensorsolve`.

Ref https://github.com/pytorch/pytorch/issues/42666.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46142

Reviewed By: izdeby

Differential Revision: D24539400

Pulled By: mruberry

fbshipit-source-id: 6e38364fe0bc511e739036deb274d9307df119b2
2020-10-29 10:29:28 -07:00
Richard Barnes
13b4127c95 Fix implicit conversion (#46833)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46833

Implicit integer conversions are causing compiler warnings. Since in this case the logs make it pretty clear that the `unsigned` types won't overflow despite 64-bit inputs, we fix the issue by making the downconversion explicit.

Test Plan: Standard test rig.

Reviewed By: malfet

Differential Revision: D24481377

fbshipit-source-id: 4422538286d8ed2beb65065544016fd430394ff8
2020-10-29 10:22:37 -07:00
Rohan Varma
ecdbea77bc Fix DDP documentation (#46861)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46861

Noticed that in the DDP documentation:
https://pytorch.org/docs/master/generated/torch.nn.parallel.DistributedDataParallel.html?highlight=distributeddataparallel
there were some examples with `torch.nn.DistributedDataParallel`, fix this to
read `torch.nn.parallel.DistributedDataParallel`.
ghstack-source-id: 115453703

Test Plan: ci

Reviewed By: pritamdamania87, SciPioneer

Differential Revision: D24534486

fbshipit-source-id: 64b92dc8a55136c23313f7926251fe825a2cb7d5
2020-10-29 09:13:47 -07:00
Sebastian Messmer
262bd6437a Show old kernel location when there are mismatches (#46850)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46850

So far, in the error messages when kernel signatures mismatched, we showed the location where the second kernel came from,
but we didn't show the location of the first kernel. This PR now shows the location of both.
ghstack-source-id: 115468616

Test Plan: waitforsandcastle

Reviewed By: ezyang

Differential Revision: D24540368

fbshipit-source-id: 3b4474062879d17f9bb7870ad3814343edc1b755
2020-10-29 08:30:49 -07:00
ashish
dfdc1dbee4 Disable softmax tests on ROCm (#46793)
Summary:
This PR disables the test_softmax and test_softmax_results in test_nn.py that were enabled in https://github.com/pytorch/pytorch/issues/46363. The softmax tests are causing failure on gfx906 machines. Disabling those until we root cause and fix them on 906.

cc: jeffdaily ezyang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46793

Reviewed By: izdeby

Differential Revision: D24539211

Pulled By: ezyang

fbshipit-source-id: 633cb9dc497ad6359af85b85a711c4549d772b2a
2020-10-29 08:05:36 -07:00
Brandon Lin
4a581ba6c2 Implement LengthsToOffsets operator in Caffe2 (#46590)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46590

This operator is very similar to LengthsToRanges but doesn't pack the offsets next to the original lengths.

Reviewed By: yf225

Differential Revision: D24419746

fbshipit-source-id: aa8b014588bb22eced324853c545f8684086c4e4
2020-10-29 07:03:34 -07:00
Kunal Bhalla
18d273dc0e [RFC][LocalSession] Fix workspace type
Summary: I was reading/looking into how LocalSession works and realized that the workspace type being passed around was the bound function on TaskGroup instead of the actual type. This meant that all workspaces for localsession would always be global, because they'd never match the private workspace type.

Test Plan: <not sure, could use some suggestions>

Reviewed By: cryptopic

Differential Revision: D24458428

fbshipit-source-id: 0f87874babe9c1ddff25b5363b443f9ca37e03c1
2020-10-29 04:12:17 -07:00
James Reed
d0df29ac22 [FX] Put inf and nan in globals instead of with an import string (#47035)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47035

Chillee thought the `from math import inf, nan` string at the top of `.code` was annoying so here's an alternative way to do it by putting those values in `globals` before we `exec`

Test Plan: Imported from OSS

Reviewed By: dzhulgakov

Differential Revision: D24611278

Pulled By: jamesr66a

fbshipit-source-id: c25ef89e649bdd3e79fe91aea945a30fa7106961
2020-10-29 00:35:41 -07:00
Yi Wang
cab32d9cdf [RPC Framework] Support remote device format "<workername>/<device>" (#46773)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46773

Changed the constructor of RemoteModule to accept a `remote_device` arg in the following format:
"<workername>/<device>" (e.g., "trainer0/cpu", "ps0/cuda:0")

This arg merges the original `on` and `device` arg.

Original PR issue: RemoteDevice Format #46554
ghstack-source-id: 115448051

Test Plan: buck test mode/dev-nosan caffe2/test/distributed/rpc:process_group_agent -- RemoteModule

Reviewed By: pritamdamania87

Differential Revision: D24482562

fbshipit-source-id: 5acfc73772576a4b674df27625bf560b8f8e67c1
2020-10-29 00:14:56 -07:00
Martin Yuan
b553c06abb Throw an exception in the constructor of torchscript serialization to avoid double-exception (#44266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44266

If PyTorchStreamWriter is writing to a file in a non-existing path, it throws an exception. In unwinding the destructor calls writeEndOfFile() and throws again. To avoid this double-exception, a check and throw is added in the constructor. In such case the destructor will not be called and the exception can go through the unwinding.

Test Plan: python test/test_jit.py TestSaveLoad.test_save_nonexit_file

Reviewed By: dreiss

Differential Revision: D23560770

Pulled By: iseeyuan

fbshipit-source-id: 51b24403500bdab3578c7fd5e017780467a5d06a
2020-10-28 22:41:19 -07:00
Dhruv Matani
9c1a41b724 [RFC] Add OperatorHandle overload to the RecordFunction::before() method (#46401)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46401

Broader context about selective/custom build available at https://fb.quip.com/2oEzAR5MKqbD and https://fb.workplace.com/groups/pytorch.mobile.team/permalink/735794523641956/

Basically, we want to be able to trace full operator names (with overload name). The current observer infra picks up the operator name from the schema, which doesn't seem to include the overload name. To ensure consistency with the existing uses and to accomodate the new use-case, this diff adds a new overload to accept an `OperatorHandle` object, and the code in `before()` eagerly resolves it to an `OperatorName` object (which can be cached in a member variable) as well as a string (view) operator-name which has the same semantics as before.

Why do we pass in an `OperatorHandle` but then resolve it to an `OperatorName`? This might come across as a strange design choice (and it is), but it is grounded in practicality.

It is not reasonable to cache an `OperatorHandle` object but caching an `OperatorName` object is reasonable since it holds all the data itself.

An initial version of this change was trying to test this change in the `xplat` repo, which didn't work. Thanks to ilia-cher for pointing out that the dispatcher observing mechanism is disabled under a compile time flag (macro) for xplat.
ghstack-source-id: 114360747

Test Plan:
`buck test fbcode/caffe2/fb/test:record_function_test` succeeds. Also replicated this test in OSS in the file `test_misc.cpp` where the rest of the `RecordFunction` subsystem is being tested.

Ran benchmark as reqiested by ilia-cher

{P146511280}

Reviewed By: ilia-cher

Differential Revision: D24315241

fbshipit-source-id: 239f3081e6aa2e26c3021a7dd61f328b723b03d9
2020-10-28 22:38:26 -07:00
Xiong Wei
604e1b301a Fix negative column numbers for the torch.eye (#46841)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46757

Error out the negative column numbers and add the corresponding tests in the `test/test_tensor_creation_ops.py`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46841

Reviewed By: VitalyFedyunin

Differential Revision: D24593839

Pulled By: ngimel

fbshipit-source-id: b8988207911453de7811cf3ceb43747192cd689d
2020-10-28 22:29:25 -07:00
Kshiteej K
5c8aad1141 [numpy] torch.cos, torch.tan : promote integer inputs to float (#46706)
Summary:
References https://github.com/pytorch/pytorch/issues/42515

cc: mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46706

Reviewed By: izdeby

Differential Revision: D24537262

Pulled By: mruberry

fbshipit-source-id: e57377a625814a3f34a765ce6bfd63a33c02a5d9
2020-10-28 22:02:52 -07:00
Nikita Shulga
42a51148c1 Use f-strings in torch.utils.cpp_extension (#47025)
Summary:
Plus two minor fixes to `torch/csrc/Module.cpp`:
 - Use iterator of type `Py_ssize_t` for array indexing in `THPModule_initNames`
 - Fix clang-tidy warning of unneeded defaultGenerator copy by capturing it as `const auto&`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47025

Reviewed By: samestep

Differential Revision: D24605907

Pulled By: malfet

fbshipit-source-id: c276567d320758fa8b6f4bd64ff46d2ea5d40eff
2020-10-28 21:32:33 -07:00
Jiakai Liu
9d23fd5c00 [pytorch] get rid of cpp_type_str from pybind codegen (#46977)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46977

Clean up a few TODOs in the new python binding codegen.
Get rid of the _simple_type() hack and the uses of cpp_type_str.
Now python argument type strings and PythonArgParser unpacking methods
are directly generated from the original Type model.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24589209

Pulled By: ljk53

fbshipit-source-id: b2a6c3911d58eae49c031d319c8ea6f804e2cfde
2020-10-28 21:25:55 -07:00
Jiakai Liu
79474a1928 [pytorch] simplify tensor options logic in pybinding codegen (#46976)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46976

Technically, it's not semantic preserving, e.g.: emition of
'requires_grad' is no longer gated by 'has_tensor_return' - there is no
guarantee that is_like_or_new_function should all have tensor return.
But the output is identical so there might be some invariant - could
also add assertion to fail loudly when it's broken.

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24589211

Pulled By: ljk53

fbshipit-source-id: 47c7e43b080e4e67a526fde1a8a53aae99df4432
2020-10-28 21:22:59 -07:00
Wang Xu
a86b3438eb add support for different memory sizes on size_based_partition (#46919)
Summary:
WIP: add support for different memory sizes on size_based_partition, so the size_based_partition could support different logical devices with different memory sizes. Compared to the original size_based_partition, the new one also supports partition to logical device mapping. Multiple partitions can be mapped into one device if the memory size is allowed. A test unit test_different_size_partition is also added.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46919

Reviewed By: gcatron, VitalyFedyunin

Differential Revision: D24603511

Pulled By: scottxu0730

fbshipit-source-id: 1ba37338ae054ad846b425fbb7e631d3b6c500b6
2020-10-28 21:11:41 -07:00
Jerry Zhang
c2a3951352 [quant][graphmode][fx] Remove inplace option for convert_fx (#46955)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46955

Initially we were thinking of adding a `invalidate_quantized_float_parameters` option to free the memory
of quantized floating parameters, but it turns out we will do module swap just like in eager mode for the modules
that are quantized, so the old floating point module will not be referenced after quantization. therefore this feature
is only needed for functionals, since most people are using quantization with modules we may not need this.

we'll revisit after we find there is a need for this.

Test Plan: Imported from OSS

Reviewed By: supriyar

Differential Revision: D24579400

fbshipit-source-id: fbb0e567405dc0604a2089fc001573affdade986
2020-10-28 21:07:19 -07:00
Pritam Damania
ad260ae7fd Disable test_joing_running_workers for TSAN. (#46966)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46966

These tests had false positives in TSAN for modifying thread local
variables:

```
WARNING: ThreadSanitizer: data race (pid=5364)
  Write of size 8 at 0x7b2c0004ff70 by thread T2:
    #0 free <null> (libtools_build_sanitizers_tsan-py.so+0xde6ad)
    #1 __GI__dl_deallocate_tls

  Previous write of size 1 at 0x7b2c0004ff71 by thread T3:
    #0 at::GradMode::set_enabled(bool) caffe2/aten/src/ATen/core/grad_mode.cpp:20 (libcaffe2_ATen-core.so+0x40e013)
    #1 torch::autograd::set_grad_enabled(_object*, _object*) caffe2/torch/csrc/autograd/init.cpp:143 (libcaffe2__C_impl_cuda.so+0x115ef0e)
    #2 _PyMethodDef_RawFastCallKeywords

  Thread T3 (tid=5385, finished) created by main thread at:
    #0 pthread_create <null> (libtools_build_sanitizers_tsan-py.so+0xc5a86)
    #1 PyThread_start_new_thread
```
ghstack-source-id: 115330433

Test Plan: waitforbuildbot

Reviewed By: mrshenli

Differential Revision: D24584411

fbshipit-source-id: e35f704dfcb7b161a13a4902beaf8b1e41ccd596
2020-10-28 19:28:04 -07:00