Commit Graph

573 Commits

Author SHA1 Message Date
Taylor Robie
0225d3dc9d Add support for timing C++ snippets. (#47864)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47864

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D25199262

Pulled By: robieta

fbshipit-source-id: 1c2114628ed543fba4f403bf49c065f4d71388e2
2020-12-01 20:03:14 -08:00
Taylor Robie
17ea11259a Rework compat bindings. (#47863)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/47863

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D25199261

Pulled By: robieta

fbshipit-source-id: 0a4a0409ddb75c1bf66cd31d67b55080227b1679
2020-12-01 20:03:11 -08:00
Nikita Shulga
2dff0b3e91 Fix typos in comments (#48316)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48316

Reviewed By: walterddr, mrshenli

Differential Revision: D25125123

Pulled By: malfet

fbshipit-source-id: 6f31e5456cc078cc61b288191f1933711acebba0
2020-11-24 10:56:40 -08:00
Ilia Cherniavskii
f2da18af14 Add USE_KINETO build option (#45888)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45888

Adding USE_LIBKINETO build option

Test Plan:
USE_KINETO=1 USE_CUDA=1 USE_MKLDNN=1 BLAS=MKL BUILD_BINARY=1 python
setup.py develop install --cmake

Reviewed By: Chillee

Differential Revision: D25142221

Pulled By: ilia-cher

fbshipit-source-id: d1634a8f9599604ff511fac59b9072854289510c
2020-11-21 20:20:32 -08:00
Nikita Shulga
d7c8d3cccb Remove references to typing module from setup.py (#47677)
Summary:
It is part of core Python-3.6.2+

Fixes https://github.com/pytorch/pytorch/issues/47596

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47677

Reviewed By: walterddr

Differential Revision: D24860188

Pulled By: malfet

fbshipit-source-id: ad72b433a4493ebe5caca97c2e8a9d4b3c8172d4
2020-11-12 10:04:38 -08:00
peter
a08e8dd70c Fix python 3.9 builds on Windows (#47602)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/47460.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47602

Reviewed By: heitorschueroff

Differential Revision: D24832487

Pulled By: malfet

fbshipit-source-id: 8846caeac5e767e8066470d5c981218f147c88dc
2020-11-09 12:39:28 -08:00
Nikita Shulga
6f6025183f Skip iomp5 emebedding if torch_cpu could not be found (#47390)
Summary:
This would be the case when package is build for local development rather than for installation

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47390

Reviewed By: janeyx99

Differential Revision: D24738416

Pulled By: malfet

fbshipit-source-id: 22bd676bc46e5d50a09539c969ce56d37cfe5952
2020-11-04 14:22:53 -08:00
Nikita Shulga
3a0024574d Do not delete rpath from torch.dylib on Darwin (#47337)
Summary:
Fixes CI regressions introduced by https://github.com/pytorch/pytorch/issues/47262

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47337

Reviewed By: ngimel

Differential Revision: D24721954

Pulled By: malfet

fbshipit-source-id: 395b037b29c0fc3b62ca50bba9be940ad72e0c5b
2020-11-03 22:36:35 -08:00
Nikita Shulga
ca61b061f3 Update minimum supported Python version to 3.6.2 (#47314)
Summary:
As typing.NoReturn is used in the codebase

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47314

Reviewed By: seemethere

Differential Revision: D24712847

Pulled By: malfet

fbshipit-source-id: f0692d408316d630bc11f1ee881b695437fb47d4
2020-11-03 13:32:07 -08:00
Nikita Shulga
14194e4f23 Embed libiomp5.dylib into wheel package (#47262)
Summary:
libiomp runtime  is the only external dependency OS X package has if compiled with MKL
Copy it to the stage directory from one of the available rpathes
And remove all absolute rpathes, since project shoudl have none

Fixes https://github.com/pytorch/pytorch/issues/38607

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47262

Reviewed By: walterddr

Differential Revision: D24705094

Pulled By: malfet

fbshipit-source-id: 9f588a3ec3c6c836c8986d858fb53df815a506c8
2020-11-03 13:00:30 -08:00
Nikita Shulga
8c39f198b4 Fix typo in setup.py (#46921)
Summary:
Also, be a bit future-proof in support version list

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46921

Reviewed By: seemethere

Differential Revision: D24568733

Pulled By: malfet

fbshipit-source-id: ae34f8da1ed39b80dc34db0b06e4ef142104a3ff
2020-10-27 13:14:41 -07:00
Nikita Shulga
a38eeeff5c Make setup.py python 2 friendly (#46317)
Summary:
import print_function to make setup.py invoked by Python2 print human readable error:
```
% python2 setup.py
Python 2 has reached end-of-life and is no longer supported by PyTorch.
```
Also, remove `future` from the list of the PyTorch package install dependencies

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46317

Reviewed By: walterddr, bugra

Differential Revision: D24305004

Pulled By: malfet

fbshipit-source-id: 9181186170562384dd2c0e6a8ff0b1e93508f221
2020-10-14 16:37:06 -07:00
Nikita Shulga
45de2ee3ac Remove Python version upper boundary check (#46315)
Summary:
This prevents setup.py from erroring out when Python-3.9 is used

Fixes https://github.com/pytorch/pytorch/issues/46314

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46315

Reviewed By: heitorschueroff

Differential Revision: D24304846

Pulled By: malfet

fbshipit-source-id: 573a88ea8c1572d7d8a9991539effb3c228bffc9
2020-10-14 07:36:55 -07:00
Eli Uriegas
615013edcb setup: Dataclasses only when < 3.7 (#45844)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45844

Someone pointed out that dataclasses were actually added to the python
stdlib in 3.7 and not 3.8, so bumping down the dependency on dataclasses
from 3.8 -> 3.7 makes sense here

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: walterddr, malfet

Differential Revision: D24113367

Pulled By: seemethere

fbshipit-source-id: 03d2d93f7d966d48a30a8e2545fd07dfe63b4fb3
2020-10-05 13:29:21 -07:00
Michael Suo
18253f4a48 Fix BUILD_CAFFE2 if FBGEMM and NNPACK are not built (#45610)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45610

Also add to the usual documentation places that this option exists.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D24058199

Pulled By: suo

fbshipit-source-id: 81574fbd042f47587e2c7820c726fac0f68af2a7
2020-10-01 14:58:55 -07:00
Eli Uriegas
5959de3aeb setup: Only include dataclasses for py < 3.8 (#45611)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45611

dataclasses was made a standard library item in 3.8

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Test Plan: Imported from OSS

Reviewed By: walterddr

Differential Revision: D24031740

Pulled By: seemethere

fbshipit-source-id: 15bdf1fe0d8de9b8ba7912e4a651f06b18d516ee
2020-10-01 14:52:28 -07:00
Bugra Akyildiz
27c7158166 Remove __future__ imports for legacy Python2 supports (#45033)
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:

```2to3 -f future -w caffe2```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033

Reviewed By: seemethere

Differential Revision: D23808648

Pulled By: bugra

fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
2020-09-23 17:57:02 -07:00
Daily, Jeff
b98ac20849 install ATen/native/cuda and hip headers (#45097)
Summary:
The ATen/native/cuda headers were copied to torch/include, but then not included in the final package.  Further, add ATen/native/hip headers to the installation, as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45097

Reviewed By: mruberry

Differential Revision: D23831006

Pulled By: malfet

fbshipit-source-id: ab527928185faaa912fd8cab208733a9b11a097b
2020-09-22 17:43:47 -07:00
Michael Suo
161490d441 Move torch/version.py generation to cmake (#44577)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44577

I would like to to move this to cmake so that I can depend on it
happening from other parts of the build.

This PR pulls out the logic for determining the version string and
writing the version file into its own module. `setup.py` still receives
the version string and uses it as before, but now the code for writing
out `torch/version.py` lives in a custom command in torch/CMakeLists.txt

I noticed a small inconsistency in how version info is populated.
`TORCH_BUILD_VERSION` is populated from `setup.py` at configuration
time, while `torch/version.py` is written at build time. So if, e.g. you
configured cmake on a certain git rev, then built it in on another, the
two versions would be inconsistent.

This does not appear to matter, so I opted to preserve the existing
behavior.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D23734781

Pulled By: suo

fbshipit-source-id: 4002c9ec8058503dc0550f8eece2256bc98c03a4
2020-09-16 15:49:22 -07:00
Alexander Grund
d23f3170ef Remove pybind11 from required submodules (#44278)
Summary:
This can be taken from the system in which case it is not used from the submodule. Hence the check here limits the usage unnecessarily

ccing malfet

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44278

Reviewed By: malfet

Differential Revision: D23568552

Pulled By: ezyang

fbshipit-source-id: 7fd2613251567f649b12eca0b1fe7663db9cb58d
2020-09-09 08:07:13 -07:00
Edward Yang
6ea89166bd Rewrite of ATen code generator (#42629)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42629

How to approach reviewing this diff:

- The new codegen itself lives in `tools/codegen`. Start with `gen.py`, then read `model.py` and them the `api/` folder. The comments at the top of the files describe what is going on. The CLI interface of the new codegen is similar to the old one, but (1) it is no longer necessary to explicitly specify cwrap inputs (and now we will error if you do so) and (2) the default settings for source and install dir are much better; to the extent that if you run the codegen from the root source directory as just `python -m tools.codegen.gen`, something reasonable will happen.
- The old codegen is (nearly) entirely deleted; every Python file in `aten/src/ATen` was deleted except for `common_with_cwrap.py`, which now permanently finds its home in `tools/shared/cwrap_common.py` (previously cmake copied the file there), and `code_template.py`, which now lives in `tools/codegen/code_template.py`. We remove the copying logic for `common_with_cwrap.py`.
- All of the inputs to the old codegen are deleted.
- Build rules now have to be adjusted to not refer to files that no longer exist, and to abide by the (slightly modified) CLI.
- LegacyTHFunctions files have been generated and checked in. We expect these to be deleted as these final functions get ported to ATen. The deletion process is straightforward; just delete the functions of the ones you are porting. There are 39 more functions left to port.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D23183978

Pulled By: ezyang

fbshipit-source-id: 6073ba432ad182c7284a97147b05f0574a02f763
2020-08-31 09:00:22 -07:00
Hong Xu
9063bcee04 Don't proceed into setup.py too far if Python version is unsupported (#42870)
Summary:
This prevents confusing errors when the interpreter encounters some
syntax errors in the middle.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42870

Reviewed By: albanD

Differential Revision: D23269265

Pulled By: ezyang

fbshipit-source-id: 61f62cbe294078ad4a909fa87aa93abd08c26344
2020-08-28 09:04:55 -07:00
Luca Wehrstedt
c30bc6d4d7 Update TensorPipe submodule (#42522)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42522

Main changes:
- Consolidated CMake files to have a single entry point, rather than having a specialized one for PyTorch.
- Changed the way the preprocessor flags are provided, and changed their name.

There were a few instances in PyTorch's CMake files where we were directly adding TensorPipe's source directory as an include path, which however doesn't contain the auto-generated header we now added. We fix that by adding the `tensorpipe` CMake target as a dependency, so that the include paths defined by TensorPipe are used, which contain that auto-generated header. So instead we link those targets to the tensorpipe target in order for them to pick up the correct include directories.

I'm turning off SHM and CMA for now because they have never been covered by the CI. I'll enable them in a separate PR so that if they turn out to be flaky we can revert that change without reverting this one.

Test Plan: CI

Reviewed By: malfet

Differential Revision: D22959472

fbshipit-source-id: 1959a41c4a66ef78bf0f3bd5e3964969a2a1bf67
2020-08-06 02:14:58 -07:00
Ralf Gommers
dc1f87c254 Add typing_extensions as a dependency. (#42431)
Summary:
Closes gh-38221.

The related pytorch/builder PR: https://github.com/pytorch/builder/pull/475

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42431

Reviewed By: malfet

Differential Revision: D22916499

Pulled By: ezyang

fbshipit-source-id: c8fe9413b62fc7a6b829fc82aaf32531b55994d1
2020-08-03 20:06:16 -07:00
Nikita Shulga
f00a37dd71 Make setup.py Python-2 syntactically correct (#41960)
Summary:
Import __future__ to make `print(*args)` a syntactically correct statement under Python-2
Otherwise, if once accidentally invokes setup.py using Python-2 interpreter they will be greeted by:
```
  File "setup.py", line 229
    print(*args)
          ^
SyntaxError: invalid syntax
```
instead of:
```
Python 2 has reached end-of-life and is no longer supported by PyTorch.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41960

Reviewed By: orionr, seemethere

Differential Revision: D22710174

Pulled By: malfet

fbshipit-source-id: ffde3ddd585707ba1d39e57e0c6bc9c4c53f8004
2020-07-23 19:10:20 -07:00
Nikita Shulga
883e4c44b2 Raise exception when trying to build PyTorch on 32-bit Windows system (#40321)
Summary:
Makes errors in cases described in https://github.com/pytorch/pytorch/issues/27815 more obvious
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40321

Differential Revision: D22198352

Pulled By: malfet

fbshipit-source-id: 327d81103c066048dcf5f900fd9083b09942af0e
2020-06-23 16:54:20 -07:00
peter
0f39ed86a7 Cleanup debug info switches with MSVC (#39703)
Summary:
Switch off `/Z7` so that we don't generate debug info in Release and MinSizeRel builds, so that we will probably get smaller static libraries and object files and faster build time
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39703

Differential Revision: D21960684

Pulled By: ezyang

fbshipit-source-id: 909a237a138183591d667885b13fc311470eed65
2020-06-09 14:11:40 -07:00
Eli Uriegas
b7b7433561 setup: Add long description to wheel packages (#39676)
Summary:
Closes out https://github.com/pytorch/pytorch/issues/38354

For reference: https://packaging.python.org/guides/making-a-pypi-friendly-readme/

Should fill out the PyPI description as well.

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39676

Reviewed By: malfet

Differential Revision: D21940656

Pulled By: seemethere

fbshipit-source-id: 6c39500404227047d8f24936db0697fe44a6b9e8
2020-06-08 16:25:39 -07:00
Nikita Shulga
a864dbb360 Make _C extension a thin C wrapper (#39375)
Summary:
It just depends on a single `torch_python` library.
C library does not depend on standard C++ library and as result it closes https://github.com/pytorch/pytorch/issues/36941
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39375

Reviewed By: orionr

Differential Revision: D21840645

Pulled By: malfet

fbshipit-source-id: 777c189feee9d6fc686816d92cb9f109b8aac7ca
2020-06-02 13:11:59 -07:00
Meghan Lele
dd7eed5ae4 [JIT] Export JIT backend extension headers in setup.py (#38525)
Summary:
**Summary**
This commit adds the headers required to define and use JIT backends to
`package_data` in `setup.py` so that they are exported and copied to the
same place as the rest of the headers when PyTorch is installed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38525

Differential Revision: D21601806

Pulled By: SplitInfinity

fbshipit-source-id: 1615dd4047777926e013d7dd14fe427d5ffb8b70
2020-05-15 14:45:08 -07:00
David Reiss
328fc70b84 Remove (most) Python 2 support from setup.py (#35617)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35617

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up some cruft that we put in place to support it.

Test Plan: CI

Differential Revision: D20842883

Pulled By: dreiss

fbshipit-source-id: 18dc5219ba99658c0ca7e2f26863df008c420e6a
2020-05-14 10:06:20 -07:00
Edward Yang
6edf340338 Delete torch/__init__.pyi, deferring to direct extension stubs (#38157)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38157

This removes the error prone process of assembling `torch/__init__.pyi`
(and frequently forgetting to expose things), since now we can simply
rely on the true source file to get things done.  Most of the old
codegen in gen_pyi.py is now rerouted to various files:

- `torch/_C/__init__.pyi` (the dumping pile of all misc bindings)
- `torch/_C/_nn.pyi` (NN function bindings)
- `torch/_C/_VariableFunctions.pyi` (torch function bindings)

`torch.types` grew a bunch more definitions that previously where
defined in `torch/__init__.pyi`

Some miscellaneous changes

- Fixed a bug where we treat single TensorList argument as implying
  varargs are accepted. This is actually only supported on IntList.
  This means we can correctly generate a stub for dequantize.
- Add missing manual stub for nonzero
- Switched torch/onnx/operators.py to directly refer to _C module,
  since apparently mypy doesn't think that methods prefixed with
  underscores get reexported.  This may be a recurring theme; maybe
  we need to find a better way to solve it.

Because I was really lazy, I dumped namedtuple definitions in both
`torch._C` and `torch._C._VariableFunctions`.  This is definitely wrong.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Differential Revision: D21497400

Pulled By: ezyang

fbshipit-source-id: 07b126141c82efaca37be27c07255cb2b9b3f064
2020-05-11 07:20:13 -07:00
Jerry Zhang
0ed7fc581c [quant][graphmode][refactor] Split quantization.cpp (#37975)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37975

Test Plan:
.

Imported from OSS

Differential Revision: D21468497

fbshipit-source-id: 35cbf98a344ca6e4094d616a4040eacf017fd2de
2020-05-08 12:24:50 -07:00
peter
c5d6f59ab1 Replacing EHa with EHsc (#37235)
Summary:
We should not rely on the async exceptions. Catching C++ only exception is more sensible and may get a boost in both space (1163 MB -> 1073 MB, 0.92x) and performance(51m -> 49m, 0.96x).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37235

Differential Revision: D21256918

Pulled By: ezyang

fbshipit-source-id: 572ee96f2e4c48ad13f83409e4e113483b3a457a
2020-04-28 08:20:37 -07:00
Mo Zhou
5b9f7f7b0e [cmake] Add USE_SYSTEM_{GLOO,FP16,PTHREADPOOL,PSIMD,FXDIV,BENCHMARK} options (#14699) (#37277)
Summary:
These options are disabled by default, and are supposed to be used by
linux distro developers. With the existing shortcut option
USE_SYSTEM_LIBS toggled, these new options will be enabled as well.

Additionally, when USE_SYSTEM_LIBS is toggled, setup.py should
no longer check the existence of git submodules.

ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37277

Differential Revision: D21256999

Pulled By: ezyang

fbshipit-source-id: 84f97d008db5a5e41a289cb7bce94906de3c52cf
2020-04-27 09:37:27 -07:00
Mo Zhou
ff21b15624 cmake: add USE_SYSTEM_{LIBS,CPUINFO,SLEEF} options (#14699) (#37137)
Summary:
ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37137

Differential Revision: D21222632

Pulled By: ezyang

fbshipit-source-id: 47624b30f8d07b31a40a26edf665bbec39e45202
2020-04-23 20:43:36 -07:00
Christian Kastner
6df90bcecc setup.py: Remove conflicting double documentation of USE_FBGEMM (#36993)
Summary:
Line 33+ contains instructions on how to disable use, 108+ on how to enable it.
The default in CMakeLists.txt is enabled, so drop the latter.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36993

Differential Revision: D21161793

Pulled By: ngimel

fbshipit-source-id: 08c5eecaf8768491f90d4a52c338ecea32a0c35e
2020-04-21 22:33:49 -07:00
David Reiss
3c85f44ce8 Fail setup.py if trying to set up with Python 2 (#35613)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35613

Python 2 has reached end-of-life and is no longer supported by PyTorch.
To spare users from a long, doomed setup when trying to use PyTorch with
Python 2, detect this case early and fail with a clear message.  This
commit covers setup.py.

Test Plan: Attempted to build PyTorch with Python 2 and saw a clear error *quickly*.

Differential Revision: D20842881

Pulled By: dreiss

fbshipit-source-id: caaaa0dbff83145ff668bd25df6d7d4b3ce12e47
2020-04-16 10:24:03 -07:00
peter
b9260bdb7b Don't build deps for python setup.py egg_info (#36208)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/36207.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36208

Differential Revision: D20919649

Pulled By: ezyang

fbshipit-source-id: b5242a540181b29dba8987fb5f00332e1e81ca98
2020-04-08 09:02:01 -07:00
Sebastian Messmer
7ee88d61f7 Rename boxing/unboxing files and utilities (#35411)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35411

The file and class names in ATen/core/boxing were quite confusing.
Let's rename them for readability.

Also move function schema inference out of the boxing logic into op_registration.h where it belongs.
ghstack-source-id: 101539206

Test Plan: waitforsandcastle

Differential Revision: D20653621

fbshipit-source-id: 6a79c73d5758bee1e072d543c030913b18a69c7c
2020-04-04 14:13:28 -07:00
Feng Tian
762270c51f add c10d dynamic loading mechanism and unit test (#28068)
Summary:
The original behavior of pytorch c10d only supports built-in c10d backends, such as
nccl/gloo/mpi. This patch is used to extend the c10d capability to support dynamically
loading 3rd party communication libraries which are derived from ProcessGroup base class.

related RFC is in: https://github.com/pytorch/pytorch/issues/27955

Through this way, user just need specify a 3rd party c10d backend name when invoking
torch.distributed.init_process_group(). The proposed logic will try to load corresponding
c10d backend cpp extension automatically. as for how to develop a new 3rd party c10d backend
through cpp extension, pls refer to test/cpp_extensions/cpp_c10d_extension.cpp
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28068

Differential Revision: D19174838

Pulled By: agolynski

fbshipit-source-id: 3409a504a43ce7260e6f9d1207c00e87471fac62
2020-04-02 15:46:51 -07:00
Orion Reblitz-Richardson
f101949390 Remove python2 support from setup.py (#35539)
Summary:
As a followup to https://github.com/pytorch/pytorch/pull/35042 this removes python2 from setup.py and adds Python 3.8 to the list of supported versions. We're already testing this in CircleCI.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35539

Differential Revision: D20709060

Pulled By: orionr

fbshipit-source-id: 5d40bc14cb885374fec370fc7c5d3cde8769039a
2020-03-27 14:33:11 -07:00
pinzhenx
bd604cb5b7 Upgrade MKL-DNN to DNNL v1.2 (#32422)
Summary:
## Motivation

This PR upgrades MKL-DNN from v0.20 to DNNL v1.2 and resolves https://github.com/pytorch/pytorch/issues/30300.

DNNL (Deep Neural Network Library) is the new brand of MKL-DNN, which improves performance, quality, and usability over the old version.

This PR focuses on the migration of all existing functionalities, including minor fixes, performance improvement and code clean up. It serves as the cornerstone of our future efforts to accommodate new features like OpenCL support, BF16 training, INT8 inference, etc. and to let the Pytorch community derive more benefits from the Intel Architecture.

<br>

## What's included?

Even DNNL has many breaking changes to the API, we managed to absorb most of them in ideep. This PR contains minimalist changes to the integration code in pytorch. Below is a summary of the changes:

<br>

**General:**

1. Replace op-level allocator with global-registered allocator

```
// before
ideep::sum::compute<AllocForMKLDNN>(scales, {x, y}, z);

// after
ideep::sum::compute(scales, {x, y}, z);
```

The allocator is now being registeted at `aten/src/ATen/native/mkldnn/IDeepRegistration.cpp`. Thereafter all tensors derived from the `cpu_engine` (by default) will use the c10 allocator.

```
RegisterEngineAllocator cpu_alloc(
  ideep::engine::cpu_engine(),
  [](size_t size) {
    return c10::GetAllocator(c10::DeviceType::CPU)->raw_allocate(size);
  },
  [](void* p) {
    c10::GetAllocator(c10::DeviceType::CPU)->raw_deallocate(p);
  }
);
```
------

2. Simplify group convolution

We had such a scenario in convolution where ideep tensor shape mismatched aten tensor: when `groups > 1`, DNNL expects weights tensors to be 5-d with an extra group dimension, e.g. `goihw` instead of `oihw` in 2d conv case.

As shown below, a lot of extra checks came with this difference in shape before. Now we've completely hidden this difference in ideep and all tensors are going to align with pytorch's definition. So we could safely remove these checks from both aten and c2 integration code.

```
// aten/src/ATen/native/mkldnn/Conv.cpp

if (w.ndims() == x.ndims() + 1) {
  AT_ASSERTM(
      groups > 1,
      "Only group _mkldnn_conv2d weights could have been reordered to 5d");
  kernel_size[0] = w.get_dim(0) * w.get_dim(1);
  std::copy_n(
      w.get_dims().cbegin() + 2, x.ndims() - 1, kernel_size.begin() + 1);
} else {
  std::copy_n(w.get_dims().cbegin(), x.ndims(), kernel_size.begin());
}
```

------

3. Enable DNNL built-in cache

Previously, we stored DNNL jitted kernels along with intermediate buffers inside ideep using an LRU cache. Now we are switching to the newly added DNNL built-in cache, and **no longer** caching buffers in order to reduce memory footprint.

This change will be mainly reflected in lower memory usage from memory profiling results. On the code side, we removed couple of lines of `op_key_` that depended on the ideep cache before.

------

4. Use 64-bit integer to denote dimensions

We changed the type of `ideep::dims` from `vector<int32_t>` to `vector<int64_t>`. This renders ideep dims no longer compatible with 32-bit dims used by caffe2. So we use something like `{stride_.begin(), stride_.end()}` to cast parameter `stride_` into a int64 vector.

<br>

**Misc changes in each commit:**

**Commit:** change build options

Some build options were slightly changed, mainly to avoid name collisions with other projects that include DNNL as a subproject. In addition, DNNL built-in cache is enabled by option `DNNL_ENABLE_PRIMITIVE_CACHE`.

Old | New
-- | --
WITH_EXAMPLE | MKLDNN_BUILD_EXAMPLES
WITH_TEST | MKLDNN_BUILD_TESTS
MKLDNN_THREADING | MKLDNN_CPU_RUNTIME
MKLDNN_USE_MKL | N/A (not use MKL anymore)

------

**Commit:** aten reintegration

- aten/src/ATen/native/mkldnn/BinaryOps.cpp

    Implement binary ops using new operation `binary` provided by DNNL

- aten/src/ATen/native/mkldnn/Conv.cpp

    Clean up group convolution checks
    Simplify conv backward integration

- aten/src/ATen/native/mkldnn/MKLDNNConversions.cpp

    Simplify prepacking convolution weights

- test/test_mkldnn.py

    Fixed an issue in conv2d unit test: it didn't check conv results between mkldnn and aten implementation before. Instead, it compared the mkldnn with mkldnn as the default cpu path will also go into mkldnn. Now we use `torch.backends.mkldnn.flags` to fix this issue

- torch/utils/mkldnn.py

    Prepack weight tensor on module `__init__` to achieve better performance significantly

------

**Commit:** caffe2 reintegration

- caffe2/ideep/ideep_utils.h

    Clean up unused type definitions

- caffe2/ideep/operators/adam_op.cc & caffe2/ideep/operators/momentum_sgd_op.cc

   Unify tensor initialization with `ideep::tensor::init`. Obsolete `ideep::tensor::reinit`

- caffe2/ideep/operators/conv_op.cc & caffe2/ideep/operators/quantization/int8_conv_op.cc

    Clean up group convolution checks
    Revamp convolution API

- caffe2/ideep/operators/conv_transpose_op.cc

    Clean up group convolution checks
    Clean up deconv workaround code

------

**Commit:** custom allocator

- Register c10 allocator as mentioned above

<br><br>

## Performance

We tested inference on some common models based on user scenarios, and most performance numbers are either better than or on par with DNNL 0.20.

ratio: new / old | Latency (batch=1 4T) | Throughput (batch=64 56T)
-- | -- | --
pytorch resnet18 | 121.4% | 99.7%
pytorch resnet50 | 123.1% | 106.9%
pytorch resnext101_32x8d | 116.3% | 100.1%
pytorch resnext50_32x4d | 141.9% | 104.4%
pytorch mobilenet_v2 | 163.0% | 105.8%
caffe2 alexnet | 303.0% | 99.2%
caffe2 googlenet-v3 | 101.1% | 99.2%
caffe2 inception-v1 | 102.2% | 101.7%
caffe2 mobilenet-v1 | 356.1% | 253.7%
caffe2 resnet101 | 100.4% | 99.8%
caffe2 resnet152 | 99.8% | 99.8%
caffe2 shufflenet | 141.1% | 69.0% †
caffe2 squeezenet | 98.5% | 99.2%
caffe2 vgg16 | 136.8% | 100.6%
caffe2 googlenet-v3 int8 | 100.0% | 100.7%
caffe2 mobilenet-v1 int8 | 779.2% | 943.0%
caffe2 resnet50 int8 | 99.5% | 95.5%

_Configuration:
Platform: Skylake 8180
Latency Test: 4 threads, warmup 30, iteration 500, batch size 1
Throughput Test: 56 threads, warmup 30, iteration 200, batch size 64_

† Shufflenet is one of the few models that require temp buffers during inference. The performance degradation is an expected issue since we no longer cache any buffer in the ideep. As for the solution, we suggest users opt for caching allocator like **jemalloc** as a drop-in replacement for system allocator in such heavy workloads.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32422

Test Plan:
Perf results: https://our.intern.facebook.com/intern/fblearner/details/177790608?tab=Experiment%20Results

10% improvement for ResNext with avx512, neutral on avx2

More results: https://fb.quip.com/ob10AL0bCDXW#NNNACAUoHJP

Reviewed By: yinghai

Differential Revision: D20381325

Pulled By: dzhulgakov

fbshipit-source-id: 803b906fd89ed8b723c5fcab55039efe3e4bcb77
2020-03-26 22:07:59 -07:00
Pavel Belevich
11a40410e7 pybind11 type_caster for at::Generator and custom RNG python test (#34774)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34774

This PR provides pybind11's `type_caster<at::Generator>` that allows mapping `at::Generator` instance returned from user-defined method to python `torch::Generator`, defined as `THPGenerator ` c++ class.

This allows 1) defining custom RNG in c++ extension 2) using custom RNG in python code.

`TestRNGExtension.test_rng` shows how to use custom RNG defined in `rng_extension.cpp`

Test Plan: Imported from OSS

Differential Revision: D20549451

Pulled By: pbelevich

fbshipit-source-id: 312a6deccf8228f7f60695bbf95834620d52f5eb
2020-03-22 10:57:35 -07:00
Nikita Shulga
d3f5045bf5 PyTorch should always depend on future (#35057)
Summary:
Because `past` is used in `caffe2.python.core`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35057

Test Plan: CI

Differential Revision: D20547042

Pulled By: malfet

fbshipit-source-id: cad2123c7b88271fea37f21e616df551075383a8
2020-03-19 17:31:47 -07:00
Eli Uriegas
275f5c8049 setup.py: Add numpy as required for install_requires (#34510)
Summary:
Was originally not a requirement but we should add it back here since
it's required on import and we require it anyways for our conda
packages.

Tested with:

```
❯ pkginfo -f requires_dist *.whl
requires_dist: ['numpy']
```

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34510

Differential Revision: D20352125

Pulled By: seemethere

fbshipit-source-id: 383e396fe500ed7043d83c3df57d1772d0fff1e6
2020-03-17 13:31:55 -07:00
Nikita Shulga
6d790c3611 Mark PyTorch incompatible with python-3.6.0 (#34724)
Summary:
Per https://github.com/pytorch/pytorch/issues/19161 PyTorch is incompatible with 3.6.0 due to the missing `PySlice_Unpack`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34724

Test Plan: CI + try to load pytorch binary using python-3.6.0

Differential Revision: D20449052

Pulled By: malfet

fbshipit-source-id: 2c787fc64f5d1377c7f935ad2f3c77f46723d7dd
2020-03-13 15:22:34 -07:00
Nikita Shulga
dd7cec680c Do not use clang if it can not parse system extensions (#34549)
Summary:
Attempt to build pytorch with ASAN on system with gcc-8 fails due to the mismatch system compilation flags.
Address the issue by using original compiler to build `torch._C` extension
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34549

Test Plan: Run `.jenkins/pytorch/build-asan.sh` on FC-30

Differential Revision: D20373781

Pulled By: malfet

fbshipit-source-id: 041c8d25f96b4436385a5e0eb6fc46e9b5fdf3f1
2020-03-10 15:40:08 -07:00
xiaobing.zhang
b678256bfb Move glu to Aten(CPU) (#33179)
Summary:
This PR move glu to Aten(CPU).
Test script:
```
import torch
import torch.nn.functional as F
import time

torch.manual_seed(0)

def _time():
    if torch.cuda.is_available():
        torch.cuda.synchronize()
    return time.time()

device = "cpu"

#warm up
for n in [10, 100, 1000, 10000]:
    input = torch.randn(128, n, requires_grad=True, device=device)
    grad_output = torch.ones(128, n // 2, device=device)
    for i in range(1000):
        output = F.glu(input)
        output.backward(grad_output)

for n in [10, 100, 1000, 10000]:
    fwd_t = 0
    bwd_t = 0
    input = torch.randn(128, n, requires_grad=True, device=device)
    grad_output = torch.ones(128, n // 2, device=device)
    for i in range(10000):
        t1 = _time()
        output = F.glu(input)
        t2 = _time()
        output.backward(grad_output)
        t3 = _time()
        fwd_t = fwd_t + (t2 -t1)
        bwd_t = bwd_t + (t3 - t2)
    fwd_avg = fwd_t / 10000 * 1000
    bwd_avg = bwd_t / 10000 * 1000
    print("input size(128, %d) forward time is %.2f (ms); backwad avg time is %.2f (ms)."
          % (n, fwd_avg, bwd_avg))
```
Test device: **skx-8180.**
Before:
```
input size(128, 10) forward time is 0.04 (ms); backwad avg time is 0.08 (ms).
input size(128, 100) forward time is 0.06 (ms); backwad avg time is 0.14 (ms).
input size(128, 1000) forward time is 0.11 (ms); backwad avg time is 0.31 (ms).
input size(128, 10000) forward time is 1.52 (ms); backwad avg time is 2.04 (ms).
```
After:
```
input size(128, 10) forward time is 0.02 (ms); backwad avg time is 0.05 (ms).
input size(128, 100) forward time is 0.04 (ms); backwad avg time is 0.09 (ms).
input size(128, 1000) forward time is 0.07 (ms); backwad avg time is 0.17 (ms).
input size(128, 10000) forward time is 0.13 (ms); backwad avg time is 1.03 (ms).
```
Fix https://github.com/pytorch/pytorch/issues/24707, https://github.com/pytorch/pytorch/issues/24708.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33179

Differential Revision: D19839835

Pulled By: VitalyFedyunin

fbshipit-source-id: e4d3438556a1068da2c4a7e573d6bbf8d2a6e2b9
2020-02-28 14:54:38 -08:00
Michael Suo
dbe850af5b [jit] do the code reorg (#33851)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33851

Rationale and context described in #33828.

Script to reproduce the move:
https://gist.github.com/suo/16cbefaaeb67ca5a7c6caffd49b7f6e9
ghstack-source-id: 99079645

Test Plan: Make sure CI passes

Reviewed By: jamesr66a

Differential Revision: D20133869

fbshipit-source-id: 390e9241a9c85366d9005c492ac31f10aa96488e
2020-02-27 13:02:51 -08:00