Commit Graph

227 Commits

Author SHA1 Message Date
Jerry Zhang
4f7292f7ee Add NoQEngine to QEngine and refactor the name of set/get qengine (#26330)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26330

att

Test Plan:
.

Imported from OSS

Differential Revision: D17464904

fbshipit-source-id: d8f2cebb978fcbc478bc7e111ba24bc71a6f8915
2019-09-18 19:38:59 -07:00
Supriya Rao
bb1efb3bee Adding quantized::linear function for pytorch mobile in c10 (#26135)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26135

This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected)
It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten
I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same.

Test Plan:
python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack

Imported from OSS

Differential Revision: D17434885

fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd
2019-09-17 16:16:39 -07:00
Richard Zou
caed485873 Turn on BUILD_NAMEDTENSOR permanently (#26060)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26060

This PR enables BUILD_NAMEDTENSOR by default. This is done via including
a header, `c10/core/EnableNamedTensor`, that sets `BUILD_NAMEDTENSOR`.
In the future, the plan is to get rid of the flag entirely: we can
incrementally delete usages after this PR goes in.

This PR also maintains the namedtensor ci vs regular ci distinction.
`test/test_namedtensor.py` only runs if TEST_NAMEDTENSOR=1 is specified.
TEST_NAMEDTENSOR=1 is set on the namedtensor ci. I'll remove this
distinction later and send out an announcement about it; devs will be
responsible for named tensor failures after that.

The initial reason why we had the BUILD_NAMEDTENSOR flag was so that we
could quickly prototype named tensor features without worrying about
adding overhead to the framework. The overheads can be categorized as
memory overhead and performance overhead.

Memory overhead: named tensors adds 1 additional word per Tensor. This
is because TensorImpl stores a `unique_ptr<NamedTensorMetaInterface>`
field. This is not a lot of overhead.

Performance overhead: At all entry points to name inference, we check
if inputs to an op are named. If inputs are not named, we short-circuit
and don't do name inference. These calls should therefore be as
efficient as error-checking code and not take up a lot of time.

My plan is to benchmark a few functions and then post the results in a
comment to this PR.

Test Plan: - [namedtensor ci]

Differential Revision: D17331635

Pulled By: zou3519

fbshipit-source-id: deed901347448ae2c26066c1fa432e3dc0cadb92
2019-09-17 08:25:00 -07:00
Ralf Gommers
1b4951d3a5 Fix remaining invalid function cast warnings that show up with GCC 8/9 (#26104)
Summary:
Follow-up to gh-25483, more of the same fixes for warnings like:

```
../torch/csrc/autograd/python_variable.cpp:503:31: warning: cast between incompatible function types from ‘PyObject* (*)(THPVariable*)’ {aka ‘_object* (*)(THPVariable*)’} to ‘getter’ {aka ‘_object* (*)(_object*, void*)’} [-Wcast-function-type]
  503 |   {"_backward_hooks", (getter)THPVariable_get_backwards_hooks, (setter)THPVariable_set_backwards_hooks, nullptr, nullptr},
      |                               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```

This takes the build log output for a full rebuild with GCC 9.1 from ~10,000 to ~7,000 lines.

`clang-tidy` is going to complain, no way around that - see discussion at the end of gh-25483.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26104

Differential Revision: D17396831

Pulled By: ezyang

fbshipit-source-id: d71696bfe4dbe25519e4bcb7753151c118bd39f7
2019-09-17 07:43:37 -07:00
Supriya Rao
24d5b5f5f9 Add Runtime flag for quantized backend. (#25680)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25680

Add a runtime flag to choose between FBGEMM and QNNPACK when compiled with both.

The flag can be set by using torch.backends.quantized.engine = torch.fbgemm/torch.qnnpack or ctx::setPreferredQuantizedEngine(at::QEngine)
ghstack-source-id: 89935643

Test Plan: Verified torch.backends.quantized.engine works

Differential Revision: D17198233

fbshipit-source-id: e5449d06f4136385e0e6d18bd4237f8654a61672
2019-09-11 21:37:36 -07:00
jiayisun
b9bf91feb8 Add torch.backends.mkldnn.enabled flag (#25459)
Summary:
This PR is about add torch.backends.mkldnn.enabled flag said in https://github.com/pytorch/pytorch/issues/25186 which can be used disable mkldnn at runtime step as torch.backends.cudnn.enabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25459

Differential Revision: D17258926

Pulled By: ezyang

fbshipit-source-id: e179ad364cc608fdaa7d0f37e2e762ceb5eda598
2019-09-11 12:09:40 -07:00
Gregory Chanan
716815e3de Stop initializing THNN backend. (#25352)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25352

It doesn't appear to be necessary anymore; assuming this works I'll kill the codegen in a follow-up PR.

Test Plan: Imported from OSS

Differential Revision: D17101573

Pulled By: gchanan

fbshipit-source-id: bd3d1724ee5c659185a161b1e291e30af52f0a8a
2019-08-30 07:42:17 -07:00
Pritam Damania
7818e7e5d4 Basic framework for Distributed Autograd context. (#24875)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24875

As per https://github.com/pytorch/pytorch/issues/23110, each autograd pass
would be assigned a unique autograd_context_id. In this change we introduce a
DistAutogradContainer per worker which holds information for each autograd pass
currently running.

DistAutogradContainer has a map from the autograd_context_id to
DistAutogradContext (which holds all the relevant information for the autograd
pass). DistAutogradContext currently only stores the autograd_context_id and
more information would be added to it later as we build out the rest of the
framework.

The autograd_context_id is a 64 bit globally unique integer where the first 16
bits are the worker_id and next 48 bits are auto-incrementing for uniqueness.

Sample python code on how this would be used for distributed autograd:

```
import torch.distributed.autograd as dist_autograd
worker_id = 0
dist_autograd.init(worker_id)
with dist_autograd.context() as context_id:
     # forward pass...
     # backward pass...
     # optimizer step...
```
ghstack-source-id: 89119248

Test Plan: unit tests.

Differential Revision: D16356694

fbshipit-source-id: d1a8678da0c2af611758dbb5d624d554212330ce
2019-08-28 18:51:56 -07:00
Shen Li
02d3c302d8 Fix build failure on OSX (#23998)
Summary:
https://github.com/pytorch/pytorch/pull/23228 caused build failure on OSX, because rpc.h is included as long as USE_DISTRIBUTED=1, but rpc/init.cpp (and others) is only included when NOT APPLE. So, it cannot find python_functions defined in init.cpp on MacOS. This PR attempt to fix it by wrapping rpc.h with USE_C10D, which is only set when NOT APPLE.

I tried this fix locally and it works.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23998

Differential Revision: D16706087

Pulled By: mrshenli

fbshipit-source-id: d04fe6717a181a3198289cdef51439708c2e291d
2019-08-07 22:05:41 -07:00
Shen Li
8b349073ce sync and async torch.distributed.rpc for builtin operators (#23228)
Summary:
Features:

* sync and async RPC for builtin operators
* RpcAgent API
* ProcessGroupAgent implementation

Goal:

* have a minimum working and testable RPC implementation
* make sure the RpcAgent API is sufficient for future ThriftAgent and TensorPipeAgent implementation
  * For tensor pipe implementation, it might allocate multiple underlying communication channels with different types, and might also use streaming serialization/deserialization for large tensors. To support this requirement, the current implementation only convert a BuiltinOp into a Message which contains a byte vector and a tensor table. It is up to the RpcAgent implementation to determine how it would like to serialize a Message object.
  * For ThriftAgent, as Thrift has it own request/response matching solution, the Message.id is no longer necessary. Hence the id can be dropped during serialization. All it needs to do is to pass the response Message object to the Future returned by send(...).
* support blocking and non-blocking RequestCallback
  * blocking means the callback won't return before sending out the response
  * non-blocking can be achieved by enqueue the `(from, request, RpcAgent&)` tuple and use a different thread to process them. That is why there is an `RpcAgent&` arg in the param list.

We are not exporting this diff until we finalize distributed autograd design and publish the API review publicly.

https://fb.quip.com/FabTAZKVgQpf

Pull Request resolved: https://github.com/pytorch/pytorch/pull/23228
ghstack-source-id: 87816717

Reviewed By: zhaojuanmao

Differential Revision: D15194693

fbshipit-source-id: 7adb600796613cde6073db6c227451b89940ecaf
2019-08-06 16:03:01 -07:00
Richard Zou
8e466b7e21 Add torch._C._BUILD_NAMEDTENSOR() (#23623)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23623

This is a quick, not-user-facing check for if pytorch was built with BUILD_NAMEDTENSOR=1.

Test Plan:
- run tests [namedtensor ci]

gh-metadata: pytorch pytorch 23623 gh/zou3519/85/head

Differential Revision: D16621829

Pulled By: zou3519

fbshipit-source-id: d7e1161dc176bab2c1f953265722daeba1e63102
2019-08-02 11:37:25 -07:00
Iurii Zdebskyi
3a8d7463bd Enabled BFloat16 storage (#21523)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21523
ghimport-source-id: 698b3cbd6b21c09b9ff8bf8011980df8e35c33b0

Test Plan: Imported from OSS

Differential Revision: D15819368

Pulled By: izdeby

fbshipit-source-id: f6b3bba7b3ca8ee677bd80a231dbb3920c07d61c
2019-07-09 21:51:06 -07:00
Roy Li
9c8f9f0ecb Remove many usages of Type (#21941)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21941
ghimport-source-id: f20cca6229daba9eb8652adb3d959266ae081ef1

Test Plan: Imported from OSS

Differential Revision: D15893331

Pulled By: li-roy

fbshipit-source-id: c988b16008ff0e2725a88c6025afd4aabdaca45a
2019-06-30 04:11:28 -07:00
Alexander Sidorov
f51de8b61a Back out "Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark" (#22185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22185

Original commit changeset: 72a0eac1658b

Differential Revision: D15981928

fbshipit-source-id: d2455d79e81c26ee90d41414cde8ac0f9b703bc3
2019-06-26 16:05:51 -07:00
Ailing Zhang
e8bc992b03 print device when it's not on default device (#22094)
Summary:
we used to not print device when it's on xla. It's sometimes confusing as it looks the same as cpu tensor...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22094

Differential Revision: D15975405

Pulled By: ailzhang

fbshipit-source-id: f19ceb9e26f5f2f6e7d659de12716f0dfe065f42
2019-06-25 20:28:50 -07:00
Pieter Noordhuis
6ff0c6ca3f Remove THD (#22065)
Summary:
It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it.

Closes https://github.com/pytorch/pytorch/issues/18967.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065

Reviewed By: mrshenli

Differential Revision: D15983669

Pulled By: pietern

fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165
2019-06-25 12:19:13 -07:00
Soumith Chintala
08060e898b Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark
Differential Revision:
D15435461

Original commit changeset: db08829dc3f4

fbshipit-source-id: 72a0eac1658b2d3f885bc9a21c49fcc23030ae3e
2019-06-23 22:55:05 -07:00
Alexander Sidorov
9b45237618 PyTorch ThroughputBenchmark (#20766)
Summary:
This is useful for measuring inference performance of your
models. This is a very basic benchmark for now. We don't support
batching on the benchmark side, no inter and intra op parallelizm is
supported yet, just caller based parallelizm.

Main phylosophy here is that user should be able to provide inputs
from python and just stack them within the benchmark. API should be
exactly the same as passing inputs to module.forward.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20766

Test Plan: Added a new unit test

Differential Revision: D15435461

Pulled By: salexspb

fbshipit-source-id: db08829dc3f4398bb1d8aa16cc4a58b6c72f16c6
2019-06-23 13:03:18 -07:00
Jerry Zhang
94f903654c Add qscheme() method (#20608)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20608

Exposing QScheme in python as Python objects like `torch.qscheme.per_tensor_affine` etc.

Reviewed By: zafartahirov

Differential Revision: D15364354

fbshipit-source-id: 4d6a96d67e9ead051cf4a8f934553a8c7232fdb7
2019-06-14 16:29:29 -07:00
Syed Tousif Ahmed
ae342fd076 Refactor Random Number Generators in ATen (#21364)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21364
ghimport-source-id: ca7d37e10190ba46dc8512f437404ca9216d3369

Differential Revision: D15696497

Pulled By: ezyang

fbshipit-source-id: 2e713b8566ae915e175b5a79ac1dd9b86cc2a23d
2019-06-12 13:01:30 -07:00
davidriazati
f172fadd80 Make warnings be UserWarnings with source file info (#21231)
Summary:
Redo of #15201, this makes `warnings.warn` calls match their Python
behavior
](https://our.intern.facebook.com/intern/diff/15605266/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21231

Pulled By: driazati

Differential Revision: D15605266

fbshipit-source-id: 5931fd720b0c40d52dd492fbd1f5a76abefaab5c
2019-06-05 11:09:11 -07:00
Jerry Zhang
277bf69fa0 Add torch.load/torch.save for QTensor (#20830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20830

att

Reviewed By: dzhulgakov

Differential Revision: D15340701

fbshipit-source-id: 677038c8101f66dec4856c2eccf9f9e394012226
2019-05-30 20:52:19 -07:00
Dmytro Dzhulgakov
c25e33789e Lightweight at-most-once logging for API usage (#20745)
Summary:
Resubmit #20698 which got messed up.

Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl.

Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745

Differential Revision: D15429196

Pulled By: dzhulgakov

fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca
2019-05-23 23:17:59 -07:00
Will Feng
8cde4c4d22 Remove Variable::Impl and DifferentiableViewImpl (#17072)
Summary:
As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR:
1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class
2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()`
3. Remove `Variable.data()` API
3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history.

After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't.

**Note that this PR is BC-breaking in the following use cases:**

**Use Case 1:**
Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type.

**Use Case 2:**
If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example:
```python
params = torch.tensor([1.5, 1.5]).requires_grad_()
with torch.no_grad():
    # Change gradient to a sparse tensor
    params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.]))

grad_saved = params.grad
params.backward(torch.tensor([1.5, 1.5]))
assert id(grad_saved) == id(params.grad)  # This will fail after this PR
```
The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072

Differential Revision: D14075257

Pulled By: yf225

fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957
2019-05-23 21:09:04 -07:00
Edward Z. Yang
9b1dbffba5
Re-sync with internal repository (#20702) 2019-05-20 09:22:57 -04:00
Dmytro Dzhulgakov
d3059b9c49 Lightweight logging for once-only API usage 2019-05-19 23:04:40 -07:00
Ilia Cherniavskii
409200df59 Move inter-op settings into ATen/Parallel (#20050)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050
ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14

Differential Revision: D15248576

Pulled By: ilia-cher

fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249
2019-05-17 03:12:02 -07:00
Vitaly Fedyunin
5b78a5eadb Memory format support for contiguous and is_contiguous (#20455)
Summary:
#19975 was separated by 2 PRs.

This one:

Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions.

At this moment both functions just operate with strides and doesn't store any tensor state.

(Original RFC #19092)

-----

Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api).

Note: We had several complaints about `.to(memory_format)` function, and decided not to support it.

1.  `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.

    - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior.

    - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern.

        `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise.

2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.

    - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged.

    - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format.

Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455

Differential Revision: D15341577

Pulled By: VitalyFedyunin

fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d
2019-05-16 07:18:24 -07:00
Ilia Cherniavskii
481b6d0268 Allow a non-OpenMP based build (#19749)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749
ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9

Differential Revision: D15141993

Pulled By: ilia-cher

fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d
2019-05-06 19:34:48 -07:00
Roy Li
689dd800ed Generate only one Type class per backend (#19295)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19295
ghimport-source-id: 9345110f91f044a449804ddd5116cc9179444a00

Differential Revision: D14948581

Pulled By: li-roy

fbshipit-source-id: a317b03d58d621e8df162918038f7543bfb13ba2
2019-04-21 21:16:14 -07:00
Ilia Cherniavskii
646cb6157d Move OMP/MKL thread initialization into ATen/Parallel (#19011)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19011
ghimport-source-id: 432e31eccfd0e59fa21a790f861e6b2ff4fdbac6

Differential Revision: D14846034

Pulled By: ilia-cher

fbshipit-source-id: d9d03c761d34bac80e09ce776e41c20fd3b04389
2019-04-16 00:16:32 -07:00
Edward Yang
29ea08616b Add torch.__config__.show(), reporting detailed version of all libraries. (#18579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18579
ghimport-source-id: 65124c95e49423de4ad1008c65e75057fea09b94

Differential Revision: D14778507

Pulled By: ezyang

fbshipit-source-id: 1e4bb79f4800a116ce8fb7af2fefbd34da8d102c
2019-04-09 11:13:24 -07:00
Edward Yang
50df3e5e2e Add ability to query if built with CUDA and MKL-DNN. (#18362)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18362
ghimport-source-id: 374b7ab97e2d6a894368007133201f510539296f

Stack from [ghstack](https://github.com/ezyang/ghstack):
* #18242 Test running a CUDA build on CPU machine.
* **#18362 Add ability to query if built with CUDA and MKL-DNN.**

Fixes #18108.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14584430

fbshipit-source-id: 7605a1ac4e8f2a7c70d52e5a43ad7f03f0457473
2019-03-25 10:39:09 -07:00
Iurii Zdebskyi
444039c47b Bool tensor. Part 0: Boolean storage implementation (#16810)
Summary:
This is the first commit from a series of planned changes in order to add boolean tensors to PyTorch. The whole plan looks like this:

0. Storage Implementation (this change)
1. Tensor Creation.
2. Tensor Conversions.
3. Tensor Indexing.
4. Tensor Operations.
5. Back compatibility related changes.

This feature was requested by the community:
https://github.com/pytorch/pytorch/issues/4764
https://github.com/pytorch/pytorch/issues/4219
https://github.com/pytorch/pytorch/issues/4288

**Change**:
Added boolean type to the Storage class for CPU and CUDA backends.

**Tested via**:
1. unit tests
2. running this:
-> import torch
-> torch.BoolStorage
<class 'torch.BoolStorage'>
-> torch.cuda.BoolStorage
<class 'torch.cuda.BoolStorage'>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16810

Reviewed By: gchanan

Differential Revision: D14087246

Pulled By: izdeby

fbshipit-source-id: 042642ced1cb0fd1bb6bff05f9ca871a5c54ee5e
2019-02-19 08:22:13 -08:00
SsnL
13422fca32 Add torch.backends.openmp.is_available(); fix some cmake messages (#16425)
Summary:
1. add `torch.backends.openmp.is_available()`
2. Improve various `cmake` outputs
3. Fix LDFLAGS not respected by `caffe2_pybind11_state_*` targets
4. Fix `MKL` warning message, and QUIET flag.
5. Fix various typos
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16425

Differential Revision: D13903395

Pulled By: soumith

fbshipit-source-id: d15c5d46f53e1ff1c27fca2887b9d23d0bd85b4d
2019-01-31 16:15:46 -08:00
Shen Li
24f4d3987e Move all Stream and Event Python implementation to C++ (#15937)
Summary:
1. Added `torch/csrc/cuda/Event.h` and `torch/csrc/cuda/Event.cpp` to bind Python Event class to C++ implementation.
2. Move all CUDA runtime invocations from `torch/cuda/streams.py` to C++
3. Added tests to cover Stream and Event APIs. ~(event IPC handle tests is introduced in #15974)~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15937

Differential Revision: D13649001

Pulled By: mrshenli

fbshipit-source-id: 84ca58f35f6ba679a4ba33150ceba678d760d240
2019-01-17 07:29:22 -08:00
Peter Goldsborough
0bf1383f0a Python <-> C++ Frontend inter-op (#13481)
Summary:
This PR enables C++ frontend modules to be bound into Python and added as submodules of Python modules. For this, I added lots of pybind11 bindings for the `torch::nn::Module` class, and modified the `torch.nn.Module` class in Python to have a new Metaclass that makes `isinstance(m, torch.nn.Module)` return true when `m` is a C++ frontend module. The methods and fields of C++ modules are bound in such a way that they work seamlessly as submodules of Python modules for most operations (one exception I know of: calling `.to()` ends up calling `.apply()` on each submodule with a Python lambda, which cannot be used in C++ -- this may require small changes on Python side).

I've added quite a bunch of tests to verify the bindings and equality with Python. I think I should also try out adding a C++ module as part of some large PyTorch module, like a WLM or something, and see if everything works smoothly.

The next step for inter-op across our system is ScriptModule <-> C++ Frontend Module inter-op. I think this will then also allow using C++ frontend modules from TorchScript.

apaszke zdevito

CC dzhulgakov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13481

Differential Revision: D12981996

Pulled By: goldsborough

fbshipit-source-id: 147370d3596ebb0e94c82cec92993a148fee50a7
2018-12-13 08:04:02 -08:00
Edward Yang
517c7c9861 Canonicalize all includes in PyTorch. (#14849)
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.

I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.

I used the following script to do the canonicalization:

```
  import subprocess
  import re
  import os.path

  files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
  for fn in files:
      if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
          continue
      if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
          continue
      with open(fn, 'r') as f:
          c = f.read()
      def fmt(p):
          return "#include <{}>".format(p)
      def repl(m):
          p = m.group(1)
          if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
              return fmt(p)
          if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
              return fmt(p)
          for root in ["aten/src", "torch/lib", ""]:
              for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
                  new_p = os.path.relpath(os.path.join(bad_root, p), root)
                  if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
                      return fmt(new_p)
          print("ERROR: ", fn, p)
          return m.group(0)
      new_c = re.sub(r'#include "([^"]+)"', repl, c)
      if new_c != c:
          print(fn)
          with open(fn, 'w') as f:
              f.write(new_c)
```

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849

Reviewed By: dzhulgakov

Differential Revision: D13363445

Pulled By: ezyang

fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
2018-12-08 19:38:30 -08:00
Peter Goldsborough
d6c53328f9 Large scale fix of python-related files in torch/csrc/
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14515

Differential Revision: D13247966

Pulled By: goldsborough

fbshipit-source-id: 7a127c508fc576a7a92626dd6b729f660162d628
2018-12-07 13:04:46 -08:00
Pieter Noordhuis
220ce8046e Binding for prctl(PR_SET_PDEATHSIG) (#14491)
Summary:
If torch.multiprocessing.spawn is used to launch non-daemonic
processes (the default since #14391), the spawned children won't be
automatically terminated when the parent terminates.

On Linux, we can address this by setting PR_SET_PDEATHSIG, which
delivers a configurable signal to child processes when their parent
terminates.

Fixes #14394.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14491

Differential Revision: D13270374

Pulled By: pietern

fbshipit-source-id: 092c9d3c3cea2622c3766b467957bc27a1bd500c
2018-11-29 20:09:19 -08:00
albanD
f80d34a1c8 Update Tensor doc (#14339)
Summary:
Add to the Tensor doc info about `.device`, `.is_cuda`, `.requires_grad`, `.is_leaf` and `.grad`.
Update the `register_backward_hook` doc with a warning stating that it does not work in all cases.
Add support in the `_add_docstr` function to add docstring to attributes.

There is an explicit cast here but I am not sure how to handle it properly. The thing is that the doc field for getsetdescr is written as being a const char * (as all other doc fields in descriptors objects) in cpython online documentation. But in the code, it is the only one that is not const.
I assumed here that it is a bug in the code because it does not follow the doc and the convention of the others descriptors and so I cast out the const.
EDIT: the online doc I was looking at is for 3.7 and in that version both the code and the doc are const. For older versions, both are non const.
Please let me know if this should not be done. And if it should be done if there is a cleaner way to do it !
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14339

Differential Revision: D13243266

Pulled By: ezyang

fbshipit-source-id: 75b7838f7cd6c8dc72b0c61950e7a971baefaeeb
2018-11-28 15:28:17 -08:00
Anders Papitto
2983998bb3 add torch-python target (#12742)
Summary:
This is the next minimal step towards moving _C into cmake. For now,
leave _C in setup.py, but reduce it to an empty stub file. All of its
sources are now part of the new torch-python cmake target.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12742

Reviewed By: soumith

Differential Revision: D13089691

Pulled By: anderspapitto

fbshipit-source-id: 1c746fda33cfebb26e02a7f0781fefa8b0d86385
2018-11-16 11:43:48 -08:00
Benoit Steiner
bbe6ef3864 torch.finfo and torch.iinfo to mimic the numpy equivalent (#12472)
Summary:
This pull request intends to provide the functionality requested in https://github.com/pytorch/pytorch/issues/10742 by adding a new torch.finfo and torch.iinfo API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12472

Differential Revision: D10250829

Pulled By: benoitsteiner

fbshipit-source-id: eb22ca55d5b0064bef381fa7f1eb75989977df30
2018-10-15 13:43:52 -07:00
Yangqing Jia
713e706618 Move exception to C10 (#12354)
Summary:
There are still a few work to be done:

- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h

This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:

(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.

Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354

Reviewed By: orionr

Differential Revision: D10238910

Pulled By: Yangqing

fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32
2018-10-15 13:33:18 -07:00
Adam Paszke
8c3a94eaf2 Improve autograd profiler performance (#11773)
Summary:
To illustrate the benefits of this commit, I'll use the time/iter I got from one of the JIT benchmarks on my machine.

| Run                                          | Time                    |
|----------------------------------------------|-------------------------|
| No profiler                                  | 45ms                    |
| With profiler                                | 56ms                    |
| Use `clock_gettime` instead of `std::chrono` | 48ms                    |
| Touch all pages on block allocation          | 48ms (less jitter)      |
| Use `const char*` instead of `std::string`   | 47ms (even less jitter) |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11773

Differential Revision: D9886858

Pulled By: apaszke

fbshipit-source-id: 58f926f09e95df0b11ec687763a72b06b66991d0
2018-09-19 09:25:43 -07:00
Teng Li
020501b7b0 Getting rid of USE_C10D for build (#11237)
Summary:
Will use USE_DISTRIBUTED for both c10d and THD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11237

Differential Revision: D9647825

Pulled By: teng-li

fbshipit-source-id: 06e0ec9b5e2f8f38780fc88718f8499463e9e969
2018-09-04 17:27:53 -07:00
Pieter Noordhuis
033499cf56 Remove mention of USE_DISTRIBUTED_MW (#11240)
Summary:
This was lingering after #10731.

cc orionr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11240

Differential Revision: D9645437

Pulled By: pietern

fbshipit-source-id: d02c33354b094be3bb0872cf54a45721e20c4e7d
2018-09-04 16:10:20 -07:00
Peter Goldsborough
7ddc6f84c4 NULL -> nullptr (#11047)
Summary:
How did we get so many uses of `NULL` again?

ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11047

Differential Revision: D9566799

Pulled By: goldsborough

fbshipit-source-id: 83469f352ac69aa65bdaf1a1a21f922d892e0db3
2018-08-30 16:25:42 -07:00
Tongzhou Wang
23af7deea7 Add has_lapack flag (#11024)
Summary:
Currently our `skipIfLapack` has uses a try-catch block and regex match the error message. It is highly unreliable. This PR adds `hasLAPACK` and `hasMAGMA` on ATen context, and expose the flags to python.

Also fixes refcounting bug with `PyModule_AddObject`. The method steals reference, but we didn't `Py_INCREF` in some places before calling it with `Py_True` or `Py_False`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11024

Differential Revision: D9564898

Pulled By: SsnL

fbshipit-source-id: f46862ec3558d7e0058ef48991cd9c720cb317e2
2018-08-29 22:41:16 -07:00
Richard Zou
ad6d62250a Add torch.compiled_with_cxx11_abi(). (#10071)
Summary:
It returns whether PyTorch was built with _GLIBCXX_USE_CXX11_ABI=1.

Fixes #8385
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10071

Differential Revision: D9088946

Pulled By: zou3519

fbshipit-source-id: b00fd92ee340ef34f60bdd6027ceaf46dd7442c0
2018-08-01 15:34:48 -07:00