pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jerry Zhang	4f7292f7ee	Add NoQEngine to QEngine and refactor the name of set/get qengine (#26330 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26330 att Test Plan: . Imported from OSS Differential Revision: D17464904 fbshipit-source-id: d8f2cebb978fcbc478bc7e111ba24bc71a6f8915	2019-09-18 19:38:59 -07:00
Supriya Rao	bb1efb3bee	Adding quantized::linear function for pytorch mobile in c10 (#26135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26135 This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected) It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same. Test Plan: python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack Imported from OSS Differential Revision: D17434885 fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd	2019-09-17 16:16:39 -07:00
Richard Zou	caed485873	Turn on BUILD_NAMEDTENSOR permanently (#26060 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26060 This PR enables BUILD_NAMEDTENSOR by default. This is done via including a header, `c10/core/EnableNamedTensor`, that sets `BUILD_NAMEDTENSOR`. In the future, the plan is to get rid of the flag entirely: we can incrementally delete usages after this PR goes in. This PR also maintains the namedtensor ci vs regular ci distinction. `test/test_namedtensor.py` only runs if TEST_NAMEDTENSOR=1 is specified. TEST_NAMEDTENSOR=1 is set on the namedtensor ci. I'll remove this distinction later and send out an announcement about it; devs will be responsible for named tensor failures after that. The initial reason why we had the BUILD_NAMEDTENSOR flag was so that we could quickly prototype named tensor features without worrying about adding overhead to the framework. The overheads can be categorized as memory overhead and performance overhead. Memory overhead: named tensors adds 1 additional word per Tensor. This is because TensorImpl stores a `unique_ptr<NamedTensorMetaInterface>` field. This is not a lot of overhead. Performance overhead: At all entry points to name inference, we check if inputs to an op are named. If inputs are not named, we short-circuit and don't do name inference. These calls should therefore be as efficient as error-checking code and not take up a lot of time. My plan is to benchmark a few functions and then post the results in a comment to this PR. Test Plan: - [namedtensor ci] Differential Revision: D17331635 Pulled By: zou3519 fbshipit-source-id: deed901347448ae2c26066c1fa432e3dc0cadb92	2019-09-17 08:25:00 -07:00
Ralf Gommers	1b4951d3a5	Fix remaining invalid function cast warnings that show up with GCC 8/9 (#26104 ) Summary: Follow-up to gh-25483, more of the same fixes for warnings like: ``` ../torch/csrc/autograd/python_variable.cpp:503:31: warning: cast between incompatible function types from ‘PyObject* ()(THPVariable)’ {aka ‘_object* ()(THPVariable)’} to ‘getter’ {aka ‘_object* ()(_object, void*)’} [-Wcast-function-type] 503 \| {"_backward_hooks", (getter)THPVariable_get_backwards_hooks, (setter)THPVariable_set_backwards_hooks, nullptr, nullptr}, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` This takes the build log output for a full rebuild with GCC 9.1 from ~10,000 to ~7,000 lines. `clang-tidy` is going to complain, no way around that - see discussion at the end of gh-25483. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26104 Differential Revision: D17396831 Pulled By: ezyang fbshipit-source-id: d71696bfe4dbe25519e4bcb7753151c118bd39f7	2019-09-17 07:43:37 -07:00
Supriya Rao	24d5b5f5f9	Add Runtime flag for quantized backend. (#25680 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25680 Add a runtime flag to choose between FBGEMM and QNNPACK when compiled with both. The flag can be set by using torch.backends.quantized.engine = torch.fbgemm/torch.qnnpack or ctx::setPreferredQuantizedEngine(at::QEngine) ghstack-source-id: 89935643 Test Plan: Verified torch.backends.quantized.engine works Differential Revision: D17198233 fbshipit-source-id: e5449d06f4136385e0e6d18bd4237f8654a61672	2019-09-11 21:37:36 -07:00
jiayisun	b9bf91feb8	Add torch.backends.mkldnn.enabled flag (#25459 ) Summary: This PR is about add torch.backends.mkldnn.enabled flag said in https://github.com/pytorch/pytorch/issues/25186 which can be used disable mkldnn at runtime step as torch.backends.cudnn.enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25459 Differential Revision: D17258926 Pulled By: ezyang fbshipit-source-id: e179ad364cc608fdaa7d0f37e2e762ceb5eda598	2019-09-11 12:09:40 -07:00
Gregory Chanan	716815e3de	Stop initializing THNN backend. (#25352 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25352 It doesn't appear to be necessary anymore; assuming this works I'll kill the codegen in a follow-up PR. Test Plan: Imported from OSS Differential Revision: D17101573 Pulled By: gchanan fbshipit-source-id: bd3d1724ee5c659185a161b1e291e30af52f0a8a	2019-08-30 07:42:17 -07:00
Pritam Damania	7818e7e5d4	Basic framework for Distributed Autograd context. (#24875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24875 As per https://github.com/pytorch/pytorch/issues/23110, each autograd pass would be assigned a unique autograd_context_id. In this change we introduce a DistAutogradContainer per worker which holds information for each autograd pass currently running. DistAutogradContainer has a map from the autograd_context_id to DistAutogradContext (which holds all the relevant information for the autograd pass). DistAutogradContext currently only stores the autograd_context_id and more information would be added to it later as we build out the rest of the framework. The autograd_context_id is a 64 bit globally unique integer where the first 16 bits are the worker_id and next 48 bits are auto-incrementing for uniqueness. Sample python code on how this would be used for distributed autograd: ``` import torch.distributed.autograd as dist_autograd worker_id = 0 dist_autograd.init(worker_id) with dist_autograd.context() as context_id: # forward pass... # backward pass... # optimizer step... ``` ghstack-source-id: 89119248 Test Plan: unit tests. Differential Revision: D16356694 fbshipit-source-id: d1a8678da0c2af611758dbb5d624d554212330ce	2019-08-28 18:51:56 -07:00
Shen Li	02d3c302d8	Fix build failure on OSX (#23998 ) Summary: https://github.com/pytorch/pytorch/pull/23228 caused build failure on OSX, because rpc.h is included as long as USE_DISTRIBUTED=1, but rpc/init.cpp (and others) is only included when NOT APPLE. So, it cannot find python_functions defined in init.cpp on MacOS. This PR attempt to fix it by wrapping rpc.h with USE_C10D, which is only set when NOT APPLE. I tried this fix locally and it works. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23998 Differential Revision: D16706087 Pulled By: mrshenli fbshipit-source-id: d04fe6717a181a3198289cdef51439708c2e291d	2019-08-07 22:05:41 -07:00
Shen Li	8b349073ce	sync and async torch.distributed.rpc for builtin operators (#23228 ) Summary: Features: * sync and async RPC for builtin operators * RpcAgent API * ProcessGroupAgent implementation Goal: * have a minimum working and testable RPC implementation * make sure the RpcAgent API is sufficient for future ThriftAgent and TensorPipeAgent implementation * For tensor pipe implementation, it might allocate multiple underlying communication channels with different types, and might also use streaming serialization/deserialization for large tensors. To support this requirement, the current implementation only convert a BuiltinOp into a Message which contains a byte vector and a tensor table. It is up to the RpcAgent implementation to determine how it would like to serialize a Message object. * For ThriftAgent, as Thrift has it own request/response matching solution, the Message.id is no longer necessary. Hence the id can be dropped during serialization. All it needs to do is to pass the response Message object to the Future returned by send(...). * support blocking and non-blocking RequestCallback * blocking means the callback won't return before sending out the response * non-blocking can be achieved by enqueue the `(from, request, RpcAgent&)` tuple and use a different thread to process them. That is why there is an `RpcAgent&` arg in the param list. We are not exporting this diff until we finalize distributed autograd design and publish the API review publicly. https://fb.quip.com/FabTAZKVgQpf Pull Request resolved: https://github.com/pytorch/pytorch/pull/23228 ghstack-source-id: 87816717 Reviewed By: zhaojuanmao Differential Revision: D15194693 fbshipit-source-id: 7adb600796613cde6073db6c227451b89940ecaf	2019-08-06 16:03:01 -07:00
Richard Zou	8e466b7e21	Add torch._C._BUILD_NAMEDTENSOR() (#23623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23623 This is a quick, not-user-facing check for if pytorch was built with BUILD_NAMEDTENSOR=1. Test Plan: - run tests [namedtensor ci] gh-metadata: pytorch pytorch 23623 gh/zou3519/85/head Differential Revision: D16621829 Pulled By: zou3519 fbshipit-source-id: d7e1161dc176bab2c1f953265722daeba1e63102	2019-08-02 11:37:25 -07:00
Iurii Zdebskyi	3a8d7463bd	Enabled BFloat16 storage (#21523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21523 ghimport-source-id: 698b3cbd6b21c09b9ff8bf8011980df8e35c33b0 Test Plan: Imported from OSS Differential Revision: D15819368 Pulled By: izdeby fbshipit-source-id: f6b3bba7b3ca8ee677bd80a231dbb3920c07d61c	2019-07-09 21:51:06 -07:00
Roy Li	9c8f9f0ecb	Remove many usages of Type (#21941 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21941 ghimport-source-id: f20cca6229daba9eb8652adb3d959266ae081ef1 Test Plan: Imported from OSS Differential Revision: D15893331 Pulled By: li-roy fbshipit-source-id: c988b16008ff0e2725a88c6025afd4aabdaca45a	2019-06-30 04:11:28 -07:00
Alexander Sidorov	f51de8b61a	Back out "Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark" (#22185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22185 Original commit changeset: 72a0eac1658b Differential Revision: D15981928 fbshipit-source-id: d2455d79e81c26ee90d41414cde8ac0f9b703bc3	2019-06-26 16:05:51 -07:00
Ailing Zhang	e8bc992b03	print device when it's not on default device (#22094 ) Summary: we used to not print device when it's on xla. It's sometimes confusing as it looks the same as cpu tensor... Pull Request resolved: https://github.com/pytorch/pytorch/pull/22094 Differential Revision: D15975405 Pulled By: ailzhang fbshipit-source-id: f19ceb9e26f5f2f6e7d659de12716f0dfe065f42	2019-06-25 20:28:50 -07:00
Pieter Noordhuis	6ff0c6ca3f	Remove THD (#22065 ) Summary: It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it. Closes https://github.com/pytorch/pytorch/issues/18967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065 Reviewed By: mrshenli Differential Revision: D15983669 Pulled By: pietern fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165	2019-06-25 12:19:13 -07:00
Soumith Chintala	08060e898b	Revert D15435461: [pytorch][PR] PyTorch ThroughputBenchmark Differential Revision: D15435461 Original commit changeset: db08829dc3f4 fbshipit-source-id: 72a0eac1658b2d3f885bc9a21c49fcc23030ae3e	2019-06-23 22:55:05 -07:00
Alexander Sidorov	9b45237618	PyTorch ThroughputBenchmark (#20766 ) Summary: This is useful for measuring inference performance of your models. This is a very basic benchmark for now. We don't support batching on the benchmark side, no inter and intra op parallelizm is supported yet, just caller based parallelizm. Main phylosophy here is that user should be able to provide inputs from python and just stack them within the benchmark. API should be exactly the same as passing inputs to module.forward. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20766 Test Plan: Added a new unit test Differential Revision: D15435461 Pulled By: salexspb fbshipit-source-id: db08829dc3f4398bb1d8aa16cc4a58b6c72f16c6	2019-06-23 13:03:18 -07:00
Jerry Zhang	94f903654c	Add qscheme() method (#20608 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20608 Exposing QScheme in python as Python objects like `torch.qscheme.per_tensor_affine` etc. Reviewed By: zafartahirov Differential Revision: D15364354 fbshipit-source-id: 4d6a96d67e9ead051cf4a8f934553a8c7232fdb7	2019-06-14 16:29:29 -07:00
Syed Tousif Ahmed	ae342fd076	Refactor Random Number Generators in ATen (#21364 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21364 ghimport-source-id: ca7d37e10190ba46dc8512f437404ca9216d3369 Differential Revision: D15696497 Pulled By: ezyang fbshipit-source-id: 2e713b8566ae915e175b5a79ac1dd9b86cc2a23d	2019-06-12 13:01:30 -07:00
davidriazati	f172fadd80	Make warnings be UserWarnings with source file info (#21231 ) Summary: Redo of #15201, this makes `warnings.warn` calls match their Python behavior ](https://our.intern.facebook.com/intern/diff/15605266/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/21231 Pulled By: driazati Differential Revision: D15605266 fbshipit-source-id: 5931fd720b0c40d52dd492fbd1f5a76abefaab5c	2019-06-05 11:09:11 -07:00
Jerry Zhang	277bf69fa0	Add torch.load/torch.save for QTensor (#20830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20830 att Reviewed By: dzhulgakov Differential Revision: D15340701 fbshipit-source-id: 677038c8101f66dec4856c2eccf9f9e394012226	2019-05-30 20:52:19 -07:00
Dmytro Dzhulgakov	c25e33789e	Lightweight at-most-once logging for API usage (#20745 ) Summary: Resubmit #20698 which got messed up. Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl. Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745 Differential Revision: D15429196 Pulled By: dzhulgakov fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca	2019-05-23 23:17:59 -07:00
Will Feng	8cde4c4d22	Remove Variable::Impl and DifferentiableViewImpl (#17072 ) Summary: As part of the Variable/Tensor merge work: https://github.com/pytorch/pytorch/issues/13638, we make the following changes in this PR: 1. Remove the `Variable::Impl` class and the `DifferentiableViewImpl` class 2. Change all `Variable.data()` call sites to either use `Variable` directly, or use `Variable.tensor_data()` 3. Remove `Variable.data()` API 3. Add `Variable.variable_data()` that matches `tensor.data` in Python API, which creates a new `Variable` that shares the same storage and tensor metadata with the original `Variable`, but with a completely new autograd history. After this PR, Variable doesn't wrap a Tensor internally anymore, and both Variable and Tensor use the same TensorImpl class as its `impl_`. The only difference is that Variable always has AutogradMeta in its TensorImpl, but Tensor doesn't. Note that this PR is BC-breaking in the following use cases: Use Case 1: Previously, `x.data = y` works even if `x` and `y` are of different TensorImpl type (e.g. `x` is a CPU dense tensor whose impl is of type TensorImpl, while `y` is a CPU sparse tensor whose impl is of type SparseTensorImpl). However, after this PR, `x.data = y` doesn't work anymore if `x` and `y` are of different TensorImpl type, because the underlying implementation `variable.set_data(tensor)` no longer works if `variable` and `tensor` have different TensorImpl type. Use Case 2: If a tensor `x`'s `grad` is sparse, accumulating dense gradients to `x` will change the tensor that `x.grad` is pointing to. This is better illustrated with the following example: ```python params = torch.tensor([1.5, 1.5]).requires_grad_() with torch.no_grad(): # Change gradient to a sparse tensor params.grad = torch.sparse_coo_tensor(torch.tensor([[1, 1]]).long(), torch.tensor([1., 1.])) grad_saved = params.grad params.backward(torch.tensor([1.5, 1.5])) assert id(grad_saved) == id(params.grad) # This will fail after this PR ``` The assertion in the last line will fail after this PR, because adding dense gradients to sparse gradients will change the `params.grad` tensor reference. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17072 Differential Revision: D14075257 Pulled By: yf225 fbshipit-source-id: 0e681df641270dea586042dd26db59f2e76b5957	2019-05-23 21:09:04 -07:00
Edward Z. Yang	9b1dbffba5	Re-sync with internal repository (#20702 )	2019-05-20 09:22:57 -04:00
Dmytro Dzhulgakov	d3059b9c49	Lightweight logging for once-only API usage	2019-05-19 23:04:40 -07:00
Ilia Cherniavskii	409200df59	Move inter-op settings into ATen/Parallel (#20050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050 ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14 Differential Revision: D15248576 Pulled By: ilia-cher fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249	2019-05-17 03:12:02 -07:00
Vitaly Fedyunin	5b78a5eadb	Memory format support for contiguous and is_contiguous (#20455 ) Summary: #19975 was separated by 2 PRs. This one: Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions. At this moment both functions just operate with strides and doesn't store any tensor state. (Original RFC #19092) ----- Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api). Note: We had several complaints about `.to(memory_format)` function, and decided not to support it. 1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior. - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern. `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise. 2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged. - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format. Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455 Differential Revision: D15341577 Pulled By: VitalyFedyunin fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d	2019-05-16 07:18:24 -07:00
Ilia Cherniavskii	481b6d0268	Allow a non-OpenMP based build (#19749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19749 ghimport-source-id: a6636c0acddbdc5fd5b0dcb20b9f80cbdb9159b9 Differential Revision: D15141993 Pulled By: ilia-cher fbshipit-source-id: 96085608398b2a4c97c68b2948f5184d07f9ad3d	2019-05-06 19:34:48 -07:00
Roy Li	689dd800ed	Generate only one Type class per backend (#19295 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19295 ghimport-source-id: 9345110f91f044a449804ddd5116cc9179444a00 Differential Revision: D14948581 Pulled By: li-roy fbshipit-source-id: a317b03d58d621e8df162918038f7543bfb13ba2	2019-04-21 21:16:14 -07:00
Ilia Cherniavskii	646cb6157d	Move OMP/MKL thread initialization into ATen/Parallel (#19011 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19011 ghimport-source-id: 432e31eccfd0e59fa21a790f861e6b2ff4fdbac6 Differential Revision: D14846034 Pulled By: ilia-cher fbshipit-source-id: d9d03c761d34bac80e09ce776e41c20fd3b04389	2019-04-16 00:16:32 -07:00
Edward Yang	29ea08616b	Add torch.__config__.show(), reporting detailed version of all libraries. (#18579 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18579 ghimport-source-id: 65124c95e49423de4ad1008c65e75057fea09b94 Differential Revision: D14778507 Pulled By: ezyang fbshipit-source-id: 1e4bb79f4800a116ce8fb7af2fefbd34da8d102c	2019-04-09 11:13:24 -07:00
Edward Yang	50df3e5e2e	Add ability to query if built with CUDA and MKL-DNN. (#18362 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18362 ghimport-source-id: 374b7ab97e2d6a894368007133201f510539296f Stack from [ghstack](https://github.com/ezyang/ghstack): * #18242 Test running a CUDA build on CPU machine. * #18362 Add ability to query if built with CUDA and MKL-DNN. Fixes #18108. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14584430 fbshipit-source-id: 7605a1ac4e8f2a7c70d52e5a43ad7f03f0457473	2019-03-25 10:39:09 -07:00
Iurii Zdebskyi	444039c47b	Bool tensor. Part 0: Boolean storage implementation (#16810 ) Summary: This is the first commit from a series of planned changes in order to add boolean tensors to PyTorch. The whole plan looks like this: 0. Storage Implementation (this change) 1. Tensor Creation. 2. Tensor Conversions. 3. Tensor Indexing. 4. Tensor Operations. 5. Back compatibility related changes. This feature was requested by the community: https://github.com/pytorch/pytorch/issues/4764 https://github.com/pytorch/pytorch/issues/4219 https://github.com/pytorch/pytorch/issues/4288 Change: Added boolean type to the Storage class for CPU and CUDA backends. Tested via: 1. unit tests 2. running this: -> import torch -> torch.BoolStorage <class 'torch.BoolStorage'> -> torch.cuda.BoolStorage <class 'torch.cuda.BoolStorage'> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16810 Reviewed By: gchanan Differential Revision: D14087246 Pulled By: izdeby fbshipit-source-id: 042642ced1cb0fd1bb6bff05f9ca871a5c54ee5e	2019-02-19 08:22:13 -08:00
SsnL	13422fca32	Add torch.backends.openmp.is_available(); fix some cmake messages (#16425 ) Summary: 1. add `torch.backends.openmp.is_available()` 2. Improve various `cmake` outputs 3. Fix LDFLAGS not respected by `caffe2_pybind11_state_*` targets 4. Fix `MKL` warning message, and QUIET flag. 5. Fix various typos Pull Request resolved: https://github.com/pytorch/pytorch/pull/16425 Differential Revision: D13903395 Pulled By: soumith fbshipit-source-id: d15c5d46f53e1ff1c27fca2887b9d23d0bd85b4d	2019-01-31 16:15:46 -08:00
Shen Li	24f4d3987e	Move all Stream and Event Python implementation to C++ (#15937 ) Summary: 1. Added `torch/csrc/cuda/Event.h` and `torch/csrc/cuda/Event.cpp` to bind Python Event class to C++ implementation. 2. Move all CUDA runtime invocations from `torch/cuda/streams.py` to C++ 3. Added tests to cover Stream and Event APIs. ~(event IPC handle tests is introduced in #15974)~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/15937 Differential Revision: D13649001 Pulled By: mrshenli fbshipit-source-id: 84ca58f35f6ba679a4ba33150ceba678d760d240	2019-01-17 07:29:22 -08:00
Peter Goldsborough	0bf1383f0a	Python <-> C++ Frontend inter-op (#13481 ) Summary: This PR enables C++ frontend modules to be bound into Python and added as submodules of Python modules. For this, I added lots of pybind11 bindings for the `torch::nn::Module` class, and modified the `torch.nn.Module` class in Python to have a new Metaclass that makes `isinstance(m, torch.nn.Module)` return true when `m` is a C++ frontend module. The methods and fields of C++ modules are bound in such a way that they work seamlessly as submodules of Python modules for most operations (one exception I know of: calling `.to()` ends up calling `.apply()` on each submodule with a Python lambda, which cannot be used in C++ -- this may require small changes on Python side). I've added quite a bunch of tests to verify the bindings and equality with Python. I think I should also try out adding a C++ module as part of some large PyTorch module, like a WLM or something, and see if everything works smoothly. The next step for inter-op across our system is ScriptModule <-> C++ Frontend Module inter-op. I think this will then also allow using C++ frontend modules from TorchScript. apaszke zdevito CC dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/13481 Differential Revision: D12981996 Pulled By: goldsborough fbshipit-source-id: 147370d3596ebb0e94c82cec92993a148fee50a7	2018-12-13 08:04:02 -08:00
Edward Yang	517c7c9861	Canonicalize all includes in PyTorch. (#14849 ) Summary: Anywhere we used #include "foo.h", we now say #include <foo.h> Paths are adjusted to be rooted out of aten/src, torch/lib, or the root level directory. I modified CMakeLists.txt by hand to remove TH and THC from the include paths. I used the following script to do the canonicalization: ``` import subprocess import re import os.path files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n') for fn in files: if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']): continue if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]): continue with open(fn, 'r') as f: c = f.read() def fmt(p): return "#include <{}>".format(p) def repl(m): p = m.group(1) if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]: return fmt(p) if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]): return fmt(p) for root in ["aten/src", "torch/lib", ""]: for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]: new_p = os.path.relpath(os.path.join(bad_root, p), root) if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))): return fmt(new_p) print("ERROR: ", fn, p) return m.group(0) new_c = re.sub(r'#include "([^"]+)"', repl, c) if new_c != c: print(fn) with open(fn, 'w') as f: f.write(new_c) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849 Reviewed By: dzhulgakov Differential Revision: D13363445 Pulled By: ezyang fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68	2018-12-08 19:38:30 -08:00
Peter Goldsborough	d6c53328f9	Large scale fix of python-related files in torch/csrc/ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14515 Differential Revision: D13247966 Pulled By: goldsborough fbshipit-source-id: 7a127c508fc576a7a92626dd6b729f660162d628	2018-12-07 13:04:46 -08:00
Pieter Noordhuis	220ce8046e	Binding for prctl(PR_SET_PDEATHSIG) (#14491 ) Summary: If torch.multiprocessing.spawn is used to launch non-daemonic processes (the default since #14391), the spawned children won't be automatically terminated when the parent terminates. On Linux, we can address this by setting PR_SET_PDEATHSIG, which delivers a configurable signal to child processes when their parent terminates. Fixes #14394. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14491 Differential Revision: D13270374 Pulled By: pietern fbshipit-source-id: 092c9d3c3cea2622c3766b467957bc27a1bd500c	2018-11-29 20:09:19 -08:00
albanD	f80d34a1c8	Update Tensor doc (#14339 ) Summary: Add to the Tensor doc info about `.device`, `.is_cuda`, `.requires_grad`, `.is_leaf` and `.grad`. Update the `register_backward_hook` doc with a warning stating that it does not work in all cases. Add support in the `_add_docstr` function to add docstring to attributes. There is an explicit cast here but I am not sure how to handle it properly. The thing is that the doc field for getsetdescr is written as being a const char * (as all other doc fields in descriptors objects) in cpython online documentation. But in the code, it is the only one that is not const. I assumed here that it is a bug in the code because it does not follow the doc and the convention of the others descriptors and so I cast out the const. EDIT: the online doc I was looking at is for 3.7 and in that version both the code and the doc are const. For older versions, both are non const. Please let me know if this should not be done. And if it should be done if there is a cleaner way to do it ! Pull Request resolved: https://github.com/pytorch/pytorch/pull/14339 Differential Revision: D13243266 Pulled By: ezyang fbshipit-source-id: 75b7838f7cd6c8dc72b0c61950e7a971baefaeeb	2018-11-28 15:28:17 -08:00
Anders Papitto	2983998bb3	add torch-python target (#12742 ) Summary: This is the next minimal step towards moving _C into cmake. For now, leave _C in setup.py, but reduce it to an empty stub file. All of its sources are now part of the new torch-python cmake target. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12742 Reviewed By: soumith Differential Revision: D13089691 Pulled By: anderspapitto fbshipit-source-id: 1c746fda33cfebb26e02a7f0781fefa8b0d86385	2018-11-16 11:43:48 -08:00
Benoit Steiner	bbe6ef3864	torch.finfo and torch.iinfo to mimic the numpy equivalent (#12472 ) Summary: This pull request intends to provide the functionality requested in https://github.com/pytorch/pytorch/issues/10742 by adding a new torch.finfo and torch.iinfo API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12472 Differential Revision: D10250829 Pulled By: benoitsteiner fbshipit-source-id: eb22ca55d5b0064bef381fa7f1eb75989977df30	2018-10-15 13:43:52 -07:00
Yangqing Jia	713e706618	Move exception to C10 (#12354 ) Summary: There are still a few work to be done: - Move logging and unify AT_WARN with LOG(ERROR). - A few header files are still being plumbed through, need cleaning. - caffe2::EnforceNotMet aliasing is not done yet. - need to unify the macros. See c10/util/Exception.h This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches: (1) add //caffe2/c10:c10 to your dependency (or transitive dependency). (2) change objects such as at::Error, at::Optional to the c10 namespace. (3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes. Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354 Reviewed By: orionr Differential Revision: D10238910 Pulled By: Yangqing fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32	2018-10-15 13:33:18 -07:00
Adam Paszke	8c3a94eaf2	Improve autograd profiler performance (#11773 ) Summary: To illustrate the benefits of this commit, I'll use the time/iter I got from one of the JIT benchmarks on my machine. \| Run \| Time \| \|----------------------------------------------\|-------------------------\| \| No profiler \| 45ms \| \| With profiler \| 56ms \| \| Use `clock_gettime` instead of `std::chrono` \| 48ms \| \| Touch all pages on block allocation \| 48ms (less jitter) \| \| Use `const char*` instead of `std::string` \| 47ms (even less jitter) \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/11773 Differential Revision: D9886858 Pulled By: apaszke fbshipit-source-id: 58f926f09e95df0b11ec687763a72b06b66991d0	2018-09-19 09:25:43 -07:00
Teng Li	020501b7b0	Getting rid of USE_C10D for build (#11237 ) Summary: Will use USE_DISTRIBUTED for both c10d and THD Pull Request resolved: https://github.com/pytorch/pytorch/pull/11237 Differential Revision: D9647825 Pulled By: teng-li fbshipit-source-id: 06e0ec9b5e2f8f38780fc88718f8499463e9e969	2018-09-04 17:27:53 -07:00
Pieter Noordhuis	033499cf56	Remove mention of USE_DISTRIBUTED_MW (#11240 ) Summary: This was lingering after #10731. cc orionr Pull Request resolved: https://github.com/pytorch/pytorch/pull/11240 Differential Revision: D9645437 Pulled By: pietern fbshipit-source-id: d02c33354b094be3bb0872cf54a45721e20c4e7d	2018-09-04 16:10:20 -07:00
Peter Goldsborough	7ddc6f84c4	NULL -> nullptr (#11047 ) Summary: How did we get so many uses of `NULL` again? ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/11047 Differential Revision: D9566799 Pulled By: goldsborough fbshipit-source-id: 83469f352ac69aa65bdaf1a1a21f922d892e0db3	2018-08-30 16:25:42 -07:00
Tongzhou Wang	23af7deea7	Add has_lapack flag (#11024 ) Summary: Currently our `skipIfLapack` has uses a try-catch block and regex match the error message. It is highly unreliable. This PR adds `hasLAPACK` and `hasMAGMA` on ATen context, and expose the flags to python. Also fixes refcounting bug with `PyModule_AddObject`. The method steals reference, but we didn't `Py_INCREF` in some places before calling it with `Py_True` or `Py_False`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11024 Differential Revision: D9564898 Pulled By: SsnL fbshipit-source-id: f46862ec3558d7e0058ef48991cd9c720cb317e2	2018-08-29 22:41:16 -07:00
Richard Zou	ad6d62250a	Add torch.compiled_with_cxx11_abi(). (#10071 ) Summary: It returns whether PyTorch was built with _GLIBCXX_USE_CXX11_ABI=1. Fixes #8385 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10071 Differential Revision: D9088946 Pulled By: zou3519 fbshipit-source-id: b00fd92ee340ef34f60bdd6027ceaf46dd7442c0	2018-08-01 15:34:48 -07:00

1 2 3 4 5

227 Commits