pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Pritam Damania	2b221a9599	Remove PyCFunction casts as much as possible. (#46227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46227 Follow up from https://github.com/pytorch/pytorch/issues/45419, in this PR I've removed as many PyCFunction casts as I could from the codebase. The only ones I didn't remove were the ones with `METH_VARARGS \| METH_KEYWORDS` which have 3 parameters instead of 2 and had to be casted. Example: ` {"copy_", (PyCFunction)(void(*)(void))THPStorage_(copy_), METH_VARARGS \| METH_KEYWORDS, nullptr},` ghstack-source-id: 114632704 Test Plan: waitforbuildbot Reviewed By: albanD Differential Revision: D24269435 fbshipit-source-id: 025cfd43a9a2a3e59f6b2951c1a78749193d77cf	2020-10-20 15:01:51 -07:00
Gregory Chanan	2070834b9e	Improve error checking of Storage._writeFile. (#46036 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46036 Previously, this function didn't do error-bounds checking on the GetItem (GET_ITEM) calls, which led to issues like https://github.com/pytorch/pytorch/issues/46020. A better solution would be to use pybind, but given writing the file is going to dominate bounds checking, this is strictly better. Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D24228370 Pulled By: gchanan fbshipit-source-id: f5d0a3d21ff12b4380beefe1e9954fa81ea2f567	2020-10-12 11:10:04 -07:00
Nikita Shulga	4066022146	Do not use `PRId64` in torch/csrc (#44767 ) Summary: Instead use `fmt::format()` or `%lld` and cast argument to `(long long)` Fix typos and add helper `PyErr_SetString()` method in torch/csrc/Exceptions.h Pull Request resolved: https://github.com/pytorch/pytorch/pull/44767 Reviewed By: ezyang Differential Revision: D23723671 Pulled By: malfet fbshipit-source-id: c0101aed222184aa436b1e8768480d1531dff232	2020-09-17 14:00:02 -07:00
Kurt Mohler	f9eb8824f1	Remove datatype from Storage and StorageImpl (#38870 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38870 * Removed dtype data member from StorageImpl * Removed any methods or method arguments in Storage/StorageImpl that deal with dtypes * Update all callers of the changed API Part of issue https://github.com/pytorch/pytorch/issues/33950 Original PR: https://github.com/pytorch/pytorch/pull/38038 Reviewed By: albanD Differential Revision: D21549645 Pulled By: ezyang fbshipit-source-id: 4289b356c55ff6b9530376a79343b99b540ee3de	2020-05-21 15:26:08 -07:00
Edward Yang	fe88806784	Back out "Revert D21171334: [pytorch][PR] Change StorageImpl to track byte count rather than element count" (#37893 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37893 Original commit changeset: 50746043acf3 Test Plan: sandcastle and ossci Reviewed By: malfet, seemethere, ngimel Differential Revision: D21416509 fbshipit-source-id: 735ec4e61f9d36d4537f52dd2dc6267751aeb94b	2020-05-05 22:43:15 -07:00
Edward Yang	a2fc7f787a	Revert D21171334: [pytorch][PR] Change StorageImpl to track byte count rather than element count Test Plan: revert-hammer Differential Revision: D21171334 Original commit changeset: 37329a379de9 fbshipit-source-id: 50746043acf3c76754688de0fe6f1cc12437ea2f	2020-05-05 16:36:15 -07:00
Kurt Mohler	3706803b60	Change StorageImpl to track byte count rather than element count (#37776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37776 * Remove type-specific size tracking in favor of byte size tracking in Storage and StorageImpl * Changed numel() and set_numel() to nbytes() and set_nbytes() * Added enum argument to Storage/StorageImpl constructor to indicate new meaning of the size parameter * Update all callers of the changed API Part of issue https://github.com/pytorch/pytorch/issues/33950 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37028 Differential Revision: D21171334 Pulled By: ezyang fbshipit-source-id: 37329a379de9a3a83cc5e9007e455a3e1c2d10b8	2020-05-05 14:20:51 -07:00
anjali411	1f09f7ea44	Python API for Complex Storage and storage copy logic (#35771 ) Summary: Following up on this: https://github.com/pytorch/pytorch/pull/35851 cross dtype storage copy is not being used internally, so I have not included cross dtype copy for complex. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35771 Differential Revision: D21319650 Pulled By: anjali411 fbshipit-source-id: 07c72996ee598eba0cf401ad61534494d6f5b5b3	2020-05-01 11:47:22 -07:00
Gregory Chanan	287f3b746e	Remove Backend -> THPLayout mapping. (#37527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37527 This is yet another place that needs to be updated for adding a new "Backend" and is unnecessary. Instead, just use layout_from_backend and have a map from Layout -> THPLayout. Other changes: - rename torch::getDtype and torch::getLayout to torch::getTHPDtype and torch::getTHPLayout since e.g. for layout you are both passing in and returning a "layout" type. - add NumOptions to Layout to match the dtype/ScalarType formulation. Test Plan: Imported from OSS Differential Revision: D21309836 Pulled By: gchanan fbshipit-source-id: ede0e4f3bf7ff2cd04a9b17df020f0d4fd654ba3	2020-04-30 11:11:09 -07:00
davidriazati	74ce3a032c	Fix some bugs with zipfile serialization (#32244 ) Summary: Stacked PRs * #32958 - Make zip serialization the default * #32244 - Fix some bugs with zipfile serialization It includes the following changes: * Split up tests so that we can test both serialization methods * Loading something within a buffer doesn't work anymore, so those tests are only on the old serialization method (it's possible but introduces a big slowdown since it requires a linear scan of the entire zipfile to find the magic number at the end) * Call `readinto` on a buffer if possible instead of `read` + a copy * Disable CRC-32 checks on read (there was some issue where miniz said the CRC was wrong but `zipinfo` and `unzip` said the zip file was fine) ](https://our.intern.facebook.com/intern/diff/19418935/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/32244 Pulled By: driazati Reviewed By: eellison Differential Revision: D19418935 fbshipit-source-id: df140854f52ecd04236225417d625374fd99f573	2020-02-05 15:32:14 -08:00
Brian Wignall	e7fe64f6a6	Fix typos (#30606 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606 Differential Revision: D18763028 Pulled By: mrshenli fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c	2019-12-02 20:17:42 -08:00
Davide Libenzi	cc6af45944	Fix writeable strings warnings. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29512 Differential Revision: D18417715 Pulled By: mruberry fbshipit-source-id: 7029f0a73bcdf0ce8594b90b6f5af8be4e8b5e02	2019-11-09 21:16:35 -08:00
vishwakftw	86c64440c9	Make PyTorch Python 3.8 compatible (#29302 ) Summary: PEP 590 modifies the `tp_print` offset to `tp_vectorcall_offset` - which requires a Py_ssize_t object. Passing a nullptr caused compatibility issues for Python 3.8. Changelog: - Modify all occurrences of `nullptr /* tp_print /` to 0 / tp_vectorcall_offset */ - Minor formatting changes Pull Request resolved: https://github.com/pytorch/pytorch/pull/29302 Test Plan: - Local fresh build with Python 3.8 completed successfully. Fixes https://github.com/pytorch/pytorch/issues/28060. Fixes https://github.com/pytorch/pytorch/issues/29162. Supersedes https://github.com/pytorch/pytorch/pull/28364 Differential Revision: D18372022 Pulled By: ezyang fbshipit-source-id: 8e9a15b0d0f72101ccc69bd489f5efa216b880bb	2019-11-07 09:20:19 -08:00
Edward Yang	a5d356cb39	Delete THP_CORE macro; partially replace with THP_BUILD_MAIN_LIB (#29143 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29143 THP_CORE macro is a very old macro that appeared to have served two purposes: 1. The torch-python equivalent of CAFFE2_BUILD_MAIN_LIB, to toggle symbol visibility headers 2. Some sort of ad hoc way of hiding certain definitions from headers so external clients can't get at them. It did (2) in a very confusing manner, because we set THP_CORE in both torch and torch-python (it shouldn't do anything in torch). In this PR I just get rid of use case (2) entirely (so everything shows up in headers all the time), and then redo (1) using a new THP_BUILD_MAIN_LIB macro. This cleans up some of the macro definitions and makes my life easier for working on #27215. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Differential Revision: D18309594 Pulled By: ezyang fbshipit-source-id: adcb6d7cb387cd818480137e2b94e5e761dbfefc	2019-11-06 15:02:02 -08:00
Jerry Zhang	23193c155f	Quantized Tensor support copy (#28612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28612 att Test Plan: python test/test_quantized_tensor.py Imported from OSS Differential Revision: D18255247 fbshipit-source-id: 814b12640fdf9d79b27482ee642ce430dbaeea68	2019-11-01 17:40:17 -07:00
Pritam Damania	fe4170bda8	Add send and recv backward functions for builtin operators RPC. (#25527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25527 Master GH issue: https://github.com/pytorch/pytorch/issues/23110. This change builds upon https://github.com/pytorch/pytorch/pull/24876 and provides all the autograd hooks needed for a forward pass with distributed rpc for builtin operators. This change does not address distributed rpc for python UDFs and that will be addressed in follow up PRs. Summary of changes: 1. Attach send autograd functions when a request is sent from the client and response is sent from the server. 2. Attach receive autograd functions when a request is received on the server and a response is received on the client. 3. Generate a globally unique autograd_message_id for each send/recv autograd function pair to uniquely identify them. ghstack-source-id: 91240466 Test Plan: unit tests. Differential Revision: D17148077 fbshipit-source-id: 192d8a3f552ed7cc939f55dcca332965c9bd3233	2019-10-03 01:18:46 -07:00
peter	ec07d144ba	Fixed seek offset size to 64bit. (#27125 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/26998. Pull Request resolved: https://github.com/pytorch/pytorch/pull/27125 Differential Revision: D17687154 Pulled By: ezyang fbshipit-source-id: 6784f4fd799130ac72a25884f120a0ba96bd4f51	2019-10-01 08:50:32 -07:00
Edward Yang	b16358b251	Revert D17666050: [pytorch][PR] Fixed seek offset size to 64bit. Test Plan: revert-hammer Differential Revision: D17666050 Original commit changeset: f02ebd5320ae fbshipit-source-id: 6bc8fe583e350e2b573f767af85d1287dd048d1f	2019-09-30 11:07:35 -07:00
Yoshiaki Nakamura	1afe3fc01e	Fixed seek offset size to 64bit. (#27047 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/26998 Pull Request resolved: https://github.com/pytorch/pytorch/pull/27047 Differential Revision: D17666050 Pulled By: ezyang fbshipit-source-id: f02ebd5320ae25f8949be20d0744fe3cd3e2fee9	2019-09-30 07:52:15 -07:00
Bulent Abali	afa5d0823b	Fixes big endian arch bugs. (#26383 ) Summary: Serialization.cpp fails on big endian machines. This patch fixes the endian bugs and also makes the pytorch model files portable across different endian architectures. x86 generated model file can be read on s390 arch. First problem, is serialization.cpp forgets to convert "size" value of the storage elements to the native byte order. torch.load throws an assertion as a result (see the first stack trace below). Second problem is when it reads the model from storage (doRead) it decodes values to little endian which is the wrong order on a big endian machine. The decode should be to THP_nativeByteOrder() instead (see the model dump below) ```loaded_model = torch.load( opt.model_file, map_location=torch.device("cpu")) File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 422, in load return _load(f, map_location, pickle_module, **pickle_load_args) File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 616, in _load deserialized_objects[key]._set_from_file(f, offset, f_should_read_directly) RuntimeError: storage has wrong size: expected 2305843009213693952 got 32 (the very long number is actually 32 in the wrong endianness) ``` Model file load on x86 (correct output) ```>>> import torch >>> torch.load('400f2k_best.model', map_location=torch.device("cpu")) {'epoch': 24, 'model_type': 'emb_aec', 'classifier_model': OrderedDict([('model.0.weight', tensor([[ 2.4608e-01, -1.1174e-01, -1.0854e-01, 4.0124e-01, -1.5261e-02, -1.2206e-01, 1.3229e-01, -1.2615e-01, -5.2773e-01, 2.6333e-01, -3.1462e-03, -1.4902e-01, 9.8545e-02, -1.5789e-01, -2.2625e-01, -1.0776e-01, -9.0895e-02, -3.8530e-01, 9.1152e-01, -3.9720e-01, -8.5848e-01, -4.7837e-02, -1.5178e-01, 8.5023e-02, 1.5013e-01, -9.9294e-02, -2.7422e-01, -4.3986e-01, -4.4297e-01, -3.9570e-01, ``` Model file load on s390x (wrong endianness; notice the exponents) ```>>> import torch >>> torch.load( "400f2k_best.model", map_location=torch.device("cpu")) {'epoch': 24, 'model_type': 'emb_aec', 'classifier_model': OrderedDict([('model.0.weight', tensor([[ 9.2780e+21, -9.7722e-11, 4.1350e+33, 7.782e+34, 4.2056e-31, 9.0784e+18, 1.1846e-32, 3.3320e-32, -4.8288e-28, -7.2679e+12, 1.5379e-16, -5.2604e+12, -4.7240e+17, 4.6092e-21, -1.8360e-20, -2.7712e-31, 1.4548e-16, -2.5089e-27, 7.9094e-10, 7.1977e+34, 1.1930e+26, 8.4536e+15, 2.7757e+23, -5.8455e-10, -1.5611e+09, -1.1311e-23, 6.6451e+19, -2.0970e+20, 3.4878e-19, -1.0857e-12, 7.8098e+22, 5.3998e-35], ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/26383 Differential Revision: D17480891 fbshipit-source-id: f40569c7b9c4a1935dceb41f1a2508ce21ea3491	2019-09-19 19:58:02 -07:00
Ralf Gommers	1b4951d3a5	Fix remaining invalid function cast warnings that show up with GCC 8/9 (#26104 ) Summary: Follow-up to gh-25483, more of the same fixes for warnings like: ``` ../torch/csrc/autograd/python_variable.cpp:503:31: warning: cast between incompatible function types from ‘PyObject* ()(THPVariable)’ {aka ‘_object* ()(THPVariable)’} to ‘getter’ {aka ‘_object* ()(_object, void*)’} [-Wcast-function-type] 503 \| {"_backward_hooks", (getter)THPVariable_get_backwards_hooks, (setter)THPVariable_set_backwards_hooks, nullptr, nullptr}, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` This takes the build log output for a full rebuild with GCC 9.1 from ~10,000 to ~7,000 lines. `clang-tidy` is going to complain, no way around that - see discussion at the end of gh-25483. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26104 Differential Revision: D17396831 Pulled By: ezyang fbshipit-source-id: d71696bfe4dbe25519e4bcb7753151c118bd39f7	2019-09-17 07:43:37 -07:00
Ralf Gommers	4299faa10b	Fix invalid function cast warnings that show up with GCC 8/9 (#25483 ) Summary: Fixes ~5000 lines of warnings like: ``` In file included from ../aten/src/TH/TH.h:4, from ../torch/csrc/Storage.cpp:11: ../torch/csrc/Storage.h:6:39: warning: cast between incompatible function types from ‘PyObject* ()(THPStorage)’ {aka ‘_object* ()(THPStorage)’} to ‘getter’ {aka ‘_object* ()(_object, void)’} [-Wcast-function-type] 6 \| #define THPStorage_(NAME) TH_CONCAT_4(THP,Real,Storage_,NAME) \| ^~~ caffe2/aten/src/TH/THGeneral.h:154:37: note: in definition of macro ‘TH_CONCAT_4_EXPAND’ 154 \| #define TH_CONCAT_4_EXPAND(x,y,z,w) x ## y ## z ## w \| ^ ../torch/csrc/Storage.h:6:27: note: in expansion of macro ‘TH_CONCAT_4’ 6 \| #define THPStorage_(NAME) TH_CONCAT_4(THP,Real,Storage_,NAME) \| ^~~~~~~~~~~ ../torch/csrc/generic/Storage.cpp:299:22: note: in expansion of macro ‘THPStorage_’ 299 \| {"device", (getter)THPStorage_(device), nullptr, nullptr, nullptr}, \| ^~~~~~~~~~~ ../torch/csrc/Storage.h:6:39: warning: cast between incompatible function types from ‘PyObject ()(THPStorage)’ {aka ‘_object* ()(THPStorage)’} to ‘getter’ {aka ‘_object* ()(_object, void*)’} [-Wcast-function-type] 6 \| #define THPStorage_(NAME) TH_CONCAT_4(THP,Real,Storage_,NAME) \| ^~~ caffe2/aten/src/TH/THGeneral.h:154:37: note: in definition of macro ‘TH_CONCAT_4_EXPAND’ 154 \| #define TH_CONCAT_4_EXPAND(x,y,z,w) x ## y ## z ## w \| ^ ../torch/csrc/Storage.h:6:27: note: in expansion of macro ‘TH_CONCAT_4’ 6 \| #define THPStorage_(NAME) TH_CONCAT_4(THP,Real,Storage_,NAME) \| ^~~~~~~~~~~ ``` This issue and the fix is very similar to how CPython fixed it, see https://bugs.python.org/issue33012. There's still more of these warnings left, but this fixes the majority of them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/25483 Differential Revision: D17149824 Pulled By: ezyang fbshipit-source-id: 353560a4f76070fa7482608e9532b60205d16798	2019-09-09 06:35:11 -07:00
Tongzhou Wang	af638ad5d7	pin_memory should not copy on already pinned tensors (#23484 ) Summary: fixes https://github.com/pytorch/pytorch/issues/21076 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23484 Differential Revision: D16546264 Pulled By: ezyang fbshipit-source-id: 8058e0bbc6336751f36b884d71234feef498a982	2019-07-30 21:16:23 -07:00
Iurii Zdebskyi	3a8d7463bd	Enabled BFloat16 storage (#21523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21523 ghimport-source-id: 698b3cbd6b21c09b9ff8bf8011980df8e35c33b0 Test Plan: Imported from OSS Differential Revision: D15819368 Pulled By: izdeby fbshipit-source-id: f6b3bba7b3ca8ee677bd80a231dbb3920c07d61c	2019-07-09 21:51:06 -07:00
Pieter Noordhuis	6ff0c6ca3f	Remove THD (#22065 ) Summary: It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it. Closes https://github.com/pytorch/pytorch/issues/18967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065 Reviewed By: mrshenli Differential Revision: D15983669 Pulled By: pietern fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165	2019-06-25 12:19:13 -07:00
Vitaly Fedyunin	b46e87cd3d	Fix catch block to fix 'error: catching polymorphic type' (#21637 ) Summary: Fix catch block to fix 'error: catching polymorphic type `class c10::Error` by value [-Werror=catch-value=]' Pull Request resolved: https://github.com/pytorch/pytorch/pull/21637 Differential Revision: D15761860 Pulled By: VitalyFedyunin fbshipit-source-id: befc18a9c217440381cdb50a1319b0b5db5710e9	2019-06-11 12:30:52 -07:00
Yoshiaki Nakamura	52596164d4	Fix 32-bit env. model load issue (#20900 ) Summary: Fixed an issue where models can not be loaded in a 32-bit environment like Raspbian. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20900 Differential Revision: D15696709 Pulled By: ezyang fbshipit-source-id: 37a81f05f235d3b9fc6244e12d3320ced3d1465e	2019-06-06 10:30:29 -07:00
Jerry Zhang	277bf69fa0	Add torch.load/torch.save for QTensor (#20830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20830 att Reviewed By: dzhulgakov Differential Revision: D15340701 fbshipit-source-id: 677038c8101f66dec4856c2eccf9f9e394012226	2019-05-30 20:52:19 -07:00
Jerry Zhang	56fb5e03b5	refactor registerStoragePyTypeObject (#20467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20467 for upcoming changes in Storage for QInt8 Reviewed By: ezyang Differential Revision: D15330865 fbshipit-source-id: 2840e59c0bf088983f792fd724de41b3bb3dec55	2019-05-14 18:22:33 -07:00
Philipp Lang	f23fb66e6e	Fix in file position logic: file descriptor and Python-side handle (#20270 ) Summary: This addresses #18436 The logic replicates the essence of closing file descriptors in numpy: `bf20e30340/numpy/core/include/numpy/npy_3kcompat.h (L278)` This stores the position of the file descriptor before resetting it to the Python handle offset, then resets to the original position before exit. The Python-side handle is then updated to reflect the new position. Also added somewhat more demanding tests to cover this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20270 Differential Revision: D15275902 Pulled By: soumith fbshipit-source-id: 5ca8a52b61c7718d2e69571f72f80b1350b0acdb	2019-05-09 08:20:01 -07:00
Gregory Chanan	2113ea6fbf	Add device and dtype to storage. (#18749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18749 ghimport-source-id: 9026a037f5e11cdb9ccd386f4b6b5768b9c3259b Stack from [ghstack](https://github.com/ezyang/ghstack): * #18751 Disallow changing the device of a tensor via set_. * #18750 Use non-legacy constructors for tensor deserialization. * #18749 Add device and dtype to storage. The goal here is to fix our serialization, which currently depends on the legacy constructors. Having dtype and device on Storage allows us to use the non-legacy constructors. This fits somewhat along our goal of removing Storage, my having Storage act like a Tensor. Differential Revision: D14729516 fbshipit-source-id: bf4a3e8669ad4859931f4a3fa56df605cbc08dcb	2019-04-03 07:59:02 -07:00
Vitaly Fedyunin	5653a914f7	Implement reference counting for shared IPC CUDA tensors (#16854 ) Summary: This is to fix #16141 and similar issues. The idea is to track a reference to every shared CUDA Storage and deallocate memory only after a consumer process deallocates received Storage. ezyang Done with cleanup. Same (insignificantly better) performance as in file-per-share solution, but handles millions of shared tensors easily. Note [ ] documentation in progress. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16854 Differential Revision: D13994490 Pulled By: VitalyFedyunin fbshipit-source-id: 565148ec3ac4fafb32d37fde0486b325bed6fbd1	2019-03-25 10:24:38 -07:00
Iurii Zdebskyi	444039c47b	Bool tensor. Part 0: Boolean storage implementation (#16810 ) Summary: This is the first commit from a series of planned changes in order to add boolean tensors to PyTorch. The whole plan looks like this: 0. Storage Implementation (this change) 1. Tensor Creation. 2. Tensor Conversions. 3. Tensor Indexing. 4. Tensor Operations. 5. Back compatibility related changes. This feature was requested by the community: https://github.com/pytorch/pytorch/issues/4764 https://github.com/pytorch/pytorch/issues/4219 https://github.com/pytorch/pytorch/issues/4288 Change: Added boolean type to the Storage class for CPU and CUDA backends. Tested via: 1. unit tests 2. running this: -> import torch -> torch.BoolStorage <class 'torch.BoolStorage'> -> torch.cuda.BoolStorage <class 'torch.cuda.BoolStorage'> Pull Request resolved: https://github.com/pytorch/pytorch/pull/16810 Reviewed By: gchanan Differential Revision: D14087246 Pulled By: izdeby fbshipit-source-id: 042642ced1cb0fd1bb6bff05f9ca871a5c54ee5e	2019-02-19 08:22:13 -08:00
Edward Yang	e936a69085	Move THCCachingAllocator to c10_cuda. (#16119 ) Summary: Some renaming and renamespacing also took place. I was originally planning not to do anything, but it turns out that it was easier to make HIPify work by using a namespace CUDACachingAllocator:: rather than THCCachingAllocator_, since :: is a word boundary but _ is not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16119 Reviewed By: smessmer Differential Revision: D13718768 fbshipit-source-id: 884a481d99027fd3e34471c020f826aa12225656	2019-01-24 12:06:56 -08:00
Edward Yang	24b50f1411	Remove unnecessary includes and headers from THCCachingAllocator, move to at::cuda:: namespace (#16117 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16117 This means I can move it to c10_cuda with minimal fuss. Reviewed By: smessmer Differential Revision: D13717836 fbshipit-source-id: a94c7dc649af64542480fc1c226b289588886c00	2019-01-24 12:06:54 -08:00
Edward Yang	411173757e	Rename away uses of THAllocator and THCDeviceAllocator (#16061 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16061 I discovered I needed to delete these names in preparation of moving THCCachingAllocator to c10_cuda; might as well also fix all the other sites too. Reviewed By: dzhulgakov Differential Revision: D13686869 fbshipit-source-id: e8cc55d39ac4bfd3e3a22c761f89a7a111ce5f5e	2019-01-16 05:36:47 -08:00
Syed Tousif Ahmed	86af14b0c7	Resolves ptxas warnings when compiling for CUDA_ARCH 750 and a memoryType deprecation warning (#15461 ) Summary: When compiling for `TORCH_CUDA_ARCH_LIST=7.5` we were getting ptxas warnings (https://github.com/pytorch/pytorch/issues/14310). This was because we had some hardcoded values when using launch_bounds in kernels. The maximum number of threads per multiprocessor is 1024 for Turing architecture (7.5) but 2048 for previous architectures. The hardcoded launch_bounds in the kernel were requesting for 2048 threads when compiling for Turing and hence were generating the warning. This PR adds a macro that checks for the bounds on the launch bounds value supplied. The max number of threads per block across all architectures is 1024. If a user supplies more than 1024, I just clamp it down to 512. Depending on this value, I set the minimum number of blocks per sm. This PR should resolve https://github.com/pytorch/pytorch/issues/14310. The gradient computation being wrong reported in that PR is probably due to the faulty card. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15461 Differential Revision: D13633952 Pulled By: soumith fbshipit-source-id: 795aa151109f343ab5433bf3cb070cb6ec896fff	2019-01-10 21:44:39 -08:00
Richard Zou	8f11147d43	Use CUDAGuard when serializing CUDA Tensors (#15807 ) Summary: Fixes #15308. Before this change, `torch.save` and `torch.load` would initialize the CUDA context on GPU 0 if it hadn't been initialized already, even if the serialized tensors are only on GPU 1. This PR fixes that bug by using CUDAGuard in the storage serialization path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15807 Differential Revision: D13593201 Pulled By: zou3519 fbshipit-source-id: 4addc91ea5a5278d56a03f3d422577ee39e99897	2019-01-08 07:31:50 -08:00
Edward Yang	2d485ffb17	Move CUDAGuard, CUDAStream and CUDAGuardImpl to c10/cuda (#14248 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14248 This diff also introduces a horrifying hack to override CUDA's DeviceGuardImpl with a HIPGuardImplMasqueradingAsCUDA, to accommodate PyTorch's current behavior of pretending CUDA is HIP when you build with ROCm enabled. Reviewed By: bddppq Differential Revision: D13145293 fbshipit-source-id: ee0e207b6fd132f0d435512957424a002d588f02	2018-12-12 11:24:26 -08:00
Edward Yang	517c7c9861	Canonicalize all includes in PyTorch. (#14849 ) Summary: Anywhere we used #include "foo.h", we now say #include <foo.h> Paths are adjusted to be rooted out of aten/src, torch/lib, or the root level directory. I modified CMakeLists.txt by hand to remove TH and THC from the include paths. I used the following script to do the canonicalization: ``` import subprocess import re import os.path files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n') for fn in files: if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']): continue if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]): continue with open(fn, 'r') as f: c = f.read() def fmt(p): return "#include <{}>".format(p) def repl(m): p = m.group(1) if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]: return fmt(p) if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]): return fmt(p) for root in ["aten/src", "torch/lib", ""]: for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]: new_p = os.path.relpath(os.path.join(bad_root, p), root) if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))): return fmt(new_p) print("ERROR: ", fn, p) return m.group(0) new_c = re.sub(r'#include "([^"]+)"', repl, c) if new_c != c: print(fn) with open(fn, 'w') as f: f.write(new_c) ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849 Reviewed By: dzhulgakov Differential Revision: D13363445 Pulled By: ezyang fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68	2018-12-08 19:38:30 -08:00
Peter Goldsborough	d6c53328f9	Large scale fix of python-related files in torch/csrc/ Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14515 Differential Revision: D13247966 Pulled By: goldsborough fbshipit-source-id: 7a127c508fc576a7a92626dd6b729f660162d628	2018-12-07 13:04:46 -08:00
Lin Huang	524574ab73	Define THPStorage struct only once (rather than N times) (#14802 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14802 The definetion of THPStorage does not depend on any Real, its macro defintion is unnecessary, refactor the code so that THPStorage is not macro defined. Reviewed By: ezyang Differential Revision: D13340445 fbshipit-source-id: 343393d0a36c868b9a06eea2ad9b80f5e395e947	2018-12-05 13:19:29 -08:00
Ailing Zhang	be47470c91	Fix cuda multiprocessing cached memory (#14736 ) Summary: This PR fixes #11422 In the old world of CUDA IPC, when we want to share a tensor T from A to B, we have to share the whole CUDA mem allocation where T's storage sit in. And we casted it to the same type of storage of T's. This causes problem when two different types of storage got allocated to the same CUDA mem block. When we try to reconstruct the second tensor, it will complain about wrong storage type. In this PR we reconstruct the storage only (not the entire mem block). However, CUDA only allows one open memHandle once per process, we have to save the device pointer in a global cache so that we can reconstruct tensors as they come. Thanks a ton to ezyang who helped design the solution and debugged the issue! Pull Request resolved: https://github.com/pytorch/pytorch/pull/14736 Differential Revision: D13335899 Pulled By: ailzhang fbshipit-source-id: cad69db392ed6f8fdc2b93a9dc2899f6d378c371	2018-12-05 10:55:43 -08:00
albanD	6c8ac50753	Fix exception catching to catch c10::Error properly (#13665 ) Summary: In particular, this was breaking the logic for cudnn algorithm to fall back to a less memory hungry algorithm if the selected one OOM when creating the workspace. c10::Error are subclass of `std::exception` and not `std::runtime_error`. I removed `runtime_error` in all places in our code and replaced them with `const exception`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13665 Differential Revision: D12958396 Pulled By: soumith fbshipit-source-id: af557efd9887b013140113d3067de157ffcf8465	2018-11-07 11:22:48 -08:00
Edward Yang	e5d56659ec	Delete DeviceGuard(int64_t) constructor. (#13232 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13232 DeviceGuard should be device agnostic, which means that it shouldn't assume that int64_t means select the CUDA device. Reviewed By: gchanan Differential Revision: D10858024 fbshipit-source-id: b40e8337e4046906fd8f83a95e6206367fb29dbe	2018-10-31 07:55:11 -07:00
Edward Z. Yang	a5818047c4	Rewrite serialization to correctly handle partial reads/writes in all cases (#12143 ) Summary: Previously, doRead/doWrite were functions that could return partial reads/writes, and we checked for this case inconsistently in the call sites of serialization.cpp. Now, these functions do NOT return the amount of bytes read/written, and instead handle the necessary checking loop themselves. Fixes #12042. Maybe. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/12143 Differential Revision: D10097027 Pulled By: ezyang fbshipit-source-id: fd222ab8a825bed352153648ad396acfe124a3e1	2018-09-27 19:09:53 -07:00
Roy Li	f00f99ebcc	use at::Half in THC (#11322 ) Summary: - use Half instead of half in THC - clean up TH_float2half, TH_half2float, etc. conversions Pull Request resolved: https://github.com/pytorch/pytorch/pull/11322 Differential Revision: D9799553 Pulled By: li-roy fbshipit-source-id: 9aa3e003bff73d9df6224a393f3ec0624b1f44ed	2018-09-12 17:39:37 -07:00
Edward Yang	49231ab0a8	Reimplement storage slicing. (#11314 ) Summary: In #9466 I got rid of storage views and eliminated all places where they were used... OR SO I THOUGHT. In actuality, under certain conditions (specifically, if you trained a CUDA multiprocessing model shared over CUDA IPC and then serialized your parameters), you could also serialize storage slices to the saved model format. In #9466, I "fixed" the case when you loaded the legacy model format (really, just unshared the storages--not strictly kosher but if you aren't updating the parameters, shouldn't matter), but NOT the modern model format, so such models would fail. So, I could have applied the legacy model format fix too, but hyperfraise remarked that he had applied a fix that was effectively the same as unsharing the storages, but it had caused his model to behave differently. So I looked into it again, and realized that using a custom deleter, I could simulate the same behavior as old storage slices. So back they come. In principle, I could also reimplement storage views entirely using our allocators, but I'm not going to do that unless someone really really wants it. Fixes #10120. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/11314 Reviewed By: ailzhang Differential Revision: D9671966 Pulled By: ezyang fbshipit-source-id: fd863783d03b6a6421d6b9ae21ce2f0e44a0dcce	2018-09-06 16:11:59 -07:00
Edward Yang	0a8c8c1dbe	Rename real to scalar_t. (#11163 ) Summary: This is necessary to allow us to use the complex header which defines real (and is very sad if real is macro'ed). We should also fix accreal, ureal, Real and REAL, but only 'real' is the real blocker. ``` codemod -d aten/src/TH --extensions c,cc,cpp,cu,cuh,h,TARGETS,py,hpp '\breal\b' scalar_t codemod -d aten/src/THC --extensions c,cc,cpp,cu,cuh,h,TARGETS,py,hpp '\breal\b' scalar_t codemod -d aten/src/THNN --extensions c,cc,cpp,cu,cuh,h,TARGETS,py,hpp '\breal\b' scalar_t codemod -d aten/src/THCUNN --extensions c,cc,cpp,cu,cuh,h,TARGETS,py,hpp '\breal\b' scalar_t ``` Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/11163 Reviewed By: SsnL Differential Revision: D9619906 Pulled By: ezyang fbshipit-source-id: 922cb3a763c0bffecbd81200c1cefc6b8ea70942	2018-09-02 15:26:01 -07:00
Peter Goldsborough	7ddc6f84c4	NULL -> nullptr (#11047 ) Summary: How did we get so many uses of `NULL` again? ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/11047 Differential Revision: D9566799 Pulled By: goldsborough fbshipit-source-id: 83469f352ac69aa65bdaf1a1a21f922d892e0db3	2018-08-30 16:25:42 -07:00

1 2 3 4 5 ...

399 Commits