pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Will Feng	c67d3533a7	Update C++ torch::nn parity table, and temporarily disable C++ API parity test (#28117 ) Summary: This PR updates `test/cpp_api_parity/parity-tracker.md` to reflect our progress on C++ `torch::nn` parity. It also disables the C++ API parity test temporarily, and as the next step I will refactor the parity test to make it simpler. Pull Request resolved: https://github.com/pytorch/pytorch/pull/28117 Differential Revision: D17957948 Pulled By: yf225 fbshipit-source-id: 1dd836c25665f57ba8efc6d1abf671a95c03eff7	2019-10-16 11:54:13 -07:00
Jithun Nair	6eef469074	Enable mgpu unit tests for rocm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27518 Differential Revision: D17880153 Pulled By: bddppq fbshipit-source-id: 5b6210104ec66747558a08f97dda1e7796f681df	2019-10-11 14:35:36 -07:00
Pieter Noordhuis	c5ec0a7ede	Don't run dist_autograd_fork on Python 2 (#27612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27612 The file imports from torch.distributed.rpc, which won't be initialized when running on Python 2. Test Plan: Imported from OSS Differential Revision: D17855033 Pulled By: pietern fbshipit-source-id: 6e6b0ca248d0512dac5a44e10e153c710cefe02c	2019-10-11 11:18:46 -07:00
Yanli Zhao	fc249c7924	skip all rpc and dist autograd spawn tests for <PY36 (#27191 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27191 skip rpc and distautograd spawns tests for <python 3.6 ghstack-source-id: 91231565 close #27157 Test Plan: unit tests Differential Revision: D17697368 fbshipit-source-id: bb8cf1f47de41f9d350fd60afe37fece293d8680	2019-10-02 23:05:51 -07:00
Shihao Xu	00e588290b	Add test case for init_rpc_backend (#26997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26997 Reverting accidental change in https://github.com/pytorch/pytorch/pull/26919 ghstack-source-id: 91126906 Reviewed By: zhaojuanmao Differential Revision: D17637468 fbshipit-source-id: 9ffcf4b15b37effe6b5d5f82338ff89298c82a52	2019-10-01 15:44:34 -07:00
Shen Li	bb8983e936	Revert D17694691: Enable distributed autograd tests for >py36 Test Plan: revert-hammer Differential Revision: D17694691 Original commit changeset: 6e7b74064589 fbshipit-source-id: 7da10f478adbbde05f16eb6095acb000d7945c99	2019-10-01 15:00:33 -07:00
Shen Li	7bbb2df6d9	Enable distributed autograd tests for >py36 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27166 Test Plan: Imported from OSS Reviewed By: zhaojuanmao Differential Revision: D17694691 Pulled By: mrshenli fbshipit-source-id: 6e7b740645891fd3cc67600de26346f7b336773b	2019-10-01 14:46:06 -07:00
Yanli Zhao	1d2d59dd79	make rpc and dist-autograd multiprocess test to use both fork and spawn (#25656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25656 spawn multiprocessing can catch some issues that fork multiprocessing can not catch, meanwhile fork can work properly with asan tests, but spawn multiprocessing can not work with asan tests for some use cases right now. so this diff adding support to launch both spawn and fork tests in multiProcessingTestCase class, also let test_rpc and test_dist_autograd to run both spawn and fork tests ghstack-source-id: 91096705 Test Plan: unit tests Reviewed By: xush6528 Differential Revision: D17086007 fbshipit-source-id: af2446e7abe948c37081cff24ed060fd87f84922	2019-10-01 11:15:22 -07:00
Mike Ruberry	a9a9d362e2	Makes test_indexing.py device generic (#26634 ) Summary: - Makes test_indexing.py device generic - Removes test_indexing_cuda.py Note: a couple tests in test_indexing.py were already CPU and CUDA tests, meaning these tests were run multiple times when CUDA was available. Genericizing test_indexing.py corrects this and lets these tests be run on other device types, like XLA, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26634 Differential Revision: D17529001 Pulled By: mruberry fbshipit-source-id: e71ba28d947749255a0aceeb7b77a42c4811439d	2019-09-23 11:52:48 -07:00
peter	2ce8c83f67	Enable CPU fused kernel on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25578 Differential Revision: D17397156 Pulled By: ezyang fbshipit-source-id: b243528c2bfd5a0d401897833048429e67fe40ef	2019-09-17 07:29:40 -07:00
Pieter Noordhuis	e4cd807cdb	Make running Gloo tests conditional on availability Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25913 Test Plan: Imported from OSS Differential Revision: D17313283 Pulled By: pietern fbshipit-source-id: f07cb456e79a0067eac0f7abbc378fbd05c5565f	2019-09-11 02:20:32 -07:00
Lu Fang	75cac0fe69	expose parse_schema and __eq__ function to python and add round trip tests (#23208 ) Summary: expose necessary functions to python, and add round-way tests for function schema str() and parsing functions. We iterate over all the registered function schemas and get the string, then parse the string. We compare the schema generated from parsing with the original one, and make sure they are equal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23208 ghstack-source-id: 89638026 Test Plan: buck test //caffe2/test:function_schema Reviewed By: zrphercule Differential Revision: D16435471 fbshipit-source-id: 6961ab096335eb88a96b132575996c24090fd4c0	2019-09-06 15:50:56 -07:00
Brian Vaughan	88e4cee3e7	Improve handling of mixed-type tensor operations (#22273 ) Summary: Improve handling of mixed-type tensor operations. This PR affects the arithmetic (add, sub, mul, and div) operators implemented via TensorIterator (so dense but not sparse tensor ops). For these operators, we will now promote to reasonable types where possible, following the rules defined in https://github.com/pytorch/pytorch/issues/9515, and error in cases where the cast would require floating point -> integral or non-boolean to boolean downcasts. The details of the promotion rules are described here: https://github.com/nairbv/pytorch/blob/promote_types_strict/docs/source/tensor_attributes.rst Some specific backwards incompatible examples: * now `int_tensor * float` will result in a float tensor, whereas previously the floating point operand was first cast to an int. Previously `torch.tensor(10) * 1.9` => `tensor(10)` because the 1.9 was downcast to `1`. Now the result will be the more intuitive `tensor(19)` * Now `int_tensor *= float` will error, since the floating point result of this operation can't be cast into the in-place integral type result. See more examples/detail in the original issue (https://github.com/pytorch/pytorch/issues/9515), in the above linked tensor_attributes.rst doc, or in the test_type_promotion.py tests added in this PR: https://github.com/nairbv/pytorch/blob/promote_types_strict/test/test_type_promotion.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/22273 Reviewed By: gchanan Differential Revision: D16582230 Pulled By: nairbv fbshipit-source-id: 4029cca891908cdbf4253e4513c617bba7306cb3	2019-09-05 18:26:09 -07:00
Pritam Damania	7818e7e5d4	Basic framework for Distributed Autograd context. (#24875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24875 As per https://github.com/pytorch/pytorch/issues/23110, each autograd pass would be assigned a unique autograd_context_id. In this change we introduce a DistAutogradContainer per worker which holds information for each autograd pass currently running. DistAutogradContainer has a map from the autograd_context_id to DistAutogradContext (which holds all the relevant information for the autograd pass). DistAutogradContext currently only stores the autograd_context_id and more information would be added to it later as we build out the rest of the framework. The autograd_context_id is a 64 bit globally unique integer where the first 16 bits are the worker_id and next 48 bits are auto-incrementing for uniqueness. Sample python code on how this would be used for distributed autograd: ``` import torch.distributed.autograd as dist_autograd worker_id = 0 dist_autograd.init(worker_id) with dist_autograd.context() as context_id: # forward pass... # backward pass... # optimizer step... ``` ghstack-source-id: 89119248 Test Plan: unit tests. Differential Revision: D16356694 fbshipit-source-id: d1a8678da0c2af611758dbb5d624d554212330ce	2019-08-28 18:51:56 -07:00
Raghuraman Krishnamoorthi	9945c0cea6	Work around for bias quantization for conv and linear operators (#25212 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25212 In eager mode, all modules need to work with input tensors that can change qparams dynamically. This issue https://github.com/pytorch/pytorch/issues/23874 will address this via FBGEMM modifications. This is a work around before that. ghstack-source-id: 89118038 Test Plan: buck test caffe2/test:quantized -- 'test_conv_api $test_quantized_nn_mods\.ModuleAPITest$' --print-passing-details Summary (total time 65.86s): PASS: 1 FAIL: 0 SKIP: 0 FATAL: 0 TIMEOUT: 0 OMIT: 0 Differential Revision: D17064471 fbshipit-source-id: 3c192442b19bf2d9d88d4e52de6c24dc134a846f	2019-08-28 07:24:03 -07:00
Elias Ellison	277cd748f9	skip fstrings test if not py36 (#25184 ) Summary: Fixes py35 job on master Pull Request resolved: https://github.com/pytorch/pytorch/pull/25184 Differential Revision: D17057957 Pulled By: eellison fbshipit-source-id: 53decc408680d9436395698cbd4b4ede98933159	2019-08-26 13:58:45 -07:00
Will Feng	1bf1970fe2	Add Python/C++ torch.nn API parity test harness (#23852 ) Summary: This PR adds test harness for checking Python / C++ API parity for `torch.nn.Module` subclasses. Under the hood, we use JIT tracing to transfer `nn.Module` state from Python to C++, so that we can test initialization / forward / backward on Python / C++ modules with the same parameters and buffers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23852 Differential Revision: D16830204 Pulled By: yf225 fbshipit-source-id: 9b5298c0e8cd30e341a9f026e6f05604a82d6002	2019-08-26 08:02:25 -07:00
Elias Ellison	ab38059bc7	fix annotated assignment (#25094 ) Summary: Fixing parsing for annotated assignment `List[int] a = []`. See https://github.com/pytorch/pytorch/pull/24989/files?file-filters%5B%5D=.py for changes to the test_jit_py3 & run_test files. follow up to https://github.com/pytorch/pytorch/pull/24477 and fix for https://github.com/pytorch/pytorch/issues/25086 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25094 Differential Revision: D16985016 Pulled By: eellison fbshipit-source-id: 6be1363f2503303b96bd2e6a9f188ad72441f4eb	2019-08-23 13:14:38 -07:00
Zachary DeVito	f9f5af0ed7	Revert D16949314: [jit] Fix bugs in assignment to optionals Test Plan: revert-hammer Differential Revision: D16949314 Original commit changeset: 7f63d88b30a3 fbshipit-source-id: d1f00de2ad9c3484b731ad1b24205ca60024355d	2019-08-22 16:50:48 -07:00
Zachary DeVito	bb79b61ce7	Fix bugs in assignment to optionals (#24989 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24989 This fixes the cases where a type annotated with optional cannot be conditionally assigned to none: ``` x : Optional[int] = 4 if ...: x = None ``` Test Plan: Imported from OSS Differential Revision: D16949314 Pulled By: zdevito fbshipit-source-id: 7f63d88b30a3f5b024c2a539aa74967c9202af00	2019-08-22 16:27:46 -07:00
Michael Suo	ef14d88f27	Make torch.jit.Attribute work with PYTORCH_ENABLED=0 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23851 Test Plan: Imported from OSS Differential Revision: D16840394 Pulled By: suo fbshipit-source-id: b72e081513de73f565f3aeaa61125b7d3aa9c3e7	2019-08-19 15:23:21 -07:00
Michael Suo	0ce7264ed6	Don't require slow test reporting in `run_tests.py --pytest` (#24448 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24448 The setting `--durations=10` was hard-coded, which is annoying as I don't necessarily care. A good alternative to get the same behavior is: ``` python run_tests.py --pytest -- --durations=10 ``` Test Plan: Imported from OSS Differential Revision: D16876380 Pulled By: suo fbshipit-source-id: 1e14d366db45b6b9bf4a4ab1633b0f6ece29f6bc	2019-08-17 01:26:07 -07:00
James Reed	7597741159	Run quantization tests first Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24366 Test Plan: Imported from OSS Differential Revision: D16815295 Pulled By: jamesr66a fbshipit-source-id: 01478ce2fcbe0620cd5cf9854121602e0663c057	2019-08-14 18:09:32 -07:00
James Reed	e7f1977bae	test_nn_quantized -> test_quantized_nn_mods (#24201 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24201 It turns out that the `run_test` script uses a blacklist of "exclude" tests and tests if the test name [starts with](https://github.com/pytorch/pytorch/blob/master/test/run_test.py#L342) the given blacklist item. `nn` was passed as a blacklist item in CI, and that meant that not only was test_nn skipped, but also test_nn_quantized. This renames the test to avoid this situation, and imo puts it in a better position lexicographically next to the other quantization tests. Test Plan: Imported from OSS Differential Revision: D16772820 Pulled By: jamesr66a fbshipit-source-id: 4cde0729b48ae3e36fcedab9c98197831af82dde	2019-08-13 17:07:15 -07:00
Shen Li	8b349073ce	sync and async torch.distributed.rpc for builtin operators (#23228 ) Summary: Features: * sync and async RPC for builtin operators * RpcAgent API * ProcessGroupAgent implementation Goal: * have a minimum working and testable RPC implementation * make sure the RpcAgent API is sufficient for future ThriftAgent and TensorPipeAgent implementation * For tensor pipe implementation, it might allocate multiple underlying communication channels with different types, and might also use streaming serialization/deserialization for large tensors. To support this requirement, the current implementation only convert a BuiltinOp into a Message which contains a byte vector and a tensor table. It is up to the RpcAgent implementation to determine how it would like to serialize a Message object. * For ThriftAgent, as Thrift has it own request/response matching solution, the Message.id is no longer necessary. Hence the id can be dropped during serialization. All it needs to do is to pass the response Message object to the Future returned by send(...). * support blocking and non-blocking RequestCallback * blocking means the callback won't return before sending out the response * non-blocking can be achieved by enqueue the `(from, request, RpcAgent&)` tuple and use a different thread to process them. That is why there is an `RpcAgent&` arg in the param list. We are not exporting this diff until we finalize distributed autograd design and publish the API review publicly. https://fb.quip.com/FabTAZKVgQpf Pull Request resolved: https://github.com/pytorch/pytorch/pull/23228 ghstack-source-id: 87816717 Reviewed By: zhaojuanmao Differential Revision: D15194693 fbshipit-source-id: 7adb600796613cde6073db6c227451b89940ecaf	2019-08-06 16:03:01 -07:00
James Reed	40f0b1c844	Enable OSS quantization tests (#23858 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23858 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23718 Changes: - Enable tests for quantization test files in `run_tests.py` - Remove `__future__` imports from `torch/nn/qat/modules/__init__.py`, since `unicode_literals` messes up imports on python2 because the elements in `__all__` will be Unicode and not string - Skip PostTrainingQuantTests if the build doesn't have FBGEMM (only a small subset of targets in tests) or if testing under UBSAN (the suppression file doesn't seem to work) Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D16639467 Pulled By: jamesr66a fbshipit-source-id: 532766797c216976dd7e07d751f768ff8e0fc207	2019-08-06 11:20:30 -07:00
SsnL	8482efb203	pin_memory malloc now uses existing context if available. (#22229 ) Summary: This is achieved by using `cuDevicePrimaryCtxGetState` as a way to check whether a primary context exists on a device. It is not too slow, from this benchmark of a single call to it on CUDA 10.1, Titan Xp, driver 415.27: ``` --------------------------------------------------------------------- Benchmark Time CPU Iterations --------------------------------------------------------------------- BM_cuDevicePrimaryCtxGetState 301 ns 301 ns 2319746 ``` Commits: 1. Add `CUDAHooks::getDeviceWithPrimaryContext` which returns a device index with primary context (if exists). Link `c10/cuda` against `libcuda` for device API calls. 2. Use `getDeviceWithPrimaryContext` to check primary context in `pin_memory`. Fix `OptionalDeviceGuard` doc. 3. Refactor `test_cuda_primary_ctx.py` to support multiple tests. Add test for this in that file. Fixes https://github.com/pytorch/pytorch/issues/21081. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22229 Differential Revision: D16170194 Pulled By: zou3519 fbshipit-source-id: 485a45f211b7844c9e69c63f3b3b75194a796c5d	2019-07-16 10:18:30 -07:00
Pieter Noordhuis	6ff0c6ca3f	Remove THD (#22065 ) Summary: It's been ~9 months since moving THD to the `torch.distributed.deprecated` namespace (see https://github.com/pytorch/pytorch/issues/11405) and we haven't seen issues related to it, so it's time to remove it. Closes https://github.com/pytorch/pytorch/issues/18967. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22065 Reviewed By: mrshenli Differential Revision: D15983669 Pulled By: pietern fbshipit-source-id: 2a2f5866f9a63040bc7cef3956d5fd215aba7165	2019-06-25 12:19:13 -07:00
Shen Li	25d1496d58	Fix Process Group for tensors shared across processes (#21449 ) Summary: Ops on a Process Group (pg) instance will hit an error when input/output tensors are created on a different process, because, pg calls `recordStream` on `CUDACachingAllocator` which only knows tensors created within the same process. The proposed solution is to add a `suppressError` arg (suggestions for better names?) to `recordStream`. See comments in code for arguments. CC pichuang1984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21449 Differential Revision: D15689736 Pulled By: mrshenli fbshipit-source-id: e7fc81b167868f8666536067eaa7ae2c8584d88e	2019-06-11 11:50:25 -07:00
Elias Ellison	f6e5846a67	add handle to run all jit tests (#21161 ) Summary: Now you can run `python test/run_tests --jit` to run all jit tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/21161 Differential Revision: D15563912 Pulled By: eellison fbshipit-source-id: 4bb0285cda4168b72a3dc4bba471485566a59873	2019-05-30 14:12:21 -07:00
Dmytro Dzhulgakov	c25e33789e	Lightweight at-most-once logging for API usage (#20745 ) Summary: Resubmit #20698 which got messed up. Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl. Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745 Differential Revision: D15429196 Pulled By: dzhulgakov fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca	2019-05-23 23:17:59 -07:00
Richard Zou	83a80d2b31	Add test/test_namedtensor.py (#20168 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20168 ghimport-source-id: 78bd3c4b6bc87c216ce33dba13b61feb87e5fe53 Reviewed By: gchanan Differential Revision: D15278222 Pulled By: zou3519 fbshipit-source-id: 3bcdb1cb654400350d42464dd9e0d5e0a7116e1e	2019-05-09 09:09:22 -07:00
Tzu-Wei Huang	98e312cf96	TensorBoard support within PyTorch (#16196 ) Summary: This PR adds TensorBoard logging support natively within PyTorch. It is based on the tensorboardX code developed by lanpa and relies on changes inside the tensorflow/tensorboard repo landing at https://github.com/tensorflow/tensorboard/pull/2065. With these changes users can simply `pip install tensorboard; pip install torch` and then log PyTorch data directly to the TensorBoard protobuf format using ``` import torch from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter() s1 = torch.rand(1) writer.add_scalar('data/scalar1', s1[0], 0) writer.close() ``` Design: - `EventFileWriter` and `RecordWriter` from tensorboardX now live in tensorflow/tensorboard - `SummaryWriter` and PyTorch-specific conversion from tensors, nn modules, etc. now live in pytorch/pytorch. We also support Caffe2 blobs and nets. Action items: - [x] `from torch.utils.tensorboard import SummaryWriter` - [x] rename functions - [x] unittests - [x] move actual writing function to tensorflow/tensorboard in https://github.com/tensorflow/tensorboard/pull/2065 Review: - Please review for PyTorch standard formatting, code usage, etc. - Please verify unittest usage is correct and executing in CI Any significant changes made here will likely be synced back to github.com/lanpa/tensorboardX/ in the future. cc orionr, ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/16196 Differential Revision: D15062901 Pulled By: orionr fbshipit-source-id: 3812eb6aa07a2811979c5c7b70810261f9ea169e	2019-04-25 21:30:23 -07:00
Junjie Bai	ef499cd567	Remove no-fork workaround for running tests with ROCm (#19436 ) Summary: This should have been fixed in newest ROCm version. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19436 Reviewed By: ezyang Differential Revision: D15004685 Pulled By: bddppq fbshipit-source-id: 19fd4cca94c914dc54aabfbb4e62b328aa348a35	2019-04-19 09:51:03 -07:00
Zafar Takhirov	c145c34a7b	Basic implementation of QRelu in C10 (#19091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19091 Implements a basic quantized ReLU (uint8). This is a temporary solution before using the `QTensor` type instead of the tuple. Reviewed By: dzhulgakov Differential Revision: D14565413 fbshipit-source-id: 7d53cf5628cf9ec135603d6a1fb7c79cd9383019	2019-04-11 08:47:56 -07:00
jgong5	3ad710b837	Add MKL-DNN Tensor (#17748 ) Summary: This is a minimalist PR to add MKL-DNN tensor per discussion from Github issue: https://github.com/pytorch/pytorch/issues/16038 Ops with MKL-DNN tensor will be supported in following-up PRs to speed up imperative path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17748 Reviewed By: dzhulgakov Differential Revision: D14614640 Pulled By: bddppq fbshipit-source-id: c58de98e244b0c63ae11e10d752a8e8ed920c533	2019-04-08 21:41:38 -07:00
Elias Ellison	a5ddecd00c	Move fuser to test_jit_fuser (#18590 ) Summary: Start of breaking up test_jit.py New files will have the format test_jit_* so they are easily grepable but remain in the same directory so we don't have to go through multiple sources for imports. I am adding a test that's expected to fail to be sure it's running. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18590 Reviewed By: wanchaol Differential Revision: D14677094 Pulled By: eellison fbshipit-source-id: 9782c6aa9525bb6f332fc75cfff004c83a417522	2019-03-29 18:13:26 -07:00
Edward Yang	4bea15f580	Fix lint in run_test.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17815 Reviewed By: eellison Differential Revision: D14390308 fbshipit-source-id: 22efd62a1bbd1fc8155a942d7160d5b7d3158e6b	2019-03-08 14:41:36 -08:00
peter	c78da0c6ed	Enable using CMD when building cpp extensions on Windows Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706 Differential Revision: D14346482 Pulled By: ezyang fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9	2019-03-06 14:45:31 -08:00
Gao, Xiang	b6b99fd7d3	Add namedtuple return for min, median, mode, kthvalue, add test for namedtuple return API (#16186 ) Summary: This partially fixes https://github.com/pytorch/pytorch/issues/394 and depend on https://github.com/pytorch/pytorch/pull/15429. I suggest to review this only after https://github.com/pytorch/pytorch/pull/15429 get landed, otherwise the diff might be large to review. The test only allows explicitly whitelisted operators to have named return. Differential Revision: D14070735 Pulled By: ezyang fbshipit-source-id: ace2a672998b4e4a8094f52cbda5aa1cea6e3b42	2019-02-16 00:01:33 -08:00
Xiang Gao	07b5782ff7	Add some missing docs to torch.rst, new unittest to enforce torch.rst no longer miss anything (#16039 ) Summary: This prevent people (reviewer, PR author) from forgetting adding things to `torch.rst`. When something new is added to `_torch_doc.py` or `functional.py` but intentionally not in `torch.rst`, people should manually whitelist it in `test_docs_coverage.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16039 Differential Revision: D14070903 Pulled By: ezyang fbshipit-source-id: 60f2a42eb5efe81be073ed64e54525d143eb643e	2019-02-15 07:02:31 -08:00
Thomas Viehmann	6a6983ed7f	create type hint stub files for module torch (#12500 ) Summary: We have: - This is an initial stab at creating a type stub `torch/__init__.pyi` . - This is only tested on Python 3, since that's the only Python version mypy works on. - So far, we only aim at doing this for torch functions and torch.Tensor. - Quite a few methods and functions have to be typed manually. These are done in `torch/__init__.pyi.in` For me, PyCharm (the non-paid one) didn't seem to indicate errors in the .pyi when opening and seemed to be able to get the type hint for the few functions I tried, but I don't use PyCharm for my usual PyTorch activities, so I didn't extensively try this out. An example of a generated PYI is at [this gist](https://gist.github.com/ezyang/bf9b6a5fa8827c52152858169bcb61b1). Pull Request resolved: https://github.com/pytorch/pytorch/pull/12500 Differential Revision: D13695553 Pulled By: ezyang fbshipit-source-id: 4566c71913ede4e4c23ebc4a72c17151f94e8e21	2019-01-29 12:14:17 -08:00
Syed Tousif Ahmed	17e3ab957a	Report the slowest 10 tests when using pytest (#16423 ) Summary: This flag is useful in identifying if a test is taking way too long like the ones in the following snippet when running the test suite with pytest. `9757ad35b0/test/common_utils.py (L814-L835)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16423 Differential Revision: D13843507 Pulled By: ezyang fbshipit-source-id: 643e1766a85905b3b112ea5ca562135a17896a72	2019-01-28 10:33:05 -08:00
SsnL	ffd613800f	Add IS_PYTORCH_CI flag for testing (#16006 ) Summary: Use case: Some data loader tests rely on `psutil` (a third party lib). So they are guarded by `skipIf`. But we want to always test them on CI envs. With `IS_PYTORCH_CI`, we can raise if `psutil` is not found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16006 Reviewed By: ezyang Differential Revision: D13673957 Pulled By: yf225 fbshipit-source-id: c63a7138093f45333c0b371fed0bcc88b67f2a22	2019-01-16 23:07:38 -08:00
Mickaël Schoentgen	71c6e24373	Fix several ResourceWarning: unclosed file (#15746 ) Summary: Hello, This is a patch to fix `ResourceWarning: unclosed file`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15746 Differential Revision: D13587286 Pulled By: soumith fbshipit-source-id: 08ac34c5b51d9334867f65a2927bff11511553f3	2019-01-09 15:36:53 -08:00
bddppq	2db742fc95	Do not use fork to invoke test scripts in pytorch rocm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14600 Differential Revision: D13523937 Pulled By: bddppq fbshipit-source-id: 1493fdd051283650081d7944bb2bd7f0c4c44990	2018-12-19 21:35:16 -08:00
Peter Goldsborough	6f2307ba6a	Allow building libraries with setuptools that dont have abi suffix (#14130 ) Summary: When using `setuptools` to build a Python extension, setuptools will automatically add an ABI suffix like `cpython-37m-x86_64-linux-gnu` to the shared library name when using Python 3. This is required for extensions meant to be imported as Python modules. When we use setuptools to build shared libraries not meant as Python modules, for example libraries that define and register TorchScript custom ops, having your library called `my_ops.cpython-37m-x86_64-linux-gnu.so` is a bit annoying compared to just `my_ops.so`, especially since you have to reference the library name when loading it with `torch.ops.load_library` in Python. This PR fixes this by adding a `with_options` class method to the `torch.utils.cpp_extension.BuildExtension` which allows configuring the `BuildExtension`. In this case, the first option we add is `no_python_abi_suffix`, which we then use in `get_ext_filename` (override from `setuptools.build_ext`) to throw away the ABI suffix. I've added a test `setup.py` in a `no_python_abi_suffix_test` folder. Fixes https://github.com/pytorch/pytorch/issues/14188 t-vi fmassa soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/14130 Differential Revision: D13216575 Pulled By: goldsborough fbshipit-source-id: 67dc345c1278a1a4ee4ca907d848bc1fb4956cfa	2018-11-27 17:35:53 -08:00
Sam Gross	006505bb8f	Speed-up "advanced" indexing operations (#13420 ) Summary: This speeds-up "advanced" indexing (indexing a tensor by a tensor) on CPU and GPU. There's still a bunch of work to do, including speeding up indexing by a byte (boolean) mask and speeding up the derivative calculation for advanced indexing. Here's some speed comparisons to indexing on master using a little [benchmark script](https://gist.github.com/colesbury/c369db72aad594e5e032c8fda557d909) with 16 OpenMP threads and on a P100. The test cases are listed as (input shape -> output shape). \| Test case \| CPU (old vs. new) \| CUDA (old vs. new) \| \|-----------------------\|---------------------\|------------------------\| \| 1024x1024 -> 512x1024 \| 225 us vs. 57 us \| 297 us vs. 47 us \| \| 1024x1024 -> 1024x512 \| 208 us vs. 153 us \| 335 us vs. 54 us \| \| 50x50 -> 20000x50 \| 617 us vs. 77 us \| 239 us vs. 54 us \| \| 50x50 -> 50x20000 \| 575 us vs. 236 us \| 262 us vs. 58 us \| \| 2x5x10 -> 10 \| 65 us vs. 18 us \| 612 us vs. 93 us \| See #11647 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13420 Reviewed By: soumith Differential Revision: D13088936 Pulled By: colesbury fbshipit-source-id: 0a5c2ee9aa54e15f96d06692d1694c3b24b924e2	2018-11-27 15:23:59 -08:00
Johannes M Dieterich	ce48958606	enable more unit tests (#13166 ) Summary: This enables the distributions and utils test sets for ROCm. Individual tests are enabled that now pass due to fixes in HIP/HCC/libraries versions in white rabbit. For attention: bddppq ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/13166 Differential Revision: D12814759 Pulled By: bddppq fbshipit-source-id: ea70e775c707d7a8d2776fede6154a755adef43e	2018-11-12 18:49:52 -08:00
Peter Goldsborough	7978ba45ba	Update path in CI script to access ninja (#13646 ) Summary: We weren't running C++ extensions tests in CI. Also, let's error hard when `ninja` is not available instead of skipping C++ extensions tests. Fixes https://github.com/pytorch/pytorch/issues/13622 ezyang soumith yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13646 Differential Revision: D12961468 Pulled By: goldsborough fbshipit-source-id: 917c8a14063dc40e6ab79a0f7d345ae2d3566ba4	2018-11-07 14:31:29 -08:00
Pieter Noordhuis	be424de869	Add torch.multiprocessing.spawn helper (#13518 ) Summary: This helper addresses a common pattern where one spawns N processes to work on some common task (e.g. parallel preprocessing or multiple training loops). A straightforward approach is to use the multiprocessing API directly and then consecutively call join on the resulting processes. This pattern breaks down in the face of errors. If one of the processes terminates with an exception or via some signal, and it is not the first process that was launched, the join call on the first process won't be affected. This helper seeks to solve this by waiting on termination from any of the spawned processes. When any process terminates with a non-zero exit status, it terminates the remaining processes, and raises an exception in the parent process. If the process terminated with an exception, it is propagated to the parent. If the process terminated via a signal (e.g. SIGINT, SIGSEGV), this is mentioned in the exception as well. Requires Python >= 3.4. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13518 Reviewed By: orionr Differential Revision: D12929045 Pulled By: pietern fbshipit-source-id: 00df19fa16a568d1e22f37a2ba65677ab0cce3fd	2018-11-06 14:08:37 -08:00
Tongzhou Wang	6d2b3cc869	Fix pytest, make it work with run_test.py (#13416 ) Summary: Fixes #13326 Also now you can use `run_test.py` with `pytest`. E.g., ``` python run_test.py -vci distributed -pt ``` Yes it works with `distributed` and `cpp_extension`. cc zou3519 vishwakftw Pull Request resolved: https://github.com/pytorch/pytorch/pull/13416 Differential Revision: D12895622 Pulled By: SsnL fbshipit-source-id: 2d18106f3a118d642a666bfb1318f41c859c3df7	2018-11-01 19:08:06 -07:00
verhoek	33b00bdbb8	cwd arg in shell function of run_test set to optional (#13247 ) Summary: Tiny fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13247 Differential Revision: D12830311 Pulled By: soumith fbshipit-source-id: 405620e3a1de5bfc7e039f9aaf2f7cb7a3bca1b1	2018-10-29 15:17:00 -07:00
Jesse Hellemn	448a32e0ee	Adding timestamps to the beginning of every test file in run_test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12994 Reviewed By: anderspapitto Differential Revision: D10515291 Pulled By: pjh5 fbshipit-source-id: 191054cdacff308b63e9063d22d62314398e4f88	2018-10-24 11:42:31 -07:00
Edward Yang	bc1d96ca98	Add support for inline expect tests. (#12825 ) Summary: expecttest and test_expecttest are the implementation and tests for this functionality. I wired it up to the --accept flag, but there's also a new environment variable EXPECTTEST_ACCEPT which may be more convenient to trigger. Haven't tested if this works in fbcode. There may be a few expect tests which will benefit from inline treatment, but I just did one to show it works. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/12825 Reviewed By: teng-li Differential Revision: D10448630 Pulled By: ezyang fbshipit-source-id: 3d339f82e2d00891309620a60e13039fa1ed8b46	2018-10-22 19:29:04 -07:00
James Sun	f4944f0f8a	Rename test/common.py to test/common_utils.py (#12794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794 common.py is used in base_module for almost all tests in test/. The name of this file is so common that can easily conflict with other dependencies if they happen to have another common.py in the base module. Rename the file to avoid conflict. Reviewed By: orionr Differential Revision: D10438204 fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380	2018-10-17 23:04:29 -07:00
Benoit Steiner	bbe6ef3864	torch.finfo and torch.iinfo to mimic the numpy equivalent (#12472 ) Summary: This pull request intends to provide the functionality requested in https://github.com/pytorch/pytorch/issues/10742 by adding a new torch.finfo and torch.iinfo API. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12472 Differential Revision: D10250829 Pulled By: benoitsteiner fbshipit-source-id: eb22ca55d5b0064bef381fa7f1eb75989977df30	2018-10-15 13:43:52 -07:00
Alex Ford	7a1b668283	Implement Tensor.__cuda_array_interface__. (#11984 ) Summary: _Implements pytorch/pytorch#11914, cc: ezyang_ Implements `__cuda_array_interface__` for non-sparse cuda tensors, providing compatibility with numba (and other cuda projects...). Adds `numba` installation to the `xenial-cuda9` jenkins test environments via direct installation in `.jenkins/pytorch/test.sh` and numba-oriented test suite in `test/test_numba_integration.py`. See interface reference at: https://numba.pydata.org/numba-doc/latest/cuda/cuda_array_interface.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/11984 Differential Revision: D10361430 Pulled By: ezyang fbshipit-source-id: 6e7742a7ae4e8d5f534afd794ab6f54f67808b63	2018-10-12 13:41:05 -07:00
Christian Puhrsch	d8f6be686d	Remove torch/legacy (#11823 ) Summary: Largely unused and hinders current development Pull Request resolved: https://github.com/pytorch/pytorch/pull/11823 Differential Revision: D9925094 Pulled By: cpuhrsch fbshipit-source-id: c797f62180e2128f9a567b0c57c8347957470ea5	2018-09-20 14:00:54 -07:00
Gregory Chanan	85ff72348d	Only involve tensor device in CUDA -> CPU copy, not current device. (#11592 ) Summary: This also unifies the device usage between the async and sync case. Fixes https://github.com/pytorch/pytorch/issues/10832. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11592 Differential Revision: D9797355 Pulled By: gchanan fbshipit-source-id: e496cd371111cfaf9a6c664167967b395e3d72e9	2018-09-13 16:32:46 -07:00
Teng Li	0988bbad2d	C10d release to torch.distributed for PT1 (#11405 ) Summary: The old `torch.distributed` will go to `torch.distributed.deprecated` The old DDP will go to `torch.nn.parallel.deprecated` Now `torch.nn.parallel.DDP` will use c10d DDP Now `torch.distributed` will use C10d frontend API Pull Request resolved: https://github.com/pytorch/pytorch/pull/11405 Reviewed By: pietern Differential Revision: D9733733 Pulled By: teng-li fbshipit-source-id: d6a3f3e73f8d3a7fcb1f4baef53c78063b8cbb08	2018-09-10 23:27:22 -07:00
Tongzhou Wang	0d5e4a2c66	Allow passing through arguments to unittest (#11209 ) Summary: Example: ```sh python run_test.py -i sparse -- TestSparse.test_factory_size_check -f ``` With this, the `--verbose` option is redundant (one can call `python run_test.py -- -v` instead of `python run_test.py -v`. But since this is (probably) a frequently used flag, I didn't remove the existing easier-to-use option. cc ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/11209 Differential Revision: D9632215 Pulled By: SsnL fbshipit-source-id: ff522802da11ef0a0714578be46e4a44f6343d44	2018-09-03 20:09:08 -07:00
iotamudelta	33c7cc13ca	improve docker packages, fix bugs, enable tests, enable FFT (#10893 ) Summary: * improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs) * integrate rocFFT (i.e., enable Fourier functionality) * fix bugs in ROCm caused by wrong warp size * enable more test sets, skip the tests that don't work on ROCm yet * don't disable asserts any longer in hipification * small improvements Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893 Differential Revision: D9615053 Pulled By: ezyang fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b	2018-09-02 08:54:42 -07:00
Teng Li	56539f5fe1	PT1 Distributed Release MileStone No.1 - Completed Distributed Package and CI tests (#10871 ) Summary: The PR includes: (1) torch.distributed.c10d, which now includes the complete backward compatible frontend API for `torch.distributed` (2) `env://` init method functionality (3) Minor change to `test_distributed.py`, which is now a test for `torch.distributed.c10d`. (4) The old `test_distributed.py' is now moved to `test_distributed_thd` (5) Miscellaneous bug fixes. (6) DDP CPU test is removed since c10d doesn't have this support yet, but this is a very easy test after moving DDP CPU's dependency to torch.distributed.c10d. (7) CI config to test MPI, NCCL, and Gloo backend of c10d Now all the distributed test including c10d DDP can pass with the c10d frontend API TODO: (in a separate PR) MPI subgroup support, once this is added, CI group test will be enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10871 Differential Revision: D9554514 Pulled By: teng-li fbshipit-source-id: fb686ad42258526c8b4372148e82969fac4f42dd	2018-08-29 12:55:57 -07:00
Teng Li	a88463cd9a	Working async version of AllGather, test fix and compiler warnings, and CI (#10932 ) Summary: The previous NCCL all gather doesn't work as expected. This is a fully working async version. Tested on both C++ and Python Frontend. Multi-node: ``` tengli@learnfair042:~/new_pytorch/pytorch/torch/lib/build/c10d/test$ TMPFILE="/private/home/tengli/temp/tengli-test" RANK=0 WORLD_SIZE=2 ./ProcessGroupNCCLTest Multi-node world size: 2 rank: 0 Allreduce test successful Broadcast test successful Reduce test successful Allgather test successful tengli@learnfair117:~/new_pytorch/pytorch/torch/lib/build/c10d/test$ TMPFILE="/private/home/tengli/temp/tengli-test" RANK=1 WORLD_SIZE=2 ./ProcessGroupNCCLTest Multi-node world size: 2 rank: 1 Allreduce test successful Broadcast test successful Reduce test successful Allgather test successful ``` CI test: ``` test_set_get (__main__.FileStoreTest) ... ok test_set_get (__main__.PrefixFileStoreTest) ... ok test_set_get (__main__.PrefixTCPStoreTest) ... ok test_allreduce_ops (__main__.ProcessGroupGlooTest) ... ok test_broadcast_ops (__main__.ProcessGroupGlooTest) ... ok test_allgather_ops (__main__.ProcessGroupNCCLTest) ... ok test_allreduce_ops (__main__.ProcessGroupNCCLTest) ... ok test_broadcast_ops (__main__.ProcessGroupNCCLTest) ... ok test_reduce_ops (__main__.ProcessGroupNCCLTest) ... ok test_common_errors (__main__.RendezvousFileTest) ... ok test_nominal (__main__.RendezvousFileTest) ... ok test_common_errors (__main__.RendezvousTCPTest) ... ok test_nominal (__main__.RendezvousTCPTest) ... ok test_unknown_handler (__main__.RendezvousTest) ... ok test_set_get (__main__.TCPStoreTest) ... ok ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10932 Differential Revision: D9542067 Pulled By: teng-li fbshipit-source-id: 25513eddcc3119fd736875d69dfb631b10f4ac86	2018-08-28 12:40:14 -07:00
Johannes M Dieterich	a4c59a9dab	MIOpen integration, more tests enabled, bug fixes (#10612 ) Summary: * first integration of MIOpen for batch norm and conv on ROCm * workaround a ROCm compiler bug exposed by elementwise_kernel through explicit capture of variables in the densest packing * workaround a ROCm compiler bug exposed by having `extern "C" __host__` as a definition and just `__host__` in the implementation through the hipify script * use fabs() in accordance with C++11 for double absolute, not ::abs() which is integer-only on ROCm * enable test_sparse set on CI, skip tests that don't work currently on ROCm * enable more tests in test_optim after the elementwise_bug got fixed * enable more tests in test_dataloader * improvements to hipification and ROCm build With this, resnet18 on CIFAR data trains without hang or crash in our tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10612 Reviewed By: bddppq Differential Revision: D9423872 Pulled By: ezyang fbshipit-source-id: 22c0c985217d65c593f35762b3eb16969ad96bdd	2018-08-23 15:24:47 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00
iotamudelta	a38b572de3	enable unit tests and other changes (#10266 ) Summary: This PR for the ROCm target does the following: * enable some unit tests on ROCm * fix a missing static_cast that breaks BatchNorm call on ROCm * fix BatchNorm to work on ROCm w/ ROCm warp sizes etc * improve the pyhipify script by introducing kernel scope to some transpilations and other improvements * fix a linking issue on ROCm * for more unit test sets: mark currently broken tests broken (to be fixed) * enable THINLTO (phase one) to parallelize linking * address the first failing of the elementwise kernel by removing non-working ROCm specialization Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266 Differential Revision: D9184178 Pulled By: ezyang fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297	2018-08-06 14:54:01 -07:00
Pieter Noordhuis	695d40efc2	Create initial Python bindings for c10d (#8119 ) * Build and install c10d from tools/build_pytorch_libs.sh * Create initial Python bindings for c10d * clang-format * Switch link order to include more symbols * Add bindings and tests for ProcessGroupGloo * Add broadcast test * Separate build flag for c10d * Explicit PIC property * Skip c10d tests if not available * Remove c10d from Windows blacklist Let it skip by itself because it won't be available anyway. * Make lint happy * Comments * Move c10d module into torch.distributed * Close tempfile such that it is deleted	2018-06-08 12:59:51 -07:00
Tongzhou Wang	85ee94b7be	Add memory leak check in CUDA tests (#7270 ) * Add memory leak check in CUDA tests * Tracking multi-GPU too * fix run_test.py not running __name__ == '__main__' content; add test for make_cuda_memory_checked_test * add a comment * skip if cuda * 1. Change the wrapper to a method in common.py:TestCase 2. Refactor common constants/method that initialize CUDA context into common_cuda.py 3. Update some test files to use TEST_CUDA and TEST_MULTIGPU * Fix MaxUnpool3d forward memory leak * Fix MultiLabelMarginCriterion forward memory leak * Fix MultiMarginLoss backward memory leak * default doCUDAMemoryCheck to False * make the wrapper skip-able * use TEST_MULTIGPU * add align_corners=True/False tests for Upsample; fix TEST_CUDNN * finalize interface * VolumetricMaxUnpooling_updateOutput * fix test_nccl * rename THC caching allocator methods to be clearer * make the wrapped function a method * address comments; revert changes to aten/src/THC/THCCachingAllocator.cpp * fix renamed var	2018-05-31 15:09:54 -04:00
Francisco Massa	b240cc9b87	Add support for dotted names in CPP Extensions (#6986 ) * Add support for dotted names in CPP Extensions * Modify tests for cpp extensions Test that dotted names work * Py2 fixes * Make run_test cpp_extensions Win-compatible	2018-04-29 18:10:03 +02:00
Simeon Monov	dc94182db0	Check for --noprefix option for mpiexec in run_test.py (#6690 ) * Check for --noprefix option for mpiexec --noprefix option to mpiexec is not part of the MPI standard. It is needed in certain configurations when using OpenMPI but not supported with other MPI implementations such as MPICH and maybe others. This commit adds a check if the option is supported by the current mpiexec. Also this commit fixes Issue #4965 and MPI tests can be enabled in the CI. Fixes: #4965 * Update run_test.py	2018-04-17 23:34:33 -04:00
xhzhao	f2c9975378	Add DistributedDataParallelCPU (#5919 )	2018-04-17 15:36:47 +02:00
Simeon Monov	24b4931462	Improve run_test.py to support running individual test classes and methods (#6344 ) * Improve run_test.py to support running individual test classes and methods Added support in run_test.py for running individual test classes and methods. The -i/--include option can specify a list of test modules, classes or methods like this: python run_test.py -i autograd torch.TestTorch.test_abs \ torch.TestTorch.test_add utils.TestBottleneck -f, -l and -x behaviour stays the same as before * Fixed some code formatting * Multiple fixes according to the reviews in #6344	2018-04-16 14:33:50 -04:00
peterjc123	d45f3d0d5c	Skip cpp_extensions test when possible on Windows (#6423 )	2018-04-12 12:12:39 +02:00
Peter Goldsborough	6f10978e7b	Skip C++ extensions test when ninja is not available (#6480 )	2018-04-10 14:50:24 -07:00
Peter Goldsborough	c3f7e5ff55	Install signal handler for SIGCHLD in run_test.py (#6436 ) Handle exit signal in run_test.py	2018-04-10 11:31:23 -07:00
peterjc123	63af898d46	Fix extension test on Windows (#5548 ) * Change cpp_extensions.py to make it work on Windows * Fix linting * Show python paths * Debug * Debug 1 * set PYTHONPATH * Add ATen into library * expose essential libs and functions, and copy _C.lib * Specify dir in header * Update check_abi for MSVC * Activate cl environment to compile cpp extensions * change version string * Redirect stderr to stdout * Add monkey patch for windows * Remove unnecessary self * Fix various issues * Append necessary flags * add /MD flag to cuda * Install ninja * Use THP_API instead of THP_CLASS * Beautify the paths * Revert "Use THP_API instead of THP_CLASS" This reverts commit dd7e74c44db48e4c5f85bb8e3c698ff9de71ba2d. * Use THP_API instead of THP_CLASS(new)	2018-04-02 13:53:25 -04:00
Edward Z. Yang	2ad972c9eb	A complete revamp of our test scripts. (#5904 ) - All of the scripts are based off of the idea that they should be as simple as possible, and all the heavy lifting done in the construction of the Docker file. The scripts are really simple now. A bigger philosophical discussion can be found in .jenkins/README.md - build-asan.sh is split out of build.sh, as ASAN builds are a bit specialized and it's inappropriate to run many of the other builds as part of them. - We now build and run with mkl/mkl-include on the CPU only builds - We now report sccache and ccache stats at the end of all builds. - run_test.py flushes stdout/stderr before making a subprocess call, which should solve our interleaving problems. Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-22 16:31:50 -04:00
Peter Goldsborough	4613eef69e	Simplify run_test.py and dont use shell=True (#5767 ) * Simplify run_test.py and dont use shell=True * Fix non-shell output for check_output and always print to stderr * Use shlex.split instead of str.split * s/log/print_to_stderr * with_init -> with_init_file * Remove bufsize argument	2018-03-15 01:12:51 -04:00
Edward Z. Yang	3f3b686056	Refactor run_test.py to pass all options, not just verbose. (#5760 ) I need this because run_test is going to need to read other options than just verbose when I implement JUnit XML dumping. (JUnit XML dumping cannot be implemented solely by frobbing --python because the XML file to dump to must vary based on the test name.) Signed-off-by: Edward Z. Yang <ezyang@fb.com>	2018-03-14 07:44:58 -04:00
Edward Z. Yang	cadeb0cb17	Revert "ATen ReduceOps (#5481 )" (#5765 ) * Revert "ATen ReduceOps (#5481)" This reverts commit `310c3735b9`. * Revert "Check that new cpuinfo and tbb submodules exist (#5714)" This reverts commit `1a23c9901d`.	2018-03-13 23:50:16 -04:00
Peter Goldsborough	16fa12214d	raise RuntimeError on test failure (#5754 )	2018-03-13 18:53:43 -04:00
cpuhrsch	310c3735b9	ATen ReduceOps (#5481 ) This diff adds vectorization to ATen. It uses intel intrinsics to build a general vec256 class, that represents types of 256bit width. These can then be treated like regular variables. Using those it implements torch.sum() for the contiguous case. It uses Intel TBB for multithreading, which allows workstealing and chunks the reduction operations based on a experimentally chosen value (_THRESHOLD). It uses cpuinfo to pick the right code depending on the host's capabilities. The kernels are implemented under native/cpu. Each .cpp file is compiled with -avx, -avx2 and no additional flags. A macro is used to append AVX, AVX2 or NONE to the function name. The header then needs to define the functions three times, one for each capability. This could be improved by either changing the cmake file a bit or possibly generating source code using a Python script etc. For the non-contiguous case this defaults to the current implementation within TH. For CUDA is entirely defaults to the implementation within THC. There probably needs to be a bit of a debate around the design decisions here, the additional dependencies, parallelization strategy, clarity, etc. The numerical results also diverge from numpy with larger tensors, which is expected since we're summing, for example, 8 numbers and then adding the result to the running sum, instead of each number one by one. But there might be something to be said about accumulating into a double for floats or the degree of divergence, the behavior with respect to CUDA, etc. I wrote a [small Python script]( https://github.com/cpuhrsch/benchmark/blob/sumall/benchmarks/sum_bench.py) to compare the results with numpy numerically as well as on timing. I ran this script to create timings both on master and this branch. Here is the command for 1 core `OMP_NUM_THREAD=1 taskset -c 0 python sum_bench.py --enable_numpy 200` Here is the command for all cores `python sum_bench.py --enable_numpy 200` Here are the results of each: [Master, 1 core](https://paste.fedoraproject.org/paste/Nho9JzHpPVK9av8a6mByjQ) [This branch, 1 core](https://paste.fedoraproject.org/paste/6xLHkYvcVJx9z~5MoHxN4w) [Master, all cores](https://paste.fedoraproject.org/paste/5l3V1d5zGqvJcMXIUteMRw) [This branch, all cores](https://paste.fedoraproject.org/paste/J4RuDU-0Drz0aZwtphQwEA) To test the command is `python sum_bench.py --test 200` [This branch, test results](https://paste.fedoraproject.org/paste/kTEoUC~oWgXA6XWMAfNfNw) For this test we look at the average absolute value of the differences. This does not take into account the relative magnitude of the numbers. The numbers are sampled from a standard normal distribution. In terms of performance this diff should bring PyTorch on par with Numpy and usually exceed it by 1.5 to 2x.	2018-03-12 15:19:12 -04:00
Peter Goldsborough	6404904d8a	Fix run_test.py (#5693 )	2018-03-10 19:16:40 -05:00
Peter Goldsborough	53876c4606	Rewrite run_test.sh in Python (#5615 )	2018-03-09 22:02:02 +01:00

... 12 13 14 15 16

786 Commits