pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Aaron Gokaslan	12e95aa4ee	[BE]: Apply PERF401 autofixes from ruff (#140980 ) * Automatically applies ruff rule 401. Turns loops into equivalent list comprehensions which are faster and do not leak the scope of the loop variables. * list comprehensions not only often have better typing, but are 50+% faster than for loops on overhead. They also preserve length information etc and are better for the interpreter to optimize. * Manually went back and made mypy happy after the change. * Also fixed style lints in files covered by flake8 but not by pyfmt Pull Request resolved: https://github.com/pytorch/pytorch/pull/140980 Approved by: https://github.com/justinchuby, https://github.com/malfet	2024-11-20 17:52:07 +00:00
Xuehai Pan	b5c006acac	[BE][Easy] enable UFMT for `torch/nn/` (#128865 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128865 Approved by: https://github.com/ezyang	2024-07-25 02:48:42 +00:00
Xuehai Pan	973037be6a	[BE][Easy] apply autofix for ruff rules unnecessary-collection-call (C408): `list()` / `tuple()` / `dict()` (#130199 ) This PR changes the empty collection factory call to Python literals: - `list()` -> `[]` - `tuple()` -> `()` - `dict()` -> `{}` The Python literals are more performant and safer. For example, the bytecode for building an empty dictionary: ```bash $ python3 -m dis - <<EOS import collections d1 = {} d2 = dict() dict = collections.OrderedDict d3 = dict() EOS ``` ```text 0 0 RESUME 0 1 2 LOAD_CONST 0 (0) 4 LOAD_CONST 1 (None) 6 IMPORT_NAME 0 (collections) 8 STORE_NAME 0 (collections) 3 10 BUILD_MAP 0 12 STORE_NAME 1 (d1) 4 14 PUSH_NULL 16 LOAD_NAME 2 (dict) 18 CALL 0 26 STORE_NAME 3 (d2) 6 28 LOAD_NAME 0 (collections) 30 LOAD_ATTR 8 (OrderedDict) 50 STORE_NAME 2 (dict) 7 52 PUSH_NULL 54 LOAD_NAME 2 (dict) 56 CALL 0 64 STORE_NAME 5 (d3) 66 RETURN_CONST 1 (None) ``` The dict literal `{}` only has one bytecode `BUILD_MAP`, while the factory call `dict()` has three `PUSH_NULL + LOAD_NAME + CALL`. Also, the factory call is not safe if users override the `dict` name in `locals` or `globals` (see the example of replacing with `OrderedDict` above). Pull Request resolved: https://github.com/pytorch/pytorch/pull/130199 Approved by: https://github.com/malfet	2024-07-11 17:30:28 +00:00
Xuehai Pan	dff6342a0b	[BE][Easy] enable UFMT for `torch/nn/parallel` (#128596 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/128596 Approved by: https://github.com/mikaylagawarecki	2024-06-17 16:29:22 +00:00
markstur	5540d276ce	Fix docstring errors in container.py, _functions.py, transformer.py, comm.py, parallel_apply.py, data_parallel.py, scatter_gather.py (#113250 ) Fix docstring errors in container.py, _functions.py, transformer.py, comm.py, parallel_apply.py, data_parallel.py, scatter_gather.py Fixes #112603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113250 Approved by: https://github.com/mikaylagawarecki	2023-11-10 21:07:25 +00:00
feifan	bd07f8d2e0	DDP forward support custom stream accelerated copy. (#98723 ) At present, DDP forward uses `_get_stream` to get a stream,which is cudaStream. If the custom module already registered to torch, I can use `getattr` to get it and it's stream. Then, the custom stream is used to copy the tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98723 Approved by: https://github.com/ezyang	2023-04-14 20:19:56 +00:00
AJ San Joaquin	578f0a04c7	fix torch.nn.parallel.scatter_gather.gather to handle NamedTuples and handle moving output to CPU (#51104 ) Summary: Fixes #{[50510](https://github.com/pytorch/pytorch/issues/50510)} Allows ```torch.nn.parallel.scatter_gather.gather``` to accept a list of NamedTuples as input and returns a NamedTuple whose elements are tensors. I added the author's fix using the ```is_namedtuple``` function. While testing this fix, I encountered a deprecation warning instructing me to use ```'cpu'``` instead of ```-1``` to move the outputs to the CPU. However, doing this causes an assertion error in the ```_get_device_index``` function. I solved this by handling the CPU case in the affected ```forward``` function. rohan-varma Pull Request resolved: https://github.com/pytorch/pytorch/pull/51104 Reviewed By: albanD Differential Revision: D26395578 Pulled By: rohan-varma fbshipit-source-id: 6e98c9ce1d9f1725973c18d24a6554c1bceae465	2021-02-11 15:50:28 -08:00
Guilherme Leobas	27f0dd36d9	add type annotations to torch.nn.parallel._functions (#49687 ) Summary: Closes gh-49686 Pull Request resolved: https://github.com/pytorch/pytorch/pull/49687 Reviewed By: ngimel Differential Revision: D25680210 Pulled By: zou3519 fbshipit-source-id: 221f7c9a4d3a6213eac6983030b0be51ee1c5b60	2020-12-22 16:56:16 -08:00
Alexander Grund	93719440b8	Replace map(lambda constructs (#46462 ) Summary: Follow-up of https://github.com/pytorch/pytorch/issues/46461 with a similar goal Makes them more readable and possibly faster. Care has to be taken because `map` applies the function immediately while `(x for x in xs)` is a generator expression which gets evaluated later. This is a benefit in some cases where it is not required to actually create the list of values in memory (e.g. when passing to `tuple` or `extend` or `join`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/46462 Reviewed By: zou3519 Differential Revision: D24422343 Pulled By: ezyang fbshipit-source-id: 252e33499c92ac0b15238f2df32681dbbda2b237	2020-10-22 09:50:22 -07:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
chengjun	8d570bc708	Decouple DataParallel/DistributedDataParallel from CUDA (#38454 ) Summary: Decouple DataParallel/DistributedDataParallel from CUDA to support more device types. - Move torch/cuda/comm.py to torch/nn/parallel/comm.py with minor changes for common devices support. Torch.cuda.comm is kept as is for backward compatibility - Provide common APIs to arbitrary device types without changing existing CUDA APIs in torch.cuda space. - Replace the torch.cuda calls in DataParellel/DistributedDataParallel with the new APIs. Related RFC: [https://github.com/pytorch/pytorch/issues/36160](https://github.com/pytorch/pytorch/issues/36160) Pull Request resolved: https://github.com/pytorch/pytorch/pull/38454 Differential Revision: D22051557 Pulled By: mrshenli fbshipit-source-id: 7842dad0e5d3ca0f6fb760bda49182dcf6653af8	2020-07-07 12:48:16 -07:00
Wei Yang	54107ae8cf	convert output_device at data_parallel from torch.device to index (#10189 ) Summary: - fixes #9984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10189 Differential Revision: D9545390 Pulled By: weiyangfb fbshipit-source-id: 3a6a705437553ba319e9fd4b7f676ff73857a27e	2018-09-11 20:27:07 -07:00
Peter Goldsborough	b770156a7a	Functional DataParallel (#9234 ) Summary: This PR adds the functional version of `DataParallel` (i.e. `data_parallel`) to the C++ frontend. For this, I had to: 1. Add "differentiable" versions of scatter and gather, which perform their inverse operation in the backward pass, to C++. I've added them under `torch/csrc/autograd/functions/comm.{h,cpp}`. I had to move some utilities from `VariableType.cpp` into `torch/csrc/autograd/functions/utils.h`, and changed them a bit to fix the `const_cast`s for which there were `TODO`s, 2. Implement the `replicate`, `parallel_apply` and the combining `data_parallel` functions in C++. `replicate` is implemented based on our existing `clone()` interface, along with the ability to set the current device via `at::OptionsGuard` (so nice). `parallel_apply` is implemented using `at::parallel_for` (CC cpuhrsch) and [follows the code from PyTorch](https://github.com/pytorch/pytorch/blob/master/torch/nn/parallel/parallel_apply.py). Added lots of tests for these things. apaszke ezyang ebetica colesbury Pull Request resolved: https://github.com/pytorch/pytorch/pull/9234 Differential Revision: D8865182 Pulled By: goldsborough fbshipit-source-id: 4f1fecf2b3f3bc1540c071dfb2d23dd45de433e4	2018-07-19 16:12:04 -07:00
Tongzhou Wang	c6a923f486	Support modules that output scalar in Gather (and data parallel) (#7973 ) * Support modules that output scalar in Gather (and data parallel) * Improve warning msg	2018-06-01 16:20:39 -04:00
Sam Gross	509aed6ca3	More Variable/Tensor clean-ups (#5464 )	2018-02-28 16:46:47 -05:00
Sam Gross	895aebac08	Use Variable instead of Tensor in Function.forward (#4786 ) The Tensor and Variable classes are being merged. autograd.Function.forward is now called on Variables, but with "no-grad" mode (torch.no_grad()) enabled. One benefit is that we no longer have to explicitly track shared storages.	2018-02-06 17:24:27 -05:00
Tongzhou Wang	805639906a	Broacast output requires_grad if only corresponding input requires_grad (#5061 )	2018-02-05 23:38:35 -05:00
gchanan	22ec5f37ca	Support double backwards with parallel nn autograd functions. (#2508 )	2017-08-22 03:57:45 -04:00
Adam Paszke	20aa5b066f	Convert some of the functions to new format Also, fix a lot of issues that appeared after the previous commits.	2017-05-01 16:44:56 -04:00
Sam Gross	e50a1f19b3	Use streams in scatter to overlap copy with compute	2017-03-14 22:46:07 +01:00
Sam Gross	65b66264d4	Improve broadcast/reduce performance by coalescing tensors	2017-03-06 12:47:53 -08:00
Luke Yeager	e7c1e6a8e3	[pep8] Fix most lint automatically with autopep8 Here's the command I used to invoke autopep8 (in parallel!): git ls-files \| grep '\.py$' \| xargs -n1 -P`nproc` autopep8 -i Several rules are ignored in setup.cfg. The goal is to let autopep8 handle everything which it can handle safely, and to disable any rules which are tricky or controversial to address. We may want to come back and re-enable some of these rules later, but I'm trying to make this patch as safe as possible. Also configures flake8 to match pep8's behavior. Also configures TravisCI to check the whole project for lint.	2017-01-28 01:15:51 +01:00
Adam Paszke	8d60e39fdc	Rename torch.nn.functions to torch.nn._functions	2016-12-30 23:02:57 +01:00

23 Commits