pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	a8f36dd646	Revert "add amp support for custom backend (#96188 )" This reverts commit `cf12edee02`. Reverted https://github.com/pytorch/pytorch/pull/96188 on behalf of https://github.com/kit1980 due to Broke some linalg tests : https://github.com/pytorch/pytorch/actions/runs/4420037607/jobs/7750708339	2023-03-15 00:03:19 +00:00
shibo	cf12edee02	add amp support for custom backend (#96188 ) Fixes #ISSUE_NUMBER 1、add amp support for custom backend 2、optimize the file `backend_registration.py`, and rename it with `custom_backend_registration.py`. And then we would register other funcs for custom backend. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96188 Approved by: https://github.com/bdhirsh	2023-03-14 20:43:21 +00:00
soulitzer	d30db9a251	Replace non-reentrant checkpoint with a rewrite that can be nested and contain grad (#90105 ) Changes: - bc-breaking change: The main difference between this and the old non-reentrant impl that it replaces is that we clear recomputed tensors on backward immediately upon unpack, even if retain_graph=True. This has the following additional implications: - Accessing _saved_tensors multiple times will silently recompute forward multiple times. - Accessing ctx.saved_tensor twice in the same backward will now raise an error. - To avoid dealing with the potential consequences, early stopping has been hidden behind a global flag that is by default False, and can be enabled via a context manager. We can remove this in a follow up. Some features of nesting as a result do not work by default. Before land: - import to check for more bc-breakingness - implement any workarounds for the bc-breaking-ness, if we decide on any - update docs to reflect new lifetime of recomputed variables - update docs to mention the early stop feature Follow ups: - enable early-stopping by default - update docs/tutorial to feature nested use cases Related docs: - code comment: https://github.com/pytorch/pytorch/pull/90105/files#diff-9dcd955620b52ce128e18e3567be88edbb238810460d1288a86fabc20e483b30R448 - design doc: https://docs.google.com/document/d/1UDLhTNv6_kvuDTRlsjfj9WdqtNaQNr8ahrvdBIB6914/edit# - retains_grad <> checkpiont https://docs.google.com/document/d/1maiGmuFUxysQL0AdYUU88kngAaXh_L0XpDcLDh_5Ors/edit Pull Request resolved: https://github.com/pytorch/pytorch/pull/90105 Approved by: https://github.com/albanD	2023-03-14 20:38:36 +00:00
Xuehai Pan	046e88a291	[BE] [3/3] Rewrite `super()` calls in test (#94592 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94592 Approved by: https://github.com/ezyang, https://github.com/seemethere	2023-02-12 22:20:53 +00:00
Aaron Gokaslan	8fce9a09cd	[BE]: pyupgrade Python to 3.8 - imports and object inheritance only (#94308 ) Apply parts of pyupgrade to torch (starting with the safest changes). This PR only does two things: removes the need to inherit from object and removes unused future imports. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94308 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-07 21:10:56 +00:00
albanD	0b2dc3b3ac	[Py-3.11] Skip dynamo related tests (#94187 ) The quantization test fails to import Dynamo as expected. The traceback tool looks a lot more tricky, opened https://github.com/pytorch/pytorch/issues/94189 to investigate further. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94187 Approved by: https://github.com/malfet	2023-02-07 16:40:55 +00:00
Edward Z. Yang	8b00c54425	Add utility report_compile_source_on_error (#91069 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91069 Approved by: https://github.com/soumith, https://github.com/albanD	2023-01-11 22:54:46 +00:00
Edward Z. Yang	333540a458	Reland "Add torch.utils.device_mode" (#91796 ) Original PR https://github.com/pytorch/pytorch/pull/91525 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796 Approved by: https://github.com/albanD	2023-01-09 20:57:12 +00:00
PyTorch MergeBot	9b415240d4	Revert "Reland "Add torch.utils.device_mode" (#91796 )" This reverts commit `81b5eff3c3`. Reverted https://github.com/pytorch/pytorch/pull/91796 on behalf of https://github.com/huydhn due to This breaks trunk with the following failed test https://hud.pytorch.org/failure/test_jit_save%2CTestTracer	2023-01-09 04:45:47 +00:00
Edward Z. Yang	81b5eff3c3	Reland "Add torch.utils.device_mode" (#91796 ) Original PR https://github.com/pytorch/pytorch/pull/91525 Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91796 Approved by: https://github.com/albanD	2023-01-08 03:44:56 +00:00
PyTorch MergeBot	f571ae4fdb	Revert "Make torch.device usable as a context manager (#91525 )" This reverts commit `619d52a5d2`. Reverted https://github.com/pytorch/pytorch/pull/91525 on behalf of https://github.com/mehtanirav due to Internal breakages	2023-01-05 21:34:50 +00:00
Edward Z. Yang	619d52a5d2	Make torch.device usable as a context manager (#91525 ) Fixes https://github.com/pytorch/pytorch/issues/82296 Fixes https://github.com/pytorch/pytorch/issues/27878 Fixes https://github.com/pytorch/pytorch/issues/260 Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/91525 Approved by: https://github.com/albanD	2023-01-04 01:32:00 +00:00
Huy Do	0417da2288	Set a timeout value when testing multiprocess DataLoader (#91476 ) Setting a timeout value when testing multiprocess DataLoader to prevent ASAN jobs timing out after 4 hours. We are seeing multiple timeout issue running ASAN tests on HUD https://hud.pytorch.org/hud/pytorch/pytorch/master/1?per_page=50&name_filter=asan for examples * Without mem leak check enabled https://github.com/pytorch/pytorch/actions/runs/3794216079/jobs/6455118197 * With mem leak check https://github.com/pytorch/pytorch/actions/runs/3792743994/jobs/6449356306 Looking a bit closer into the test, the hanging happens when multiprocess DataLoader is used in `test_utils`. Here is the snapshot of those processes when I log into the hang runner: ``` UID PID PPID C STIME TTY TIME CMD jenkins 1 0 0 Dec28 pts/0 00:00:00 bash jenkins 8 0 0 Dec28 pts/1 00:00:00 sh -c pip install dist/torch-2.0.0a0+git97db9fd-cp37-cp37m-linux_x86_64.whl[opt-einsum] && .jenkins/pytorch/test.sh jenkins 20 8 0 Dec28 pts/1 00:00:00 /bin/bash .jenkins/pytorch/test.sh jenkins 764 20 0 Dec28 pts/1 00:00:07 python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --shard 5 5 --verbose jenkins 788 764 0 Dec28 pts/1 00:00:00 /opt/conda/bin/python -c from multiprocessing.semaphore_tracker import main;main(6) jenkins 3743 764 0 Dec28 pts/1 00:00:05 /opt/conda/bin/python -c from multiprocessing.spawn import spawn_main; spawn_main(tracker_fd=7, pipe_handle=11) --multiprocessing-fork jenkins 3766 3743 0 Dec28 pts/1 00:00:06 /opt/conda/bin/python -bb test_utils.py -v --import-slow-tests --import-disabled-tests jenkins 3878 3766 0 Dec28 pts/1 00:00:06 /opt/conda/bin/python -bb test_utils.py -v --import-slow-tests --import-disabled-tests jenkins 3879 3766 0 Dec28 pts/1 00:00:00 /opt/conda/bin/python -bb test_utils.py -v --import-slow-tests --import-disabled-tests jenkins 3880 3766 0 Dec28 pts/1 00:00:00 /opt/conda/bin/python -bb test_utils.py -v --import-slow-tests --import-disabled-tests jenkins 3881 3766 0 Dec28 pts/1 00:00:00 /opt/conda/bin/python -bb test_utils.py -v --import-slow-tests --import-disabled-tests jenkins 3893 0 0 01:45 pts/2 00:00:00 /bin/bash jenkins 3904 3893 0 01:46 pts/2 00:00:00 ps -ef ``` The specific hanging test was `test_random_seed` which spawned 4 subprocesses to load data. After I killed one of them, the test could continue and printed the following stacktrace: ``` test_random_seed (__main__.TestDataLoaderUtils) ... [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) ERROR (9345.840s) test_random_seed (__main__.TestDataLoaderUtils) ... test_random_seed errored - num_retries_left: 3 Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1134, in _try_get_data data = self._data_queue.get(timeout=timeout) File "/opt/conda/lib/python3.7/multiprocessing/queues.py", line 104, in get if not self._poll(timeout): File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 257, in poll return self._poll(timeout) File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 414, in _poll r = wait([self], timeout) File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 921, in wait ready = selector.select(timeout) File "/opt/conda/lib/python3.7/selectors.py", line 415, in select fd_event_list = self._selector.poll(timeout) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 3878) is killed by signal: Terminated. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "test_utils.py", line 469, in test_random_seed x2 = run() File "test_utils.py", line 464, in run return next(iter(dataloader)) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 635, in __next__ data = self._next_data() File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1330, in _next_data idx, data = self._get_data() File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1296, in _get_data success, data = self._try_get_data() File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1147, in _try_get_data raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e RuntimeError: DataLoader worker (pid(s) 3878) exited unexpectedly [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) [W ParallelNative.cpp:230] Warning: Cannot set number of intraop threads after parallel work has started or after set_num_threads call when using native parallel backend (function set_num_threads) ok (0.137s) ``` This doesn't fix the issue which I'll need to follow up to see why they hang. However, this should allow the test to terminate gracefully and report errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91476 Approved by: https://github.com/kit1980	2022-12-29 17:50:37 +00:00
mikey dagitses	3a1bdfee67	skip environment collection test in fbcode (#88744 ) Summary: This runs pip, which we don't have in the fbcode environment. Test Plan: Rely on CI. Differential Revision: D41156589 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88744 Approved by: https://github.com/zou3519	2022-11-09 18:20:04 +00:00
soulitzer	c18eead2df	Update saved variable hooks to no longer trigger on wrapped numbers (#87316 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87316 Approved by: https://github.com/ezyang, https://github.com/albanD	2022-10-20 03:01:11 +00:00
Rohan Varma	7a411952fb	CheckpointSequential support non-reentrant (#86331 ) Closes https://github.com/pytorch/pytorch/issues/86328 Adds `use_reentrant` argument to `checkpoint_sequential`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86331 Approved by: https://github.com/zhaojuanmao, https://github.com/albanD	2022-10-06 23:10:18 +00:00
Zain Rizvi	a1a95d402d	Fix inheritance in TestDataLoaderUtil (#85018 ) TestDataLoaderUtils needs to run it's parent class's setUp method to actually disable flaky tests (see https://github.com/pytorch/pytorch/issues/70516#issuecomment-1247045072 for details) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85018 Approved by: https://github.com/clee2000, https://github.com/huydhn	2022-09-14 22:04:43 +00:00
soulitzer	b18962552e	Fix and unskip cpp extension tests for ARM (#83115 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83115 Approved by: https://github.com/albanD	2022-08-11 20:01:53 +00:00
albanD	7dd795cbed	Prevent ref cycle creation in inner hook (#82776 ) Towards fixing https://github.com/pytorch/pytorch/issues/82482 This PR fixes two things: ## 1) memory leak The .detach() call prevents a true memory leak in some cases where the user function is using multiple ops in a row that save their inputs. The following chain of objects keep each other alive - the `storage` object - a recomputed Tensor y - y's grad_fn FooBackward (in c++) - FooBackward's SavedVariables (in c++) - SavedVariable Hook - the `inner_pack` function - captures `storage` Since part of this cycle is in c++, the python gc is not able to break it. Should THPCppFunction_traverse actually visit it's SavedVariables which in turn should visit their hooks? I think the answer is yes but I haven't dived into which python object is traversing what as if there is non-unique ownership of the c++ object, it makes the traversal a lot trickier. @ezyang do you think we should dive into this more? In this case, this can be easily solved anyways by storing `y.detach()` in the `storage` object as we don't care about the temporary backward graph that gets created during the second forward call. ## 2) Lifetime of the recomputed buffers The new storage system is now such that the lifetime of the recomputed buffer is directly linked to the SavedVariable c++ object. Meaning that this buffer will get deleted IIF the SavedVariable is cleared. This means that we now get the exact same behavior as the version without the saved variable hook where Tensors are saved directly on the SavedVariable object. This is great as this solves all the cases where the non-checkpoint version used to work but the checkpoint version does not (even double access or retain_graph=True). The one drawback of this approach though is that the buffer do NOT get cleared when the user passes in `retain_graph=True`! The next backward won't even re-run the forward as it already has all the buffers available. Is this a problem that you think we would need to find a solution for @rohan-varma or it is niche enough that we don't care for now? Pull Request resolved: https://github.com/pytorch/pytorch/pull/82776 Approved by: https://github.com/ezyang, https://github.com/rohan-varma	2022-08-06 00:31:22 +00:00
albanD	2255911f8a	Make M1 tests green (#82213 ) This is skipping all the failing tests and add a new master job to test on M1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82213 Approved by: https://github.com/seemethere, https://github.com/soulitzer, https://github.com/malfet	2022-08-05 16:12:08 +00:00
PyTorch MergeBot	ec4be38ba9	Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704 )" This reverts commit `93b0fec39d`. Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision	2022-06-21 23:54:00 +00:00
Bhavya Medishetty	93b0fec39d	To add hipify_torch as a submodule in pytorch/third_party (#74704 ) `hipify_torch` as a submodule in `pytorch/third_party` Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704 Approved by: https://github.com/jeffdaily, https://github.com/malfet	2022-06-21 18:56:49 +00:00
Kiarash Jamali	bc3c7a6cbd	Fix issue with _checkpoint_without_reentrant Fixes #76737 I also added a test case for this bug. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76890 Approved by: https://github.com/albanD	2022-05-05 17:37:31 +00:00
Nikita Shulga	8473173c36	Remove breakpad dependency This functionality does not seem to be used and there are some requests to update dependency. Add `third_party` to torch_cpu include directories if compiling with Caffe2 support, as `caffe2/quantization/server/conv_dnnlowp_op.cc` depends on `third_party/fbgemm/src/RefImplementations.h` Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394 Approved by: https://github.com/janeyx99, https://github.com/seemethere	2022-05-03 20:21:55 +00:00
PyTorch MergeBot	d79d9fa283	Revert "Remove breakpad dependency" This reverts commit `9aa3c7fd83`. Reverted https://github.com/pytorch/pytorch/pull/75394 on behalf of https://github.com/malfet	2022-04-17 17:58:51 +00:00
Nikita Shulga	9aa3c7fd83	Remove breakpad dependency This functionality does not seem to be used and there are some requests to update dependency Pull Request resolved: https://github.com/pytorch/pytorch/pull/75394 Approved by: https://github.com/janeyx99, https://github.com/seemethere	2022-04-17 17:43:45 +00:00
Nicolas Hug	d0387ad285	Move torchhub tests into separate test_hub.py file Pull Request resolved: https://github.com/pytorch/pytorch/pull/74826 Approved by: https://github.com/vmoens	2022-03-30 10:06:14 +00:00
Nicolas Hug	7df0d9fda4	Call super().setUp() and super().tearDown() in torchhub tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/74621 Approved by: https://github.com/vmoens, https://github.com/janeyx99, https://github.com/cpuhrsch	2022-03-25 14:36:31 +00:00
Jane Xu	a1e284d9c8	Remove high priority as an owner for tests (#74555 ) Summary: Following triage review discussion, it would be best for these tests to not be triaged high priority by automation, but by the triagers in the oncall. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74555 Reviewed By: albanD Differential Revision: D35099202 Pulled By: janeyx99 fbshipit-source-id: 657a0317141de3a598476a6f601ec26cc26231b1 (cherry picked from commit 057519cb2494d0f9a0b169f359ac87ba9e89f088)	2022-03-24 14:29:52 +00:00
Lood	670e4d9808	set_dir expanding "~" Fixes #69761. Small change to torch.hub.set_dir() (<10 LOC). It seems that before the code was split into `set_dir()` and `_get_torch_home `, an [earlier version](`5164622ba4/torch/hub.py (L111)`) of hub.py had a os.path.expanduser check. Currently, [_get_torch_home](https://github.com/pytorch/pytorch/blob/master/torch/hub.py#L104) retained the os.path.expanduser check, but `set_dir()` didn't have one. This PR fixes that (I hope). (As I mentioned in the issue, I can't run the tests on my laptop yet because of storage space :/ But I did include a test.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/69763 Approved by: https://github.com/malfet, https://github.com/NicolasHug	2022-03-23 20:38:14 +00:00
Nicolas Hug	08590b4159	Cosmetic changes to torchhub tests (#74431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74431 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D35011898 Pulled By: NicolasHug fbshipit-source-id: 37a42f843b0a3c781fa59254552a9b3af8678176 (cherry picked from commit aa4f83e126cb72cd846266af7ea77c70e2a9dc81)	2022-03-22 08:55:09 +00:00
Nicolas Hug	e0ecdb5cba	Properly catch warning in torchhub tests (#74430 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74430 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D35011900 Pulled By: NicolasHug fbshipit-source-id: 36753167d6ee737ee437d1cd7303e5cc8b5c286c (cherry picked from commit d0fdf4af795bdf74c145260c82f976a53f1aaff5)	2022-03-22 08:55:09 +00:00
Nicolas Hug	bcc77c470b	Cosmetic changes to torchhub tests (#74431 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74431 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D35011832 Pulled By: NicolasHug fbshipit-source-id: f76f92cf92b236ac8a2e2947001d219d0a7d5f14 (cherry picked from commit 3e142f8da9479eab356b3f38ace321cc9fde9bfc)	2022-03-22 08:55:09 +00:00
Alban Desmaison	734281c3d6	Cleanup all module references in doc (#73983 ) Summary: Working towards https://docs.google.com/document/d/10yx2-4gs0gTMOimVS403MnoAWkqitS8TUHX73PN8EjE/edit?pli=1# This PR: - Ensure that all the submodules are listed in a rst file (that ensure they are considered by the coverage tool) - Remove some long deprecated code that just error out on import - Remove the allow list altogether to ensure nothing gets added back there Pull Request resolved: https://github.com/pytorch/pytorch/pull/73983 Reviewed By: anjali411 Differential Revision: D34787908 Pulled By: albanD fbshipit-source-id: 163ce61e133b12b2f2e1cbe374f979e3d6858db7 (cherry picked from commit c9edfead7a01dc45bfc24eaf7220d2a84ab1f62e)	2022-03-10 22:26:29 +00:00
Nikita Shulga	bede18b061	Add support for C++ frontend wrapper on Linux (#69094 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69094 Partially addresses https://github.com/pytorch/pytorch/issues/68768 Test Plan: Imported from OSS Reviewed By: seemethere Differential Revision: D32730079 Pulled By: malfet fbshipit-source-id: 854e4215ff66e087bdf354fed7a17e87f2649c87	2021-12-02 16:47:00 -08:00
Michael Suo	5fd93fb5f8	broaden retries on TestHub (#67779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67779 Not all flaky failures from this test are URLErrors; I think we should err on the side of being expansive with retries here. Test Plan: Imported from OSS Reviewed By: jamesr66a Differential Revision: D32145434 Pulled By: suo fbshipit-source-id: 3c3274b2080681fcafb3ea6132e420605f65c429	2021-11-03 13:48:58 -07:00
Jane Xu	c19cda5782	[skip ci] Add test owners for a special hi-pri class of tests (#67553 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 This change does require some context: there were several suggestions regarding what to do about this group of tests: tests that are core and crucial to all of PyTorch and are too broad to be owned by one team. 1. Let's add a "module: core" and put people behind it! This idea sounds appealing unless you are one of the people backing the label. From talking to albanD among others, this idea of putting all these core tests on the shoulder of a few people or one team isn't super fair and I have not yet found anyone willing to take on this job. 2. Taking advantage of the fact that we already have a triaging oncall that takes turns triaging issues, we can leave these tests essentially unlabeled and allow the oncall to triage these tests. Since these tests are crucial to PyTorch, we'll add the "high priority" label to mark them different from other unowned tests (see https://github.com/pytorch/pytorch/issues/67552). 3. I _could_ still create an unbacked label "module: core" and attribute these tests there, but I don't like the idea of creating a facade that the tests are "triaged" to a label when no one is actually taking a look. Now we could potentially break these tests down into smaller files so that each piece _could_ be owned by a team, but 1. I don't know if this is currently feasible and 2. This approach does not prevent that from happening in the future. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67553 Reviewed By: albanD Differential Revision: D32025004 Pulled By: janeyx99 fbshipit-source-id: 1fb1aa4c27e305695ab6e80ae3d02f90519939c0	2021-10-29 12:17:21 -07:00
Jane Xu	68555339d7	test_utils.py: Add another retry to test_download_url_to_file (#66159 ) Summary: Fixes one of the flakiness concerns mentioned https://github.com/pytorch/pytorch/issues/65439#issuecomment-934686485 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66159 Reviewed By: ngimel Differential Revision: D31406485 Pulled By: janeyx99 fbshipit-source-id: cf7834cdab58360ecef1748075d52969de2e0778	2021-10-05 16:26:20 -07:00
Nicolas Hug	0a3cf8886a	Torchhub: More robust assumption regarding main or master branch (#64364 ) Summary: Closes https://github.com/pytorch/pytorch/issues/63753 This PR changes the assumption regarding the default branch of a repo to the following: > If main exist then use main,otherwise use master This will make torchhub more robust w.r.t. to the ongoing changes where repo use `main` instead of `master` as the development / default branch. cc nairbv NicolasHug Pull Request resolved: https://github.com/pytorch/pytorch/pull/64364 Reviewed By: saketh-are Differential Revision: D30731551 Pulled By: NicolasHug fbshipit-source-id: 7232a30e956dcccca21933a29de5eddd711aa99b	2021-09-20 10:36:13 -07:00
Mike Ruberry	6596173811	Revert D30731191: [pytorch][PR] Torchhub: rewrite commit hash check to avoid using unnecessary GitHub API credits Test Plan: revert-hammer Differential Revision: D30731191 (`f9bf144a0c`) Original commit changeset: d1ee7c2ef259 fbshipit-source-id: 5c7207f66c5354ce7b9ac2594e4f5b8307619b0c	2021-09-17 14:33:00 -07:00
Nicolas Hug	f9bf144a0c	Torchhub: rewrite commit hash check to avoid using unnecessary GitHub API credits (#64362 ) Summary: This PR adds more detailed error messages to torchhub if the commit hash validation goes wrong, providing suggestions to the users on how to resolve the issue. It also documents why such validation is important. EDIT: it also avoids validatating some stuff when we know "stuff" isn't a commit since there's no risk in this case CC malfet mthrok cc nairbv NicolasHug Pull Request resolved: https://github.com/pytorch/pytorch/pull/64362 Reviewed By: gchanan, malfet Differential Revision: D30731191 Pulled By: NicolasHug fbshipit-source-id: d1ee7c2ef2591dd7a5291977af1635ada2552d1b	2021-09-17 10:30:39 -07:00
Nicolas Hug	9157a2889f	Pass GITHUB_TOKEN to linux CI jobs and avoid skipping torchhub tests (#64807 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/64760 This should hopefully put the torchhub tests back. This also avoids skipping the torchhub tests: currently the tests are skipped if they fail, which pretty much defeats the purpose of having a test in the first place since we're never notified when they do fail. cc ezyang seemethere malfet lg20987 pytorch/pytorch-dev-infra nairbv NicolasHug Pull Request resolved: https://github.com/pytorch/pytorch/pull/64807 Reviewed By: seemethere Differential Revision: D30994585 Pulled By: NicolasHug fbshipit-source-id: 561782c22462b5cfec99cca153eb59623db5660a	2021-09-17 03:30:56 -07:00
driazati	bd8608cd5c	Use CMake for breakpad (#63186 ) Summary: We currently build breakpad from [this fork](https://github.com/driazati/breakpad) to include extra logic to restore signal handlers that were previously present. With some [new additions](https://github.com/google/breakpad/compare/main...driazati:main) this fork now includes a CMake based build, so we can add breakpad as a proper dependency rather than rely on including it in Docker images as a system library which is error prone (we have a bunch of images) and hard to extend to MacOS / Windows. This also includes some changes to the crash handling code to support MacOS / Windows in a similar way to Linux. ```python import torch # On Windows this writes crashes to C:\Users\<user>\AppData\pytorch_crashes # On MacOS/Linux this writes crashes to /tmp/pytorch_crashes torch.utils._crash_handler.enable_minidumps() # Easy way to cause a segfault and trigger the handler torch.bincount(input=torch.tensor([9223372036854775807])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/63186 Reviewed By: malfet, seemethere Differential Revision: D30318404 Pulled By: driazati fbshipit-source-id: 0d7daf3701cfaba5451cc529a0730272ab1eb1dc	2021-08-19 10:42:01 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
driazati	45cc207a88	Fix breakpad build + add test canary (#60990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60990 This makes the breakpad build more explicit in its messaging and hints to cmake where to look for the library (it wasn't able to find it without `PATHS` on CI even though that works locally). This also adds a smoke test that will fail if breakpad isn't present on a CI job where it is expected (e.g. binary builds). Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D29514316 Pulled By: driazati fbshipit-source-id: 79514363334788f311ba5d4f25deed3452f0c3eb	2021-07-06 14:15:07 -07:00
johnlu	265f0e5321	Add device runtime API for the plug-in to register platform python module into torch (#59857 ) Summary: ## Motivation Allow the out-of-tree Pytorch plug-in, for the device type other than CUDA, to add the runtime interface to the `torch` module. The runtime interface of the device can be referred with the device type name in the `torch` module. I.E., `torch.cuda` or `torch.xpu`. ## Solution - Add a register interface for the plug-in to add the platform python module into `torch` module with the device type name. I.E., The `torch.xpu` can be used to refer the XPU runtime interface after the XPU runtime module is registered with `torch._register_device_module('xpu', xpu_module)` in Intel's XPU plug-in. ## Additional Context More details about runtime has been discussed in https://github.com/pytorch/pytorch/issues/53707. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59857 Reviewed By: mrshenli Differential Revision: D29309320 Pulled By: ezyang fbshipit-source-id: b9802a5f937ddef9e0bdaf2f7692dfe463912fbe	2021-06-23 07:54:45 -07:00
Philip Meier	d5988c5eca	remove unused `type: ignore` directives (#60006 ) Summary: During development it is common practice to put `type: ignore` comments on lines that are correct, but `mypy` doesn't recognize this. This often stems from the fact, that the used `mypy` version wasn't able to handle the used pattern. With every new release `mypy` gets better at handling complex code. In addition to fix all the previously accepted but now failing patterns, we should also revisit all `type: ignore` comments to see if they are still needed or not. Fortunately, we don't need to do it manually: by adding `warn_unused_ignores = True` to the configuration, `mypy` will error out in case it encounters an `type: ignore` that is no longer needed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60006 Reviewed By: jbschlosser, malfet Differential Revision: D29133237 Pulled By: albanD fbshipit-source-id: 41e82edc5cd5affa7ccedad044b59b94dad4425a	2021-06-18 07:23:31 -07:00
driazati	059a717c9e	Fix breakpad build and add to more images (#59236 ) Summary: This PR * adds the breakpad build to most of the remaining docker images (except the mobile + slim ones) * pins to a [fork of breakpad](https://github.com/google/breakpad/compare/master...driazati:master?expand=1) to enable dasiy chaining on signal handlers * renames the API to be nicer Pull Request resolved: https://github.com/pytorch/pytorch/pull/59236 Reviewed By: malfet Differential Revision: D28792511 Pulled By: driazati fbshipit-source-id: 83723e74b7f0a00e1695210ac2620a0c91ab4bf2	2021-06-01 22:47:14 -07:00
Sam Estep	75024e228c	Add lint for unqualified `type: ignore` (#56290 ) Summary: The other half of https://github.com/pytorch/pytorch/issues/56272. Pull Request resolved: https://github.com/pytorch/pytorch/pull/56290 Test Plan: CI should pass on the tip of this PR, and we know that the lint works because the following CI runs (before this PR was finished) failed: - https://github.com/pytorch/pytorch/runs/2384511062 - https://github.com/pytorch/pytorch/actions/runs/765036024 Reviewed By: seemethere Differential Revision: D27867219 Pulled By: samestep fbshipit-source-id: e648f07b6822867e70833e23ddafe7fb7eaca235	2021-04-21 08:07:23 -07:00

1 2 3 4

179 Commits