pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	17c149ad9e	Revert "[CI] Use prebuilt triton from nightly repo (#94732 )" This reverts commit `18d93cdc5d`. Reverted https://github.com/pytorch/pytorch/pull/94732 on behalf of https://github.com/kit1980 due to Reverting per offline discussion to try to fix dynamo test failures after triton update	2023-02-17 21:51:25 +00:00
Will Constable	a8cbf70ffc	Inductor support for aten::all_reduce (#93111 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93111 Approved by: https://github.com/jansel, https://github.com/wanchaol	2023-02-17 04:42:04 +00:00
Nikita Shulga	d0fbed76c6	Test inductor with stock g++ (#90710 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/90710 Approved by: https://github.com/jansel	2023-02-16 15:10:17 +00:00
AllenTiTaiWang	28e69954a1	[ONNX] Support aten::bit_wise_not in fx-onnx exporter (#94919 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94919 Approved by: https://github.com/justinchuby, https://github.com/wschin	2023-02-16 06:21:59 +00:00
Nikita Shulga	18d93cdc5d	[CI] Use prebuilt triton from nightly repo (#94732 ) No point in building from source if it was prebuilt already Pull Request resolved: https://github.com/pytorch/pytorch/pull/94732 Approved by: https://github.com/DanilBaibak, https://github.com/atalman, https://github.com/huydhn, https://github.com/jansel	2023-02-14 15:51:23 +00:00
Xuehai Pan	b005ec62b9	[BE] Remove dependency on `six` and `future` (#94709 ) Remove the Python 2 and 3 compatibility library [six](https://pypi.org/project/six) and [future](https://pypi.org/project/future) and `torch._six`. We only support Python 3.8+ now. It's time to retire them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94709 Approved by: https://github.com/malfet, https://github.com/Skylion007	2023-02-14 09:14:14 +00:00
BowenBao	055dc72dba	[ONNX] Bump onnx to 1.13.1, onnxruntime to 1.14.0 (#94767 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94767 Approved by: https://github.com/abock	2023-02-14 03:53:05 +00:00
Huy Do	bdf9963e57	Cache linter S3 dependencies (#94745 ) Fixes https://github.com/pytorch/pytorch/issues/94716 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94745 Approved by: https://github.com/seemethere	2023-02-13 19:44:23 +00:00
Nikita Shulga	4869929f32	Update Triton hash (#94249 ) That includes MLIR + latest packaging changes (that also download ptxas from CUDA-12) Tweak CI to install gcc-9 to build trition Disable a few tests to make everything be correct Pull Request resolved: https://github.com/pytorch/pytorch/pull/94249 Approved by: https://github.com/Skylion007, https://github.com/ngimel, https://github.com/weiwangmeta	2023-02-13 13:17:36 +00:00
Wei Wang	6fadd5e94a	Checkout torchbench with only needed models (#94578 ) Addresses (https://github.com/pytorch/pytorch/pull/93395#issuecomment-1414231011) The perf smoke test is supposed to be around one minute. But the torchbench checkout process is taking more than 15 minutes. This PR explores a way to just checkout torchbench with only needed models that are later used to do perf smoke test and memory compression ratio check. Torchbench installation has "python install.py models model1 model 2 model3" support to just install model1 model2 and model3, not providing "models model1 model2 model3" would install all models by default. Before this PR, inductor job takes about 27 minutes (21 minutes spent in testing phase) https://github.com/pytorch/pytorch/actions/runs/4149154553/jobs/7178024253 After this PR, inductor job takes about 19 minutes (12 minutes spent in testing phase), pytorch checkout and docker image pull takes about 5 - 6 minutes total. https://github.com/pytorch/pytorch/actions/runs/4149155814/jobs/7178735494 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94578 Approved by: https://github.com/orionr, https://github.com/malfet, https://github.com/desertfire	2023-02-13 04:02:18 +00:00
Huy Do	371f587c92	Dockerize lint jobs (#94255 ) This is to minimize network flakiness when running lint jobs. I create a new Docker image for linter and install all linter dependencies there. After that, all linter jobs are converted to use Nova generic Linux job https://github.com/pytorch/test-infra/blob/main/.github/workflows/linux_job.yml with the new image. For the future task: I encounter this issue with the current mypy version we are using and Python 3.11 https://github.com/python/mypy/issues/13627. Fixing this requires upgrading mypy to a newer version, but that can be done separately (require formatting/fixing `*.py` files with the newer mypy version) `collect_env` linter job is currently not included here as it needs older Python versions (3.5). It could also be converted to use the same mechanism (with another Docker image, probably). This one rarely fails though. ### Testing BEFORE https://github.com/pytorch/pytorch/actions/runs/4130366955 took a total of ~14m AFTER https://github.com/pytorch/pytorch/actions/runs/4130712385 also takes a total of ~14m Pull Request resolved: https://github.com/pytorch/pytorch/pull/94255 Approved by: https://github.com/ZainRizvi	2023-02-11 21:56:19 +00:00
Justin Chu	a27bd42bb9	[ONNX] Use onnxruntime to run fx tests (#94638 ) - Enable the mnist test - Removed `max_pool2d` in the test because we don't have the op yet. - Add aten::convolution - Bump onnxscript version Pull Request resolved: https://github.com/pytorch/pytorch/pull/94638 Approved by: https://github.com/BowenBao, https://github.com/wschin, https://github.com/titaiwangms	2023-02-11 15:32:03 +00:00
BowenBao	88d0235b73	[ONNX] Update CI test environment; Add symbolic functions (#94564 ) * CI Test environment to install onnx and onnx-script. * Add symbolic function for `bitwise_or`, `convert_element_type` and `masked_fill_`. * Update symbolic function for `slice` and `arange`. * Update .pyi signature for `_jit_pass_onnx_graph_shape_type_inference`. Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Co-authored-by: Ti-Tai Wang <titaiwang@microsoft.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/94564 Approved by: https://github.com/abock	2023-02-10 20:44:59 +00:00
Huy Do	2af89e96ec	Lower libtorch build parallelization to avoid OOM (#94548 ) Memory usage increases after https://github.com/pytorch/pytorch/pull/88575. Docker crashes with exit code 137, clearly means out of memory Pull Request resolved: https://github.com/pytorch/pytorch/pull/94548 Approved by: https://github.com/seemethere	2023-02-10 01:52:09 +00:00
pramenku	dddc0b41db	[ROCm] centos update endpoint repo and fix sudo (#92034 ) * Update ROCm centos Dockerfile * Update install_user.sh for centos sudo issue Fixes ROCm centos Dockerfile due to https://packages.endpoint.com/rhel/7/os/x86_64/endpoint-repo-1.9-1.x86_64.rpm file is not accessible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92034 Approved by: https://github.com/malfet	2023-02-09 21:30:58 +00:00
Xuehai Pan	69e0bda999	[BE] Import `Literal`, `Protocol`, and `Final` from standard library `typing` as of Python 3.8+ (#94490 ) Changes: 1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention. ```python import re def normalize(name): return re.sub(r"[-_.]+", "-", name).lower() ``` 2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+ 3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490 Approved by: https://github.com/ezyang, https://github.com/albanD	2023-02-09 19:17:49 +00:00
Jack Taylor	75545798c6	test_inductor test.sh fix (#92833 ) inductor/test_torchinductor suite is not running as part of the CI. I have triaged this down to a bug in the arguments supplied in test/run_test.py Currently test_inductor runs the test suites as: `PYTORCH_TEST_WITH_INDUCTOR=0 python test/run_test.py --include inductor/test_torchinductor --include inductor/test_torchinductor_opinfo --verbose` Which will only set off the test_torchinductor_opinfo suite Example from CI logs: https://github.com/pytorch/pytorch/actions/runs/3926246136/jobs/6711985831#step:10:45089 ``` + PYTORCH_TEST_WITH_INDUCTOR=0 + python test/run_test.py --include inductor/test_torchinductor --include inductor/test_torchinductor_opinfo --verbose Ignoring disabled issues: [] /var/lib/jenkins/workspace/test/run_test.py:1193: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. if torch.version.cuda is not None and LooseVersion(torch.version.cuda) >= "11.6": Selected tests: inductor/test_torchinductor_opinfo Prioritized test from test file changes. reordering tests for PR: prioritized: [] the rest: ['inductor/test_torchinductor_opinfo'] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/92833 Approved by: https://github.com/seemethere	2023-02-09 18:51:25 +00:00
AllenTiTaiWang	6d722dba0f	[ONNX] Update CI onnx and ORT version (#94439 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94439 Approved by: https://github.com/BowenBao	2023-02-09 04:08:38 +00:00
Nikita Shulga	82401c6a69	[BE] Set PYTORCH_TEST_WITH_INDUCTOR only once (#94411 ) Setting the same env-var twice should have no effect, unless one is trying mini rowhammer here Pull Request resolved: https://github.com/pytorch/pytorch/pull/94411 Approved by: https://github.com/jeanschmidt, https://github.com/huydhn, https://github.com/Skylion007	2023-02-08 21:00:40 +00:00
albanD	75e04f6dad	Test enabling full testing on 3.11 for linux (#94056 ) Testing what happens if we run everything right now. Will remove the broken stuff to get a a mergeable version next. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94056 Approved by: https://github.com/malfet	2023-02-07 23:02:13 +00:00
albanD	9b3277c095	Make sure to properly pull the right submodule in BC test (#94182 ) To unblock https://github.com/pytorch/pytorch/pull/93219 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94182 Approved by: https://github.com/ezyang, https://github.com/malfet, https://github.com/Skylion007	2023-02-06 18:03:35 +00:00
Nikita Shulga	6c4dc98b9d	[CI][BE] Move docker forlder to `.ci` (#93104 ) Follow up after https://github.com/pytorch/pytorch/pull/92569 Pull Request resolved: https://github.com/pytorch/pytorch/pull/93104 Approved by: https://github.com/huydhn, https://github.com/seemethere, https://github.com/ZainRizvi	2023-02-03 12:25:33 +00:00
Jane Xu	0ecb071fc4	[BE][CI] change references from .jenkins to .ci (#92624 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92624 Approved by: https://github.com/ZainRizvi, https://github.com/huydhn	2023-01-30 22:50:07 +00:00
Bin Bao	2b267fa7f2	[inductor] Check memory compression ratio in model tests (#89305 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89305 Approved by: https://github.com/weiwangmeta	2023-01-30 22:01:06 +00:00
Catherine Lee	27ab1dfc28	Remove print_test_stats, test_history, s3_stat_parser (#92841 ) Pritam Damania no longer uses it (and is no longer with FB), and I don't know who else has interest in this Pull Request resolved: https://github.com/pytorch/pytorch/pull/92841 Approved by: https://github.com/malfet, https://github.com/huydhn, https://github.com/ZainRizvi, https://github.com/seemethere	2023-01-27 18:11:42 +00:00
Huy Do	074f5ce0b7	Install Torchvision in all Linux shards (#93108 ) Also skip `test_roi_align_dynamic_shapes` for cuda as introduced by https://github.com/pytorch/pytorch/pull/92667. With Torchvision properly installed, the test fails with the following error: ``` 2023-01-26T04:46:58.1532060Z test_roi_align_dynamic_shapes_cuda (__main__.CudaTests) ... /var/lib/jenkins/workspace/test/inductor/test_torchinductor.py:266: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() 2023-01-26T04:46:58.1532195Z buffer = torch.as_strided(x, (x.storage().size(),), (1,), 0).clone() 2023-01-26T04:46:58.1532383Z test_roi_align_dynamic_shapes_cuda errored - num_retries_left: 3 2023-01-26T04:46:58.1532479Z Traceback (most recent call last): 2023-01-26T04:46:58.1532725Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_dynamo/utils.py", line 1155, in run_node 2023-01-26T04:46:58.1532821Z return node.target(args, kwargs) 2023-01-26T04:46:58.1533056Z File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/_ops.py", line 499, in __call__ 2023-01-26T04:46:58.1533160Z return self._op(args, **kwargs or {}) 2023-01-26T04:46:58.1533304Z RuntimeError: Cannot call sizes() on tensor with symbolic sizes/strides ``` https://github.com/pytorch/pytorch/issues/93054 reveals a blindspot in the CI where Torchvision was only installed in the first and second shard. The above test should show that failure as part of https://github.com/pytorch/pytorch/pull/92667, but then it was skipped because Torchvision was not installed (in the 3rd shard) for `test_roi_align` to run. The test is still skipped here, but in a more explicit way. Fixes https://github.com/pytorch/pytorch/issues/93054 Pull Request resolved: https://github.com/pytorch/pytorch/pull/93108 Approved by: https://github.com/clee2000, https://github.com/jjsjann123, https://github.com/nkaretnikov	2023-01-27 03:15:18 +00:00
Huy Do	68a49322e7	[MacOS] Explicitly use cmake from cloned conda environment (#92737 ) My first attempt to fix `Library not loaded: @rpath/libzstd.1.dylib` issue on MacOS M1 in https://github.com/pytorch/pytorch/pull/91142 provides some additional logs about flaky error but doesn't fix the issue as I see some of them recently, for example * `e4d83d54a6` Looking at the log, I can see that: * CMAKE_EXEC correctly points to `CMAKE_EXEC=/Users/ec2-user/runner/_work/_temp/conda_environment_3971491892/bin/cmake` * The library is there under the executable rpath ``` ls -la /Users/ec2-user/runner/_work/_temp/conda_environment_3971491892/bin/../lib ... 2023-01-20T23:22:03.9761370Z -rwxr-xr-x 2 ec2-user staff 737776 Apr 22 2022 libzstd.1.5.2.dylib 2023-01-20T23:22:03.9761630Z lrwxr-xr-x 1 ec2-user staff 19 Jan 20 22:47 libzstd.1.dylib -> libzstd.1.5.2.dylib ... ``` Then calling cmake after that suddenly uses the wrong cmake from miniconda package cache: ``` 2023-01-20T23:22:04.0636880Z + cmake .. 2023-01-20T23:22:04.1924790Z dyld[85763]: Library not loaded: @rpath/libzstd.1.dylib 2023-01-20T23:22:04.1925540Z Referenced from: /Users/ec2-user/runner/_work/_temp/miniconda/pkgs/cmake-3.22.1-hae769c0_0/bin/cmake ``` This is weird, so my second attempt will be more explicit and use the correct cmake executable in `CMAKE_EXEC`. May be something manipulates the global path in between making ` /Users/ec2-user/runner/_work/_temp/miniconda/pkgs/cmake-3.22.1-hae769c0_0/bin/cmake` comes first in the PATH Pull Request resolved: https://github.com/pytorch/pytorch/pull/92737 Approved by: https://github.com/ZainRizvi	2023-01-26 21:07:41 +00:00
jjsjann123	c11b301bcd	[NVFUSER] refactor nvfuser build (#89621 ) This PR is the first step towards refactors the build for nvfuser in order to have the coegen being a standalone library. Contents inside this PR: 1. nvfuser code base has been moved to `./nvfuser`, from `./torch/csrc/jit/codegen/cuda/`, except for registration code for integration (interface.h/interface.cpp) 2. splits the build system so nvfuser is generating its own `.so` files. Currently there are: - `libnvfuser_codegen.so`, which contains the integration, codegen and runtime system of nvfuser - `nvfuser.so`, which is nvfuser's python API via pybind. Python frontend is now exposed via `nvfuser._C.XXX` instead of `torch._C._nvfuser` 3. nvfuser cpp tests is currently being compiled into `nvfuser_tests` 4. cmake is refactored so that: - nvfuser now has its own `CMakeLists.txt`, which is under `torch/csrc/jit/codegen/cuda/`. - nvfuser backend code is not compiled inside `libtorch_cuda_xxx` any more - nvfuser is added as a subdirectory under `./CMakeLists.txt` at the very end after torch is built. - since nvfuser has dependency on torch, the registration of nvfuser at runtime is done via dlopen (`at::DynamicLibrary`). This avoids circular dependency in cmake, which will be a nightmare to handle. For details, look at `torch/csrc/jit/codegen/cuda/interface.cpp::LoadingNvfuserLibrary` Future work that's scoped in following PR: - Currently since nvfuser codegen has dependency on torch, we need to refactor that out so we can move nvfuser into a submodule and not rely on dlopen to load the library. @malfet - Since we moved nvfuser into a cmake build, we effectively disabled bazel build for nvfuser. This could impact internal workload at Meta, so we need to put support back. cc'ing @vors Pull Request resolved: https://github.com/pytorch/pytorch/pull/89621 Approved by: https://github.com/davidberard98	2023-01-26 02:50:44 +00:00
Jane Xu	b453adc945	[BE][CI] rename .jenkins (#92845 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92845 Approved by: https://github.com/clee2000	2023-01-25 23:47:38 +00:00
PyTorch MergeBot	afe6ea884f	Revert "[BE][CI] rename .jenkins to .ci, add symlink (#92621 )" This reverts commit `8972a9fe6a`. Reverted https://github.com/pytorch/pytorch/pull/92621 on behalf of https://github.com/atalman due to breaks shipit	2023-01-23 15:04:58 +00:00
Nikita Shulga	b5f614c4cd	Move ASAN and ONNX to Python 3.9 and 3.8 (#92712 ) As 3.7 is getting deprecated Pull Request resolved: https://github.com/pytorch/pytorch/pull/92712 Approved by: https://github.com/weiwangmeta, https://github.com/kit1980, https://github.com/seemethere	2023-01-23 14:46:02 +00:00
Edward Z. Yang	de69cedf98	Run all of the timm models shards in the periodic (#92743 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92743 Approved by: https://github.com/kit1980	2023-01-21 18:39:17 +00:00
Jane Xu	8972a9fe6a	[BE][CI] rename .jenkins to .ci, add symlink (#92621 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92621 Approved by: https://github.com/huydhn, https://github.com/ZainRizvi	2023-01-21 02:40:18 +00:00

... 31 32 33 34 35

1733 Commits