pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	7e654c8f88	Revert "WIP / TST: allow testing torch._numpy under Dynamo (#110401 )" This reverts commit `5ed4a423de`. Reverted https://github.com/pytorch/pytorch/pull/110401 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing dynamo job in trunk `5ed4a423de` ([comment](https://github.com/pytorch/pytorch/pull/110401#issuecomment-1779811943))	2023-10-25 18:21:16 +00:00
Evgeni Burovski	5ed4a423de	WIP / TST: allow testing torch._numpy under Dynamo (#110401 ) Use conditional imports: when running under dynamo, import the original NumPy not torch._numpy. This is what we want to trace, not our implementation. With this, the test suite passes with and without `PYTORCH_TEST_WITH_DYNAMO=1` (modulo a couple of test modules which are not meant to be compiled, e.g. `test_nep50_examples`). There are two new decorators, `x{fail,pass}ifTorchDynamo`, the `xpass` in most cases indicates a graph break and a fallback to eager for things we do not implement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110401 Approved by: https://github.com/lezcano	2023-10-25 16:02:16 +00:00
Prachi Gupta	53a9ac534c	Added decorator `skipRocmIfTorchInductor` and skipped failing tests (#107760 ) This PR adds a skip decorator which will disable tests in CI for ROCm inductor workflow. This new workflow will be coming in via https://github.com/pytorch/pytorch/pull/110544 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107760 Approved by: https://github.com/jataylo, https://github.com/pruthvistony, https://github.com/atalman	2023-10-12 16:00:35 +00:00
eellison	c5f06b9753	Re-enable test_copy_transpose_math_view, neg_view/dce fix (#110651 ) - neg view can just be lowered to neg() post functionalization - we were treating all fallback kernels as not having side effects. we shouldn't dce mutating fallback kernels - either mutations induced by the reinplacing pass or clone_ with unsupported arguments (complex) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110651 Approved by: https://github.com/Chillee, https://github.com/jansel, https://github.com/malfet, https://github.com/Skylion007	2023-10-10 16:34:01 +00:00
albanD	1824ea3c0f	Add a test to make sure all modules in the codebase are importable (#110598 ) As per title, running import on any of these files lead to a crash. I'm very curious how the code in them is used! Pull Request resolved: https://github.com/pytorch/pytorch/pull/110598 Approved by: https://github.com/janeyx99, https://github.com/malfet	2023-10-08 03:52:30 +00:00
albanD	cae537126f	Set _diffThreshold on our TestCase (#110603 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/110603 Approved by: https://github.com/albanD	2023-10-05 21:49:28 +00:00
Catherine Lee	d6e5898e8d	Quieter logs in CI (#110033 ) To reduce the amount of logs * for successes, only print the part that says what tests ran and don't print the rest. Zip the log into an artifact. The line listing al the test names is really long, but if you view source of the raw logs, it will not wrap so it will only be one line. The log classifier can also be configured to ignored this line. Gets rid of lines like `test_ops.py::TestCommonCPU::test_multiple_devices_round_cpu_int64 SKIPPED [0.0010s] (Only runs on cuda) [ 9%]` * for failures/reruns, print logs. Do not zip. Also * change log artifact name Examples of various logs: `a074db0f7f` failures `1b439e24c4` failures possibly controversial haha should i include an option for always printing? Pull Request resolved: https://github.com/pytorch/pytorch/pull/110033 Approved by: https://github.com/huydhn	2023-10-05 16:40:37 +00:00
Oguz Ulgen	f04b1a0d27	[AOTInductor] Implement autograd eager backend for native triton kernels (#110403 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110403 Approved by: https://github.com/zou3519, https://github.com/bdhirsh	2023-10-04 17:56:56 +00:00
Pruthvi Madugundu	9ce2e02fd6	Revert "[ROCm] Remove PYTORCH_MIOPEN_SUGGEST_NHWC flag (#90725 )" (#110319 ) This reverts commit `66bfcd32fd`. NHWC is have perf regression on MIOpen, so reverting till the performance issue is fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110319 Approved by: https://github.com/jeffdaily, https://github.com/jithunnair-amd, https://github.com/kit1980	2023-10-03 19:14:47 +00:00
Edward Z. Yang	f7c9ef88f5	Add masked_select abstract impl (#110103 ) Fixes https://github.com/pytorch/pytorch/issues/109871 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/110103 Approved by: https://github.com/bdhirsh	2023-09-27 04:07:58 +00:00
Aaron Gokaslan	6d725e7d66	[BE]: enable ruff rules PLR1722 and PLW3301 (#109461 ) Enables two ruff rules derived from pylint: * PLR1722 replaces any exit() calls with sys.exit(). exit() is only designed to be used in repl contexts as may not always be imported by default. This always use the version in the sys module which is better * PLW3301 replaces nested min / max calls with simplified versions (ie. `min(a, min(b, c))` => `min(a, b. c)`). The new version is more idiomatic and more efficient. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109461 Approved by: https://github.com/ezyang	2023-09-18 02:07:21 +00:00
Kurt Mohler	3f88e3105f	Reland: Remove remaining global `set_default_dtype` calls from tests (#108088 ) Fixes #68972 Relands #107246 To avoid causing Meta-internal CI failures, this PR avoids always asserting that the default dtype is float in the `TestCase.setUp/tearDown` methods. Instead, the assert is only done if `TestCase._default_dtype_check_enabled == True`. `_default_dtype_check_enabled` is set to True in the `if __name__ == "__main__":` blocks of all the relevant test files that have required changes for this issue Pull Request resolved: https://github.com/pytorch/pytorch/pull/108088 Approved by: https://github.com/ezyang	2023-09-07 03:04:34 +00:00
Michael Gschwind	2a40fe2dbf	[experimental] use EXCEPT_FOR env to suppress CPU tests from GPU RE (#108672 ) Summary: [experimental] use EXCEPT_FOR env to suppress CPU tests from GPU RE -- alternative implementation to D48997976 using preexisting PYTORCH_TESTING_DEVICE_EXCEPT_FOR facility and building remaining logic (for assert-positive listers like test_transformers) on top of that. Goal: save ~100 GPU (10% of capacity), enables us to fund more aggressive PyPer unit testing on GPU RE Test Plan: sandcastle, github Differential Revision: D48998582 Pull Request resolved: https://github.com/pytorch/pytorch/pull/108672 Approved by: https://github.com/bertmaher	2023-09-06 23:33:18 +00:00
Animesh Jain	29f1097891	[dynamo] Reduce cache size limit to 8 (#108526 ) As title Pull Request resolved: https://github.com/pytorch/pytorch/pull/108526 Approved by: https://github.com/ezyang	2023-09-05 17:56:26 +00:00
PyTorch MergeBot	161ea463e6	Revert "Remove remaining global `set_default_dtype` calls from tests (#107246 )" This reverts commit `aa8ea1d787`. Reverted https://github.com/pytorch/pytorch/pull/107246 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/107246#issuecomment-1693838522))	2023-08-25 19:34:55 +00:00
Kurt Mohler	aa8ea1d787	Remove remaining global `set_default_dtype` calls from tests (#107246 ) Fixes #68972 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107246 Approved by: https://github.com/ezyang	2023-08-24 16:10:48 +00:00
Aaron Gokaslan	660e8060ad	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-22 23:16:38 +00:00
PyTorch MergeBot	d59a6864fb	Revert "[BE]: Update ruff to 0.285 (#107519 )" This reverts commit `88ab3e4322`. Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))	2023-08-22 19:53:32 +00:00
Catherine Lee	4dc9df2f87	Slightly more flexible naming system for disable + slow tests (#104002 ) Sometimes test suite names include file/module names since they were imported from another file (ex _nvfuser.test_dynamo.TestNvFuserDynamo etc). This can sometimes make the autogenerated named by disable bot and the disable test button on hud incorrect which is annoying to track down, which leads to issues that are open but don't actually do anything, so my solution is to make the check between the issue name + the test more flexible. Instead of checking the entire test suite name, we chop off the file/module names and only look for the last part (ex TestNvFuserDynamo) and check if those are equal. Also bundle both the check against the names in the slow test json and disable test issue names into one function for no reason other than less code. Looked through logs to see what tests are skipped with this vs the old one and it looked the same. Diff looks like a big change but its mostly a change in the indentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/104002 Approved by: https://github.com/ZainRizvi, https://github.com/huydhn	2023-08-22 16:35:54 +00:00
Aaron Gokaslan	88ab3e4322	[BE]: Update ruff to 0.285 (#107519 ) This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings. I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :) Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519 Approved by: https://github.com/ezyang	2023-08-20 01:36:18 +00:00
lcskrishna	bc662ffff9	[ROCm] Update ROCm skip decorators (#106138 ) This PR adds a msg argument for skipIfRocm and skipCUDAIfRocm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106138 Approved by: https://github.com/jataylo, https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD	2023-08-18 22:02:06 +00:00
Catherine Lee	bc053070f8	Mark test_gradient_extreme_cases as slow for inductor (#107189 ) test_gradient_extreme_cases_* takes ~5 minutes on the inductor sm86 shard and possibly even longer on the inductor workflow since it's timing out right now although I'm not sure what the difference between the two is, and sometimes auto slow test detection isn't catching it Pull Request resolved: https://github.com/pytorch/pytorch/pull/107189 Approved by: https://github.com/ZainRizvi	2023-08-15 22:03:00 +00:00
summerdo	7db6eb7156	[test_nn] add custom device support for dropout tests、lazy_modules te… (#106609 ) add custom device support for dropout tests、lazy_modules tests and multihead_attention tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106609 Approved by: https://github.com/mikaylagawarecki	2023-08-11 09:14:34 +00:00
Peter Bell	d4d090e2da	[FakeTensor] Workaround FFT ops with incorrect meta strides (#106319 ) Currently there are FFT operators which raise `UnsupportedOperatorException` because their meta implementations sometimes give incorrect strides. This works around the problem for static shapes by falling back to eager. Though we still don't support calls with dynamic shapes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106319 Approved by: https://github.com/ezyang	2023-08-07 20:59:30 +00:00
Edward Z. Yang	697893568d	Improve error message when export encounters non-local input (#106403 ) Previously, you would get an error like ``` Dynamo input and output is a strict subset of traced input/output ``` now you get ``` Cannot export model which references tensors that are neither buffers/parameters/constants nor are direct inputs. For each tensor, if you'd like this tensor to be an explicit input, add it as a dummy argument to the top-level model definition you are exporting; if you would like its value to be embedded as an exported constant, wrap its access in a function marked with @assume_constant_result. G['bulbous_bouffant'], accessed at: File "test_export.py", line N, in f return bulbous_bouffant + y ``` This doesn't handle outputs, I'm going to hit that next. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106403 Approved by: https://github.com/tugsbayasgalan	2023-08-03 12:35:25 +00:00
Richard Zou	fd6e052a8a	Some minor improvements to FakeTensor testing (#106311 ) Summary: - PyTorch testing chokes sometimes when it sees an exception where the first argument is not a string. fake_tensor.UnsupportedOperatorException's first arg is an OpOverload. This PR fixes PyTorch testing to not choke. I'm not really sure how to reproduce this in OSS. - It turns out that if an operator does not have a meta kernel, the FakeTensor rule is really slow (30ms in OSS in debug mode, 3s on some internal config). The thing that is slow (aside from the previous diff) is waiting for the Dispatcher to report NotImplemented and then attempting to catch that. I'm not really sure why this is slow but it's easy to workaround so I added a workaround. Test Plan: - existing tests Differential Revision: D47917554 Pull Request resolved: https://github.com/pytorch/pytorch/pull/106311 Approved by: https://github.com/eellison	2023-08-03 01:44:15 +00:00
Zachary DeVito	8ee0b17990	Fix reference cycle in our test suite (#106328 ) In certain cases we capture ErrorMeta in a list. The ErrorMeta objects hold tracebacks which contain a frame with a local variable that refers to that list. This change mutates the list on exit from the frame so that it doesn't refer to the ErrorMeta objects, breaking the cycle. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106328 Approved by: https://github.com/huydhn	2023-08-02 07:58:32 +00:00
Edward Z. Yang	76163a56c0	Refactor stack handling to always use TracingContext to populate real stack on exception (#106277 ) The basic gist of the PR is simple, but it's accompanied with some careful modifications and unit tests to make sure I got it right. Check inline comments for more details. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/106277 Approved by: https://github.com/albanD, https://github.com/voznesenskym	2023-08-02 00:09:16 +00:00
Xiao Wang	21fd2bc32e	Allow setting TORCH_LINALG_PREFER_CUSOLVER=1 to prefer cusolver as linear algebra library globally (#106226 ) setting TORCH_LINALG_PREFER_CUSOLVER=1 This will allow users to prefer cusolver as linear algebra backend in their container use case. The switch is not enabled by default so it won't change any existing default behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/106226 Approved by: https://github.com/lezcano	2023-07-30 09:38:46 +00:00
Michael Lazos	bd669d52d2	Print env var name instead of flag name for commandline repros (#106223 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/106223 Approved by: https://github.com/seemethere, https://github.com/malfet	2023-07-28 23:22:27 +00:00
Justin Chu	4cc1745b13	[BE] f-stringify torch/ and scripts (#105538 ) This PR is a follow up on the pyupgrade series to convert more strings to use f-strings using `flynt`. - https://docs.python.org/3/reference/lexical_analysis.html#f-strings - https://pypi.org/project/flynt/ Command used: ``` flynt torch/ -ll 120 flynt scripts/ -ll 120 flynt tools/ -ll 120 ``` and excluded `collect_env.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/105538 Approved by: https://github.com/ezyang, https://github.com/malfet	2023-07-21 19:35:24 +00:00
eqy	29f856e3e0	Kill process in `wait_for_process` if `SIGINT` fails to terminate it (#105625 ) #98035 adds some additional logic `wait_for_process` that includes catching a timeout exception and sending `SIGINT` to the process before waiting on it again with a timeout. However, if the additional wait times out again, then the wait call in the `finally` block (which does not have a timeout) has the potential to hang indefinitely. This PR kills the process if a second timeout exception occurs after the `SIGINT` signal is sent. CC @clee2000 @ptrblck @xwang233 @kwen2501 Also hoping that this has the potential to reduce turnaround time for distributed timeouts like those seen in https://hud.pytorch.org/pr/pytorch/pytorch/105274#15148799113 Pull Request resolved: https://github.com/pytorch/pytorch/pull/105625 Approved by: https://github.com/ezyang	2023-07-21 10:11:58 +00:00
Yukio Siraichi	0b6de0eb1c	Improve validator module behavior if Z3 is not installed. (#105168 ) Fixes: #105143 In summary, the changes are: - Check if Z3 is installed when the module is loaded - Naming consistently as "translation validation" (not "validator") - Skipping tests if Z3 is not installed Pull Request resolved: https://github.com/pytorch/pytorch/pull/105168 Approved by: https://github.com/ezyang	2023-07-19 13:11:22 +00:00
Justin Chu	be03a56955	[BE] Enable ruff's UP rules and autoformat testing/ (#105425 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105425 Approved by: https://github.com/malfet	2023-07-18 21:04:39 +00:00
Joel Schlosser	ece19bf018	Update run_test.py to use TEST_WITH_SLOW_GRADCHECK flag (#104819 ) Finishes the job from #104537. See https://github.com/pytorch/pytorch/pull/104537#pullrequestreview-1520065008 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104819 Approved by: https://github.com/huydhn	2023-07-11 21:58:46 +00:00
Joel Schlosser	c2e286daf9	Testing: Print test reproduction command on failure (#104537 ) MS2 of the Reproducible Testing BE initiative. For context, this is the ask: ``` Another thing that would be really great as we start to have more dependent systems or types of tests (functorch, dynamo, crossref) would be to have a minimally reproducible version of the test (something at the end of the HUD comment like: "Run python test/test_file.py -k test_name" but also if you need flags, like crossref it would be like "Run <flag to run crossref> python test/..." ). I'll often go through the test infra to find the flags that I need to pass when something only breaks crossref/dynamo tests. ``` Implementation details: * Adds a new flag `PRINT_REPRO_ON_FAILURE` that is settable through the environment variable `PYTORCH_PRINT_REPRO_ON_FAILURE=1` * Default is ON but I can be persuaded otherwise * When the flag is enabled, our base `TestCase` will wrap the test method in a context manager that catches any non-skip exceptions and appends a repro string to the exception message. The repro includes setting of necessary test flags through env vars. Example: ``` To execute this test, run the following from the base repo dir: PYTORCH_TEST_WITH_CROSSREF=1 python test/test_ops.py -k test_foo_add_cuda_float32 This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0 ``` * To keep track of flag settings, this PR introduces a new `TestEnvironment` class that defines global flags by querying related environment variables. Flag and env var names are purposefully kept searchable via full names. Example usages: ```python TestEnvironment.def_flag("TEST_WITH_TORCHINDUCTOR", env_var="PYTORCH_TEST_WITH_INDUCTOR") # can track implication relationships to avoid adding unnecessary flags to the repro TestEnvironment.def_flag( "TEST_WITH_TORCHDYNAMO", env_var="PYTORCH_TEST_WITH_DYNAMO", implied_by_fn=lambda: TEST_WITH_TORCHINDUCTOR or TEST_WITH_AOT_EAGER) # can use include_in_repro=False to keep the flag from appearing in the repro command TestEnvironment.def_flag( "DISABLE_RUNNING_SCRIPT_CHK", env_var="PYTORCH_DISABLE_RUNNING_SCRIPT_CHK", include_in_repro=False) # the default default value is False, but this can be changed TestEnvironment.def_flag( "PRINT_REPRO_ON_FAILURE", env_var="PYTORCH_PRINT_REPRO_ON_FAILURE", default=(not IS_FBCODE), include_in_repro=False) ``` * AFAICT it is only feasible to achieve this from within the test framework rather than at the CI level. This is because CI / `run_test.py` are unaware of individual test cases. Implementing it in our base `TestCase` class has the broadest area of effect, as it's not isolated to e.g. OpInfo tests. * I couldn't find an easy way to test the logic via `test_testing.py`, as the logic for extracting the test filename doesn't work for generated test classes. I'm open to ideas on testing this, however. Pull Request resolved: https://github.com/pytorch/pytorch/pull/104537 Approved by: https://github.com/ezyang, https://github.com/janeyx99, https://github.com/huydhn	2023-07-10 21:24:02 +00:00
Yukio Siraichi	40b8d10d5e	Re-land: Turn translation validation on for tests and accuracy runs by default. (#104467 ) Re-landing: #103611 Pull Request resolved: https://github.com/pytorch/pytorch/pull/104467 Approved by: https://github.com/malfet	2023-07-05 19:01:50 +00:00
PyTorch MergeBot	a2a8b4d415	Revert "Turn translation validation on for tests and accuracy runs by default. (#103611 )" This reverts commit `e311bed2a8`. Reverted https://github.com/pytorch/pytorch/pull/103611 on behalf of https://github.com/malfet due to Broke inductor tests ([comment](https://github.com/pytorch/pytorch/pull/103611#issuecomment-1614850276))	2023-06-30 15:54:18 +00:00
Yukio Siraichi	e311bed2a8	Turn translation validation on for tests and accuracy runs by default. (#103611 ) This PR turns translation validation on by default for tests and accuracy benchmark runs. It also installs Z3 on CI. The main changes are: - Add `--no-translation-validation` as an option in _test/run_tests.py_ - Set `PYTORCH_TEST_WITH_TV` environment variable - Add `TEST_WITH_TV` variable in _torch/testing/_internal/common_utils.py_ - Turn translation validation on for accuracy benchmarks in _benchmarks/dynamo/common.py_ - Add Z3 installation on CI scripts Pull Request resolved: https://github.com/pytorch/pytorch/pull/103611 Approved by: https://github.com/ezyang	2023-06-30 01:32:21 +00:00
Nikita Shulga	13ef0ec186	Add "slow" tests to list of disable conditions (#103856 ) Companion PR to https://github.com/pytorch/test-infra/pull/4306 Pull Request resolved: https://github.com/pytorch/pytorch/pull/103856 Approved by: https://github.com/huydhn	2023-06-19 21:22:35 +00:00
Edward Z. Yang	ddf4cd69ec	Delete ifdyn and ifunspec combinators (#103596 ) Replaced with expect tests for ease of updating. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/103596 Approved by: https://github.com/voznesenskym	2023-06-15 00:14:17 +00:00
Elias Ellison	40d70ba7ed	Remove a number of fixed skips (#103162 ) Also adds `PYTORCH_TEST_WITH_AOT_EAGER` to distinguish errors coming from aot_autograd and not inductor (not tested in ci, but useful for local debugging) Pull Request resolved: https://github.com/pytorch/pytorch/pull/103162 Approved by: https://github.com/desertfire	2023-06-08 17:37:59 +00:00
Xiao Wang	39f3514fa3	Add an env PYTORCH_TEST_SKIP_CUDAGRAPH to skip all cuda graph-related unit tests (#103032 ) Skip all cuda graph-related unit tests by setting env var `PYTORCH_TEST_SKIP_CUDAGRAPH=1` This PR refactors the `TEST_CUDA` python variable in test_cuda.py into common_utils.py. This PR also creates a new python variable `TEST_CUDA_GRAPH` in common_utils.py, which has an env var switch to turn off all cuda graph-related tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/103032 Approved by: https://github.com/malfet	2023-06-06 07:51:57 +00:00
Richard Zou	74f10b9ea5	Switch most Python RAII guard usages to context manager (#102642 ) There are some I can't easily switch due to reasons like: - Dynamo modelling the guard - BC concerns (for torch.autograd.set_multithreading_enabled) Test Plan: - existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/102642 Approved by: https://github.com/albanD	2023-06-01 16:28:37 +00:00
Andres Lugo-Reyes	eaffd98880	Enable hipSOLVER in ROCm builds (#97370 ) Enables the hipSolver backend for ROCm builds -------------------------------------------------------------------------- - Minimum ROCm version requirement - 5.3 - Introduces new macro USE_LINALG_SOLVER the controls enablement of both cuSOLVER and hipSOLVER - Adds hipSOLVER API to hipification process - combines hipSOLVER and hipSPARSE mappings into single SPECIAL map that takes priority among normal mappings - Torch api to be moved to hipsolver backend (as opposed to magma) include: torch.svd(), torch.geqrf(), torch.orgqr(), torch.ormqr() - Will enable 100+ linalg unit tests for ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/97370 Approved by: https://github.com/malfet	2023-05-31 16:53:23 +00:00
Huy Do	6e3e3dd477	Do not collect and skip non-disabled tests when rerunning disabled tests (#102107 ) The console log blows up to much when running in rerun disabled tests mode (x50) `e132f09e88`. Each log is around 1GB and the whole uncompressed logs is ~50GB. After compression, it will be around 1GB, still too big. The increase comes mainly from the multiple SKIPPED message for non-disabled tests, which is expected due to how SkipTest and pytest-flakyfinder currently work. I update `test/conftest.py` to completely ignore skipped tests when rerunning disabled test instead of collecting then skipping 50 tests each. The benefit of doing is is much more than I originally expect: * Rerun disabled tests jobs now finish in less than half an hour as they should be * Fix OOM runner crash because of too many collected tests * Fix verbosity issue as now only disabled tests are run x50 times. There are only few hundreds of them atm * Fix timed out issue when rerunning disabled distributed and ASAN tests. They are just too slow when running at x50 ### Testing When rerunning disabled tests https://github.com/pytorch/pytorch/actions/runs/5084508614, only disabled tests on the platform are run, for example `test_ops_jit` on https://ossci-raw-job-status.s3.amazonaws.com/log/13770164954 only ran 100 tests (`test_variant_consistency_jit_linalg_lu_cuda_float32` + `test_variant_consistency_jit_linalg_lu_factor_cuda_complex64`) x50. ``` Executing ['/opt/conda/envs/py_3.10/bin/python', '-bb', 'test_ops_jit.py', '--shard-id=1', '--num-shards=2', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '--sc=test_ops_jit_1', '--flake-finder', '--flake-runs=50', '--import-slow-tests', '--import-disabled-tests', '--rerun-disabled-tests'] ... [2023-05-25 21:32:49.763856] Expand the folded group to see the log file of test_ops_jit 2/2 ##[group]PRINTING LOG FILE of test_ops_jit 2/2 (/var/lib/jenkins/workspace/test/test-reports/test_ops_jit_h2wr_t2c.log) Test results will be stored in test-reports/python-pytest/test_ops_jit/test_ops_jit-51a83bd44549074e.xml ============================= test session starts ============================== platform linux -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0 -- /opt/conda/envs/py_3.10/bin/python cachedir: .pytest_cache hypothesis profile 'pytorch_ci' -> database=None, max_examples=50, derandomize=True, suppress_health_check=[HealthCheck.too_slow] rootdir: /var/lib/jenkins/workspace configfile: pytest.ini plugins: hypothesis-5.35.1, cpp-2.3.0, flakefinder-1.1.0, rerunfailures-11.1.2, shard-0.1.2, xdist-3.3.0, xdoctest-1.1.0 collecting ... collected 1084 items Running 100 items in this shard: test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_float32 (x50), test/test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_complex64 (x50) stepcurrent: Cannot find last run test, not skipping test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_cuda_float32 PASSED [2.1876s] [ 1%] test_ops_jit.py::TestJitCUDA::test_variant_consistency_jit_linalg_lu_factor_cuda_complex64 PASSED [4.5615s] [ 2%] ``` * [pull](https://github.com/pytorch/pytorch/actions/runs/5093566864) * [trunk](https://github.com/pytorch/pytorch/actions/runs/5095364311) * [periodic](https://github.com/pytorch/pytorch/actions/runs/5095378850) * [slow](https://github.com/pytorch/pytorch/actions/runs/5095390285) Pull Request resolved: https://github.com/pytorch/pytorch/pull/102107 Approved by: https://github.com/clee2000, https://github.com/malfet	2023-05-27 12:10:36 +00:00
Edward Z. Yang	e7a6818e97	Register top level logger for torch (#102090 ) This enables use of artifact logging in modules that aren't under the modules that were specified here. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/102090 Approved by: https://github.com/Skylion007, https://github.com/mlazos	2023-05-23 21:24:21 +00:00
Catherine Lee	a26516b78b	Add inductor as a test disable group (#101448 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/101448 Approved by: https://github.com/huydhn, https://github.com/malfet	2023-05-16 21:48:49 +00:00
William Wen	0e811044bd	[dynamo 3.11] enable other torch 3.11 dynamo-related tests (#99180 ) Notes: - No segfaults observed in any CI tests: dynamo unittests, inductor unittests, dynamo-wrapped pytorch tests. So we remove the warning that using dynamo 3.11 may result in segfaults. - Fixed a weakreflist copying bug that caused a few dynamo-wrapped tests to hang. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99180 Approved by: https://github.com/malfet, https://github.com/TamirFriedman-RecoLabs	2023-05-15 22:06:28 +00:00
Edward Z. Yang	96487d0d1f	Refactor after_dynamo to have a CLI interface too. (#101220 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/101220 Approved by: https://github.com/anijain2305	2023-05-14 19:03:16 +00:00

1 2 3 4 5 ...

454 Commits