pytorch/torch/testing/_internal
Joel Schlosser e7ec294c10 NJT OpInfo tests v2 (#138370)
This PR updates OpInfo-based tests for NJTs:
* Adds extensive coverage across non-contiguous NJTs (both non-contiguous transposed and non-contiguous with holes)
    * The `_sample_njts()` helper that `sample_input_func`s utilize now produces non-contig NJTs as well
* Utilizes a `SampleInput`-based xfail system for granular classification of bugs. For example, it's possible to indicate that a class of ops is expected to fail only on non-contig with holes NJT inputs.
    * I decided on adding `SampleInput`s and utilizing this system over using test parametrization for two reasons:
        * Test perf - adding `SampleInput`s is faster than generating entire new tests
        * Avoiding the possibility of `sample_input_func`s not respecting the non-contig test parameter - this would result in silently incorrect passing of these tests. Keeping the responsibility for `SampleInput` generation firmly within each `OpInfo`'s `sample_input_func` means weirdness like this isn't possible
* Improves `SampleInput` naming for a bunch of `sample_input_func`s. This makes it easier to xfail them as needed. For example, binary / unary / other ops now use the new `_describe_njt()` helper to get a string repr that uniquely defines the type of NJT being passed to the op
* Adds appropriate `XFailRule`s to get tests passing for forward / backward / forward compile / backward compile. In general, each xfail corresponds to some bug that needs to be fixed

```python
# Represents a rule indicating how to xfail a particular test. It allows granularity
# at the device, dtype, op, and individual sample levels. This flexibility allows entire
# bugs to be represented by a single rule, even if this corresponds with multiple conceptual
# test cases across multiple ops.
@dataclass
class XFailRule:
    # expected error type
    error_type: TypeVar = Exception
    # expected error message
    error_msg: str = ".*"
    # function to indicate whether the rule applies; return True if so
    match_fn: Callable[[torch.device, torch.dtype, OpInfo, SampleInput], bool] = None
    # optional name for identifying the rule
    name: str = ""

    def match(self, device, dtype, op, sample) -> bool:
        return self.match_fn(device, dtype, op, sample)
```

Example:
```python
    # Bug when broadcasting a binary op with non-contiguous with holes NJT + dense
    # tensor with 1 in ragged dim.
    XFailRule(
        error_type=RuntimeError,
        error_msg="cannot call binary pointwise function .* with inputs of shapes",
        match_fn=lambda device, dtype, op, sample: (
            isinstance(op, BinaryUfuncInfo)
            and "noncontig_holes" in sample.name
            and "broadcasting 1 over ragged" in sample.name
        ),
        name="binary_noncontig_holes_broadcasting_1_over_ragged",
    ),
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138370
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
ghstack dependencies: #140160
2024-11-11 19:35:24 +00:00
..
codegen
data
distributed [DTensor][Test] Fix gloo backend failure when eager_init is turned on (#139097) 2024-10-29 00:04:06 +00:00
generated
opinfo NJT OpInfo tests v2 (#138370) 2024-11-11 19:35:24 +00:00
optests Revert "Deprecate torch._utils.is_compiling() and torch._dynamo.external_utils.is_compiling() (#127690)" 2024-11-05 23:10:38 +00:00
test_module
__init__.py
autocast_test_lists.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
autograd_function_db.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
check_kernel_launches.py
common_cuda.py Make sure all SDPA tests are ran with tensor cores enabled (#135592) 2024-10-29 20:53:10 +00:00
common_device_type.py [ROCm] re-enable flex attention UTs (#139632) 2024-11-06 12:49:44 +00:00
common_dist_composable.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
common_distributed.py Ensure TORCH_TRACE is run for Dynamo/Distributed tests (#139786) 2024-11-07 01:58:05 +00:00
common_dtype.py
common_fsdp.py [CI] Add Compiled DDP / Compiled FSDP2 / compute-comm reordering tests to test_inductor_distributed (#138178) 2024-10-20 19:38:18 +00:00
common_jit.py
common_methods_invocations.py Recover non-standard bool test for msort (#139870) 2024-11-11 02:00:34 +00:00
common_mkldnn.py
common_modules.py [MPS][Perf] Dispatch to SDP-math-mps for non-contig Tensors (#139791) 2024-11-06 16:25:39 +00:00
common_nn.py
common_optimizers.py Add Support for Tracking Parameter Names (named_parameters) in Optimizer State Dict (#134107) 2024-10-14 19:24:44 +00:00
common_pruning.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
common_quantization.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
common_quantized.py [3/N] Don't skip ASAN on some tests (#139058) 2024-10-28 23:57:23 +00:00
common_subclass.py
common_utils.py NJT OpInfo tests v2 (#138370) 2024-11-11 19:35:24 +00:00
composite_compliance.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
custom_op_db.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
custom_tensor.py
dist_utils.py
dynamo_test_failures.py
fake_config_module.py Add type annotations to Configs (#139833) 2024-11-07 03:49:09 +00:00
hop_db.py [invoke_subgraph] User facing API to support arbitrary args and kwargs (#139162) 2024-11-08 03:31:19 +00:00
hypothesis_utils.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
inductor_utils.py [Inductor UT] Generalize newly introduced inductor UTs for intel GPU (Part 3) (#136947) 2024-10-12 13:21:20 +00:00
jit_metaprogramming_utils.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
jit_utils.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
logging_tensor.py
logging_utils.py
quantization_torch_package_models.py
static_module.py
subclasses.py [aotd] coerce_same_metadata_as_tangent with expected_type for e.g.AsyncCollectiveTensor (#139095) 2024-11-07 16:24:48 +00:00
torchbind_impls.py Remove unused Python variables in torch/[b-z]* (#136963) 2024-10-19 16:45:22 +00:00
triton_utils.py Add host-side Triton TMA support to Dynamo (#137677) 2024-10-16 02:18:48 +00:00
two_tensor.py Fix tensor subclass + dynamic shapes in torch.compile + aot autograd (#125941) 2024-10-28 21:58:59 +00:00