## The problem.
[A commit from three weeks ago](82d00acfee) appears to have broken five tests but was not caught by CI.
[A later commit](https://github.com/pytorch/pytorch/commit/e05ea2b1797) which added a decomposition of `transpose_copy` added another broken test, also seemingly not detected, making six total (listed below).
They came to my attention when I updated some pending decomposition pull requests which passed CI, and started getting failures like [this](https://hud.pytorch.org/pr/134319) for a test unrelated to any of these pull requests, `TestCommonCPU.test_out__refs_transpose_copy_cpu_float32`
Running `python test/test_ops.py -k _copy` on `viable/strict` found failures for six `_refs` ops: `copysign`, `expand_copy`, `index_copy`, `t_copy`, `transpose_copy`, `view_copy`
## The solution
The original commit did actually cause breakage by slightly changing user-visible behavior (in a special case involving scalar tensors being copied between different devices).
This pull request fixes that breakage in a reasonable way, but I don't understand why this error didn't appear in CI until I made later changes in the same area.
## To reproduce
To reproduce the six cases in your own client:
```
PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=5 python test/test_ops.py TestCommonCPU.test_out__refs_view_copy_cpu_float32
PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=2 python test/test_ops.py TestCommonCPU.test_out__refs_t_copy_cpu_float32
PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=0 python test/test_ops.py TestCommonCPU.test_out__refs_index_copy_cpu_float32
PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=7 python test/test_ops.py TestCommonCPU.test_out__refs_expand_copy_cpu_float32
PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=0 python test/test_ops.py TestCommonCPU.test_out__refs_copysign_cpu_float32
PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=4 python test/test_ops.py TestCommonCPU.test_out__refs_transpose_copy_cpu_float32
```
@amjames
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136653
Approved by: https://github.com/zou3519