[BE][Ez]: Fix minor potential perf regression from #123960 (#124013)

The `non_blocking` arg here is useless if the values are all eagerly consumed, so revert the change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124013
Approved by: https://github.com/ezyang
This commit is contained in:
Aaron Gokaslan 2024-04-15 16:51:40 +00:00 committed by PyTorch MergeBot
parent fea1b99d89
commit 9c4fc5fa34

View File

@ -426,8 +426,10 @@ class GradScaler:
found_inf = cast(
torch.Tensor,
sum(
t.to(scaler.device, non_blocking=True)
for t in optimizer_state["found_inf_per_device"].values()
[ # noqa: C419
t.to(scaler.device, non_blocking=True)
for t in optimizer_state["found_inf_per_device"].values()
]
),
)
optimizer.grad_scale = ( # type: ignore[attr-defined]