Thanks @drisspg and @albanD for finding the fix
**TEST PLAN**
```
import gc
import torch
import torch.nn as nn
from torch.utils.module_tracker import ModuleTracker
class MyModel(nn.Module):
def forward(self, x):
return x * x
print(f"torch=={torch.__version__}")
m = MyModel()
m.cuda()
m.to(torch.bfloat16)
mt = ModuleTracker()
for i in range(1000):
if i % 100 == 0:
gc.collect()
print("memory_allocated:", torch.cuda.memory_allocated())
x = torch.randn([128, 256], device="cuda", dtype=torch.bfloat16, requires_grad=True)
with mt:
m(x)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141960
Approved by: https://github.com/albanD
The current call passes in `['/actual/path']` to os.walk which is a string pointing to no path and thus silently leads to and empty traversal.
There is an unused function just above that handles that, so I guess this is what was supposed to be called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/126103
Approved by: https://github.com/suo
This does a few things that were originally a few PRs but I am on a new machine and don't have ghstack.
If it is too problematic to review, I can re-split, just let me know.
This does:
- Cleanup context manager use in test_flop_counter
- Remove need for mod argument in FlopCounterMode, warning about it
- Re-implement a Module tracker from scratch using global forward Module use and multi_grad_hook (we cannot use global backward Module hook because they don't look for nested Tensor and they're custom Function based instead of multi_grad_hook).
- Update FlopCouterMode to use the new ModuleTracker. All the existing test suite passes as-is (only changes there are new tests and refactoring mentioned above)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125352
Approved by: https://github.com/mikaylagawarecki