This PR is part of a series attempting to re-submit https://github.com/pytorch/pytorch/pull/134592 as smaller PRs.
In jit tests:
- Add and use a common raise_on_run_directly method for when a user runs a test file directly which should not be run this way. Print the file which the user should have run.
- Raise a RuntimeError on tests which have been disabled (not run)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154725
Approved by: https://github.com/clee2000
Fixes#118990
The root cause is due to `out_features` of Linear not matching `num_features` of BatchNorm, resulting in shape mismatch while computing `fused_w`, and `fused_b`. This can happen for linear-bn folding because linear layer operates over the last dim, `(*, H_in)`, while bn layer operates over the channel dim, `(N, C_in, H, W)`.
To preserve the shapes of the original linear weight and bias in linear-bn folding, check linear `out_features` match bn `num_features`. If they don't match, bn `num_features` need to be 1 to broadcast.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119264
Approved by: https://github.com/eellison
Fixes#68972
Relands #107246
To avoid causing Meta-internal CI failures, this PR avoids always asserting that the default dtype is float in the `TestCase.setUp/tearDown` methods. Instead, the assert is only done if `TestCase._default_dtype_check_enabled == True`. `_default_dtype_check_enabled` is set to True in the `if __name__ == "__main__":` blocks of all the relevant test files that have required changes for this issue
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108088
Approved by: https://github.com/ezyang
This PR allows freezing modules like the one below:
```python
# Ex. 1
@torch.jit.interface
class ModuleInterface(torch.nn.Module):
def forward(self, inp: torch.Tensor) -> torch.Tensor:
pass
class ImplementsInterface(torch.nn.Module):
def __init__(self):
super(ImplementsInterface, self).__init__()
self.sum = torch.zeros((2, 2))
def forward(self, inp: torch.Tensor) -> torch.Tensor:
self.sum += inp.relu() # this makes the interface-implementing module mutable
# and previously this would prevent freezing
return self.sum
class WrapperModule(torch.nn.Module):
impl: ModuleInterface
def __init__(self):
super().__init__()
self.impl = ImplementsInterface()
def forward(self, x: torch.Tensor) -> torch.Tensor:
return self.impl.forward(x)
```
Previously during freezing, we handle interfaces as shown below:
1. we inline interfaces in any preserved method graphs
2. during `cleanupFrozenModule`, we try to simplify the module data structure (<- this part is unrelated to freezing so far). During this step, if we found that a interface type was mutable, we'd error out; because of the possibility of a module that _swaps out the value of an interface-typed attribute at runtime_.
Below is an example of a module that swaps out the value of an interface-typed attribute at runtime:
```python
# Ex. 2
class MyBadModule(torch.nn.Module):
impl: MyInterface
option1: IfaceImpl1
option2: IfaceImpl2
....
def forward(self, x):
if x > 0:
self.impl = self.option1
else:
self.impl = self.option2
....
```
^ this type of situation cannot be supported by freezing (or at least would be difficult to do correctly) because it greatly complicates the details of handling types and simplifying the module data structure.
But we can still support the first example without _too_ much work:
1. inline the interface code as before
2. check to see if we have any setattrs on interface types; if so, error out
3. otherwise, replace the type of the interface types with the concrete type implementation
4. continue simplifying the module data structure as if we never had any interfaces.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86039
Approved by: https://github.com/eellison
Not sure why but this started throwing a lot of warnings while I was
adding tests to test_freezing.py, so I'm removing the deprecated escape
sequences to get rid of the warnings.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85987
Approved by: https://github.com/eellison
Test was marked as `skip` due ot a memory leak. Turns out the memory leak is expected - it can be fixed by clearing the compilation unit (with `torch.jit._state._python_cu.drop_all_functions()` at the end of the test function) or by disabling the leak detector on this test.
Fixes#77618
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78566
Approved by: https://github.com/eellison
Original PR: #77295
Original commit message:
On GPU, conv errors if not all its inputs have the same dtype.
In the case of autocasting during freezing, what we see is:
1) inputs to conv are casted to half
2) inputs to batchnorm are not casted, so many are still floats
3) we try to fold conv + batchnorm, by finding different weight and bias such that conv(input, new_weight, new_bias) is equivalent to the original conv -> batchnorm.
If conv previously had an optional bias, then during freezing we will temporarily create a zero-valued bias as a placeholder for conv_bias. We want to construct it to have the same dtype as the weight input to conv, to avoid errors on GPU.
Reland changes:
There's a memory leak from cuda caching allocator that is a side effect of this fix. The memory leak causes the test to fail, though for some reason it didn't fail on CI in the last PR. This skips the tests for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77617
Approved by: https://github.com/eellison
On GPU, conv errors if not all its inputs have the same dtype.
In the case of autocasting during freezing, what we see is:
1) inputs to conv are casted to half
2) inputs to batchnorm are not casted, so many are still floats
3) we try to fold conv + batchnorm, by finding different weight and bias such that conv(input, new_weight, new_bias) is equivalent to the original conv -> batchnorm.
If conv previously had an optional bias, then during freezing we will temporarily create a zero-valued bias as a placeholder for conv_bias. We want to construct it to have the same dtype as the weight input to conv, to avoid errors on GPU.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77295
Approved by: https://github.com/eellison
Relax the check condition of Conv-> Add/Sub/Mul/Div folding to accept that the input tensor of Add/Sub/Mul/Div is floating type or the promoteTypes of the input tensor of Add/Sub/Mul/Div is equal to the type of conv weight.
Relaxing this condition is mainly to deal with a common situation in models:
the conv output add/sub/mul/div an integer tensor or integer scalar tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73278
Approved by: https://github.com/eellison
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68668
This updates run_frozen_optimizations so that it will run on additional methods other than forward
ghstack-source-id: 143871758
Test Plan:
Added test in test_freezing.py
```
python3 test/test_jit.py -- test_conv_bn_folding_not_forward
```
Reviewed By: eellison
Differential Revision: D32567857
fbshipit-source-id: 75e56efad576404dc8d6897861d249573f5ccd7a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68316
Consider the following:
```
class Mod(nn.Module):
def __init__(self, val):
super().__init__()
self.param = nn.Parameter(val)
def forward(self, x):
# this method will change during freezing
return x + self.param
torch.jit.export
def make_prediction(self, x):
y = x + x
return self.forward(y)
param = torch.rand([2, 2])
unscripted_mod = Mod(param)
mod = torch.jit.script(unscripted_mod)
mod.eval()
mod = torch.jit.freeze(mod, preserved_attrs=["make_prediction"])`
```
During freezing the following will occur:
1. do some pre-freezing, including inlining; in particular, forward will be inlined into make_prediction. During inlining, forward.optimized_graph() is called, and the result is cached
2. freeze some methods. While freezing forward, the graph associated with the function will get updated. The cached optimized_graphs_ are not updated.
Previously, a call to `mod.forward(x)` would return an exectutor that would run on the old cached optimized_graph(). This would mean that the freezing optimizations would not apply, and potentially that the execution would fail because of parameters removed from the module.
This change clears the optimized_graphs_ cache after running freezing to prevent executing an old version of the graph.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D32410862
Pulled By: davidberard98
fbshipit-source-id: dd8bfe86ec2898b7c72813ab32c08f25c38e4cea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68367
- bmm_test.py was using syntax not allowed in 3.6
- Some suppressions were not placed on the correct line.
With this file,
```
lintrunner --paths-cmd='git grep -Il .'
```
passes successfully.
Test Plan: Imported from OSS
Reviewed By: janeyx99, mrshenli
Differential Revision: D32436644
Pulled By: suo
fbshipit-source-id: ae9300c6593d8564fb326822de157d00f4aaa3c2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67437
Certain ops do nothing on the forward pass and can be discarded after training: `aten::detach` and `fb::scale_gradient` are examples of this.
Test Plan: `buck test caffe2/test:jit -- test_freezing`
Reviewed By: hlu1
Differential Revision: D31980843
fbshipit-source-id: 0045b6babcfae786a2ce801b2f5997a078205bc0