pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Alban Desmaison	3569c59600	Inverse logic of persistent set and prevent use in jit (#38131 ) Summary: jit.ScriptModule deletes all the actual attributes but still uses the nn.Module implementation. Since I don't know how to add this new set() to the ScriptModule, it is simpler to just raise a nice error for now. I also reverted the logic so that an empty set() (which is always the case in a ScriptModule) means that everything is persistent. cc zdevito should we open an issue to add this to the ScriptModule? Pull Request resolved: https://github.com/pytorch/pytorch/pull/38131 Differential Revision: D21502183 Pulled By: albanD fbshipit-source-id: 96f83098d9a2a9156e8af5bf5bd3526dd0fefc98	2020-05-11 09:59:24 -07:00
Sharvil Nanavati	594b33ea10	Add support for non-persistent buffers. (#37191 ) Summary: Issue: https://github.com/pytorch/pytorch/issues/18056 Pull Request resolved: https://github.com/pytorch/pytorch/pull/37191 Differential Revision: D21428373 Pulled By: albanD fbshipit-source-id: a7d367bafb95137e1bc380178b82b08eff5d5a5a	2020-05-07 06:52:31 -07:00
Peter Bell	f8ec51bd86	Ensure DataParallel replicas can be saved (#37307 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/37182 The `zero_grad` wrapper from `_replicate_for_data_parallel` can't be pickled. So instead, I set an attribute `_is_replica = True` and check for this in `Module.zero_grad`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37307 Differential Revision: D21246119 Pulled By: mrshenli fbshipit-source-id: 4755786d48a20bc247570ba672de9dd526914ce1	2020-04-25 20:57:24 -07:00
Alban Desmaison	3799d1d74a	Fix many doc issues (#37099 ) Summary: Fix https://github.com/pytorch/pytorch/issues/35643 https://github.com/pytorch/pytorch/issues/37063 https://github.com/pytorch/pytorch/issues/36307 https://github.com/pytorch/pytorch/issues/35861 https://github.com/pytorch/pytorch/issues/35299 https://github.com/pytorch/pytorch/issues/23108 https://github.com/pytorch/pytorch/issues/4661 Just a bunch of small updates on the doc. Pull Request resolved: https://github.com/pytorch/pytorch/pull/37099 Differential Revision: D21185713 Pulled By: albanD fbshipit-source-id: 4ac06d6709dc0da6109a6ad3daae75667ee5863e	2020-04-23 10:01:03 -07:00
Zachary DeVito	967cdc2baf	Simplify replicate logic (#36174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36174 Test Plan: Imported from OSS Differential Revision: D20903301 Pulled By: zdevito fbshipit-source-id: 714a32fe417b7d1615886936c41505d1ba538f47	2020-04-13 11:21:43 -07:00
mpariente	79054495d3	(Fixes #33934 ) Fix AttributeError for nn.Module's properties (#34324 ) Summary: As described in https://github.com/pytorch/pytorch/issues/33934, the current attribute error in `nn.Module`'s properties are wrong. ```python from torch import nn class MyModule(nn.Module): property def something(self): hey = self.unknown_function() return hey model = MyModule() print(model.something) ``` This raises `AttributeError: 'MyModule' object has no attribute 'something'` when what we want is `AttributeError: MyModule instance has no attribute 'unknown_function'`. This fixes this issue and will make properties much easier to debug ! Pull Request resolved: https://github.com/pytorch/pytorch/pull/34324 Differential Revision: D20645563 Pulled By: ezyang fbshipit-source-id: 130f861851bdbef43803569a5ce9e24d2b942179	2020-03-26 07:43:21 -07:00
rohithkrn	66ee4f1c81	[ROCm] Enable Bfloat16 type for activation and batch-norm Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32065 Differential Revision: D19728858 Pulled By: ezyang fbshipit-source-id: 8f828c558bfe6c5f43f476ff8a0f967341f8f351	2020-02-11 21:04:20 -08:00
Peter Bell	efba630287	Issue a warning when zero_grad is used in DataParallel (#33064 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/31768, second attempt of https://github.com/pytorch/pytorch/issues/32870 DataParallel creates replicas of the original `nn.Module` with the parameters duplicated onto the destination devices. Calling `backwards` will propagate gradients onto the original module parameters but calling `zero_grad` on the replica module doesn't clear the gradients from the parent module. However, any replica using backwards was broken anyway since the replica's parameters are not leaf nodes in autograd. So, we should issue a warning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33064 Differential Revision: D19790178 Pulled By: albanD fbshipit-source-id: 886f36640acef4834a6fa57a26ce16b42ff0e9ad	2020-02-10 07:04:27 -08:00
Richard Zou	8195961f20	Revert D19730209: [pytorch][PR] Issue a warning when using zero_grad in DataParallel Test Plan: revert-hammer Differential Revision: D19730209 Original commit changeset: cb9b2cb0c2e0 fbshipit-source-id: 5bf53ea3c37a7ed2411a2acc34e40d07eff144c9	2020-02-06 07:05:51 -08:00
Peter Bell	46c3c18bcc	Issue a warning when using zero_grad in DataParallel (#32870 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/31768 `DataParallel` creates replicas of the original `nn.Module` with the parameters duplicated onto the destination devices. Calling `backwards` will propagate gradients onto the original module parameters but calling `zero_grad` on the replica module doesn't clear the gradients from the parent module, ~breaking any model that uses `backward`-`zero_grad` in its `forward`. I fix this by patching the replica module so that `zero_grad` clears grads on the parent as well.~ However, any replica using backwards was broken anyway since the replica's parameters are not leaf nodes in autograd. So, we should raise a warning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32870 Differential Revision: D19730209 Pulled By: ezyang fbshipit-source-id: cb9b2cb0c2e0aca688ce0ff3e56b40fbd2aa3c66	2020-02-05 20:25:04 -08:00
Seyyed Hossein Hasanpour	c7bf4d22fe	added exception args to the returned error message (#32693 ) Summary: addresses https://github.com/pytorch/pytorch/issues/32692 Pull Request resolved: https://github.com/pytorch/pytorch/pull/32693 Differential Revision: D19606757 Pulled By: mrshenli fbshipit-source-id: 79fc09f8bb6a33e1b73ce0bbc45387544c7adc1b	2020-01-29 08:26:27 -08:00
Alban Desmaison	81048c41ab	remove simple .data from torch/nn Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31482 Test Plan: Imported from OSS Differential Revision: D19303243 Pulled By: albanD fbshipit-source-id: 5afdfeb4b8382c09b9ec65acd545148ed76d4285	2020-01-15 12:40:38 -08:00
Alban Desmaison	77c78b7d28	remove .data from torch/nn doc Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31481 Test Plan: Imported from OSS Differential Revision: D19303242 Pulled By: albanD fbshipit-source-id: 4f650df9e9e302a299175967bcc6e30a5099fa2a	2020-01-14 07:30:42 -08:00
Vitaly Fedyunin	66f2bba852	Adding function to convert Module to channels last Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28991 Test Plan: Imported from OSS Differential Revision: D18430810 Pulled By: VitalyFedyunin fbshipit-source-id: 0693d4e31fc6f9831722c29fc83517f16ddfc028	2019-12-12 11:38:35 -08:00
James Reed	309b28ee3a	Trace module calls Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29261 Test Plan: Imported from OSS Differential Revision: D18343363 Pulled By: jamesr66a fbshipit-source-id: 0c6394205e2c0ea8708028d20df83fe17b466ff4	2019-11-06 15:05:49 -08:00
zou3519	e5d6b75319	Bag of documentation fixes; fix more sphinx warnings (#27850 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27850 Many of these are real problems in the documentation (i.e., link or bullet point doesn't display correctly). Test Plan: - built and viewed the documentation for each change locally. Differential Revision: D17908123 Pulled By: zou3519 fbshipit-source-id: 65c92a352c89b90fb6b508c388b0874233a3817a	2019-10-15 07:31:14 -07:00
Michael Suo	ffa422a8b3	kill _parameter_list (#27399 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27399 This was devised in a time when we didn't have module attributes. They are essentially just tensor lists, so represent them that way. This has the additional benefit of making the RNN forward pass faster because we effectively cache the flattened weights. The only complication part is that someone may come along and do: ``` my_rnn_mod.w_ih_l0 = torch.nn.Parameter(...) ``` This means we need to override setattr to keep the flattened weights cache up to date. Test Plan: Imported from OSS Differential Revision: D17785658 Pulled By: suo fbshipit-source-id: 7789cd1d0d4922bfd5eba1716976442fbf150766	2019-10-12 09:51:53 -07:00
davidriazati	0046092178	Reduce special casing around 'training' (#27109 ) Summary: Most of this was old cruft left over from special handling of `training` before we had a `bool` type. This makes all modules have a `training` attribute that is true by default and removes all other special handling. Fixes #26884 ](https://our.intern.facebook.com/intern/diff/17728129/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/27109 Pulled By: driazati Differential Revision: D17728129 fbshipit-source-id: 8ddc9fbb07a953dd05529538bfdd01ed88b5cb57	2019-10-07 13:52:59 -07:00
Gregory Chanan	23fde77d3d	Remove Module._backend as it's not used anymore. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25342 Test Plan: Imported from OSS Differential Revision: D17101571 Pulled By: gchanan fbshipit-source-id: 2cda46fe197e26a1cacb5e912f535809973d306e	2019-08-29 15:43:49 -07:00
Michael Suo	e2f5bc5c08	Properly mangle `nn.Module.__construct` (#23779 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23779 Mangling is two underscores, not one :(. We want this method to be private so that inheritors who define a `__construct` do not interfere with Module initialization Test Plan: Imported from OSS Differential Revision: D16645156 Pulled By: suo fbshipit-source-id: b9060cb35bfaa0391ff200b63fb78b1ac15fee39	2019-08-05 17:58:34 -07:00
Michael Suo	cbf05305c0	don't try to set training after ScriptModule has been initialized. (#23680 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23680 Now when initializing a ScriptModule during the torch.jit.load() process, there is already a cpp module backing the thing. That means that setting training will overwrite whatever the initialized ScriptModule had. This PR splits apart the common "set up internal state" part of the Module __init__ and calls that from ScriptModule.__init__ and Module.__init__, leaving the "nn.Module-specific" part (setting `self.training`) for the nn.Module __init__ Test Plan: Imported from OSS Differential Revision: D16606959 Pulled By: suo fbshipit-source-id: f7ea6b36551ff4e4472b7685f65731d5cfab87fd	2019-08-04 15:04:55 -07:00
Jiakai Liu	3b1c3996e1	remove RTTI check for TensorImpl shadow copy (#22773 ) Summary: We introduced RTTI in recent change: https://github.com/pytorch/pytorch/pull/21613 For internal mobile build we don't enable '-frtti' yet. This diff is trying to replace RTTI with alternative approach. According to dzhulgakov we could compare two tensors' type_id directly in most cases - which is more strict than comparing TensorImpl subclass type as TensorImpl -> type_id mapping is 1-to-n but it's more proper for this use case. The only two cases where we can relax direct type comparison (for legacy reason) are: 1. CPUTensor <-> CUDATensor; 2. SparseCPUTensor <-> SparseCUDATensor; Pull Request resolved: https://github.com/pytorch/pytorch/pull/22773 Differential Revision: D16277696 Pulled By: ljk53 fbshipit-source-id: 043e264fbacc37b7a11af2046983c70ddb62a599	2019-07-15 23:21:57 -07:00
SsnL	478d480d37	Add Module.requires_grad_ (#22576 ) Summary: addresses https://github.com/pytorch/pytorch/issues/20241 Pull Request resolved: https://github.com/pytorch/pytorch/pull/22576 Differential Revision: D16149314 Pulled By: zou3519 fbshipit-source-id: 1cc4c1ec084df30e00e9ae73ce1a53494a034d5c	2019-07-08 12:13:07 -07:00
Jerry Zhang	577c04c490	add mutation support for forward_pre_hook and forward_hook (#22285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22285 Previously forward hooks are expected to return None, this PR adds the support to overwrite input and output in `forward_pre_hook` and `forward_hook`, this is used to implement inserting quant/dequant function calls around forward functions. Differential Revision: D16022491 fbshipit-source-id: 02340080745f22c8ea8a2f80c2c08e3a88e37253	2019-07-01 11:06:42 -07:00
Adam Paszke	41d0525de3	Improve repr for IncompatibleKeys (#22119 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/20128. Pull Request resolved: https://github.com/pytorch/pytorch/pull/22119 Differential Revision: D15961965 Pulled By: ezyang fbshipit-source-id: 9cc397726e6bea5580e79d291cfc1ee75337fa0c	2019-06-24 15:26:54 -07:00
Dmytro Dzhulgakov	82dd69326b	Split nn.Module._save_to_state_dict to make it overridable (#21933 ) Summary: # Motivation We allow to override JIT module serialization with `__getstate__/__setstate__` in order to cover cases where parameters are not serializable. Use cases include: MKLDNN integration: `a388c78350/torch/utils/mkldnn.py (L18-L26)` and also fbgemm prepacked format integration for quantized tensors. However many Eager scripts use `torch.save(module.state_dict())` form of serialization. There are several ways to make it work: * make packed_weight itself pickleable (e.g. by binding `__getstate__/__setstate__` on C++ UDT level) * change: we’d need to allow module buffers to be of arbitrary, non-Tensor types * pro: no change to state_dict behavior * cons: might not be directly inspectable by user calling .state_dict(), especially if packed weights represent several tensors fused together * make packed_weight being proper Tensor layout * pro: no change to state_dict or buffers behavior * cons: adding new tensor layouts is pretty costly today * cons: doesn’t work if multiple tensors are packed in one interleaved representation * [this approach] allow Modules to override state_dict and return regular tensors * pro: most flexible and hackable * pro: maintains semantic meaning of statedict as all data necessary to represent module’s state * cons: complicates state_dict logic * cons: potential code duplication between `__getstate__/__setstate__` Based on discussions with zdevito and gchanan we decided to pick latter approach. Rationale: this behavior is fully opt-in and will impact only modules that need it. For those modules the requirement listed above won't be true. But we do preserve requirement that all elements of state_dict are tensors. (https://fburl.com/qgybrug4 for internal discussion) In the future we might also implement one of the approaches above but those are more involved. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21933 Differential Revision: D15937678 Pulled By: dzhulgakov fbshipit-source-id: 3cb5d1a8304d04def7aabc0969d0a2e7be182367	2019-06-21 09:55:22 -07:00
Will Feng	6b972795e4	Add `torch.__future__._overwrite_module_params_on_conversion` global flag, and check it in `nn.Module._apply()` (#21613 ) Summary: https://github.com/pytorch/pytorch/pull/17072 breaks `model.to(xla_device)`, because moving `model` to XLA device involves changing its parameters' TensorImpl type, and the current implementation of `nn.Module.to()` doesn't support changing module parameters' TensorImpl type: ```python # `6dc445e1a8/torch/nn/modules/module.py (L192-L208)` def _apply(self, fn): ... for param in self._parameters.values(): if param is not None: # Tensors stored in modules are graph leaves, and we don't # want to create copy nodes, so we have to unpack the data. param.data = fn(param.data) # NOTE: this doesn't allow changing `param.data`'s TensorImpl type if param._grad is not None: param._grad.data = fn(param._grad.data) # NOTE: this doesn't allow changing `param._grad.data`'s TensorImpl type ... ``` yf225 TODO: fix the description here when we finish the implementation To fix this problem, we introduce a new API `model.to_()` that always assign new tensors to the parameters (thus supporting changing the parameters to any TensorImpl type), and also bump the version counter of the original parameters correctly so that they are invalidated in any autograd graph they participate in. We also add warning to the current `model.to()` API to inform users about the upcoming behavior change of `model.to()`: in future releases, it would create and return a new model instead of in-place updating the current model. This unblocks adding XLA to our CI test suite, which also allows XLA to catch up with other changes in our codebase, notably the c10 dispatcher. [xla ci] cc. resistor ailzhang Pull Request resolved: https://github.com/pytorch/pytorch/pull/21613 Differential Revision: D15895387 Pulled By: yf225 fbshipit-source-id: b79f230fb06019122a37fdf0711bf2130a016fe6	2019-06-19 10:30:02 -07:00
Will Feng	4b1df5c1f5	Use fn(param) instead of fn(param.data) in nn.Module._apply (#21865 ) Summary: When we pass `fn` to `nn.Module._apply()` and `fn` is an in-place operation, the correct behavior should also include bumping the parameters' and their gradients' version counters. This PR fixes the old incorrect behavior and makes sure the new behavior is right. Note that this PR is BC-breaking in the following way: Previously, passing an in-place operation to `nn.Module._apply()` does not bump the module's parameters' and their gradients' version counters. After this PR, the module's parameters' and their gradients' version counters will be correctly bumped by the in-place operation, which will invalidate them in any autograd graph they previously participate in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21865 Differential Revision: D15881952 Pulled By: yf225 fbshipit-source-id: 62f9244a4283a110147e9f20145ff232a5579fbd	2019-06-18 20:45:40 -07:00
杨培文 (Yang Peiwen)	e447a733a1	Update module.py Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21570 Differential Revision: D15732665 Pulled By: ezyang fbshipit-source-id: caa12a8619ad1396540f787b5c849d29cc5b03bd	2019-06-09 15:28:35 -07:00
Dmytro Dzhulgakov	c25e33789e	Lightweight at-most-once logging for API usage (#20745 ) Summary: Resubmit #20698 which got messed up. Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl. Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745 Differential Revision: D15429196 Pulled By: dzhulgakov fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca	2019-05-23 23:17:59 -07:00
Sam Gross	c1fa449763	Break reference cycle in load_state_dict (#20397 ) Summary: load_state_dict includes a recursive inner function `load` that captures Tensors through the close-over variable `state_dict`. Because it's recursive, it also captures itself leading to a reference cycle. This breaks the reference cycle so that any Tensors in state_dict can be collected immediately instead of waiting until the next GC cycle. Alternatively, we could have passed `state_dict` and `metadata` as arguments to load to prevent capture of Tensors. (That would still result in cyclic garbage, but not any cyclic garbage of Tensors). See: https://github.com/pytorch/pytorch/issues/20199#issuecomment-491089004 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20397 Differential Revision: D15414834 Pulled By: colesbury fbshipit-source-id: 4c2275a08b2d8043deb3779db28be03bda15872d	2019-05-20 11:46:00 -07:00
Edward Z. Yang	9b1dbffba5	Re-sync with internal repository (#20702 )	2019-05-20 09:22:57 -04:00
Dmytro Dzhulgakov	d3059b9c49	Lightweight logging for once-only API usage	2019-05-19 23:04:40 -07:00
Alexandros Metsai	9e3bdb3231	Update module.py documentation. (#19347 ) Summary: Added the ">>>" python interpreter sign(three greater than symbols), so that the edited lines will appear as code, not comments/output, in the documentation. Normally, the interpreter would display "..." when expecting a block, but I'm not sure how this would work on the pytorch docs website. It seems that in other code examples the ">>>" sign is used as well, therefore I used with too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/19347 Differential Revision: D14986154 Pulled By: soumith fbshipit-source-id: 8f4d07d71ff7777b46c459837f350eb0a1f17e84	2019-04-18 06:46:24 -07:00
Sepehr Sameni	b11a8c6aef	return missing keys from load_state_dict (#18668 ) Summary: return missing_keys and unexpected_keys from load_state_dict so the user can handle them when strict mode is off; also removed an unused variable Pull Request resolved: https://github.com/pytorch/pytorch/pull/18668 Differential Revision: D14782073 Pulled By: ezyang fbshipit-source-id: ab3b855eb77bb7422594d971988067e86eef20f2	2019-04-04 18:11:56 -07:00
Alexandr Morev	abc171bd53	Fix typo in docstring Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18216 Differential Revision: D14539824 Pulled By: ezyang fbshipit-source-id: 490b72951a75f3f8b949a2d692d660a3693ee98a	2019-03-20 11:16:36 -07:00
Kai Zhang	4ad17c9031	Misleading documentation for module._load_from_state_dict (#17618 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17618 Base on the code, we only add key to `missing_keys` and `unexpected_keys` if `$strict` is `True`. The documentation is confusing. This diff also fix one FLAKE8 warning. Reviewed By: ailzhang Differential Revision: D14280593 fbshipit-source-id: d368f5596bdf74ff62ee4d28d79120f5af91e0a3	2019-03-12 16:57:39 -07:00
ZhuBaohe	acf5ec07af	Correct conv and pooling docstrings in nn module (#17052 ) Summary: This PR fix conv and pooling docstrings in nn module Pull Request resolved: https://github.com/pytorch/pytorch/pull/17052 Differential Revision: D14068566 Pulled By: ezyang fbshipit-source-id: 3ec1de232ff6334b6a544dadefbb0ee6193d443a	2019-02-15 06:58:02 -08:00
Michael Suo	bd75fba4e8	fix tracing using a dictionary as input (#16616 ) Summary: Previously this would fail with the error message: ``` ValueError: Auto nesting doesn't know how to process an input object of type dict. Accepted types: Tensors, or lists/tuples of them ``` Turns out we're not using the line that causes this error (or a side effect of that line), so removing it fixes the issue. Also cleaned up some related dead code (cc apaszke to make sure the code isn't useful in some way) Pull Request resolved: https://github.com/pytorch/pytorch/pull/16616 Differential Revision: D13908352 Pulled By: suo fbshipit-source-id: 27094f1f4ea0af215b901f7ed3520e94fbc587b3	2019-02-01 14:44:56 -08:00
FrankHui	fe4ae9dfe4	add if in register_buffer like register_parameters (#16110 ) Summary: without this "if", code below will throw error " Linear' object has no attribute '_buffers' " And with this if, error would be "cannot assign buffer before Module.\_\_init\_\_() call", which I think it's more accurate, just like register_parameter. ``` import math import torch from torch.nn.parameter import Parameter from torch.nn import functional as F from torch.nn import Module class Linear(Module): def __init__(self, in_features, out_features, bias=True): self.in_features = in_features self.out_features = out_features self.register_buffer('test', torch.Tensor(out_features, in_features)) self.weight = Parameter(torch.Tensor(out_features, in_features)) if bias: self.bias = Parameter(torch.Tensor(out_features)) else: self.register_parameter('bias', None) super(Linear, self).__init__() self.reset_parameters() def reset_parameters(self): stdv = 1. / math.sqrt(self.weight.size(1)) self.weight.data.uniform_(-stdv, stdv) if self.bias is not None: self.bias.data.uniform_(-stdv, stdv) def forward(self, input): return F.linear(input, self.weight, self.bias) def extra_repr(self): return 'in_features={}, out_features={}, bias={}'.format( self.in_features, self.out_features, self.bias is not None ) linear = Linear(3,4) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/16110 Differential Revision: D13715839 Pulled By: soumith fbshipit-source-id: c300eff0a8655aade448354cf489a592f7db722a	2019-01-17 11:50:12 -08:00
Derek Kim	19717224c5	Miscellaneous broken RSTs fixed (#16033 ) Summary: https://pytorch.org/docs/master/tensors.html#torch.Tensor.bernoulli_ https://pytorch.org/docs/master/torch.html#torch.addmm https://pytorch.org/docs/master/distributed_deprecated.html#torch.distributed.deprecated.reduce_multigpu Pull Request resolved: https://github.com/pytorch/pytorch/pull/16033 Differential Revision: D13671202 Pulled By: soumith fbshipit-source-id: 276e10e610affe205376573e7f0f9894695d218d	2019-01-15 09:50:12 -08:00
Peter Goldsborough	aec9fdf0a4	Fix _apply in nn.Module (#15305 ) Summary: Fixes an issue that arose from https://github.com/pytorch/pytorch/pull/13481 where `.shared_memory()` couldn't be called. Effectively undoes all changes to `nn.Module` from that PR and solve the relevant problem in a different way (the goal was to be able to call `._apply()` on the Python wrapper for a C++ module). soumith Pull Request resolved: https://github.com/pytorch/pytorch/pull/15305 Differential Revision: D13493937 Pulled By: goldsborough fbshipit-source-id: 4cb8687f90fc8709a536c5e7eacd0dc8edf6f750	2018-12-17 16:22:21 -08:00
Peter Goldsborough	0bf1383f0a	Python <-> C++ Frontend inter-op (#13481 ) Summary: This PR enables C++ frontend modules to be bound into Python and added as submodules of Python modules. For this, I added lots of pybind11 bindings for the `torch::nn::Module` class, and modified the `torch.nn.Module` class in Python to have a new Metaclass that makes `isinstance(m, torch.nn.Module)` return true when `m` is a C++ frontend module. The methods and fields of C++ modules are bound in such a way that they work seamlessly as submodules of Python modules for most operations (one exception I know of: calling `.to()` ends up calling `.apply()` on each submodule with a Python lambda, which cannot be used in C++ -- this may require small changes on Python side). I've added quite a bunch of tests to verify the bindings and equality with Python. I think I should also try out adding a C++ module as part of some large PyTorch module, like a WLM or something, and see if everything works smoothly. The next step for inter-op across our system is ScriptModule <-> C++ Frontend Module inter-op. I think this will then also allow using C++ frontend modules from TorchScript. apaszke zdevito CC dzhulgakov Pull Request resolved: https://github.com/pytorch/pytorch/pull/13481 Differential Revision: D12981996 Pulled By: goldsborough fbshipit-source-id: 147370d3596ebb0e94c82cec92993a148fee50a7	2018-12-13 08:04:02 -08:00
Ryan Moore	29d697aec4	typo in Module docstring Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14511 Differential Revision: D13246061 Pulled By: soumith fbshipit-source-id: 6c13a2957c4c4324ab5d839d634689c61e25b0fe	2018-11-29 07:17:29 -08:00
albanD	f80d34a1c8	Update Tensor doc (#14339 ) Summary: Add to the Tensor doc info about `.device`, `.is_cuda`, `.requires_grad`, `.is_leaf` and `.grad`. Update the `register_backward_hook` doc with a warning stating that it does not work in all cases. Add support in the `_add_docstr` function to add docstring to attributes. There is an explicit cast here but I am not sure how to handle it properly. The thing is that the doc field for getsetdescr is written as being a const char * (as all other doc fields in descriptors objects) in cpython online documentation. But in the code, it is the only one that is not const. I assumed here that it is a bug in the code because it does not follow the doc and the convention of the others descriptors and so I cast out the const. EDIT: the online doc I was looking at is for 3.7 and in that version both the code and the doc are const. For older versions, both are non const. Please let me know if this should not be done. And if it should be done if there is a cleaner way to do it ! Pull Request resolved: https://github.com/pytorch/pytorch/pull/14339 Differential Revision: D13243266 Pulled By: ezyang fbshipit-source-id: 75b7838f7cd6c8dc72b0c61950e7a971baefaeeb	2018-11-28 15:28:17 -08:00
Tongzhou Wang	2cd912bcc2	Fix more spectral norm bugs (#13350 ) Summary: Problems with SN and DP after #12671 : 1. in eval mode, `weight_orig` is not getting correct gradient #12737 . Fix: keep `v` vector around as a buffer and always calculate `W = W_orig / (u @ W_orig @ v)` even in eval. 2. in training mode, the `weight` buffer of the parallelized module is never updated, if someone touches `weight_orig` and/or `weight` and makes them not sharing storage. So in `eval` the weight used is wrong. Fix: Make `weight` not a buffer anymore and always calculate it as above. 3. #12671 changed SN to update `u` in-place to make DP work correctly, but then it breaks backward through two forwards (e.g., the common GAN loss `D(real) - D(fake)`) because the vectors needed to backprop the 1st forward is changed in the 2nd forward. Fix: This PR clones `u` and `v` before using them. To maintain BC, I added a hook interface for producing and loading state_dict. This is ugly and we should really have better interface for spectral_norm. But for the purpose to fix this issue, I make this patch. Even if we have a better interface, BC mechanism for legacy loading legacy state_dict still needs to be done. cc The controller you requested could not be found. crcrpar Pull Request resolved: https://github.com/pytorch/pytorch/pull/13350 Differential Revision: D12931044 Pulled By: SsnL fbshipit-source-id: 8be6f934eaa62414d76d2c644dedd7e1b7eb31ef	2018-11-06 19:16:13 -08:00
Evgeniy Zheltonozhskiy	c774cb8913	Rephrase unclear error message for shape mismatch (#12870 ) Summary: I spent a couple of minutes trying to understand which shape corresponds to checkpoint and which one to the model Pull Request resolved: https://github.com/pytorch/pytorch/pull/12870 Differential Revision: D10466600 Pulled By: SsnL fbshipit-source-id: 3b68530b1b756462a2acd59e3a033ff633567a6b	2018-10-22 08:57:16 -07:00
Tongzhou Wang	de460c7ad3	Improvements on conv/pool/fold/stft/ParamDict docs (#11106 ) Summary: Also fixes some incorrect formula rendering. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11106 Differential Revision: D9752433 Pulled By: SsnL fbshipit-source-id: 535fc8498638e8b645757fc7535d8771992b7d21	2018-09-11 08:56:21 -07:00
gngdb	c5b021cc88	State dict loading arguments were in the wrong order (#11200 ) Summary: In the state dict loading code, it would print the error message referring to the shape of the loaded parameters and the parameters in the initialised model with the formatting in the wrong order. Swapped them round to fix. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11200 Differential Revision: D9631160 Pulled By: SsnL fbshipit-source-id: 03d9446303bd417fef67027b10d7a27de06486be	2018-09-03 15:42:30 -07:00
Jerry Ma	afd7477eaa	Add ``buffers(),` `named_buffers()`` methods. (#10554 ) Summary: This commit adds the ``buffers()`` and ``named_buffers()`` methods as analogues of ``parameters()`` and ``named_parameters()``. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10554 Reviewed By: SsnL Differential Revision: D9367762 Pulled By: jma127 fbshipit-source-id: f2042e46a7e833dce40cb41681dbd80d7885c74e	2018-08-16 16:26:48 -07:00

1 2 3 4

168 Commits