pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Andrey Malevich	b599910f3a	Use new metric intefaces in trainer workflows. Summary: This diff is migrating existing DPER workflows to use new metric abstractions in training. Reviewed By: xianjiec Differential Revision: D4656576 fbshipit-source-id: 1b3b16b390fc0757480e41df1c4214c11cd76e8a	2017-03-07 12:46:52 -08:00
Qichao Que	2f68632a32	Add SparseNN workflow for feed. Summary: Add SparseNN workflow for feed. I haven't fully thought about the change needed for ads, as I added a property called 'preproc_output_schema' for LayerModelHelper. Reviewed By: xianjiec Differential Revision: D4585796 fbshipit-source-id: 060d08f4beb928e7e7863f2e563f612c358951fb	2017-03-01 11:02:38 -08:00
Andrey Malevich	a3726759c6	Add a way do describe layers in a more AdHoc manner. Summary: This diff is trying to address one of the concerns that Xianjie have had - requirements create a layer for all operators and attach pass shapes and other info around. The basic idea of the diff: 1. Try to create a layer with a given name, but if it's not available try to fallback on operator with that name (that is expected to have no parameters). 2. For all operators that we're adding through this functional style of creation - try to use C2 Shape/Type inference logic to get output type. If we fail to get - it just return untyped record and expect user to annotate it when it's really needed. Reviewed By: xianjiec Differential Revision: D4408771 fbshipit-source-id: aced7487571940d726424269970df0eb62670c39	2017-02-27 23:30:39 -08:00
Xianjie Chen	d0621a2449	NextScopedBlob with well-defined behavior and respect namescope Summary: Remove the use of `NextName` in layer model helper, so that the same function return `model_helper` that should construct identical `Net`, when under the same NameScope. The `NextScopedBlob` should only take effect when there is real name conflicting, otherwise it returns ScopedBlobReference. This is critical for parameter blobs. In long run, we need to be able to specify parameter blobs more explicitly. (kennyhorror is working on this). This solution works in short term for e.g., two tower sparse nn models. Reviewed By: kennyhorror Differential Revision: D4555423 fbshipit-source-id: 2c4b99a61392e5d51aa878f7346466a8f14be187	2017-02-16 17:16:36 -08:00
Xianjie Chen	fb7c9108d9	get parameter blobs of a model Summary: to verify that a model only used a subset of the parameters of another model (e.g., the model doing training). Differential Revision: D4557787 fbshipit-source-id: bd8ac96f5e78e05f6f56086db6e6ddcda36c1d37	2017-02-15 16:00:44 -08:00
Andrey Malevich	ec51f887bf	Create only one instance of SigridTransform in DPerExample. Summary: DPer example have been creating multiple copies of the transform config in net defition till this moment, that resulted in the fact that I've hit the limit of ProtoBuf (64MB) for a certain Task requests (especially visible because of the ValidationPipeline that I was adding). After this diff we're going to store SigridTransforms in one instance per machine for training (or 1 instance per reading). Difference in sizes of the plans for some simple SparseNN model ~30 MB (even including the fact that second model have validation plan as well). TODO: Do similar logic for NNPreProc as well (it's also pretty large). Reviewed By: dzhulgakov Differential Revision: D4441441 fbshipit-source-id: 4452dd86a4dc49b2c7f5b7642f443aed5720b047	2017-01-22 19:29:16 -08:00
Xianjie Chen	4b3bd06a7f	sparse nn converges better by dedupping sparse gradient by mean Summary: this normalizes the sparse gradient, so that the "effective learning rate" of each sparse parameter will NOT be affected by the number of examples in a batch that "use" this sparse parameter. experiment shows it help convergence (about 0.1% better train NE): https://fburl.com/1230747813683956. It's not conclusive yet, and we still need to do more experiments. But this diff adds it as an option, and does not change the default behavior, so we can get this in first. Differential Revision: D4367283 fbshipit-source-id: 49ea80dfa9ea776ff4160e220cf6c86593521607	2016-12-27 22:59:29 -08:00
Yangqing Jia	238ceab825	fbsync. TODO: check if build files need update.	2016-11-15 00:00:46 -08:00

8 Commits