pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Andrey Malevich	01de4e40d6	Fix a bug in nested parameter sharing logic. Summary: It appears that my initial implementation was not really working when one starts doing nesting. This diff is fixing this by replacing itertools with something that is really easy to reason about. Reviewed By: idning Differential Revision: D6933763 fbshipit-source-id: f7a1de996d878a41bac2b2acd9d87a7c4b416778	2018-02-08 13:32:53 -08:00
Giri Anantharaman	6aaa701c9c	Adding ThresholdedRelu Op support. Summary: Core operator and python operator changes for adding ThresholdedRelu Op support. Reviewed By: houseroad Differential Revision: D6900660 fbshipit-source-id: 9b17ede13ccb3264286389c7fc633ab9c1a7bbbf	2018-02-08 12:18:40 -08:00
Alexander Sidorov	e0e124e617	Fix RNN scoping situation Summary: There is a long lasting problem of scoping which was introduced in original python wrappers early in H1. Basically each RNNCell implemented has to manually scope outputs of each of the operators. If somebody forgets, then there could be weird bugs with layers etc. Approach is the following. User has to explicitly specify current scope when using apply_over_sequence function and others if the function is going to be called several times (like for stacking layers). This way we use Caffe2 native scoping approach instead of inventing one extra API people have to use (i.e. passing scope name as an argument to the RNNCell constructor). Closes https://github.com/caffe2/caffe2/pull/1681 Differential Revision: D6777536 Pulled By: salexspb fbshipit-source-id: 73d860b8d4857589e04bdea5a6fcd3080d68427c	2018-02-07 17:35:29 -08:00
James Reed	a68e224219	Fix ONNX While test for CUDA Summary: We should not be trying to instantiate this op on GPU at this point Reviewed By: pietern Differential Revision: D6915576 fbshipit-source-id: 6bdbc93ad12fc67e3001fce1b506fe2895d7b0ba	2018-02-06 14:35:34 -08:00
Qinqing Zheng	c028bcd466	Fix input of Reduce{Front/Back}{Sum/Mean}Gradient ops Summary: The previous refactor of these four Ops changed their input semantics, which makes backward impatible with old models. This diff fix this problem by checking the input and define follow-up behavior by case, so that the old models can be accommodated. Reviewed By: dzhulgakov Differential Revision: D6905840 fbshipit-source-id: fc37baec407fd5eae64fc9c2b61aba3c492a90f3	2018-02-05 23:33:07 -08:00
James Reed	f383600625	ONNX While Operator Summary: Special While loop operator that follows the semantics of While in ONNX: https://github.com/jamesr66a/onnx/blob/controlflow/docs/Operators.md#experimental-loop Stuff that's missing: - Lexical scoping enforced via child workspaces - Double-buffering on forward Further possible enhancements: - Full parallelism when there are no loop-carried dependencies - Diagonal execution - More optimized scan_outputs shaping via static shape inference provided in ONNX (coming sometime) - GPU support (probably just some tensor value management stuff) - Gradient support (likely low-pri right now) Closes https://github.com/caffe2/caffe2/pull/1848 Reviewed By: dzhulgakov Differential Revision: D6907524 Pulled By: jamesr66a fbshipit-source-id: 4938108733e168b8c027035091104712a18c992a	2018-02-05 21:05:52 -08:00
Anders Papitto	6a02cb2844	implement sequence length support for BasicRNN Summary: Closes https://github.com/caffe2/caffe2/pull/1843 Differential Revision: D6839575 Pulled By: anderspapitto fbshipit-source-id: efdf00f1c5cfb0d63f1992028a796c8277b76688	2018-02-05 21:05:51 -08:00
Aarti Basant	28f42cc8e7	separating set_params and init() for checkpoint managers. Summary: separating set_params and init() for checkpoint managers. Reviewed By: anshulverma Differential Revision: D6852255 fbshipit-source-id: 061f16ce0c49953ca8a5fe9546af5c9945a3be48	2018-02-05 18:03:21 -08:00
Evgeny Kharitonov	7c7e09fe2d	Adding the Percentile op & UT Reviewed By: MisterTea Differential Revision: D6879507 fbshipit-source-id: 7ca4165a42c073e384d3a6138ef033ca384afd49	2018-02-05 16:08:00 -08:00
Anders Papitto	d8748a9d53	GRU sequence lengths: allow unspecified sequence lengths Summary: modeled after the earlier change for LSTM Closes https://github.com/caffe2/caffe2/pull/1841 Differential Revision: D6837461 Pulled By: anderspapitto fbshipit-source-id: de4e787019fa30f813a4b29f14b7000ce9d22d8e	2018-02-05 13:20:05 -08:00
Orion Reblitz-Richardson	d3ea7e260b	Allow for all of the names we have in our model zoo. Summary: * We now allow subdirectories as well as numbers in the name. * Also fixed an error case. Closes https://github.com/caffe2/caffe2/pull/1875 Reviewed By: pjh5 Differential Revision: D6894401 Pulled By: orionr fbshipit-source-id: 6a9938bc7d2ba6b8f094ed7b8a02664120a10626	2018-02-05 08:52:55 -08:00
Lin Yang	3acce3e4a7	assert global_constant name as string Reviewed By: kennyhorror Differential Revision: D6895157 fbshipit-source-id: 9844ab6176d22c6d05a5a0f83b731f734ef9853d	2018-02-04 01:02:30 -08:00
Lin Yang	95626737d0	enforce global_constant name should be a string Reviewed By: kennyhorror Differential Revision: D6880114 fbshipit-source-id: 2c9bd27b01cedb469f19843163b04a613fda5904	2018-02-04 01:02:27 -08:00
Yan Shang	e816c777eb	Add regularization for sparse features Reviewed By: xianjiec Differential Revision: D5767997 fbshipit-source-id: b9b7c47d11417fbe67d861a2a6b4daa38adbe57b	2018-02-02 16:03:32 -08:00
Yan Shang	dabddd65f4	Add sparse normalization operator Reviewed By: xianjiec Differential Revision: D6735673 fbshipit-source-id: 870b38d5175cb2d2dcad43c0e9fa4746e4dd15dd	2018-02-02 15:05:59 -08:00
Lin Yang	e138203d8f	add sparse_to_dense_test Summary: hypothesis_test have been introduced in D4508879, add a plain test which is more straightforward. Reviewed By: kennyhorror Differential Revision: D6835334 fbshipit-source-id: d05a2cd199b2de56ac0cc0319f19fcd7978647d5	2018-02-01 08:14:37 -08:00
Xue Feng	f652f20f73	change ModOp to support output sign configurations Summary: enable ModOp to control the output sign to follow dividend or divisor. Reviewed By: xianjiec Differential Revision: D6852457 fbshipit-source-id: 62dbb66cacecb8e0a0f81f63f2b7b378efbd6ee2	2018-01-31 18:03:16 -08:00
Jerry Pan	eee42748d9	Caffe2: serialize init for parallel workers Summary: Caffe2: serialize init for parallel workers Reviewed By: kevinwilfong Differential Revision: D6862119 fbshipit-source-id: 805b2971eca4501977950420565bd9ea37dc0f6c	2018-01-31 17:50:10 -08:00
Qinqing Zheng	90a3363f29	Return an empty TaskGroup if node managers exist in MultiNodeCheckpointManager Summary: Current MultiNodeCheckpointManager return None in this case, yet in JobRunner we assume this function returns a valid task group, i.e. we call session.run(self.checkpoint_manager.init(...)) directly. This will fail the case we use LocalHostScheduler and reuse a MultiNodeCheckpointManager Reviewed By: azzolini Differential Revision: D6843450 fbshipit-source-id: a7ec942cfe692f19e8751b0078ae6a6108f29e54	2018-01-30 19:20:50 -08:00
Alexander Sidorov	98a4c3f9b2	Enable rnn_cell_test in jenkins Summary: Closes https://github.com/caffe2/caffe2/pull/1839 Differential Revision: D6847623 Pulled By: salexspb fbshipit-source-id: b8a32cb39a8063b8938c89556e5d42606735238d	2018-01-30 11:48:35 -08:00
Lu Fang	560e5c94bd	Change default value of LeakyRelu's alpha from 0 to 0.01 Summary: To match the semantic in ONNX, change the default value of alpha of LeakyRelu to 0.01 Reviewed By: dzhulgakov Differential Revision: D6840975 fbshipit-source-id: 08543f80fd86cbe96a0eee8d725ef137a5bf4ab8	2018-01-29 22:31:12 -08:00
Xiaomeng Yang	6b1f848df6	Adds gpu implementation for FCTransposed Summary: Adds gpu implementation for FCTransposed. Reviewed By: salexspb Differential Revision: D6572785 fbshipit-source-id: a7cd0f7364ace286942c46b91e0287307cbfea83	2018-01-29 19:03:24 -08:00
mdschatz	3c952426fb	Add operator attaching net observer Summary: Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from. Closes https://github.com/caffe2/caffe2/pull/1806 Reviewed By: salexspb Differential Revision: D6808623 Pulled By: mdschatz fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736	2018-01-29 14:34:34 -08:00
Xiaolong Wang	f8575f6d68	Breakdown Dispatcher Summary: dispatch by Ngram breakdown Differential Revision: D6794082 fbshipit-source-id: 7f6e8fa3a0abe0dc6d0d466c95e8c4fc865e3abb	2018-01-26 17:47:54 -08:00
Anders Papitto	33d2212751	LSTM sequence lengths: allow unspecified sequence lengths Summary: In this case, each sequence is treated as having a length equal to the first dimension of the input tensor. This matches the semantics of ONNX when the sequence length input is left out. Closes https://github.com/caffe2/caffe2/pull/1764 Reviewed By: dzhulgakov Differential Revision: D6751219 Pulled By: anderspapitto fbshipit-source-id: 89e0efd12339157627494e2b8c83e952bdd8a9f8	2018-01-26 16:32:56 -08:00
Lin Yang	252211b001	testPairwiseDotProduct Summary: as title. Reviewed By: kennyhorror Differential Revision: D6793829 fbshipit-source-id: f803e0400635ca37184f1dd5bb711bfe0e4bea21	2018-01-26 11:33:08 -08:00
Alexander Sidorov	a3b8c459d4	Revamp MNIST tutorial Summary: Main changes: 1. Move reader creation to Brew in order to be consistent and avoid a wild use of param_init_net 2. Use optimizers for training function, avoid manual optimizer construction 3. Add MLP mode (a default) 4. Fix a bunch of too verbose comments and add a bit of new explanations Closes https://github.com/caffe2/caffe2/pull/1760 Differential Revision: D6749059 Pulled By: salexspb fbshipit-source-id: 9dfbbb2d9772a74a0300c2e404a92e791f7cc593	2018-01-26 09:17:31 -08:00
Peter Goldsborough	0fd41a63a1	Integrate Fused8BitRowwise ops with DPER Summary: Updates `sparse_lookup.py` for the new fused 8-bit rowwise quantization. Mostly just changing the same files as the original diffs (D5753626 and D5761202). I know very little about this code here so please let me know if this is safe, also in terms of migration away from the non-fused storage. Reviewed By: kennyhorror Differential Revision: D6710784 fbshipit-source-id: 185f147af52a094a937ba631b0351225e660d205	2018-01-25 15:02:42 -08:00
Frank Jiang	304e607b70	Fix adam test Reviewed By: pietern Differential Revision: D6787780 fbshipit-source-id: a2d1428b0e028d6f3d8f7c312c90f3fa411cd0a2	2018-01-25 12:59:54 -08:00
Xiaolong Wang	b2cfc5ea53	add KeySplitOp Summary: as titled After converting categorical to Ngram keys, use this op to extract eids Differential Revision: D6794020 fbshipit-source-id: 4f9251a22d7a129da30b92845e312876e6510e7e	2018-01-25 10:50:53 -08:00
Xiaomeng Yang	d695027300	Adds cuda support for LC op Summary: Adds cuda support for LC Op Reviewed By: QueryConnectionException Differential Revision: D6803659 fbshipit-source-id: 538bbf6fd202c79154132fda0e90e175eb09d025	2018-01-25 10:19:48 -08:00
Huazhong Ning	90543ff13a	weighted sampling reader dequeue outputs table index Summary: Weighted sampling reader dequeue randomly chooses a hive reader to read a mini-batch. This diff allows dequeue to output the index of the randomly chosen table to a specific blob. Reviewed By: kennyhorror Differential Revision: D6621070 fbshipit-source-id: 754b981fc2bcfdb0146d2a0a5b677e7cfe74211b	2018-01-24 19:06:25 -08:00
Huan Gui	c261b9ce70	Fix NGram from categorical test Summary: Fix the flaky test for ngram from categorical test Reviewed By: dragonxlwang Differential Revision: D6801152 fbshipit-source-id: dcbae17b1d3737a41fb2f5c794c1146a02c542bb	2018-01-24 18:51:16 -08:00
Xiaomeng Yang	afafe8a466	Add LC Layer Summary: Add the 1st version of LC layer. Reviewed By: Yangqing Differential Revision: D6788647 fbshipit-source-id: ebee9215a1d6e1e567548a0fef771802851682a3	2018-01-24 16:51:17 -08:00
Aarti Basant	fc56e86c7d	Introduce init API for the optional Checkpoint Metadata Handler object Summary: Every call to the checkpoint_metadata_handler write() API requires us to pass all params like db_prefix, db_type etc. Introducing an init API in the checkpoint_metadata_handler so that such params can be saved and need not be passed in every API call Reviewed By: mraway, anshulverma Differential Revision: D6792651 fbshipit-source-id: 059fa4309e8fce1ee5ab009af3e0570573c24245	2018-01-24 15:19:55 -08:00
Lukasz Wesolowski	29a4c942fe	Add support for multi-device batch normalization through an option to data_parallel_model Summary: Stage 3 in stack of diffs for supporting multi-device batch normalization. Adds input parameter to data_parallel_model to enable multi-device batch normalization. Depends on D6699258. Reviewed By: pietern Differential Revision: D6700387 fbshipit-source-id: 24ed62915483fa4da9b1760eec0c1ab9a64b94f8	2018-01-24 13:24:06 -08:00
Lukasz Wesolowski	9414072159	Add operators to support batch normalization across multiple devices on the same node Summary: This is the first in a series of diffs to enable batch normalization across multiple devices on the same node with data parallel model. The diff contains the ops for computing the per-channel statistics required to obtain the mean and variance across multiple devices on the same node on the forward pass, and the gradient of the bias and scale during backpropagation. The actual modifications to SpatialBN and SpatialBNGradient to make use of these results will be in a separate diff. Reviewed By: rbgirshick Differential Revision: D6697336 fbshipit-source-id: 0de2750fe7e851795f238d9f625aeb4d74023dc2	2018-01-24 13:24:04 -08:00
Pieter Noordhuis	7a232aae49	Add random seed to NGramFromCategorical test Summary: TSIA Reviewed By: Yangqing, Maratyszcza, dzhulgakov Differential Revision: D6797213 fbshipit-source-id: e1132229cda09d1fbde63686aaec81b995989c03	2018-01-24 13:05:28 -08:00
Xiaolong Wang	29c7c682d8	add NGramFromCategorical Op Summary: as titled Differential Revision: D6783763 fbshipit-source-id: 78280cf15c2cdc3c308562d3f27a81b61ef8d662	2018-01-23 15:08:25 -08:00
Xue Feng	0e9b0cf779	add error msg in fc input_record Summary: as titled Reviewed By: xianjiec Differential Revision: D6787879 fbshipit-source-id: 4bbdd11455480b25fa18121fa4527a9f0a03addc	2018-01-23 14:48:15 -08:00
Anders Papitto	0aa1a6387e	Add a seed to the gru unit test Summary: as it calls np.random and sometimes fails unreproducibly Closes https://github.com/caffe2/caffe2/pull/1779 Reviewed By: pietern Differential Revision: D6779802 Pulled By: anderspapitto fbshipit-source-id: 2ad069f8a15f70a8110b1a6bdb06f81577c53ad4	2018-01-23 13:47:43 -08:00
Xianjie Chen	76a141f016	add error msg in get_key Summary: as title Differential Revision: D6782896 fbshipit-source-id: bd29f6d085e56f51deb4bf6ad81771787fd85a5a	2018-01-23 11:04:05 -08:00
Dániel Simig	2dd79eb53a	Visualize distribution of activation functions Summary: This is a first attempt at completing bootcamp task T24449916. This diff contains 3 major changes: 1) Change LayerModelHelper to allow for exposing the output and parameters of any layer to metrics 2) Added a runner that allows metrics to draw arbitrary plots to a matplotlib axes object 3) Implement a metric that aggregates distributions of values in a blob over the training, and try this out in a notebook Reviewed By: kennyhorror Differential Revision: D6671273 fbshipit-source-id: b8961837395e89c957edbf5c7c862bdb845ccf4b	2018-01-23 10:36:40 -08:00
Lin Yang	8e0177255e	Test for PositionWeighted Summary: add Test for SparseLookup with PositionWeighted. Reviewed By: kennyhorror Differential Revision: D6771612 fbshipit-source-id: b4b3bfd514f366f579b4192643330ae73843d4f9	2018-01-22 19:20:46 -08:00
Viswanath Sivakumar	231d6f7b09	Add SqueezeOp in MKLDNN Summary: SqueezeOp support to drop drop dims of size 1. MKLMemory now supports Reshape() if the buffer is in plain layout, in which case just the dims and layouts are modified similar to caffe2::Tensor. SqueezeOp takes care of converting the input to plain layout if needed via an intermediate buffer before calling Reshape(). Differential Revision: D6735656 fbshipit-source-id: 953309498370e1b8986e8c593bc6963f38036255	2018-01-22 18:39:42 -08:00
Wei Zhang	1d4e996b87	Separate parameter downloading tasks from training tasks and run them in a different group Summary: At the end of distributed training, trainer needs to download the parameters back from parameter servers for saving the model. Currently, this parameter downloading happens at the end of job's epoch task group, which creates several problems when checkpointing is enabled for distributed training: 1. When checkpointing is enabled, we run multiple training epochs. At the end of each epoch, the model download tasks will run to collect parameters, but we won't save the model until the true end of training, so there is a big waste of resource. 2. After trainer0 downloads the parameters, these parameters take a lot of memory, so trainer0 can easily run out of memory in the next epoch of training. Our solution is to insert a parameter download task group between the job's training epoch_group and the job's exit_group. Reviewed By: azzolini Differential Revision: D6765393 fbshipit-source-id: 5a4f556fc3c1cd7834a7c406a3c0de3fccd50c49	2018-01-22 14:04:12 -08:00
Pieter Noordhuis	d618c05174	Increase lower bound of values for values in div test Summary: This should translate to an 1% error margin. The gradient checker uses a .5% threshold. Closes https://github.com/caffe2/caffe2/pull/1766 Differential Revision: D6774077 Pulled By: pietern fbshipit-source-id: f97c7ffb2ef34fdd71d69320a7fdcf4a6a457715	2018-01-22 09:06:12 -08:00
Viswanath Sivakumar	b5d513b1f9	Add op in MKLDNN Summary: Just redirects to MKLSumOp. Doesn't support broadcast though since dnnSumCreate expects identical dims. Differential Revision: D6729788 fbshipit-source-id: 3e189465ad9d026bec4954648562ffe4e67fc393	2018-01-21 08:21:43 -08:00
James Cross	91066559a8	truthy check for empty string in NameScope() Summary: As in name. LATTE translation team moving some code from Python 2 to 3 uncovered a case where comparison between unicode and str types leads NameScope('') to prepend a separator to the beginning of blob names. This fixes it. Thank you so much to dzhulgakov for tracking down the cause of this so quickly! Reviewed By: dzhulgakov Differential Revision: D6766866 fbshipit-source-id: fbe46cff581f425ba10e8668400915ea40baab94	2018-01-19 21:34:09 -08:00
Ilia Cherniavskii	4ce4bc5c7f	Fix occasional test timeouts Summary: Make test less computationally expensive Reviewed By: Yangqing, dzhulgakov Differential Revision: D6766236 fbshipit-source-id: 59e51faa1331d804b11da9f7237ee9ce0cb27df8	2018-01-19 20:08:58 -08:00

1 2 3 4 5 ...

1554 Commits