pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Hao Shi	c095b3f67f	Deprecate CNNModelHelper - MLP() Summary: This diff deprecates `CNNModelHelper` in `MLP()` function. Reviewed By: harouwu Differential Revision: D5241738 fbshipit-source-id: 03669a4166a02257aa5779860d06b40d7496104d	2017-06-15 14:03:23 -07:00
Luke Yeager	8ef12951e0	Fix for protobuf with unicode_literals Summary: Python 2.7, Protobuf 2.6 > op.ClearField('uuid') E TypeError: field name must be a string Fix: http://python-future.org/imports.html#should-i-import-unicode-literals /cc salexspb tomdz Closes https://github.com/caffe2/caffe2/pull/804 Differential Revision: D5258494 Pulled By: akyrola fbshipit-source-id: 04c473c1e55bf8caac0bfde7d86171c9f95e71a1	2017-06-15 13:22:57 -07:00
Aapo Kyrola	7ffd76db51	check operator schema before calling gradient creator Summary: Hard-to-debug problems arise when a gradient creator fails when the forward op is incorrect itself. Add checking of the schema before callig the creator. Also clarify the error messages Reviewed By: Yangqing Differential Revision: D5256016 fbshipit-source-id: 78550f7e2ce5b88e26b69fdae4be0eece52edfea	2017-06-15 13:04:58 -07:00
Mehdi Drissi	6500d7f307	Fixing a small bug in schema where the number of default arguments doesn't match the number of fields Summary: The current version of schema.py has a Metadata class with three fields. The default for it is set to four Nones. This is just changing that to three Nones so that the number of default values matches the number of actual fields. Reviewed By: kennyhorror Differential Revision: D5250463 fbshipit-source-id: 42e5650d270f5f63662614d8445b4819ed370dec	2017-06-15 10:31:56 -07:00
Junjie Bai	be7c336626	Deprecate CNNModelHelper in python/memonger_test.py Summary: Also fixed a small bug in ModelHelper constructor Reviewed By: harouwu Differential Revision: D5246799 fbshipit-source-id: 3719ca078f0e2b5e463fc93da9c8215f5583bd9a	2017-06-15 10:06:57 -07:00
Aapo Kyrola	7bf4c0e0fb	support RNNs in ExtractPredictorNet Summary: We need to support RNNs explicitly in ExtractPredictorNet, because they store sub-nets as strings in special arguments. When netdef argument arrive, we can generalize this a bit. Added a test under rnn_cell_test to test that extracting an LSTM predictor net works correctly and sets the device option properly for the step net ops. Reviewed By: yqwangustc Differential Revision: D5236334 fbshipit-source-id: cd653427f8c440a14d94195a532d18276f94749a	2017-06-14 22:32:29 -07:00
haracejacob	2ec294a8bb	Fix a few typos and grammars in comment Summary: Fix a few typos and grammars in comment by using language-check, python library spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py here is the text file which indicates what things should be fixed : https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2 Closes https://github.com/caffe2/caffe2/pull/719 Differential Revision: D5165118 Pulled By: aaronmarkham fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b	2017-06-14 18:22:39 -07:00
Aapo Kyrola	46a95cf420	Allow specifying device to load_from_db() Summary: A quite common problem is that it is hard to load blobs with pe.load_from_db to a specific device. One must set the device options of the returned init_net and predict_init_net, which is quite magical. So I made load_from_db() able to set these device options automatically, based on device scope or device_option parameter. Added an unit test. Reviewed By: asaadaldien Differential Revision: D5249202 fbshipit-source-id: 7b9d91476cb8d1b0ec0d9772e50b9148b8b184fa	2017-06-14 14:32:24 -07:00
Simon Layton	eaacfc7e25	Fix multi-precision SGD outputs Summary: salexspb This fixes a major perf issue (40% boost on alexnet end-to-end perf) in the multi-precision SGD optimizer - it was causing repeated cudaMalloc / cudaFree calls during training iterations due to the changing size of the `grad` blob as it moved from fp16 <-> fp32. Closes https://github.com/caffe2/caffe2/pull/797 Differential Revision: D5246978 Pulled By: salexspb fbshipit-source-id: ec3d7ef18445e19eaf5aac908d0a7bcd5957eb60	2017-06-14 11:36:43 -07:00
Ahmed Taei	94d42b03fb	MaxReduction ops GPU implementation. Summary: Move rowwise-max kernel from Softmax to math_util library and implement colwwise-max kernel and MaxReduction ops. Reviewed By: akyrola Differential Revision: D5240329 fbshipit-source-id: a07281a877324de459aace33ff21175a68cfd8f6	2017-06-14 11:02:46 -07:00
Junjie Bai	c1f974aa9f	Deprecate CNNModelHelper in python/crf.py Reviewed By: harouwu Differential Revision: D5241631 fbshipit-source-id: 3dc448355bc2a766ae9eda1dc579e501743b35cf	2017-06-14 08:49:27 -07:00
Alisson Gusatti Azzolini	d03ffb211c	Remove WORKER_INIT_CALLS Summary: This was only needed in order to initialize stateful PythonOps. Now PythonOp has support for initialization at Op creation time, so this is not used anymore. Reviewed By: dzhulgakov Differential Revision: D5242908 fbshipit-source-id: dbaa249466dd0f37f25d204d387b1f99c6dd4fed	2017-06-13 20:18:48 -07:00
Alexander Sidorov	eebda50b79	Operator python traceback Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call. Reviewed By: jamesr66a Differential Revision: D5226047 fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108	2017-06-13 18:50:02 -07:00
Alisson Gusatti Azzolini	d3ec6e8f55	Run python op builder at op creation time Summary: This allows to construct a python op by passing a pickled "builder function call" as an argument to the op. The builder function is called at PythonOp construction time and returns a function that will be called when the op is run. This way we allow to drop the dependency on 'tokens', which didn't work properly for protobufs that get distributed to other processes. Now, the PythonOp definition is self-contained: as long as the build dependencies are right, sharding the protobuf is enough to execute the net remotely. Reviewed By: dzhulgakov Differential Revision: D5080833 fbshipit-source-id: a5deaca5d3143024cdb121519689224e9dbec5ce	2017-06-13 16:29:22 -07:00
Thomas Dudziak	b877d4b5f8	Misc fixes for Python 3 Summary: As title Differential Revision: D5216942 fbshipit-source-id: def5563f1b259efefab3a829d8a78d8d3297ffc7	2017-06-13 12:18:43 -07:00
Xianjie Chen	795e7e64e8	add truncation for sparse feature Summary: truncate id list using the max length computed in compute meta, so that it has determined length, which is useful for position weighted pooling method. Reviewed By: sunwael Differential Revision: D5233739 fbshipit-source-id: f73deec1bb50144ba14c4f8cfa545e1ced5071ce	2017-06-13 10:46:53 -07:00
Yiming Wu	406d748423	better engineering for core_test.TestInferDevice Summary: Recently people find that this test is too strict because of proto string matching. Thus, I change it to compare fields so that this test will not complain even if protobuf chnaged in future. Reviewed By: dzhulgakov Differential Revision: D5229855 fbshipit-source-id: 54efcd7a0f9e5dbba1ddeb480801abcb859e07bd	2017-06-12 15:23:00 -07:00
Bokai Cao	0f787a01bc	map operator (move maptrait def out of class) Summary: added an operator that converts key/value blobs into a blob containing a map pointer, unittest passed. Differential Revision: D5224449 fbshipit-source-id: 2f60754ed3ba6ed16039c09019117ae3c3646ab2	2017-06-12 14:52:04 -07:00
Thomas Dudziak	c7f5bf282b	Revert py::bytes -> std::string Summary: As title Reviewed By: salexspb Differential Revision: D5229338 fbshipit-source-id: 3bc9442c76061436db8f3217c1ba8edfd9581f8b	2017-06-12 14:11:37 -07:00
Bor-Yiing Su	c1420330b2	Fixes the checkpoint test. Summary: Diff D5224410 initializes the should_stop_blob explicitly. With that, we will have one more blob when executing the job. Adjusts the check accordingly. Reviewed By: azzolini Differential Revision: D5228398 fbshipit-source-id: 439b186c30b0b1d0e41e513babbcccd85e7a1b4a	2017-06-12 12:19:14 -07:00
Alexander Sidorov	7f1385e70c	Improve gradient accumulation of the framework: 1.5x - 2x Summary: We waste extra memory by creating two autosplit gradient blobs and then accumulating it into them main one. Sometimesk, when Sum / Sub ops are involved, we can avoid wasting extra memory at all. Ideally we would not waste any memory and make ops add to the same blob rather then calculating separate results and then mering them. But it would require a substantial change to the frameworks and rewriting a lot of operators. Reviewed By: dzhulgakov Differential Revision: D5157667 fbshipit-source-id: 8293824d6cdd971d8853ae90aee68e4a6d1e132b	2017-06-11 22:02:30 -07:00
Dmytro Dzhulgakov	638fe804dc	Implement recover_input_schema_by_prefix Summary: It's very useful for simple cases like benchmarking nets where we want to encode input/output record in the net and don't want to go through the hurdles of storing input/output record in MetaNetDef. For those cases I propose remapping the input/output record before saving to 'input_record/{field_name}'. Then we can recover input/output record back just based on the names of the blobs. Differential Revision: D5170473 fbshipit-source-id: ac5daa60051605ed93022aec1377a49f08f15663	2017-06-11 15:37:12 -07:00
Xiaolong Wang	b133c214ce	fix potential bug in task.py Summary: as titled Differential Revision: D5225166 fbshipit-source-id: 9247fe44922c097752c6996ee9192ec72b7e7d88	2017-06-11 10:40:47 -07:00
Xiaolong Wang	827a0ac2fe	Fix comment mistakes in task.py Summary: as titled Reviewed By: kennyhorror Differential Revision: D5225154 fbshipit-source-id: 99a9547e15e0d5a4c81b6339ce75406160a7fc07	2017-06-11 10:17:07 -07:00
Artem Volkhin	4102a79da4	Explicitly set should_stop_blob to False in pipeline init Summary: This diff fixes an issue with running the same reader in the same workspace multiple times. In order to achieve correct behavior of execution step we have to explicitly initialize should_stop_blob with False. Reviewed By: kennyhorror Differential Revision: D5224410 fbshipit-source-id: 4ad2740e187b62b0a1f5612ea3eef223dcc8a799	2017-06-11 02:33:42 -07:00
Bokai Cao	e01769ece5	map operator Summary: added an operator that converts key/value blobs into a blob containing a map pointer, unittest passed. Differential Revision: D5166513 fbshipit-source-id: 748527c423a163fe55f914c08fff3adfc74a540c	2017-06-09 15:17:29 -07:00
Jiyan Yang	c7aa8e142d	Add gradient to SparseToDense op Summary: As desc. Differential Revision: D5169423 fbshipit-source-id: 64c72933c14c3caabfbe0bf85912194a479c24fa	2017-06-09 13:47:21 -07:00
Jiyan Yang	c822e89956	Rename SparseToDense layer Summary: The SparseToDense layer is essentially calling the SparseToDenseMask op. This makes it impossible to call the functional layer with the true SparseToDense op. This diff is to rename the layer. Please let me know if I missed anything or you have a better name suggestion. Differential Revision: D5169353 fbshipit-source-id: 724d3c6dba81448a6db054f044176ffc7f708bdb	2017-06-09 12:48:27 -07:00
Yiming Wu	072f4dbefc	net_printer_quick_fix Summary: To deal with encode failure Reviewed By: azzolini Differential Revision: D5215897 fbshipit-source-id: cf8687706f7e4deaee05b61cd2bfeaff88672fcc	2017-06-08 19:34:50 -07:00
Wael Abdelghani	c291c97494	Add integration test for pos_w Summary: Title Reviewed By: azzolini Differential Revision: D5197307 fbshipit-source-id: 425bf8e7c5068ea544e5b2709b6bb27eef140bf3	2017-06-08 18:04:53 -07:00
Alexander Sidorov	df72826ead	Static RNN Summary: Static RNN allows to unroll an RNN into Caffe2 graph using all existing cell abstractions. In this diff I introduce several new tests that already caught a few bugs in our RecurrentNetworkOp gradient accumulation logic by comparing it to an unrolled version. Another use case is perf - potentially we can run an unrolled net faster because DAGNet will have access to the whole graph. Same about memonger. But this work is not part of this diff Reviewed By: akyrola Differential Revision: D5200943 fbshipit-source-id: 20f16fc1b2ca500d06ccc60c4cec6e81839149dc	2017-06-08 17:48:48 -07:00
Alexander Sidorov	bb9077a6cd	Network forward / backward equality checker Summary: In some cases you have an optimized network and a normal one. And you would like to make sure they produce same results. If math under the hood is the same, you could do this with a very high precision compare to a traditional numerical gradient check. One of the application - RNNs. There we can unroll RNN into Caffe2 graph and make sure result is the same as in the optimized version using RecurrentNetworkOp. Another possible application - graph transformations. We can verify that after that nets produce same gradients (cc akyrola on memonger, bwasti on other transformation ideas) Reviewed By: bwasti Differential Revision: D5200855 fbshipit-source-id: 0196af187f0c2feb33de4778ea08d0d288fe1017	2017-06-08 17:48:47 -07:00
Alexander Sidorov	264f75fdd0	ZeroGradient op Summary: when building a multi layer static RNN the last timestep of the first layer (and other layers except the last one) doesn't get a gradient for the cell state as normally user uses results only from the last layer and cell state doesn't go up either. ZeroGradient provides a general solution for injecting 0 gradient blobs. It is in some way similar to StopGradient operator which is also specialcased Reviewed By: bwasti Differential Revision: D5198375 fbshipit-source-id: a21d0cfb3676a77fac72e5897a200d0bd25fc6de	2017-06-08 16:02:38 -07:00
Luke Yeager	52ee7697f4	Fixing broken Python tests Summary: `brew_test.py` is just plain broken. `core_test.py` doesn't work with pytest. `apmeter_test.py` and `top_k_test.py` don't work for CUDA builds. Closes https://github.com/caffe2/caffe2/pull/765 Differential Revision: D5211817 Pulled By: Yangqing fbshipit-source-id: 78ec5af35a3fa870978e4c9590210ade9e3bc5ac	2017-06-08 13:34:46 -07:00
Luke Yeager	75f1da327d	Skip Python tests which require opencv or lmdb Summary: Neither dependency is required by the core Python modules. OpenCV, in particular, is a pain to install (no pip package). Conditionally skipping this test will make TravisCI integration easier. Closes https://github.com/caffe2/caffe2/pull/739 Differential Revision: D5211799 Pulled By: Yangqing fbshipit-source-id: c6bdc8a17977f64f34e968fd9ab8c65161d2624d	2017-06-08 13:34:43 -07:00
Aapo Kyrola	27e01744b2	Probably fixed memonger Summary: This diff fixes various issues with memonger, and works at leasrt with rbgirshick's failure case, Resnet-50, and new harder unit test. I will still create a proper resnet50-test. 1) Introduce concept of "tokens". These are passed down the dependency chains, and a blob can be used for recycling only if it owns all the tokens that are currently in possession. Tokens are added when branching, and tokens are redeemed after all inputs are satisfied. A bit hard to explain. 2) There were various bugs due to bad code: the free_blobs data structure is of different type when we have blob sizes and when we haven't. I plan to rewrite this soon. But there were some bugs. 3) Added a harder unit test that failed before. 4) Added test for resnet50 + memonger Reviewed By: asaadaldien Differential Revision: D5193393 fbshipit-source-id: bc2a714877aa1201c32a5ba8ade862865e455711	2017-06-08 09:19:24 -07:00
Aapo Kyrola	feba1eed00	resnet50: fetch right lr Summary: I broke resnet50 when switching to use optimizer, which uses LR per parameter. This only happens after each epoch, and I did no test patiently enough. For a stop-gap, while asaadaldien works on a better solution, just fetch the lr of a conv1_w param. Reviewed By: asaadaldien Differential Revision: D5207552 fbshipit-source-id: f3474cd5eb0e291a59880e2834375491883fddfc	2017-06-07 21:46:35 -07:00
Yiming Wu	4fefff0bbb	Auto injecting device copy for single net and several nets Summary: This diff plan to attack the problem where we want to just annotate device option for operators and leave Caffe2 to help us inject cross device copy functions. This feature would be useful for mixed device training and multi device training with several nets, where previously we do the heavy lifting of adding copy functions ourselves. Ideally, this feature will happen like this: //construct your nets first core.InjectDeviceCopyAmongNets([train_init, train_net, ...]) My ideas are written in comments. I will update them here as well later. Reviewed By: dzhulgakov Differential Revision: D5134103 fbshipit-source-id: 173f7da9d1773d1c50ccdc27f1b5cd3067b04af5	2017-06-07 20:03:18 -07:00
Peizhao Zhang	87a12dd355	Caught exception when fetching uninitialized blobs when collecting blob sizes in workspace. Summary: Caught exception when fetching uninitialized blobs when collecting blob sizes in workspace. Some of the output blobs (like mask output of DropOut when is_test=1) may be nullptr and FetchBlob will fail. Differential Revision: D5198641 fbshipit-source-id: 45ee26c4cb1c25cc48904e9f7d7c007224c97418	2017-06-07 15:35:32 -07:00
Ran Xian	4316fb4876	Implement APMeter op Summary: Implements an APMeter operator (APMeterOp) to calculate AP for multilclass classification given prediction socres and labels. The Op takes a score tensor [nsamples x nclasses] and a label tensor [nsamples x nclasses], and outputs a float tensor of size nclasses as the AP for each class. Reviewed By: akyrola Differential Revision: D5082565 fbshipit-source-id: ae7304bc8fc999c361245b9aec38eb9a5f5eef4b	2017-06-07 15:03:04 -07:00
Zhicheng Yan	ee3727db00	add_helper_function_ElementwiseLinear_op Summary: Add a helper function for parametric op ElementwiseLinear The typical syntax is model.ElementwiseLinear(input, output, dimension) Reviewed By: harouwu, akyrola Differential Revision: D5114152 fbshipit-source-id: 8e8c691f824f518ae510a72ab0c12de1b018f3b5	2017-06-07 13:49:48 -07:00
James Cross	98825d1323	guard against special case of in-place operation Summary: There is an edge case where internal gradient blobs of the backward step net should not be considered internally calclulated if the only "internal" calculation is in-place. In the case of the failing attention unit tests, the offending blob was attention_weighted_encoder_context_grad, which was incorrectly considered internal because it was the output (as well as input) of a Reshape on the step net's edge. The caveat here is that the results may be unpredictable if a non-pass-through in-place operation is applied to a blob within a step net which is also consumed both internally and is a recurrent state/output. (This is an extreme edge case, and difficult to explicitly enforce, but it's worth noting.) Reviewed By: salexspb Differential Revision: D5198328 fbshipit-source-id: 0cfa8f903fd767fc50e727f238ac3d8cdca03fe0	2017-06-07 12:33:31 -07:00
Thomas Dudziak	d524d5b481	Fixes zip/izip for Python 3 Summary: As title Reviewed By: salexspb Differential Revision: D5154186 fbshipit-source-id: 2ef24557d82ae16d3bdfbc90a4cc96be8e2dc6c3	2017-06-07 00:04:26 -07:00
Thomas Dudziak	60c78d6160	Fixes range/xrange for Python 3 Summary: As title Differential Revision: D5151894 fbshipit-source-id: 7badce5d3122e8f2526a7170fbdcf0d0b66e2638	2017-06-07 00:04:26 -07:00
Ahmed Taei	4c5d101caf	Implement ColwiseMax and RowwiseMax reduction ops. Differential Revision: D5192949 fbshipit-source-id: e7e877b4bea19dd1be94449d45d2733f4858b8e7	2017-06-06 21:17:29 -07:00
Aarti Basant	93ac6a9837	checkpointing for distributed hive reader. Summary: The goal of this diff is: 1) Enable checkpointing to honor batches_per_epoch 2) Resume hive_readers mid-split Reviewed By: azzolini Differential Revision: D5004212 fbshipit-source-id: 2ff5df30ba946eefadd109d80056cde67398a080	2017-06-06 14:20:06 -07:00
Wenyi Huang	7723129d14	Add gradient for topK op Summary: Input of topK op: X (dense) Output of topK op: Value and Indices (sparse representation) Value will have gradient in some cases, We backprop (copy) the gradient from sparse (d Value) to dense (d X) Differential Revision: D5133461 fbshipit-source-id: 7bad55b60e8a22dfe0e51357ce2099d7f752c133	2017-06-06 14:20:06 -07:00
Xiangyu Wang	c9c862fa8f	16117716 [Caffe2 OSS] make char-rnn exapmle use build_sgd Summary: replace hand made sgd with build_sgd Reviewed By: salexspb Differential Revision: D5186331 fbshipit-source-id: 3c7b4b370e29a1344b95819766463bae3812c9a6	2017-06-06 13:54:59 -07:00
Dmytro Dzhulgakov	80fe2e5caf	Fix from_column_list Summary: Previous implementation relied on the order of fields for some reason. Reviewed By: azzolini Differential Revision: D5164478 fbshipit-source-id: 12717310860584e18ce4ca67d0bd5048354cdc0a	2017-06-06 01:17:02 -07:00
Yiming Wu	8cd208ad6f	Infer input and output device from OperatorDef through OperatorSchema Summary: Infer input and output device from OperatorDef through OperatorSchema. This is inspired by shape inference. With this feature, we can easily analysis device information for all blobs in the net in a generic way. It is really helpful for auto cross device execution. Reviewed By: akyrola, dzhulgakov Differential Revision: D5161065 fbshipit-source-id: ee656123112171a4ca00f2fb3f6940f32ddf3135	2017-06-05 23:47:33 -07:00

1 2 3 4 5 ...

811 Commits