pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Bram Wasti	3a99698734	include numpy's other 32bit int type Summary: forgot one :) Reviewed By: akyrola Differential Revision: D5534905 fbshipit-source-id: a0e58ca3922ec80f526f7586931ff3da8e9bcffc	2017-08-01 13:53:11 -07:00
Tao Wu	5d304a3b49	add gradient for SparseToDenseMask operator Summary: add gradient for SparseToDenseMask operator Reviewed By: kittipatv Differential Revision: D5320792 fbshipit-source-id: 8ee7f1c87e8270ad6077ed197ce9512524069b59	2017-08-01 13:05:03 -07:00
Alisson Gusatti Azzolini	1968e03486	net_printer.to_string() accepts NetDef Summary: Title. Reviewed By: kennyhorror Differential Revision: D5531925 fbshipit-source-id: 8f8961e6ab14d49720f74ec01c197ba9cc3e33ce	2017-08-01 10:17:29 -07:00
Szymon Piechowicz	3324db447f	Caffe2: allow nets that don't use all input in net.ClonePartial Summary: Caffe2: allow nets that don't use all input in net.ClonePartial Differential Revision: D5535564 fbshipit-source-id: 0ec8fb3ade4d7d6cd4a702c9c265d9c77f27a627	2017-08-01 10:05:46 -07:00
Aapo Kyrola	e38015756a	shape inference for Squeeze Summary: Add tensor inference function for squeeze, refactor a bit Reviewed By: asaadaldien Differential Revision: D5518880 fbshipit-source-id: 5b8cb9154f5f777d4be3612a96d7ed76a9068c0c	2017-07-31 16:04:24 -07:00
Xiaolong Wang	82adbde878	pass layer_parameter shape to ps builder if cannot inferred from initializer Summary: Feed team uses distributed training and wants to also use transfer learning. Currently, transfer learning implements by overwriting the layer parameter initializer. Therefore, PS builder can't infer correctly the parameter shape. To fix this, add a field 'shape' in `layer_parameter` and set the shape if we overwrite its initializer. We also enforce the check of parameter shape between the original initializer and the loaded blob. (this adds extra cost) Differential Revision: D5520541 fbshipit-source-id: 80547dbd328b3f6cbfcea0b2daaf4004703dfe81	2017-07-31 16:04:23 -07:00
James Cross	8c65b5ab34	multilayer seq2seq Summary: Several refinements to seq2seq example code, including support for multilayer LSTM. Reviewed By: jamesr66a Differential Revision: D5460372 fbshipit-source-id: d2eabf6aa9a5b5df7bbc341fd99c4e7d8322e717	2017-07-31 12:27:51 -07:00
Aapo Kyrola	8079abbaf1	fix traversal order Summary: Memonger did not properly track the number of times a blob output has to be produced before an operator can be visited. Actually I remember fixing this before, but well. This bug was manifested in Priya's model, so thanks prigoyal, and benz's model verifier nicely caught the wrong output. Reviewed By: asaadaldien Differential Revision: D5524912 fbshipit-source-id: 10f4d7056b84aba0274a918af508ea043e6026f9	2017-07-30 21:47:48 -07:00
Mingda Li	e3c45206ec	Add a method to run a train net multiple times in layer_test_util.py Summary: This method runs a train net multiple times therefore enables testing layers with iteration-dependent behavior. Differential Revision: D5493750 fbshipit-source-id: a7fb967a66f799aaf82acfadc4ecf66e0744da20	2017-07-28 19:56:05 -07:00
Aapo Kyrola	84b9d267dc	add warnings about slow data input Summary: One of my workflows was stuck before everstore/hive data input was experiencing networking issues (No route to host etc.). But it is hard to know this is happening because the errors were logged to stdout. Anyway, added a simple logging to warn if the data workers enqueue thread is not getting new data for over 10 secs. Reviewed By: panshen1 Differential Revision: D5522816 fbshipit-source-id: a036c4afdfbbafea130a4251c1ca02c138d19a83	2017-07-28 18:21:42 -07:00
Tao Wu	6530db49bc	improve pair_wise_loss operator to support multiple sessions Summary: The diff adds support for rank_loss operator to support computing loss for multiple sessions (batch). Reviewed By: kittipatv Differential Revision: D5515465 fbshipit-source-id: 55a01cd5ad21eaeae82875ad136c392fed0dbb26	2017-07-28 15:12:47 -07:00
Dmytro Dzhulgakov	f2090debb0	Optimized SparseLengthsSum Summary: Optimised SparseLengthsSum (fp32) for now 1) Specialized reducer 2) created fast routine with prefetches, loop unrolling, block specailization and register tiling 3) added more variety of block sizes to segment_ops_test.py Reviewed By: Yangqing Differential Revision: D5392472 fbshipit-source-id: 8ed9baf1b12ec05bd391cabb390024e6bc60a6f6	2017-07-28 10:10:25 -07:00
Bangsheng Tang	a41cbdec0e	float support for square root divide Summary: to support an operation needed by D5507205 Reviewed By: xianjiec Differential Revision: D5512522 fbshipit-source-id: a9b3a668c28eff71d1e106dbbb572184df4a7638	2017-07-27 17:40:40 -07:00
Viswanath Sivakumar	0676dfef2b	ExtractPredictorNet should strip gpu_id prefix from step_net Summary: The renames were only being applied to the main net, if step_net has an external input that is part of renames, running the model would fail with 'blob not found in workspace' error. Differential Revision: D5511953 fbshipit-source-id: ba262a094c3263978dfe173f2cab00301131b57f	2017-07-27 16:06:47 -07:00
Jacqueline Xu	13569c9aa0	Fixing semi-random layer model for multi-layer models Summary: Updated the semi-random layer model for multi-layer models using semi-random layers. Notable changes: - Input and outputs for the semi-random layer is now a Struct with "full" and "random" components - Flag was added to choose to initialize output schema in Arc Cosine or not (if output schema initialization will happen in Semi Random layer) Reviewed By: chocjy Differential Revision: D5496034 fbshipit-source-id: 5245e287a5b1cbffd5e8d2e3da31477c65b41e04	2017-07-27 15:25:19 -07:00
Aapo Kyrola	26645154bb	warn about using test/val model with init_params=True + fixed some cases Summary: It is common mistake to create test/validation model with init_params=True. When its param_init_net is run, it will overwrite training models' params, and with DPM, those won't be synchronized to all GPUs. I don't want to make this an assertion yet, since it might break people's trainers (it is ok to have init_params=True if you never run the param_init_net...). Reviewed By: asaadaldien Differential Revision: D5509963 fbshipit-source-id: 63b1a16ec0af96e3790e226850f6e0e64689143f	2017-07-27 13:20:27 -07:00
Aapo Kyrola	af1e45c1e1	support appending net and converting them Summary: As per rushabhmshah99 request: he wants to append a pre-trained model (without training that) to the model. So added data_parallel_model.ConvertNetForDevice() to enable that. The unit test shows example how to use this with AppendNet, and I also added a blurb to the function. Differential Revision: D5503335 fbshipit-source-id: b2a5db5c1739dc97f46dd0d7606ed555d99255b8	2017-07-27 11:07:48 -07:00
Bangsheng Tang	d8443b8ffa	BatchGatherOp Summary: 1. added BatchGatherOp and BatchGatherGradientOp 2. unit tests Reviewed By: xianjiec Differential Revision: D5443965 fbshipit-source-id: bdcbb7f9f91c55484372a4bdb1727ae6d49e2018	2017-07-27 10:17:42 -07:00
Aapo Kyrola	3363681304	enable CreateCommonWorld to bootstrap from existing common world Summary: Use romain-intel's ContextFactory to create common worlds from existing common worlds, thus bypassing KV store completely. Changed data_parallel_model to automatically find if there is already a CW we can work. CreateCommonWorldOp takes optional second parameter, which is existing CW. Reviewed By: andrewwdye Differential Revision: D5494956 fbshipit-source-id: 5f7a840bcd5fe4ea756fafeacc746bc2cf5078b0	2017-07-26 22:31:55 -07:00
Yangqing Jia	de92dbe4bb	MKL code move Summary: Nothing gets changed - this would allow us to more easily deal with build systems. Also now everything that is MKL related lives under mkl/. Reviewed By: dzhulgakov Differential Revision: D5505157 fbshipit-source-id: ddb2e6ac290a146a7cb495da23bb0e5b5594bd2a	2017-07-26 20:21:55 -07:00
Ahmed Taei	40b783b746	Fix flaky test due to numerical gradient approximation error. Summary: Use smaller step size for GradientChecks and pass seed to help reproducing the test from logged inputs. Reviewed By: Yangqing Differential Revision: D5505698 fbshipit-source-id: fc308efe72d535695ba628944aee1913ba16b2f1	2017-07-26 18:58:19 -07:00
Jacqueline Xu	9bec54bbf1	Modify arc cosine feature map and semi random layers to initialize parameters as global constants Summary: The original issue was that the initialized parameters for randomized layers (Arc Cosine and Semi-Random) were not fixed across distributed runs of the layers. Moreover, as the weights are initialized as (constant) parameters, when the layer is added to the preprocessing part, these weights won't be saved after training since they don't exist on the trainer. I fixed the issue here by building an option to add the randomized parameters to the model global constants so that the same parameter values can be accessed. Also, the parameters can be saved when the training is finished. In this diff, I've: - Updated randomized parameters to be added as a global constant across distributed runs of Arc Cosine Feature Map and Semi Random Feature layers - Updated unit tests - Ran an end-to-end test, enabling multiple readers to test the fixed issue Reviewed By: chocjy Differential Revision: D5483372 fbshipit-source-id: b4617f9ffc1c414d5a381dbded723a31a8be3ccd	2017-07-26 16:37:00 -07:00
Szymon Piechowicz	54b171eae5	Caffe2: don't swallow exception stacktrace Summary: Caffe2: don't swallow exception stacktrace {F69325406} Reviewed By: akyrola Differential Revision: D5503227 fbshipit-source-id: 4e11d921652a094e20c46af19ba880390be8e997	2017-07-26 15:48:05 -07:00
Wojciech Glogowski	8f8dccd2ed	distance_op_test from hypothesis_test refactored Summary: Moved distance_op_test from hypothesis_test to distance_op_test and refactored Reviewed By: akyrola, asaadaldien Differential Revision: D5495104 fbshipit-source-id: 4a90c75eabeb380ae9d150d6258e9b5b0fbfc5ca	2017-07-26 13:37:08 -07:00
Davin Wang	d89632b52c	Support (U)INT8, (U)INT16 in data type conversion Summary: Data type conversion between Numpy Array and Caffe2 Tensor currently only support 3 types: FLOAT, DOUBLE and INT32. Support 8bit and 16bit date types will help reduce the model size in some circumstance. I benefit from this to reduce size of a data set from 8GB to 1GB by using INT8. Closes https://github.com/caffe2/caffe2/pull/930 Reviewed By: Yangqing Differential Revision: D5440929 Pulled By: akyrola fbshipit-source-id: 3762da1d845e62a13ba384d1c144328b19dd663b	2017-07-26 11:23:53 -07:00
Dmytro Dzhulgakov	cf1ce29631	Fix GPU SparseAdaGrad with empty tensors Summary: CUDA doesn't like 0-sized grids :) Reviewed By: Yangqing Differential Revision: D5495805 fbshipit-source-id: 6819513024978ee6bb70a39b25d23ced06465750	2017-07-25 23:50:54 -07:00
Artem Volkhin	2f5c96a730	Fix Flatten operator for empty tensors Reviewed By: xianjiec Differential Revision: D5487475 fbshipit-source-id: f1321e15352b0bbe039312f544a9c2ed78da8732	2017-07-25 17:51:42 -07:00
Andrew Tulloch	133dc2603e	Support grouped convolutions in MKL Reviewed By: Yangqing Differential Revision: D5487692 fbshipit-source-id: 94fb66b3b104cf16dcad07743def4ea940515689	2017-07-25 14:19:02 -07:00
Andrew Tulloch	d86f32ae2e	Implement simple graph rewrite functionality. Reviewed By: Yangqing Differential Revision: D5487075 fbshipit-source-id: f7c7867c5cbae39cf197cf5e7ed8a64149f33208	2017-07-25 14:19:01 -07:00
Andrew Tulloch	9e6ea2987f	MKLReluOp supports in-place X/Y Reviewed By: Yangqing Differential Revision: D5487060 fbshipit-source-id: 35d2d450f46aefc3c9395be45af99e13d1c168ec	2017-07-25 14:19:00 -07:00
Andrew Tulloch	af43e2b251	MKLConvOp handles the no bias case Reviewed By: Yangqing Differential Revision: D5487050 fbshipit-source-id: 4791943d331a2d7283f0f9b939f3f03e32dbdbed	2017-07-25 14:18:58 -07:00
Andrew Tulloch	f028e74fb7	Implement a filler op test Reviewed By: Yangqing Differential Revision: D5487042 fbshipit-source-id: 0b03683fd3822769381c14790c0c2e46162d1aaf	2017-07-25 14:18:57 -07:00
Andrew Tulloch	bbf2b578dc	Implement MKL CopyTo/CopyFrom ops Reviewed By: Yangqing Differential Revision: D5482636 fbshipit-source-id: d044c495837aef985210f0b63d61f88f9acc3db7	2017-07-25 14:18:55 -07:00
Andrew Tulloch	71d04fd5cc	Implement SumOp for MKL Reviewed By: Yangqing Differential Revision: D5482622 fbshipit-source-id: e1e8f8aebce874efc31fab2c870cd274ca0d037c	2017-07-25 14:18:54 -07:00
Andrew Tulloch	ad7d7657a4	Add tests to targets Reviewed By: Yangqing Differential Revision: D5482614 fbshipit-source-id: 04727e19b7b83b6d0d41ad3227866957480bc1ee	2017-07-25 14:18:54 -07:00
Andrew Tulloch	007492e730	Fix MKL spatial pooling test Summary: tsia Reviewed By: Yangqing Differential Revision: D5482603 fbshipit-source-id: e95a8829c71125623066cfee3b76e774c7f3a46b	2017-07-25 14:18:53 -07:00
Andrew Tulloch	a7d8f489d9	Improe MKL SpatialBN test Summary: tsia Reviewed By: Yangqing Differential Revision: D5482596 fbshipit-source-id: 2817ceb57154dcefffec3251efc397cba8163097	2017-07-25 14:18:52 -07:00
Wojciech Glogowski	f656e002a7	CosineSimilarity GPU Reviewed By: asaadaldien, akyrola Differential Revision: D5476812 fbshipit-source-id: d931a7d8e4a4dfdf22ee18f8b9c755cc21b0e75b	2017-07-25 13:34:01 -07:00
Tao Wu	5449afa855	use model.create_param instead of using param_init_net directly Summary: When creating parameters for modelhelper, we should use create_param instead of using param_init_net and model.params directly. The diff rewrite some of these cases in rnn_cell.py in order to make model._parameter_info and model.params consistent. Reviewed By: kittipatv Differential Revision: D5477724 fbshipit-source-id: 28c4aaf8f98d9d89125af6a42ad328008f0079e1	2017-07-24 21:17:24 -07:00
Dmytro Dzhulgakov	8930c095c1	Add support for int32 indices in SparseLengthSum and friends Summary: Need it for some reference comparison for c2isl. Also there's an argument that it might be faster on GPU with int32. Doesn't seem to be the case now, but haven't tested with Jeff's changes yet. Reviewed By: kennyhorror Differential Revision: D5405482 fbshipit-source-id: dc1a983dce5f06f1111c5634ec475647c94848cc	2017-07-24 17:50:00 -07:00
James Cross	0eda7955bd	use internal cell for DropoutCell output prep methods Summary: In order to get dimensions right, correctly identify gradients, etc., DropoutCell should call the _prepare_output and _prepare_output_sequence methods of its internal cell for its own such methods. This bug was identified by NVIDIA intern Syed Tousif Ahmed. Reviewed By: akyrola Differential Revision: D5483082 fbshipit-source-id: f6df5b4a0502ed0771056638aab219fb5cc7d964	2017-07-24 14:53:11 -07:00
Yangqing Jia	0deee2194f	Add a quick SparseLengthsSum benchmark. Summary: TSIA - this makes it a bit easy to benchmark sparse lengths sum. Reviewed By: dzhulgakov Differential Revision: D5477844 fbshipit-source-id: 89e25c5e0dbf3538877ba1a9abc75a10abfa2757	2017-07-24 13:17:47 -07:00
Yangqing Jia	f6afa6adbd	Add proper cpuid support. Summary: This is needed for us to do more fine grained dispatch based on CPU arch, so I figured we should just add it. Can help Dima and Misha doing optimization I think? Reviewed By: dzhulgakov Differential Revision: D5477444 fbshipit-source-id: 48aaf8bd799e9755493cd51c793ceec080a8846c	2017-07-23 17:21:50 -07:00
James Cross	99e79a616b	attention with encoder_lengths Summary: For RNN attention, we should not include the invalid parts of the encoder output (based on encoder_lengths) in the computation. This diff accomplishes that by forcing logits for those positions to be negative infinity. Note that the this step can be bypassed by passing encoder_lengths=None, which is what we do for beam search, thus incurring no extra overhead for inference. Reviewed By: jamesr66a Differential Revision: D5402547 fbshipit-source-id: 1863d6050b5129e4df829c6357f0aa9ded0715dc	2017-07-23 10:06:01 -07:00
Yiming Wu	b51e0ec0c2	quick fix inplace blob bug Summary: fixing the case where the init net will initialize same blob twice. I made an exception by allowing inplace blob among ops if the blob keeps on the same device. This should fix this problem in a generalized way as most of our training is only on CPU now. Reviewed By: dzhulgakov Differential Revision: D5450564 fbshipit-source-id: 525c4c9a2e5216a70dbd1229da2d9f8a58b89e47	2017-07-23 02:18:16 -07:00
Yiming Wu	920c553ac0	saving/loading CPU/GPU nets Summary: Saving 2 nets at offline training and loading the correct net the user want. The keep_device=false will help us load gpu blobs to CPU memory. Reviewed By: dzhulgakov Differential Revision: D5396689 fbshipit-source-id: ff26bf3759856b07f3a1bbefac4a1e613a8a02e1	2017-07-23 02:18:15 -07:00
Yiming Wu	4a256dfc97	save/load/run nets and params with device info correctly Summary: ===Update log 7/10=== We are now restrained from problem of connection. Will post if this problem does not fix in 2hrs. ===Update 7/6=== Luke is experimenting on the convergence of this diff. Hopefully he could present results next week Right now this is not affecting our original CPU training pipeline because the loading op is still correct in CPU situation now. I will need final test to make sure. But that is now blocked by log device issue t19952135 I will do CPU/GPU nets saved in a separate diff. ====Update before 7.4==== It's actually working! Include local run screenshot {F67959016} dogscience Reviewed By: dzhulgakov Differential Revision: D5307058 fbshipit-source-id: cad5d9324c239419530f4b120392ec2ccbb72280	2017-07-23 02:18:15 -07:00
Ahmed Taei	804ebf7c41	Populate learning rate blob name into data_parallel_model and fix resnet50_trainer example. Reviewed By: akyrola Differential Revision: D5463772 fbshipit-source-id: 10b8963af778503a3de6edbabb869747bd1e986d	2017-07-21 16:24:10 -07:00
Alisson Gusatti Azzolini	8e80ef7e6d	s/CopyGPUToGPU/Copy Summary: CopyGPUToGPU does not exist. Copy seems to do the trick. Didn't go into details of how copy works, not sure if it ends up triggering UVA. Reviewed By: akyrola Differential Revision: D5471014 fbshipit-source-id: d8bc1aed9b19070c92f3ffc76f5617bdd0054563	2017-07-21 13:51:11 -07:00
Junjie Bai	efe2d01a3e	Fix some bugs in CPU version of BooleanMask and add GPU version Reviewed By: akyrola Differential Revision: D5397208 fbshipit-source-id: 0314cc181e315f3b6cda846292b2e2ea73bb015b	2017-07-21 11:38:49 -07:00

1 2 3 4 5 ...

1018 Commits