pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Bangsheng Tang	5f63f5697a	IndexHash Summary: 1. IndexHashOp 2. Helper class SparseFeatureHash 3. FeatureSpec changes to add desired_hash_size Reviewed By: kennyhorror Differential Revision: D5361370 fbshipit-source-id: bf02e3ca12b3654f1d291f77c8af9248b6c4ac55	2017-07-07 23:06:11 -07:00
Geet Sethi	86b6a6e2f8	Added PiecewiseLinearTransform CUDA Op Summary: Added a CUDA implementation of the PiecewiseLinearTransformOp. Differential Revision: D5378537 fbshipit-source-id: 38857f59f5cc52e16e1ecc97983a0b0b82a46c74	2017-07-07 15:20:00 -07:00
Clément Godard	cb7f17ab64	added gradients for ResizeNearest (CPU + CUDA) and ref Summary: # Added the gradients of the operation for both CPU and CUDA kernels. # Unified variable names across all ops. # Added reference implementation in numpy. # The gradient check needs a larger stepsize to succeed, is that normal? Reviewed By: akyrola Differential Revision: D5313682 fbshipit-source-id: aceb92649e01c5caeba8774e678f9095502d396c	2017-07-07 14:19:42 -07:00
Ralph Mao	febae7b20b	fix a bug in the report function of Data_Parallel Summary: replace params with sp, otherwise it will report an empty list Reviewed By: akyrola Differential Revision: D5382716 fbshipit-source-id: 34d8e6ee00cbe1718702e3d1f23ea12f8d65063e	2017-07-07 13:03:46 -07:00
Jacqueline Xu	8cedf35d55	Adding Random Fourier Features to SparseNN Model and Flow Summary: - Integrated RFF into the preprocessing workflow for dense features - Developed Flow interface to input RFF parameters - Created unit test for using RFF with sparseNN Reviewed By: chocjy Differential Revision: D5367534 fbshipit-source-id: 07307259c501a614d9ee68a731f0cc8ecd17db68	2017-07-07 09:39:32 -07:00
Aapo Kyrola	ad62e82179	fast simple-net memonger for C++ Summary: To be used with predictor "online": C++ version of memonger for simple nets. Very simple greedy algorithm. Works well at least on Resnet-50 inference graph: only 3 shared blobs are used. Next I will integrate this with predictor and run canary (separate diff). Reviewed By: asaadaldien Differential Revision: D5375392 fbshipit-source-id: d36e419e39a32e568e105657c27fb00c85a2535d	2017-07-06 15:17:07 -07:00
Guillaume Dumont	e8689dda8f	Python 3 compatible integer division Summary: As the title says. Closes https://github.com/caffe2/caffe2/pull/879 Differential Revision: D5372787 Pulled By: akyrola fbshipit-source-id: 0ff469c0d227f1b2252c1a0c4f6f8bebaac5580f	2017-07-06 11:47:12 -07:00
Andrew Dye	31f394f8b3	Add synchronization barrier API to data parallel model Summary: Add synchronization barrier API with configurable timeout. Users can call Synchronize() to join variable length execution before resuming multi-machine communication steps, i.e., resuming distributed training iterations after validation on a single machine. Reviewed By: akyrola Differential Revision: D5348387 fbshipit-source-id: 5826da10e6a60c50394c36c7cf47624f10191d11	2017-07-06 09:21:19 -07:00
Aapo Kyrola	21ba0ff560	small fix to when input blob is input to multiple ops Summary: Memonger had a bug that it crashes if an input blob was input to multiple ops. This fixes that and adds a test. Reviewed By: asaadaldien Differential Revision: D5374860 fbshipit-source-id: 1d5044001eacdbe6db43f69727da9297558f5c5c	2017-07-05 22:37:26 -07:00
Aapo Kyrola	2d133d4627	increase concurrency default Summary: Huge improvement in my tests, and it does not really hurt either. Reviewed By: wesolwsk Differential Revision: D5374925 fbshipit-source-id: c96a4ed2ca653120a82233c0037cbfded8a2d2a1	2017-07-05 21:46:31 -07:00
Luke Yeager	be7725b0ba	Tests: fix dpm test when only 1 GPU present Summary: `b33894e95d` removed this line: ```py unittest.skipIf(workspace.NumCudaDevices() < 2, "Need at least 2 GPUs.") ``` but forgot to add it back later. ``` _________________________________ DataParallelModelTest.test_equiv __________________________________ ... if p2p_access_pattern is not None and not p2p_access_pattern[ > devices[0], peer ]: E IndexError: index 1 is out of bounds for axis 1 with size 1 ... WARNING:data_parallel_model:** Only 1 GPUs available, GPUs [0, 1] requested ``` /cc akyrola Closes https://github.com/caffe2/caffe2/pull/888 Reviewed By: akyrola Differential Revision: D5341310 Pulled By: harouwu fbshipit-source-id: 8d7f06913c7b5a42009a4033dbb6a48a8e812822	2017-07-05 14:32:12 -07:00
Yiming Wu	60e4607106	brew API in convnet benchmark Summary: upgrade convnet_benchmarks to brew api Reviewed By: salexspb Differential Revision: D5341829 fbshipit-source-id: f34c6dd4aae5f0c8db51e7600eb1f0e1cdc72ea3	2017-07-05 10:34:48 -07:00
Jacqueline Xu	25bd5dda27	Implementing random fourier features layer Summary: - Created the random fourier features layer - Generated a unit test to test the random fourier features layer is built correctly - Inspired by the paper [[ https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf \| Random Features for Large-Scale Kernel Machines]] Reviewed By: chocjy Differential Revision: D5318105 fbshipit-source-id: c3885cb5ad1358853d4fc13c780fec3141609176	2017-07-04 23:48:42 -07:00
Jiyan Yang	00e5afea6a	Adding dedup aggregator options to sgd optimizer Summary: As desc. Reviewed By: xianjiec Differential Revision: D5324671 fbshipit-source-id: 27f3a58f618cd5ea11c2ea2e756df3f73635c2c8	2017-07-04 02:10:18 -07:00
Marat Dukhan	2ac9ff5c96	Cos, Sin, and Abs operators Summary: add Cos, Sin, and Abs operators Reviewed By: akyrola Differential Revision: D5307632 fbshipit-source-id: 743c9d289e4d3fd439e4b5385841cdff87d9247a	2017-07-03 22:18:32 -07:00
Simon Layton	090506ac87	Add NCCLBroadcast to correct net Summary: Otherwise was always added to main net instead of param_init_net when desired (i.e. initial param sync) Closes https://github.com/caffe2/caffe2/pull/894 Differential Revision: D5367451 Pulled By: akyrola fbshipit-source-id: 3d82be6da687c736bd15f4852dbd272266eb4811	2017-07-03 16:54:44 -07:00
Dmytro Dzhulgakov	b6c1c0ac4e	Fix communication_schema decoding Summary: Allows to override the input/output record as long as the field blobs are the same. Reviewed By: yangyangyyy Differential Revision: D5362132 fbshipit-source-id: 3ac2ac22802902b7eed5c226b00a7e1971ad264c	2017-07-02 13:04:20 -07:00
Dmytro Dzhulgakov	c0cebc3578	Added flags to lstm, convnet and sparse_nn_benchmarks to print out operators Summary: pass flags directly to C2 Reviewed By: salexspb Differential Revision: D5345869 fbshipit-source-id: 22b0e791526c7b0caf1e6a13dd29900df0db8fe8	2017-06-30 23:47:04 -07:00
Aapo Kyrola	ab0fe0a5f4	add debug information when there is blob version mismatch Summary: It is quite common question when users get some variant of "blob has version 2 but gradient expects version 1" in their backward pass. The error message is completely unhelpful. To remedy this, I added proper debug information which tells user how the version number of a blob was incremented over time. i.e which ops caused the version to go op. This should help understand the issue. Reviewed By: dzhulgakov Differential Revision: D5358227 fbshipit-source-id: bc09d048ac33200c35d56460e44e86c2f2888f3f	2017-06-30 16:22:46 -07:00
Tao Wu	5aa147f273	added PackRNNSequence and UnpackRNNSequence operators Summary: Added two operators that can be used to tranfer data into the input format of RNN and back. Reviewed By: kittipatv Differential Revision: D5329886 fbshipit-source-id: 07eac29416427b08c49989d4eeed50a6f18493a1	2017-06-30 09:53:31 -07:00
Aapo Kyrola	8c74c36626	fix reducing device option Summary: This was broken in a previous diff, fixing it to use model device type. Reviewed By: asaadaldien Differential Revision: D5356005 fbshipit-source-id: a4fcc932bae772076b57625a5fcc0d38eb702cc9	2017-06-30 09:19:57 -07:00
Thomas Dudziak	5355634dac	Dict fixes/improvements and unittest targets for Python 3 in caffe2 core Summary: As title Reviewed By: salexspb Differential Revision: D5316104 fbshipit-source-id: aee43819d817842e5ce6ba3d045a55b1a2491c30	2017-06-29 17:05:41 -07:00
Alexander Sidorov	a6dee1da32	Make args.fixed_shape in lstm_benchmark work in a library mode Summary: this works as a standalone python script because args are global. When used from Flow for monitoring purposes it doesn't work. This diff fixes it Reviewed By: zem7 Differential Revision: D5349996 fbshipit-source-id: f73842901d975b783e09e9db0565eb81880bbea1	2017-06-29 14:55:26 -07:00
Aapo Kyrola	dd6e170b8d	fix LSTM benchmark reporting Summary: A couple of fixes to fix broken rerporting of lstm_benchmark: - last_time must be recorded after warm up - entry count was incorectly removed Reviewed By: salexspb Differential Revision: D5349890 fbshipit-source-id: 5dd5bdf46594c520b61bc3b57b153f90a6a17903	2017-06-29 13:53:17 -07:00
Andrew Tulloch	6c67a753c7	Fix test_pair_wise_loss_predictions Summary: Increase absolute error tolerance. Reviewed By: tomdz Differential Revision: D5349604 fbshipit-source-id: 8e04001b0b6a6e83083f341e265ab3c0d2b06918	2017-06-29 12:48:04 -07:00
Andrew Tulloch	912ee4e40a	Fix `test_sparse_to_dense` precision failures Summary: .. Reviewed By: tomdz Differential Revision: D5349561 fbshipit-source-id: 4c510905515eb03a64abc36f33d59a1d998c2ab1	2017-06-29 12:48:03 -07:00
Andrew Tulloch	83765906c6	Add min_satisfying_examples Summary: Eliminates failures from overloaded machines from only running a few examples before being timed out. Reviewed By: tomdz Differential Revision: D5349555 fbshipit-source-id: 89d1db063f58c72656b37157225a586c9e3f24bc	2017-06-29 12:48:01 -07:00
Junjie Bai	86305ddd49	Deprecate CNNModelHelper in python/seq2seq/seq2seq_model_helper.py Summary: Also added some simple tests for Seq2SeqModelHelper. Reviewed By: jamesr66a Differential Revision: D5291733 fbshipit-source-id: 15866dccb89acd82c08e0348f14834cd9c201422	2017-06-28 20:18:12 -07:00
Yiming Wu	fb4c0a664b	brew API in lstm benchamrk Summary: I deprecated CNN ModelHelper in LSTM benchmark Reviewed By: salexspb Differential Revision: D5342734 fbshipit-source-id: 81a552194bcb0cc3071604340fce6873230964f2	2017-06-28 20:18:12 -07:00
Ben Zhang	e128245e8c	Move memonger graph equality into memonger Summary: Lets try this again. Verify graphs every time memonger is run. Will definitely check for time though. Reviewed By: akyrola Differential Revision: D5308188 fbshipit-source-id: 512a76c759b670d31c49d1d492dd8ee1eaf3bafd	2017-06-28 17:36:40 -07:00
Luke Yeager	fe9b0bfd27	Fix some typos Summary: Closes https://github.com/caffe2/caffe2/pull/882 Differential Revision: D5341277 Pulled By: harouwu fbshipit-source-id: bb5595c65c05ca7ea1a1d060d61d14fbfe008241	2017-06-28 13:50:48 -07:00
Yongqiang Wang	ea659b8f2e	broadcast to global parameters when using warmup Reviewed By: asaadaldien, jay-mahadeokar Differential Revision: D5340692 fbshipit-source-id: 80879847ff71c8d620de502ef95a9ffb4bdf595d	2017-06-28 13:35:27 -07:00
Ahmed Taei	fbe2526343	Allow concurrent execution of GLOO broadcast collectives in Summary: This add CollectivesConcurrencyControl class to mange creating common context and cyclic controls to execute GLOO collectivces and refactors AllReduce and _AddDistributedParamterSync to use it Reviewed By: akyrola Differential Revision: D5335795 fbshipit-source-id: 5084e0a65cdb989cd949be3868b77a680561022d	2017-06-28 12:49:12 -07:00
Brian Lan	e2bd3cfc8b	Add __sub__ function for schema.Struct Summary: This is for the ease of removing the common fields of a struct from another. For example, s1 = Struct( ('a', Scalar()), ('b', Scalar()), ) s2 = Struct(('a', Scalar())) s1 - s2 == Struct(('b', Scalar())) More examples are provided in the code comments. Differential Revision: D5299277 fbshipit-source-id: 7008586ffdc8e24e1eccc8757da70330c4d90370	2017-06-28 11:24:01 -07:00
Jiyan Yang	8260002941	Partial eval layers Summary: In some cases we don't want to compute the full FC during eval. These layers allow us to compute dot product between X and W[idx,:] where idx is an input, e.g., label. Reviewed By: kittipatv Differential Revision: D5305364 fbshipit-source-id: 0b6a1b61cc8fcb26c8def8bcd037a4a35d223078	2017-06-28 00:36:40 -07:00
Yiming Wu	a1fcbb8be1	offline_all_gpu_experiment Summary: similar to sparse_nn all gpu, this is our first step towards offline full gpu experiment. Compare Run cat(128, 32)512-512 : GPU 21138598 https://fburl.com/jpeod1pi CPU 21138787 https://fburl.com/vma7225l Reviewed By: dzhulgakov Differential Revision: D5308789 fbshipit-source-id: 413819bf9c5fff125d6967ed48faa5c7b3d6fa85	2017-06-27 23:09:54 -07:00
Yiming Wu	1fce3eac4e	single trainer hybrid device Summary: First try of single trainer hybrid device training for sparsenn Comparison results with CPU training: https://our.intern.facebook.com/intern/fblearner/run/compare/?compare_to[0]=20016969&compare_to[1]=19660293&baseline_run=19660293&all_runs[0]=20016969&all_runs[1]=19660293 Reviewed By: dzhulgakov Differential Revision: D5205723 fbshipit-source-id: 4a024324ac2efc3248dd470d4c533cf2ecec2e92	2017-06-27 22:06:30 -07:00
Henry Lu	9a14c013c3	Refactor data_parallel_model to take advantage of Gloo broadcast op in broadcasting across machines and GPUs in one operation Summary: Combine _AddDistributedParameterSync() and _SyncParams() into a single function to broadcast across distributes machines and all local GPU simultaneously. This is similar to how calls to Allreduce has already optimized using the functionalities of Gloo. All the refactoring work is contained in data_parallel_model.py. Reviewed By: akyrola, andrewwdye Differential Revision: D5329277 fbshipit-source-id: 4407b88980cf396f2e0f994d796294fa79fd39ed	2017-06-27 19:35:24 -07:00
Luke Yeager	c3b4d277bf	Tests: fix test_convolution_sync() Summary: This bug in the test was exposed by https://github.com/caffe2/caffe2/pull/861 (previously, the test was always using the cuDNN engine, regardless of the value of `engine`). This bug is now blocking https://github.com/caffe2/caffe2/pull/817. ``` ____________________ TestConvolution.test_convolution_sync _____________________ ... if use_cudnn and requested_engine != 'CUDNN': raise ValueError( > 'When use_cudnn=True, the only engine you can specify is ' E ValueError: When use_cudnn=True, the only engine you can specify is "CUDNN" ``` https://travis-ci.org/caffe2/caffe2/jobs/247605579 Closes https://github.com/caffe2/caffe2/pull/881 Differential Revision: D5332619 Pulled By: akyrola fbshipit-source-id: 63737768a155359ddbbef1da424fcbb94f86bd4e	2017-06-27 18:07:04 -07:00
James Cross	08cfc72dee	Increase threshold for test_unroll_attention Summary: To 0.000001. Reviewed By: salexspb Differential Revision: D5323697 fbshipit-source-id: 5a06c8f5e719b5252e4229704205be37777a8bab	2017-06-27 17:17:32 -07:00
James Reed	07ba98b4b2	Allow specification of SliceOp dimensions via argument rather than via tensor Summary: This should make it so we no longer have super hacky DAG chains just to generate vectors of indices that could be specified at model creation time Reviewed By: akyrola Differential Revision: D5316707 fbshipit-source-id: 97bb3868b69e0c5a7f465c95f2e16ae0485dcc56	2017-06-27 17:17:32 -07:00
Aapo Kyrola	4d16578284	fix + verification for inplace blobs Summary: Fixes a memonger bug where it could recycle a blob that was released by the same op being processed. Added a verification step to ensure in-place assignments are not changed. Reviewed By: asaadaldien Differential Revision: D5331495 fbshipit-source-id: 20b08f6de5b973e8c9868aa048c142cac1eb6c58	2017-06-27 13:51:03 -07:00
Luke Yeager	dfd745a4d1	Conv frontend: checking engine and use_cudnn Summary: Fixes https://github.com/caffe2/caffe2/issues/860 Raise an exception when the user specifies conflicting values for `engine` and `use_cudnn` in the conv frontend. Closes https://github.com/caffe2/caffe2/pull/861 Differential Revision: D5329587 Pulled By: akyrola fbshipit-source-id: 0f1ced9a88c9c6c5a7cb30a070e5bf60129082f0	2017-06-27 09:47:48 -07:00
Simon Layton	d45f722e43	data_parallel_model: NCCLBroadcast root fix Summary: The root is the root _rank_ and not the root _device_. Thus we always use root=0, regardless of the devices used. https://github.com/NVIDIA/nccl/blob/v1.3.0-1/src/broadcast.cu#L75 /cc slayton58 Closes https://github.com/caffe2/caffe2/pull/872 Differential Revision: D5329564 Pulled By: akyrola fbshipit-source-id: 5a34be30c1a0046a74f28437cb08333c1fb46098	2017-06-27 09:47:48 -07:00
Luke Yeager	ca2bf16009	Tests: handle missing python-lmdb gracefully Summary: Fix issue mentioned here: `875a9850c1 (commitcomment-22773221)` Unblocks https://github.com/caffe2/caffe2/pull/817 /cc tomdz Closes https://github.com/caffe2/caffe2/pull/871 Differential Revision: D5329573 Pulled By: akyrola fbshipit-source-id: 855294f76bce82dce6d4bd489244922799848076	2017-06-27 09:47:46 -07:00
Zhicheng Yan	c0445c4426	support_multi_label Summary: Extend image_input_op to support multi-label binary label vector Reviewed By: panshen1 Differential Revision: D5318119 fbshipit-source-id: da6757ed9a562f1ab58e3ae5642b7a70d6d499c1	2017-06-27 08:47:59 -07:00
James Reed	24e30534ea	Implement SliceGradientOp for CPU Summary: Implement slice gradient for CPU. Will soon port this over to GPU so NMT can use it Reviewed By: akyrola Differential Revision: D5309305 fbshipit-source-id: 8fb5f4e665f236ecce9227c5c0c302f5076b01ad	2017-06-26 21:18:05 -07:00
Andrew Tulloch	cb5af39c69	Vectorize CPU ClipOp implementation (and add test) Summary: Noticed this wasn't vectorized, could be handy. Reviewed By: kennyhorror Differential Revision: D5308593 fbshipit-source-id: c2b35ece34831f0546f010a1ebe0b89f1a7d9446	2017-06-26 11:33:13 -07:00
Ben Zhang	4862c0f47f	Memonger in O(blobs) Summary: Made them faster. This should be equivalent to the algorithm akyrola suggested, just with a list (of parents) as an intermediate representation instead of a string. Reviewed By: akyrola Differential Revision: D5308133 fbshipit-source-id: c976a513d10e79c157ea803afb99b147e9ea3357	2017-06-26 11:04:13 -07:00
Aapo Kyrola	87275817a4	fix a rare race condition by initializing scratch blobs beforehand Summary: Data workers test timeouts randomly (very seldom), and looks like the reason is that we call FeedBlob in a thread (eneuque-thread), and first time that is called, it will call workspace.CreateBlob() -- which is not thread safe. Fix this by initializing the scratch blobs explicitly. Reviewed By: panshen1 Differential Revision: D5292426 fbshipit-source-id: d7dad68f3ccc636c60bd82b2527f00f20da298b5	2017-06-26 10:18:18 -07:00

1 2 3 4 5 ...

922 Commits