pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Di Yu	acc384183a	caffe2 operator logit / logit gradient CUDA implementation Summary: This is the continuation of T20872698 Implement the gradient operator for element-wise Logit Reviewed By: asaadaldien Differential Revision: D5969487 fbshipit-source-id: c9bb4222529f9fd9085aa9048b90eb70a63f41f4	2017-10-03 18:48:25 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Di Yu	711d7137c7	Implement the gradient operator for element-wise Logit Summary: Implemented logit gradient with eps as arg. Add the unit test for it and explored the optimal parameter to run the test. Reviewed By: asaadaldien Differential Revision: D5910655 fbshipit-source-id: 44898b784a57c7ad45519b202b1eaf95c1c4d460	2017-09-26 14:49:22 -07:00
Aapo Kyrola	bb08f261f1	EnsureDense/SparseToDense for CUDA Summary: Make CUDA version of SparseToDense, register EnsureDense (which is trivial) on CUDA. Need to use atomics because indices can be duplicated. We can later add an option to inform if the indices are unique, and use faster path then. Reviewed By: jhcross Differential Revision: D5750893 fbshipit-source-id: 005d1675b127a571aac8474fca62d9633f0c7bff	2017-09-01 09:33:05 -07:00
Ahmed Taei	8af625ede2	Implement gradients for Col2Im and Im2Col operators Reviewed By: jay-mahadeokar Differential Revision: D5576385 fbshipit-source-id: a0ca4f704fd861f7cc67079041b1d0772fc66920	2017-08-07 15:51:30 -07:00
Wojciech Glogowski	8f8dccd2ed	distance_op_test from hypothesis_test refactored Summary: Moved distance_op_test from hypothesis_test to distance_op_test and refactored Reviewed By: akyrola, asaadaldien Differential Revision: D5495104 fbshipit-source-id: 4a90c75eabeb380ae9d150d6258e9b5b0fbfc5ca	2017-07-26 13:37:08 -07:00
Wojciech Glogowski	f656e002a7	CosineSimilarity GPU Reviewed By: asaadaldien, akyrola Differential Revision: D5476812 fbshipit-source-id: d931a7d8e4a4dfdf22ee18f8b9c755cc21b0e75b	2017-07-25 13:34:01 -07:00
Bangsheng Tang	e5a7891038	dot product using matmul Summary: 1. PairwiseDotProduct in layers 2. add_axis argument in Concat and Split(just for backward propagtion) Reviewed By: xianjiec Differential Revision: D5383208 fbshipit-source-id: 8e18ce371fff2da2da77b1a728142d69cd48e9c3	2017-07-17 23:20:37 -07:00
Aapo Kyrola	f44991b398	add timeout argument to DequeueBlobs; use 10 min timeout for data workers Summary: As title. This helps with (quite common) cases where data input is stuck for reason or another, and the net execution never proceeds and is stuck forever. Reviewed By: andrewwdye Differential Revision: D5409885 fbshipit-source-id: 840261fd5964408f788fc0f50ece0d74193694ac	2017-07-13 18:52:03 -07:00
Marat Dukhan	2ac9ff5c96	Cos, Sin, and Abs operators Summary: add Cos, Sin, and Abs operators Reviewed By: akyrola Differential Revision: D5307632 fbshipit-source-id: 743c9d289e4d3fd439e4b5385841cdff87d9247a	2017-07-03 22:18:32 -07:00
Thomas Dudziak	5355634dac	Dict fixes/improvements and unittest targets for Python 3 in caffe2 core Summary: As title Reviewed By: salexspb Differential Revision: D5316104 fbshipit-source-id: aee43819d817842e5ce6ba3d045a55b1a2491c30	2017-06-29 17:05:41 -07:00
Andrew Tulloch	912ee4e40a	Fix `test_sparse_to_dense` precision failures Summary: .. Reviewed By: tomdz Differential Revision: D5349561 fbshipit-source-id: 4c510905515eb03a64abc36f33d59a1d998c2ab1	2017-06-29 12:48:03 -07:00
Thomas Dudziak	342de07231	Core unit test fixes for Python 3 Summary: As title Differential Revision: D5291327 fbshipit-source-id: 7dd9279c53ba55d3422c31973ffcec5705787fdf	2017-06-23 13:22:16 -07:00
Luke Yeager	e2107fffba	Fixes for test_recurrent in hypothesis_test.py Summary: /cc akyrola is it possible this test has been broken ever since `5614816fce`? More generally, why do we still have `hypothesis_test.py` at all? In the case of this test, surely one of these files does more than this one old test: * `operator_test/cudnn_recurrent_test.py` * `operator_test/recurrent_network_test.py` * `operator_test/rnn_cell_test.py` Closes https://github.com/caffe2/caffe2/pull/843 Differential Revision: D5292109 Pulled By: akyrola fbshipit-source-id: 6df5df6353a9741d1ae1b796adaab98382857527	2017-06-21 05:35:42 -07:00
Luke Yeager	31e700910d	Fix entropy error coming from test_div Summary: Working towards https://github.com/caffe2/caffe2/pull/817. `E InvalidArgument: Insufficient bytes of entropy to draw requested array. shape=(4, 2, 5, 1, 3, 5, 5, 1), dtype=float32. Can you reduce the size or dimensions of the array? What about using a smaller dtype? If slow test runs and minimisation are acceptable, you could increase settings().buffer_size from 8192 to at least 24576000.` https://travis-ci.org/caffe2/caffe2/jobs/243867951 Closes https://github.com/caffe2/caffe2/pull/828 Differential Revision: D5276723 Pulled By: akyrola fbshipit-source-id: f7d0e2dd8ef8b6a2354bd4ff7c7446c377c954b4	2017-06-19 13:47:29 -07:00
Po-Yen Chou	5ce9cbae70	Upgrades python/hypothesis_test.py to use brew instead of CNNHelperModel Summary: Upgrades this file to use brew instead of CNNHelperModel Reviewed By: harouwu Differential Revision: D5252089 fbshipit-source-id: 6df4350717c1d42bc4bcc63d255cd422f085ee05	2017-06-15 15:07:56 -07:00
Luke Yeager	f61e4ca070	Fixes in tests to support numpy >= 0.12 Summary: ``` File "/data/caffe2/install/caffe2/python/hypothesis_test.py", line 1911, in test_batch_to_space (w + 2 * pad) / block_size).astype(np.float32) File "mtrand.pyx", line 1404, in mtrand.RandomState.randn (numpy/random/mtrand/mtrand.c:19843) File "mtrand.pyx", line 1534, in mtrand.RandomState.standard_normal (numpy/random/mtrand/mtrand.c:20368) File "mtrand.pyx", line 167, in mtrand.cont0_array (numpy/random/mtrand/mtrand.c:6127) TypeError: 'float' object cannot be interpreted as an index ``` ``` File "/data/caffe2/install/caffe2/python/operator_test/tile_op_test.py", line 101, in tile_ref tiled_data = np.tile(X, tuple(dims)) File "/data/caffe2/venv/local/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 881, in tile return c.reshape(shape_out) TypeError: only integer scalar arrays can be converted to a scalar index ``` I also tested to make sure this still works with 0.11. Closes https://github.com/caffe2/caffe2/pull/787 Differential Revision: D5248087 Pulled By: salexspb fbshipit-source-id: eff69482a8eabb8ace330003fa326c832b53865f	2017-06-15 14:17:20 -07:00
Jiyan Yang	c7aa8e142d	Add gradient to SparseToDense op Summary: As desc. Differential Revision: D5169423 fbshipit-source-id: 64c72933c14c3caabfbe0bf85912194a479c24fa	2017-06-09 13:47:21 -07:00
Pooya Davoodi	2c97c98ca7	Enable testing the GPU implementations of Adagrad and Adam Summary: Enable testing the GPU implementations of Adagrad and Adam incl sparse versions. Closes https://github.com/caffe2/caffe2/pull/607 Reviewed By: dzhulgakov Differential Revision: D5121552 Pulled By: Yangqing fbshipit-source-id: da6b7dde456237c94cf74d00860e7327b2267eab	2017-06-01 18:10:57 -07:00
Yiming Wu	b070197e8a	cuda unique op Summary: cuda unique op , unittest provided, will provide benchmark agains CPU SpeedUp results for synthetic real data. Input of size 20k, range[1, 10million], ~5x speedup CPU 9.05795(ms) Unique GPU 1.79434(ms) Unique SpeedUp results for 5x synthetic data. Input of size 1 million, range[1, 10million] ~13.7x speedup CPU 54.7539(ms) Unique GPU 3.99473(ms) Unique Reviewed By: akyrola Differential Revision: D5007726 fbshipit-source-id: 0a00c518fd1809d0ae8c6cfcba09b0bd982ffaff	2017-05-11 21:08:10 -07:00
Xiaolong Wang	0d32ab4a45	Refactor FTRL optimizer to allow sending Alpha as input blob Summary: Split from parent diff Reviewed By: xianjiec Differential Revision: D4992993 fbshipit-source-id: 9f8a79023b0c581e84bd5e82e2e730c9e1a86e1e	2017-05-04 22:57:00 -07:00
Kittipat Virochsiri	c34d5a838f	Generalize LastNWindowCollector Summary: Use `CopyItems` so that it accepts any type of tensor. Also, move the cursor to input blob so that it's checkpoint friendly. Output is now also part of input so that inference can work correctly. Reviewed By: xianjiec Differential Revision: D4920987 fbshipit-source-id: da532736225ec27f409ff763ff69a0629235151c	2017-05-04 16:05:15 -07:00
Ahmed Taei	561255218a	NormalizeOP CUDA impelementation Summary: Implement NormalizeOP for GPU using CUDA, and re-write the graident to be a function of the output so its more efficent specially for CUDA implemntation. Reviewed By: akyrola Differential Revision: D4971300 fbshipit-source-id: e0ab66462000988aaf1f26010ea550533d107167	2017-05-01 09:25:30 -07:00
Aapo Kyrola	ed05c28bc6	Speedup SquaredL2Distance CUDA Summary: Both SquaredL2Distance and SquaredL2DistanceGradient had bad CUDA implementations. Use proper reductions and batched kernels. Reviewed By: asaadaldien Differential Revision: D4968527 fbshipit-source-id: f7cf82072d38bc127c757c5751863a9439aca8b5	2017-04-28 11:55:59 -07:00
Luke Yeager	09bb91022a	Fix tests for ops without a CUDA backend Summary: See https://github.com/caffe2/caffe2/pull/227 * Logit * ReplaceNaN * BatchOneHot Closes https://github.com/caffe2/caffe2/pull/277 Differential Revision: D4915268 Pulled By: Yangqing fbshipit-source-id: 77ccb2e7d03e6953e8ca60646987a02868d0ef5b	2017-04-24 15:52:25 -07:00
Lei Chen	8b5782ed5c	Weighted sampling dequeue operator Summary: Similar to SafeDequeueBlobsOp, but add weight-based sampling for reading from multiple input BlobsQueue. WeightedSampleDequeueBlobsOp will take a vector of weights (each weight is mapped to one input blob queue). Based on probability, we will choose which BlobQueue to fetch. WeightedSampleDequeueBlobsOp shall stop when any of input BlobQueue is empty. Reviewed By: dzhulgakov Differential Revision: D4905160 fbshipit-source-id: 5b1551e2250569f933a6c01ed04442843c5e0cb6	2017-04-19 12:02:06 -07:00
Xianjie Chen	70e9c08f27	feature processing ops Summary: add necessary ops for feature processing * logit op * replace nan * batch one hot op Reviewed By: kittipatv Differential Revision: D4840869 fbshipit-source-id: 197123ea5608d54f0b5ac7899973a077a6a86775	2017-04-11 07:07:51 -07:00
Aapo Kyrola	8da2d75ec8	Caffe2/Recurrent] recurrent.py API to cuDNN LSTM Summary: Quite large diff to make cuDNN LSTM and our LSTM produce same results and provide python API for the cuDNN LSTM. * Added operators RecurrentParamGet and RecurrentParamSet to access weights and biases for the different gates, input/recurrent. * Removed RecurrentInit as not needed * recurrent.cudnn_LSTM() returns a special net and mapping that can be used to retrieve the parameters from the LSTM * recurrent.cudnn_LSTM() can be passed blobs that have the parameters for the individual gate weights and biases * recurrnet.InitFromLSTMParams() can be used to initialize our own LSTM from CUDNN params. This way we can test if cuDNN and our own produce the same result. recurrent_test.py tests for the equivalency Reviewed By: salexspb Differential Revision: D4654988 fbshipit-source-id: 6c1547d873cadcf33e03b0e0110248f0a7ab8cb0	2017-04-05 14:20:23 -07:00
Dmytro Dzhulgakov	ef42d4c2aa	Fix sparse to dense and improve DispatchHelper Summary: Actually adds stuff on duplicated indices. I didn't use UnorderedSegmentSum because it'd need more modifications for figuring out the first dimension and I don't want to make that function more complex than it's already is :) We theoretically can have a version that does CopyItems and fails on duplicate indices as a fallback. But I haven't implemented it yet as it wouldn't be that useful for now. Also fixes hypothesis test - doing rand() inside the body is not cool as it makes hypothesis run forever Differential Revision: D4814574 fbshipit-source-id: 1851ec5f5df8fc4bf4844585076b8af23a06b0b2	2017-04-04 15:03:39 -07:00
Aapo Kyrola	e13e9c1302	cuDNN version of TransposeOp Summary: Uses the cudnnTransformTensor function. It works by shuffling the strides according to the transpose axis. Significant speedup over current GPU version . + moves the transpose test under utility_ops, because hypothesis_test is too big Reviewed By: jamesr66a Differential Revision: D4810993 fbshipit-source-id: 82577c4ced1389e70bd5992820ae4d8297a3817f	2017-04-03 13:33:10 -07:00
Luke Yeager	a95751e918	Fix test_random_seed_behavior for multi-GPU Summary: ``` E0327 17:33:12.775998 15629 context_gpu.h:126] Encountered CUDA error: an illegal memory access was encountered F0327 17:33:12.776208 15629 operator.h:176] Computation on device returned error in operator output: "Y" name: "" type: "XavierFill" arg { name: "shape" ints: 2 } device_option { device_type: 1 cuda_gpu_id: 0 } ``` Closes https://github.com/caffe2/caffe2/pull/225 Differential Revision: D4819785 Pulled By: Yangqing fbshipit-source-id: 896ca4d6534643bc261667377cc74d4fd7b3aca3	2017-04-03 10:50:46 -07:00
Luke Yeager	d76a814c93	Fixes for ops without a CUDA backend Summary: All of these tests fail with some variant of `Cannot create operator of type 'X' on the device 'CUDA'` (see commit messages). Closes https://github.com/caffe2/caffe2/pull/227 Differential Revision: D4797060 Pulled By: Yangqing fbshipit-source-id: 5feaa8e949098bfc1254d4c7449a2744e552f925	2017-03-29 14:36:09 -07:00
Ahmed Taei	f2b8150a1a	Fix PadImage same padding argument. Summary: PadImage has no kernel parameters resulting pads_ paraemeters to be not set (0). I added a test case too. Differential Revision: D4785230 fbshipit-source-id: fd475e7c41208e07fa7a363def9a45c6f82cddfe	2017-03-28 13:21:36 -07:00
James Cross	b41449b680	SparseMomentumSGDUpdateOp Summary: Creates SparseMomentumSGDUpdate, a sparse version of MomentumSGDUpdate, to make that optimization method (via in-place updating operator) compatible with GradientSlices. Differential Revision: D4784973 fbshipit-source-id: e6330f471a4d5f53589a6ac245e38f256ca7f354	2017-03-28 07:47:46 -07:00
Chonglin Sun	581e57c244	add AccumulateHistogramOp Summary: AccumulateHistogramOp, for computing the histogram of all values in input tensors Differential Revision: D4654417 fbshipit-source-id: dea92346004c772af16e1eb41306287d81dc5a02	2017-03-08 19:37:32 -08:00
Xiaolong Wang	ed693b1c6a	add EnsureDense Op in MTML MLP Summary: 1. Allow EnsureDense Op to do both in-place pass or copy 2. In MTML, add EnsureDense Op before gather 3. Change the unittest values (adding another operator changes the random seed, which causes the model initialization also changes) Reviewed By: xianjiec Differential Revision: D4625219 fbshipit-source-id: b3c748c3651d1dedd75420912a9698b7e46187c5	2017-03-07 14:03:49 -08:00
Simon Layton	73db5f902e	Fbsync cudnn rnn fix Summary: Update cuDNN RNN interface (mostly fixing ordering of arguments). Set seed so that test can pass consistently Closes https://github.com/caffe2/caffe2/pull/62 Reviewed By: Yangqing Differential Revision: D4348966 fbshipit-source-id: f9b56be37739e5bffabec130e3407492b2aef656	2017-03-02 05:31:21 -08:00
Chonglin Sun	8a85d6bd34	support vectors with different dims in for DotProductOp. Summary: Add two argument to DotProductOp operator, `force_same_dim` (1 if we want DotProductOp to only accept two tensors with equal dimension, 0 otherwise) and pad_value (only useful when force_same_dim = 0, pad the tensor with smaller size to the same as the other one). Differential Revision: D4502619 fbshipit-source-id: 46f7da710c6f6365f76a7af6234c34c7f656be62	2017-02-23 11:09:07 -08:00
James Cross	63901e9aca	allow recurrent network gradient op to receive gradient on any combination of network output blobs Summary: (Caffe2) Modified RecurrentNetworkGradient operator so that training is possible with any of the output blob(s) receiving gradient during the backward pass. This is realized through a new argument for the RecurrentNetwork op, outputs_with_grads, which takes a list of the indices of the output blobs which will receive gradient. The default case (only receiving gradient from the first output blob) remains the default. New unit test covers the case where outputs_with_grads = [1, 2] using Python LSTM wrapper. Reviewed By: urikz Differential Revision: D4518516 fbshipit-source-id: 5c531582b20f3cf727d1aa91239b4d5a2b8a7c1f	2017-02-15 16:00:45 -08:00
Huazhong Ning	cb3c41b9a9	PiecewiseLinearTransformOp transform binary predictions specially Summary: The existing op tranforms the input in a general way. It needs M transform mappings to transform a NxM input tensor. But for binary predictions X (Nx2 tensor), we know that X[:, 0] = 1 - X[:, 1]. So we just need one mapping for X[:, 1]. After being transformed, we can compute X[:, 0]. This diff is to handle this. Differential Revision: D4550441 fbshipit-source-id: 42d8c6e88d830c97628ee930b543740a32acf904	2017-02-15 16:00:44 -08:00
Huazhong Ning	ed0024a82c	SparseToDenseOp and GatherDense Summary: 1. The existing Gather op outputs gradients in sparse format. We add GatherDense that does the same thing as Gather but outputs gradients in dense format. This relies on the SparseToDenseOp. 2. SparseToDenseOp converts sparse representation (indices, values) into a dense format (missing values are filled with zeros). There is an existing SparseToDenseMaskOp. It is mainly for converting sparse features into dense format. Modifying it to achieve our purpose is too complicated and messy. Better to create a new one. Reviewed By: dzhulgakov Differential Revision: D4508879 fbshipit-source-id: f4a50efa1c08586d94040f93195661c41cd414da	2017-02-09 13:33:06 -08:00
Yangqing Jia	274ac2b590	Add cmake guard for python, build for tegra X1 Summary: In short: cmake is lovely. Closes https://github.com/caffe2/caffe2/pull/131 Differential Revision: D4517234 Pulled By: Yangqing fbshipit-source-id: 1117878393f8fe7d6bebbc4a06a3c37b734f3222	2017-02-07 13:17:50 -08:00
Zhao Tan	a386fe8b6a	LogOP implementation Summary: Element-wise log operation for a Tensor Reviewed By: dzhulgakov Differential Revision: D4519090 fbshipit-source-id: 68b73efa0ef268426b5aece77c8137000a73d165	2017-02-06 20:19:19 -08:00
Alexander Sidorov	b7fa6b2a8b	remove recurrent_inputs in a favor of recurrent_input_ids Summary: I have forgotten to remove this one. The rest of indexing instead of string names is comming after D4446813 lands as scratches aren't inputs or outputs and thus can't be indexed. Reviewed By: urikz Differential Revision: D4465748 fbshipit-source-id: 2ccbedfb35541ef4a2231d1480eef59025bd5290	2017-01-31 13:14:33 -08:00
Yury Zemlyanskiy	debd256177	Fix for gradient propagation for initial recurrent state for RecurrentNetwork Summary: looks like we don't a good job with initial recurrent input gradients yet. Here is some fix, but gradient doesn't check yet. The shape is correct now though Reviewed By: salexspb Differential Revision: D4475447 fbshipit-source-id: 280f1f59f19e487fd0dce0d440609c50ddce294a	2017-01-30 18:59:32 -08:00
Yury Zemlyanskiy	22e1bdd6d1	Use stack workspaces in RecurrentNetwork Summary: This diff use stack workspaces in RecurrentNetwork, which allows to simplify the implementation and get rid of scratches. Reviewed By: salexspb Differential Revision: D4446813 fbshipit-source-id: 514eec7e4300bdf492a9cb192b40cf4f89acf656	2017-01-27 11:44:26 -08:00
Xianjie Chen	ddbf90afa3	improve dper dh Summary: it's broken because it relies on add sparse bias. it's not easy to add_sparse_bias after switch to loader_param. DPA would like to try it out :) Differential Revision: D4447275 fbshipit-source-id: 631cb4995f35383070e44387dc86692ba64b91eb	2017-01-25 02:59:22 -08:00
Yury Zemlyanskiy	0e3146e1e8	Remove recurrent_sizes from RecurrentNetwork Summary: Remove usage of recurrent_sizes, so recurrent states' sizes can depend on input (in case of attention matrix for beam decoder). I removed recurrent_sizes from forward and backward steps. Reviewed By: salexspb Differential Revision: D4427688 fbshipit-source-id: 580420a294d309c86ec5cb4e677058623b7228e1	2017-01-24 23:14:25 -08:00
Alexander Sidorov	b1472a173a	don't hardcode outputs order to work only for lstm + don't pass blob names for parameters Summary: In this diff I stop passing parameters by name and also remove hardcoded output ids which were there specifically for LSTM to work. It also allows to avoid using recurrent_sizes in the backward pass (for forward this is done in D4427688) Using similar technic it should be simple enough to eliminate blob name passing at all. Then we can fix scoping. These can be done in a next diff. Reviewed By: urikz Differential Revision: D4444614 fbshipit-source-id: 3580a76365502b9f2f09e3d8b7e78084ca739f00	2017-01-24 16:29:23 -08:00
Chao Zhang	96fc095ccb	Add piecewise linear transformation operator Summary: New operator is added for model calibration. Given a piecewise linear function and raw prediction as input, generate the mapping as output. Detail can be find in the operator doc. Differential Revision: D4418640 fbshipit-source-id: f8ff3ea786b0fe233a4ddcb709e5dbf0861ca484	2017-01-23 17:44:26 -08:00

1 2

70 Commits