pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Dehua Cheng	35bee0c729	separate op for rowwise counter (#31612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31612 Count the number recent update on rows. Exponential decay is applied on the counter with decay rate r, such that r^{counter_halflife} = 0.5; If counter_halflife is nonpositive, this operator is turned off. Test Plan: added unittest Reviewed By: chocjy Differential Revision: D19217921 fbshipit-source-id: 96d850123e339212cc0e0ef352ea8a1b1bf61dfa	2019-12-27 12:18:39 -08:00
Yanghan Wang	d08250c223	fix zero-batch handling in convtranspose (#24341 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24341 ConvTransposeOp doesn't crash for zero-batch, but it doesn't modify the output blob. This leads to buggy behaviour especially when running the same network twice using different input, or backprop during training. Seems `ConvTransposeUnpoolBase<Context>::GetOutputSize` works for zero-batch, so I remove the check for `input.numel() > 0`, and reshape the output blob before returning. For CudnnConvTransposeGradientOp, it's a bit verbose to set `dfilter` and `dbias`, it's a seems the Cudnn can handle it, so simply remove the `X.numel() == 0` branch. Test Plan: buck test mode/dev-nosan caffe2/caffe2/python/operator_test:conv_transpose_test -- --run-disabled Reviewed By: BIT-silence Differential Revision: D16807606 fbshipit-source-id: 0d72c5bd8f2e03c34465e7b530cca548d9bdd5e1	2019-12-18 15:06:36 -08:00
Vitaly Fedyunin	c5d2758c35	Disable flaky TestMomentumSGD.test_fp16momentum_sgd (#31369 ) Summary: Related to https://github.com/pytorch/pytorch/issues/31368 Pull Request resolved: https://github.com/pytorch/pytorch/pull/31369 Differential Revision: D19147072 Pulled By: VitalyFedyunin fbshipit-source-id: 6fad13be7b35f992d84a20f23877cad05ff18616	2019-12-17 19:16:54 -08:00
Yanghan Wang	52b8a52e4d	move AliasWithNameOp to caffe2/operators Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31281 Reviewed By: houseroad Differential Revision: D19053453 fbshipit-source-id: 350bfd5c001db9c17916dcae7ade8f56db1e9841	2019-12-17 02:39:40 -08:00
Yuchen Hao	4a751dfc20	optimize MulGradient for common shapes (#19705 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19705 Optimizing for a case when there's a consecutive dims that are not broadcasted followed by another consecutive dims that are broadcasted. For example, MulGradient(["dC", "A", "B"], ["dA", "dB"], broadcast=True, axis=0) where A.shape == dC.shape == [9508, 80] and B.shape == [80] . Test Plan: In SKL T6, Running mul_gradient_benchmark without this optimization Operator #0 (dA, MulGradient) 11.9119 ms/iter After this optimization, Operator #0 (dA, MulGradient) 0.672759 ms/iter Need to land D15291800 before to fix the unit test error Reviewed By: dmudiger Differential Revision: D15075415 fbshipit-source-id: 0f97be17cf8f1dacbafa34cd637fb8bc1c5e5387	2019-12-11 11:39:52 -08:00
Brian Wignall	e7fe64f6a6	Fix typos (#30606 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos. Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606 Differential Revision: D18763028 Pulled By: mrshenli fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c	2019-12-02 20:17:42 -08:00
Chuan Jiang	6c9b188262	Support in-place update in IndexHashOp (#30275 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30275 `IndexHash` did not support in-place update. Reviewed By: kennyhorror Differential Revision: D18612231 fbshipit-source-id: adeccdf1ceb6107454555ff9cdf66fd5e5773f2a	2019-11-22 14:49:28 -08:00
Huan Gui	be757957ba	Support softmax with D == 0 (#29167 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29167 As titled. This fix is crucial as multi_channel splitting would create history that has no items (i.e., D == 0), which leads to flow failure. Test Plan: Unittest flow test: before fix: f148783160 after fix: f149082299 buck test mode/dev-nosan caffe2/caffe2/python/operator_test:softmax_ops_test Reviewed By: xianjiec Differential Revision: D18296081 fbshipit-source-id: e0bb2dc2c4e5b465e213f31e5c5ced3a7e1fd574	2019-11-11 00:46:10 -08:00
Mike Ruberry	991c2ac383	Disables flaky test_rand_quantization (#29463 ) Summary: See https://github.com/pytorch/pytorch/issues/28550. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29463 Differential Revision: D18405669 Pulled By: mruberry fbshipit-source-id: 2984c3896a9260a06fbf052afb06e0cb8d28b53d	2019-11-08 13:51:22 -08:00
Mike Ruberry	2f2a0d1607	Disables test_atomic_ops and testInputOrder (#29145 ) Summary: These tests have been flaky for some time, see: - https://github.com/pytorch/pytorch/issues/28179 - https://github.com/pytorch/pytorch/issues/9064 This PR disables them. The actual tests were added/updated 2+ years ago. It's unclear who, if anyone, would own them now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/29145 Differential Revision: D18327937 Pulled By: mruberry fbshipit-source-id: d02731d662aff3545b581272e5ae8db4e3097d87	2019-11-05 16:53:53 -08:00
Huan Gui	8a2dcff189	Add cuda version for operators BatchSparseToDense and BatchDenseToSparse (#29166 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29166 As titled Test Plan: unittest buck test mode/dev-nosan caffe2/caffe2/python/operator_test:batch_sparse_to_dense_op_test Reviewed By: xianjiec Differential Revision: D18197966 fbshipit-source-id: 7486300c509dd552ddb7484c2d83099f62878278	2019-11-05 13:06:23 -08:00
Xinyi Zhang	5821b9bf0f	Remove error logging of high empty range ratio Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28854 Reviewed By: xianjiec Differential Revision: D18206695 fbshipit-source-id: 4ce471f0236b2ceaf54ba1b1ce96e193feca720b	2019-10-30 12:55:25 -07:00
Huayu Li	793e2914e4	Support full id interations (#28769 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28769 Support full id interaction. Test Plan: * unit-tests * buck test caffe2/caffe2/python/operator_test:pack_ops_test -- * buck test caffe2/caffe2/fb/dper/layer_models/tests:sparse_nn_attention_test -- test_sparse_nn_full_id * canary * apply SUM + full id with max_length as 20 on SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID: f147253340 (v1: f146340704) # of embeddings for this features is 20: {F219139816} The corresponding ops: two lookups, which is as expected. ``` op { input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_0/Repeat_0/sparse_lookup/w" input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:values" input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:lengths" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_0/Repeat_0/sparse_lookup/output" name: "" type: "SparseLengthsSum" } op { input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/w" input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:values" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/output" name: "" type: "Gather" } op { input: "feature_preproc/output_features:SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM:lengths" input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/sparse_lookup/output" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/PackSegments/embedding_packed" name: "" type: "PackSegments" arg { name: "max_length" i: 20 } arg { name: "pad_minf" i: 0 } } op { input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/PackSegments/embedding_packed" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/reshaped_record" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/old_shape" name: "" type: "Reshape" arg { name: "shape" ints: -1 ints: 1280 } } op { input: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/Reshape/reshaped_record" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_0" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_1" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_2" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_3" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_4" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_5" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_6" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_7" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_8" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_9" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_10" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_11" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_12" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_13" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_14" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_15" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_16" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_17" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_18" output: "nested/dot/SPARSE_AD_MEDIA_XRAY_V11_TOPIC_ID_AUTO_FIRST_X_AUTO_UNIGRAM/Pool_Option_1/Repeat_0/full_id/split/output_19" name: "" type: "Split" arg { name: "axis" i: 1 } } ``` Reviewed By: chonglinsun Differential Revision: D18083520 fbshipit-source-id: f592fb7734dd4e3e712ba42dc0afcd0b32a4afa0	2019-10-29 14:56:18 -07:00
Xinyi Zhang	f5ea2ca34a	Reduce logging frequency for empty range tolarence Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28704 Reviewed By: xianjiec Differential Revision: D18138828 fbshipit-source-id: 4f3c376502cb6e30b931217702c4ca537c9eb644	2019-10-28 09:52:17 -07:00
Xinyi Zhang	2f16284231	change empty range tolorrance logging Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28489 Differential Revision: D18067322 fbshipit-source-id: 2096d1cce820f4ebe28db0045a2ddacc022e07da	2019-10-23 09:39:39 -07:00
Xinyi Zhang	06bb74ce96	Tolerate small amount of embedding corruptions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28371 Reviewed By: xianjiec Differential Revision: D18031155 fbshipit-source-id: a51d2a62a919f032dc04372b30cf9071aa2dd629	2019-10-21 16:23:25 -07:00
Jiang Wu	29f56eb920	Revert D17937850: Tolerate small amount of embedding corruptions Test Plan: revert-hammer Differential Revision: D17937850 Original commit changeset: e9c633768d98 fbshipit-source-id: 5c2c837c7867504392b19965d91a60cadd3b8101	2019-10-19 14:17:01 -07:00
Xinyi Zhang	ca6ba06f95	Tolerate small amount of embedding corruptions Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28299 Reviewed By: Wakeupbuddy Differential Revision: D17937850 fbshipit-source-id: e9c633768d9819fd734ddd59017c33688ebbdcca	2019-10-18 14:59:06 -07:00
Simran Suresh Motwani	d63d7ab997	Expose PiecewiseLinearTransform to PyTorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26903 Test Plan: Unit Test Reviewed By: bddppq Differential Revision: D17585637 fbshipit-source-id: fe669aaf3301d7efb5c28ec0097945d55a71773d	2019-09-27 12:49:04 -07:00
Jongsoo Park	8fb756d3b2	batch size 0 support in ChannelShuffle DNNLOWP op (#26858 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26858 Handle batch size = 0 in ChannelShuffle operator Test Plan: CI Reviewed By: jianyuh Differential Revision: D17591041 fbshipit-source-id: 63373aa752406c1f38401c3e93d8e1954ce7281e	2019-09-26 00:40:07 -07:00
Huan Gui	a8386d2a7d	fix composite learning rate (#26227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26227 In the previous implementation of composite lr, the lr_scale for each sub policy will be rewritten by the last lr_scale. Due to another bug in unittest (where policy_lr_scale being the same for all sub policies), this bug was not detected by unittest... Fix: add an additional field in CompositeLearningRateItem so that we store lr_scale values for all sub policies If fix unittest, the error in previous implementation: https://fburl.com/testinfra/ikdbnmey With the fix, https://fburl.com/testinfra/m694ehl1 Test Plan: unittest buck test caffe2/caffe2/python/operator_test:learning_rate_op_test -- test_composite_learning_rate_op Reviewed By: chocjy, alex1o1o7cloud Differential Revision: D17380363 fbshipit-source-id: 161e9cb71bb2ea7f0734a3361e270616057a08e4	2019-09-18 17:34:17 -07:00
Qi Zhou	076eaf4ccf	Exposing Fused8BitRowwiseQuantizedToFloat in PyTorch (#26080 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26080 Will be used in c2 ctr_mbl_feed model to PyTorch conversion Test Plan: Unit test Reviewed By: yinghai Differential Revision: D17337604 fbshipit-source-id: a90d9f5dc38301608d1562c6f2418e7f4616e753	2019-09-12 12:36:33 -07:00
Frank Jiang	3be1745b3c	Make SparseNormalize backwards compatible (#25660 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25660 As title Test Plan: buck test caffe2/caffe2/python/operator_test:sparse_normalize_test https://our.intern.facebook.com/intern/testinfra/testrun/5910974517813190 Reviewed By: boryiingsu Differential Revision: D17187839 fbshipit-source-id: 1e5a6eaac0e825db4ae969540a1f689444070579	2019-09-05 15:14:21 -07:00
Jongsoo Park	8199bb3dd3	add options to flush cache in SLS benchmarks (#25530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25530 Add an option to flush cache for more consistent benchmarking. Test Plan: buck run mode/opt caffe2/caffe2/fb/python/benchmarks:sparse_lengths_sum_4bit_benchmark -- --flush-cache buck run mode/opt caffe2/caffe2/python/operator_test:sparse_lengths_sum_benchmark -- --flush-cache Reviewed By: hyuen Differential Revision: D17148087 fbshipit-source-id: 7eb782986676620254c1619a9a48c656cb1a6856	2019-09-03 05:09:03 -07:00
Jongsoo Park	f1059d4e6a	format sparse_lengths_sum_benchmark (#25529 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25529 To prepare D17148087 Test Plan: Just formatting Reviewed By: hyuen Differential Revision: D17148085 fbshipit-source-id: faff90ee7dfec543d47037d20ce00f251144bc06	2019-09-03 05:08:59 -07:00
Yanghan Wang	e34ef04301	register HeatmapMaxKeypoint with C10 (#25191 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25191 registering as C10. Test Plan: buck test mode/dev-nosan caffe2/caffe2/python/operator_test:heatmap_max_keypoint_op_test Reviewed By: newstzpz Differential Revision: D17056321 fbshipit-source-id: 989b72d7e3c9f23684b10d5fc9b98177ad4ee47b	2019-08-27 20:13:57 -07:00
Frank Jiang	d7c6debc14	Remove gradient value as input from SparseNormalize op (#24357 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24357 SparseNormalize does not need to know the gradient value to the lookup table, only the indices of the embeddings that need to be updated. By removing this input, we allow SparseNormalize to be used alongside SparseAdagradFusion Differential Revision: D16809919 fbshipit-source-id: cc19692ba4dea8854663ae1ed8cf9365e90c99bc	2019-08-19 14:47:09 -07:00
Yanghan Wang	3b22bbeb5b	enable "keeps" from BoxWithNMSLimit and caffe2_fastrcnn_outputs_inference Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24451 Reviewed By: newstzpz Differential Revision: D16850259 fbshipit-source-id: 22f69d71a558d63c32a27d271a7557fc35a55176	2019-08-19 10:54:22 -07:00
Yanghan Wang	ad64789a1e	add aligned option to RoIAlign Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23706 Reviewed By: ppwwyyxx Differential Revision: D16615823 fbshipit-source-id: fd9152af8bc979cb04044413e66af349b032a99d	2019-08-07 21:22:33 -07:00
Shali Jiang	15d3f0242b	support Gather different indices for different examples in one batch (#23813 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23813 Pull Request resolved: https://github.com/pytorch/pytorch/pull/23285 for example: Inputs: data: [[[2 4 2 0], [0 1 2 0], [1 1 0 0]], [[3 4 1 3], [0 3 2 2], [4 1 0 4]]] idx: [[0 2], [0 1]] outputs: [[[2 4 2 0], [1 1 0 0]], [[3 4 1 3], [0 3 2 2]]] data and idx must have the same outer dimension call Gather or BatchGather with argument match_outer=True Reviewed By: huayuli00 Differential Revision: D16652485 fbshipit-source-id: 9e144e97a8d6fceaf3b5714df1534338068f4a10	2019-08-07 21:14:30 -07:00
Michael Suo	a3c165f9d2	Revert D16452539: support Gather different indices for different examples in one batch Differential Revision: D16452539 Original commit changeset: 7229489f4a9c fbshipit-source-id: 010c177e551cb81521d2af84ce951bf964cdab44	2019-08-05 10:22:01 -07:00
Shali Jiang	f87a4cc23f	support Gather different indices for different examples in one batch (#23285 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23285 for example: Inputs: data: [[[2 4 2 0], [0 1 2 0], [1 1 0 0]], [[3 4 1 3], [0 3 2 2], [4 1 0 4]]] idx: [[0 2], [0 1]] outputs: [[[2 4 2 0], [1 1 0 0]], [[3 4 1 3], [0 3 2 2]]] data and idx must have the same outer dimension call Gather or BatchGather with argument match_outer=True Reviewed By: huayuli00 Differential Revision: D16452539 fbshipit-source-id: 7229489f4a9c02ee9f3c6a8a24bcd02925d96e07	2019-08-04 21:17:49 -07:00
Jiexian Li	302adf1d20	add LambdaRank DCG Loss Option (#23679 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23679 Full Canary: https://fburl.com/fblearner/sa1pkpya Add LambdaRank DCG Loss Option * when use_idcg_normalization == true, regular LambdaRank with NDCG loss * when use_idcg_normalization == false, gradient and loss functions are not normalized by idcg. Differential Revision: D16605459 fbshipit-source-id: a16f071e69516974e48d27bef4ca179019ca4ae7	2019-08-02 11:47:46 -07:00
Jiexian Li	fc6aec9491	format only change (#23685 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23685 format only changes. Differential Revision: D16607482 fbshipit-source-id: 572afb59c6ff9f8a8842ba044fed6c87f8506843	2019-08-02 11:47:42 -07:00
Levent Ertoz	6f01d13728	Implement dropout with replacement for id list features. (#22880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22880 Implement sparse dropout with replacement value. Reviewed By: xianjiec Differential Revision: D16267012 fbshipit-source-id: 8c4878230f61bb3ac333291e2c6aaf2fbdc5f9ce	2019-07-23 14:34:21 -07:00
Du Tran	d2ceab2766	update video input (#22471 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22471 update C2 video input with latest augmentation Reviewed By: HengCV Differential Revision: D16096127 fbshipit-source-id: bb07394e211cd52b50005d801b6d03250248ea9e	2019-07-05 00:56:33 -07:00
Alyssa Wang	d9e15bccb0	Perform weight re-init for embedding table in sparse_lookup.py (#22348 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22348 This is the last step of LRU hash eviction weight re-init. This diff checks if there's evicted values in sparse_lookup, if so call op created in D15709866 to re-init the values for indicies in evicted_values. Also created gradient op for the operator. The gradient op just passes the output gradient as input gradient. Reviewed By: itomatik Differential Revision: D16044736 fbshipit-source-id: 9afb85209b0de1038c5153bcb7dfc5f52e0b2abb	2019-07-03 10:33:40 -07:00
Duke Vijitbenjaronk	d684112ec9	Output sequence probability with CTC beam search, optional multiple output sequences (#21927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21927 Add `OUTPUT_PROB` output to CTCBeamSearchDecoderOp to return a probability for each sequence. Add argument to output top-k instead of top-1 decoded sequences. Reviewed By: SuperIRabbit Differential Revision: D15797371 fbshipit-source-id: 737ca5cc4f90a0bcc3660ac9f58519a175977b69	2019-07-02 17:29:13 -07:00
Alyssa Wang	34f950c800	Create C2 operator to replace values in embedding table (#22279 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22279 This new operator is used for embedding table weight re-init. After we get the evicted indices, they will be the rows need reseting in embedding table. Then we can create a 1d tensor with default values, and apply this operator to copy the tensor to all evicted rows in embedding table Will add gradient op in next diff Reviewed By: itomatik Differential Revision: D15709866 fbshipit-source-id: 2297b70a7326591524d0be09c73a588da245cc08	2019-07-02 15:26:22 -07:00
Xiaomeng Yang	10e4137396	Optimize InstanceNormGradientOp (#22288 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22288 Optimize InstanceNormGradientOp Benchmarks: CPU with [N, C, H, W] = [128, 256, 56, 56], NCHW order: 616ms -> 128ms NHWC order: 1612ms -> 174ms GPU with [N, C, H, W] = [128, 256, 112, 112], NCHW order: 6450ms -> 37ms NHWC order: 1419ms -> 82ms Reviewed By: houseroad Differential Revision: D16023630 fbshipit-source-id: 5af9bf1103cde2fc2bcb5cd5a057d039732f052e	2019-07-01 15:10:17 -07:00
Xiaomeng Yang	29b53b0259	Fix bug in caffe2 transpose on GPU (#22233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22233 Fix bug in caffe2 transpose on GPU Reviewed By: hl475 Differential Revision: D15994973 fbshipit-source-id: 542dc8757b51a6322fffa55826c1d4e32927398d	2019-06-26 11:33:25 -07:00
Sungmann Cho	f59581218f	Fix spelling errors (#21665 ) Summary: alloctor -> allocator excutable -> executable excution -> execution foward -> forward initiaize -> initialize paralell -> parallel preprocesor -> preprocessor tranpose -> transpose Pull Request resolved: https://github.com/pytorch/pytorch/pull/21665 Differential Revision: D15806155 Pulled By: soumith fbshipit-source-id: d92b21ec8650a2b32f05faf9af0b7d2b073e992c	2019-06-13 15:21:55 -07:00
David Zhang	696b2c89b4	Adding gradient to Boolean Mask operator (#21423 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21423 - add gradient for boolean mask - add test for gradient checking Reviewed By: BIT-silence Differential Revision: D15640036 fbshipit-source-id: 79f40c6901e805bf1b8e9b01b57903e30b00f654	2019-06-06 20:48:47 -07:00
David Zhang	cb2ec07fa2	ReshapeOp supports empty tensor (#21230 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21230 tsia; we support empty tensor with this diff for reshape operator Reviewed By: jerryzh168 Differential Revision: D15583356 fbshipit-source-id: 6d44c04e95ca3546509bfb12102e29c878f9a7c7	2019-06-06 15:02:11 -07:00
Hong Xu	da4f3629c5	Add missing shebangs to Python files with executable permissions. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21305 Differential Revision: D15613078 Pulled By: ezyang fbshipit-source-id: 1fedf4368d65db406b617a51402ee8a20968aff7	2019-06-06 10:53:40 -07:00
Yanghan Wang	81e70ffa19	fix bug of not using get_score_cls_index in BoxWithNMSLimitOp (#20868 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20868 When `input_boxes_include_bg_cls` is false (which means `input_scores_fg_cls_starting_id` is 0), It doesn't map the class index of score currectly when sorting and limiting the detections over all classes after nms. Reviewed By: newstzpz Differential Revision: D15472706 fbshipit-source-id: dc1e808b63ad09fb4bd95acf866771bb3fa92d69	2019-05-24 22:31:01 -07:00
Yanghan Wang	371bd043d6	register ResizeNearestOp to C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20928 Reviewed By: smessmer Differential Revision: D15499661 fbshipit-source-id: 5af24d5c9d7ff739b8355e19dfe66b496bc026a5	2019-05-24 14:39:11 -07:00
Kittipat Virochsiri	fd2aa93b37	Exposing LengthsSum/Mean/Max in pytorch (#20802 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20802 Need this for sequence model Reviewed By: dzhulgakov Differential Revision: D15448529 fbshipit-source-id: cd5abe3b689fc0e02feff10faf8cd61c99369f4f	2019-05-22 13:55:19 -07:00
Huan Gui	fbdafdffa1	Move bucketize_op to open source Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19952 Reviewed By: houseroad Differential Revision: D15145552 fbshipit-source-id: e0074c878a5c164324a9cc477783285dedffd188	2019-05-20 18:03:27 -07:00
Jongsoo Park	ea9c6e7581	eliminate FE_INVALID in unit test (#20502 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20502 Following D15307410 removing more floating point exceptions in unit tests Reviewed By: hx89 Differential Revision: D15340930 fbshipit-source-id: 269fc75e0800bc9d39126767a0f3ca15cd8b0cad	2019-05-16 21:55:28 -07:00
Yanghan Wang	373e6a78bf	make box plus one a legacy argument in detection ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20550 Reviewed By: newstzpz Differential Revision: D15348610 fbshipit-source-id: 12b1e119e9bc9191ba9f2aa6d695ef215780c349	2019-05-16 18:17:12 -07:00
Yanghan Wang	61012080c8	split and register CollectAndDistributeFpnRpnProposals with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20509 Reviewed By: newstzpz Differential Revision: D15302181 fbshipit-source-id: 7d3b29b667cd900f2976101f35200e1ee20b0f64	2019-05-16 13:40:46 -07:00
Jongsoo Park	5f8e849d84	eliminate FE_INVALID in optimizer related operators and tests (#20501 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20501 Fixing unit tests related to optimizer related operators and tests Reviewed By: hx89 Differential Revision: D15307410 fbshipit-source-id: e5400c26e08f26191ee542fe6b02e0a69bc4e1ae	2019-05-16 08:23:46 -07:00
David Reiss	1891614aa5	Add GivenTensorInt16Fill (#20515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20515 Needed by the upcoming quantized version of GenerateProposals Reviewed By: dzhulgakov Differential Revision: D14430952 fbshipit-source-id: ea852f04cc4b070f8fbe7a1e6535bba4d5b230fd	2019-05-15 19:45:15 -07:00
Cheng Cheng	fd18b89c98	shape inference for learning rate op (#20020 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20020 Add shape inference for LearningRate op. The output (lr) should have similar shape with input (iteration), but not the same type (float vs int). Reviewed By: un-disclosed Differential Revision: D15112300 fbshipit-source-id: 09969aefa15172a6f3c70cd9b2548e3020da5d7a	2019-05-14 23:34:32 -07:00
Bilge Acun	3ee97183b0	ScaleBlobs Operator (#19660 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19660 Implementation of aggregated Scale operator. The operator takes a list of tensors as an input and scales all of them them with the argument float value. The tensor sizes can be different, therefore bookkeeping of the sizes and pointers to the tensors are necessary for the GPU version of the kernel. Reviewed By: BIT-silence Differential Revision: D14984233 fbshipit-source-id: 37cc97159a4f2c38cd6fff4f5710ab7d3a773611	2019-05-08 17:57:33 -07:00
Jongsoo Park	42e9a619b3	add decay parameter in ref_adagrad (#15329 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15329 Add decay parameter to match with C++ Adagrad implementation. Reviewed By: chocjy Differential Revision: D13300991 fbshipit-source-id: db734df0202d8f5fd156f2742207d0b5a3aa7348	2019-05-07 18:58:58 -07:00
Xue Feng	23ba0561c3	Add Gate Policy GateLearningRateOp (#20044 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20044 We do not have a gating functor. This diff adds it. I'm leveraging existing learning rate op because there are other policies I'll need to use as a union together. * Since there are other policy in LearningRateOp which will be used as a union, I chose to add it as a LearningRateOp. * constantwarmup cannot do step function of nonzero first and zero later * There are multiple uses for it, * e.g. as a gating blob generator that is useful for turning off. * e.g. as a learning rate switcher at certain iteration. * For generalizability, no regulation or constraint is applied on the range of the values * see figure below for illustration {F157366621} Reviewed By: ccheng16 Differential Revision: D15178229 fbshipit-source-id: 1e66e9a4bc1bfb946a57f8aefc97d8170f6be731	2019-05-05 20:11:04 -07:00
Xiaomeng Yang	271f005eeb	Add elementwise_affine for LayerNormGradientOp (#19982 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19982 Add elementwise_affine for LayerNormGradientOp Reviewed By: houseroad Differential Revision: D15157493 fbshipit-source-id: 7465f2c1d4df4649b4903b93483c4861e9c7afa9	2019-05-03 15:33:46 -07:00
Yanghan Wang	a285cbcccf	support different class modes for bbox in box_with_nms_limit_op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19820 Reviewed By: newstzpz Differential Revision: D15112955 fbshipit-source-id: a757622a32cff7159c39735607103138dbbafc24	2019-04-30 16:02:44 -07:00
Chandler Zuo	472be69a73	Avoid Output Uninitialized Blobs in Load with load_all=1 (#19133 ) Summary: When output blob names are specified while load_all=1, output blob names are ignored. However, this behavior is not documented. In this diff, we just disallow users to provide blob names when load_all=1. See discussion at https://fb.workplace.com/groups/1405155842844877/permalink/2714909788536136/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/19133 Reviewed By: dzhulgakov Differential Revision: D14883698 Pulled By: chandlerzuo fbshipit-source-id: 6e4171e36c4ccc4f857e79da98b858a06b7d8ad6	2019-04-27 10:45:44 -07:00
Xiaomeng Yang	2ce39de3fc	Add elementwise_affine for layer_norm_op (#19713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19713 Add elementwise_affine for layer_norm_op Reviewed By: houseroad Differential Revision: D15075454 fbshipit-source-id: e8a7d3da1c81e49fa55323f5e74a68bc4ef8d83f	2019-04-26 17:20:01 -07:00
Xiaomeng Yang	f5fe7aa0b2	Fix relu bug for empty tensor (#19451 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19451 Fix relu bug for empty tensor Reviewed By: xianjiec Differential Revision: D15009811 fbshipit-source-id: b75e567c3bec08d7d12b950d8f1380c50c138704	2019-04-19 15:21:07 -07:00
Yinghai Lu	f1f31b634d	Eliminate AdjustBatch ops (#19083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19083 As we have discussed, there are too many of AdjustBatch ops and they incur reallocation overhead and affects the performance. We will eliminate these ops by - inling the input adjust batch op into Glow - inling the output adjust batch op into OnnxifiOp and do that only conditionally. This is the C2 part of the change and requires change from Glow side to work e2e. Reviewed By: rdzhabarov Differential Revision: D14860582 fbshipit-source-id: ac2588b894bac25735babb62b1924acc559face6	2019-04-17 10:00:25 -07:00
Huamin Li	c480798a1c	use C10_REGISTER for GELU op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19090 Reviewed By: BIT-silence Differential Revision: D14864737 fbshipit-source-id: 8debd53171f7068726f0ab777a13ca46becbfbdf	2019-04-12 11:41:04 -07:00
Xiaomeng Yang	fd40c0eba0	Add gelu op (#18992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18992 Add gelu op Reviewed By: houseroad Differential Revision: D14814811 fbshipit-source-id: 00f126b8b83763c57ebbf28fbd2de5a8fab6d491	2019-04-08 21:58:29 -07:00
Lu Fang	443a58e03d	Export C10 operator in PyTorch Model (#18210 ) Summary: Almost there, feel free to review. these c10 operators are exported to _caffe2 domain. TODO: - [x] let the onnx checker pass - [x] test tensor list as argument - [x] test caffe2 backend and converter - [x] check the c10 schema can be exported to onnx - [x] refactor the test case to share some code - [x] fix the problem in ONNX_ATEN_FALLBACK Pull Request resolved: https://github.com/pytorch/pytorch/pull/18210 Reviewed By: zrphercule Differential Revision: D14600916 Pulled By: houseroad fbshipit-source-id: 2592a75f21098fb6ceb38c5d00ee40e9e01cd144	2019-04-08 16:06:00 -07:00
Xiaomeng Yang	b145dcca04	Add support for group ConvTranspose (#18794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18794 Add support for group ConvTranspose Reviewed By: houseroad Differential Revision: D14741327 fbshipit-source-id: 5d947ca044bf8495dd7f8f56122441ebbcc6c7e4	2019-04-04 11:52:06 -07:00
Duc Ngo	16f07d7dac	caffe2 - set up correct inheritance structure for remaining operator test classes (#18622 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18622 Set up correct inheritance structure for remaining operator test classes Reviewed By: ezyang Differential Revision: D14685941 fbshipit-source-id: a6b1b3be325935b7fec7515be13a4994b3016bf0	2019-04-01 15:53:22 -07:00
Yanghan Wang	f4e35d30ed	register BoxWithNMSLimit with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17956 Reviewed By: houseroad Differential Revision: D14417300 fbshipit-source-id: eb5e2ba84513b3b7bfa509dc442424b13fe9148f	2019-03-29 13:41:40 -07:00
Ahmed Aly	9eb0f435d9	Inference LSTM integration test (#18559 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18559 Adding integration test for inference LSTM Reviewed By: houseroad Differential Revision: D14656698 fbshipit-source-id: 80fb2a72be30fcb695f4471b72bf9d6e3965bf81	2019-03-28 11:31:06 -07:00
Duc Ngo	6a1a019c0a	caffe2 - support flaky operator tests for caffe2 build (#18155 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18155 - Make a python decorator caffe2_flaky for caffe2 operator unit tests. - The environment variable CAFFE2_RUN_FLAKY_TESTS are now used to mark flaky test mode During test run, - If flaky tests mode are on, only flaky tests are run - If flaky tests mode are off, only non-flaky tests are run Mark ctc_beam_search_decoder_op_test as flaky Reviewed By: ezyang, salexspb Differential Revision: D14468816 fbshipit-source-id: dceb4a48daeb5437ad9cc714bef3343e9761f3a4	2019-03-25 16:58:34 -07:00
Gerard Goossen	46990c20fa	Verify def before infer fensor (#18129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18129 A lot of tensor interference function assume the operator passes the schema. So call Verity to make sure this is actually the case. Created diff before to add checking in Concat (https://github.com/pytorch/pytorch/pull/17110), but I encountered lot more places where this is assumed (for example ElementwiseOpShapeInference) Reviewed By: mdschatz Differential Revision: D14503933 fbshipit-source-id: cf0097b8c3e4beb1cded6b61e092a6adee4b8fcb	2019-03-22 06:36:25 -07:00
Jongsoo Park	c7448aa13c	remove unused parameters in optimizer tests (#18084 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18084 data_strategy parameter was not used in some of unit tests for optimizers Reviewed By: hyuen Differential Revision: D14487830 fbshipit-source-id: d757cd06aa2965f4c0570a4a18ba090b98820ef4	2019-03-15 18:06:15 -07:00
Sebastian Messmer	7a3488e0fc	Expose c10 cuda ops to caffe2 (#18036 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18036 - Add macros to export c10 cuda operators to caffe2 frontend - Instead of having a separate caffe2 registry for the c10 operator wrappers, use the existing caffe2 registries Reviewed By: ezyang Differential Revision: D14467495 fbshipit-source-id: 7715ed2e38d2bbe16f1446ae82c17193a3fabcb9	2019-03-15 16:58:12 -07:00
Yanghan Wang	53fb9a462a	register RoIAlign with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17889 Reviewed By: smessmer Differential Revision: D14411630 fbshipit-source-id: c3b7941d725ae2c78e8d79f52a7983db92b75807	2019-03-14 11:55:29 -07:00
Jongsoo Park	8bd9465b79	make momentum non negative in adagrad test (#18009 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18009 momentum should be initialized with non-negative values Reviewed By: hyuen Differential Revision: D14450841 fbshipit-source-id: 5bbbd11645db9e6f2dc42b26a00ff3caf378c59f	2019-03-14 03:15:07 -07:00
Xiaomeng Yang	54b33503ec	Optimize channel_stats_op (#16243 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16243 Optimize channel_stats_op and add NHWC impl Reviewed By: takatosp1 Differential Revision: D13775515 fbshipit-source-id: decb889e646f5316d4afefdf9f9b6bc6343613cd	2019-03-12 12:08:00 -07:00
youkaichao	b87abdfc12	typo fix Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17653 Differential Revision: D14302003 Pulled By: ezyang fbshipit-source-id: 8ad90985a392b07127c7e315d4e74ce77962b573	2019-03-06 11:36:44 -08:00
Deepali Chourasia	e3516d0a95	omit group conv NHWC test for GPU (#17715 ) Summary: Observed the test `TestGroupConvolution.test_group_convolution` to fail with the following error: ``` Falsifying example: test_group_convolution(self=<caffe2.python.operator_test.group_conv_test.TestGroupConvolution testMethod=test_group_convolution>, stride=3, pad=0, kernel=5, size=8, group=4, input_channels_per_group=7, output_channels_per_group=8, batch_size=2, order='NHWC', engine='', use_bias=False, gc=, dc=[, device_type: 1]) You can reproduce this example by temporarily adding reproduce_failure('3.59.1', b'AAAA') as a decorator on your test case ``` This example generated by hypothesis has `group=2, order='NHWC' and dc=[, device_type: 1])`. I think this example should be skipped. I have mimicked the change corresponding to [PR#13554](https://github.com/pytorch/pytorch/pull/13554) to skip this example. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17715 Differential Revision: D14346642 Pulled By: ezyang fbshipit-source-id: b1f1fef09f625fdb43d31c7213854e61a96381ba	2019-03-06 11:32:35 -08:00
Sebastian Messmer	910519e45b	Expose cuda kernel for caffe2::GenerateProposals Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17066 Reviewed By: ezyang, wat3rBro Differential Revision: D14071130 fbshipit-source-id: 6fe26503f6069c36ec31d6c09b549b932d5db242	2019-03-04 14:59:08 -08:00
rohithkrn	8c72217817	Enable boolean_mask, adadelta, adagrad fp16 on ROCm (#17235 ) Summary: - Fix bugs, indentation for adadelta and adagrad tests to enable fp16 - Enable boolean_mask fp16 on ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/17235 Differential Revision: D14240828 Pulled By: bddppq fbshipit-source-id: ab6e8f38aa7afb83b4b879f2f4cf2277c643198f	2019-02-27 10:07:36 -08:00
Peizhao Zhang	54e4c4d7de	Removed obsolete argument correct_transform_coords in bbox_transform op. (#16723 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16723 Removed obsolete argument correct_transform_coords in bbox_transform op. * It was only for backward compatibility. We should not have models using it now. Differential Revision: D13937430 fbshipit-source-id: 504bb066137ce408c12dc9dcc2e0a513bad9b7ee	2019-02-20 13:22:33 -08:00
Sebastian Messmer	9696fee635	Register CUDA kernels for caffe2 operators (#16691 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16691 Previous diffs already introduced a macro that registers caffe2 CPU kernels with c10. This now also registers the CUDA kernels with it. Reviewed By: bwasti Differential Revision: D13901619 fbshipit-source-id: c15e5b7081ff10e5219af460779b88d6e091a6a6	2019-02-12 17:24:01 -08:00
Sebastian Messmer	920c684367	Expose GenerateProposals to PyTorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16880 Reviewed By: bwasti Differential Revision: D13998092 fbshipit-source-id: 23ab886ba137377312557fa718f262f4c8149cc7	2019-02-11 14:15:47 -08:00
Sebastian Messmer	0c02d317ea	Expose BBoxTransform to pytorch Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16879 Reviewed By: bwasti Differential Revision: D13998093 fbshipit-source-id: ddfe4bff83e9a1a4cedf1e520e6d2977b21cb3af	2019-02-11 14:15:45 -08:00
peter.yeh@amd.com	c65b03b9f8	Enable arg_ops_test/unique_ops_test on AMD/rocm (#16853 ) Summary: Verified both tests are passing on rocm 2.1 env. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16853 Differential Revision: D13996279 Pulled By: bddppq fbshipit-source-id: c0df610d7d9ca8d80ed2d1339cdadef59105a71c	2019-02-07 16:51:15 -08:00
Sebastian Messmer	64339dbd51	Fix and re-enable test case (#16643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16643 The test was disabled in D13908117 because it conflicted with another diff that was about to land. Now fixed the merge conflict and re-landing it. Reviewed By: ezyang Differential Revision: D13911775 fbshipit-source-id: b790f1c3a3f207916eea41ac93bc104d011f629b	2019-02-07 13:58:16 -08:00
Sebastian Messmer	6750e1e3e9	C10_REGISTER_CAFFE2_OPERATOR: Macro for registering c2 kernels (#16548 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16548 With this macro, a caffe2 operator can now directly be registered with c10. No need to write custom wrapper kernels anymore. Differential Revision: D13877076 fbshipit-source-id: e56846238c5bb4b1989b79855fd44d5ecf089c9c	2019-02-07 13:58:14 -08:00
rohithkrn	aa88c2c0b6	Unify gpu_support variable in python tests (#16748 ) Summary: Assign `has_gpu_support = has_cuda_support or has_hip_support` and make according changes in python tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16748 Differential Revision: D13983132 Pulled By: bddppq fbshipit-source-id: ca496fd8c6ae3549b736bebd3ace7fa20a6dad7f	2019-02-07 00:29:51 -08:00
Yinghai Lu	e5e0bf4152	Add AdjustBatch Op (#16676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16676 This op is used for changing batch size (first dimension) of the tensor. Reviewed By: bertmaher, ipiszy Differential Revision: D13929200 fbshipit-source-id: 4f2c3faec072d468be8301bf00c80d33adb3b5b3	2019-02-06 19:15:41 -08:00
Jongsoo Park	929cd23da1	no EIGEN engine for DeformConv (#16785 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16785 There's no EIGEN engine implemented for DeformConv but unit test was checking it. Reviewed By: BIT-silence Differential Revision: D13967306 fbshipit-source-id: e29c19f59f5700fc0501c59f45d60443b87ffedc	2019-02-06 11:59:31 -08:00
Jongsoo Park	8d4b2db529	format deform_conv_test.py (#16786 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16786 Format to prepare D13967306 Reviewed By: BIT-silence Differential Revision: D13967317 fbshipit-source-id: 2de895f8474b04c55ba067fbf788c553dc010c60	2019-02-06 11:59:29 -08:00
Edward Yang	a3f600e394	Revert D13854304: [redo][c10] LayerNorm Registration Example Differential Revision: D13854304 Original commit changeset: ec463ce22721 fbshipit-source-id: 4262b9a2ef486e1c7c0283ea021331ac97cc5f56	2019-02-06 08:26:23 -08:00
Edward Yang	fc0e88dd77	Revert D13855525: [c10] Expose RoIAlign to torch Differential Revision: D13855525 Original commit changeset: cfee7bb1544d fbshipit-source-id: 0b4124b78c4082b52e592a1275069c879a9aed39	2019-02-06 08:26:22 -08:00
Edward Yang	33a6a7a3ea	Revert D13856086: [c10] Expose GenerateProposals to torch Differential Revision: D13856086 Original commit changeset: a4873646a71a fbshipit-source-id: 79b634426404236ddbc407d3796a350ad3dae5ca	2019-02-06 08:26:20 -08:00
Edward Yang	018485130f	Revert D13864292: [c10] Expose BBoxTransform to pytorch Differential Revision: D13864292 Original commit changeset: 1f57664e7834 fbshipit-source-id: 37663b7e8213185ecaa5c219076fc7de64704549	2019-02-06 08:26:18 -08:00
Edward Yang	c0a7bf94ed	Revert D13865221: [c10] Expose BoxWithNMSLimit Differential Revision: D13865221 Original commit changeset: 8a3f1d420183 fbshipit-source-id: 0057be9619b660dcad8c01bae67b54400127577e	2019-02-06 08:26:17 -08:00
Edward Yang	cda43336d4	Revert D13866214: [c10] Expose HeatmapMaxKeypoints to torch Differential Revision: D13866214 Original commit changeset: 2ca79037fc07 fbshipit-source-id: d2c653f4f32cf0ea76875888f3523c0dc7db9960	2019-02-06 08:26:16 -08:00
Bram Wasti	a9713d07b0	Expose HeatmapMaxKeypoints to torch (#16528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16528 .. Reviewed By: smessmer Differential Revision: D13866214 fbshipit-source-id: 2ca79037fc070bade5542345af5ce09f88beda44	2019-02-05 12:56:58 -08:00
Bram Wasti	3df7b321cc	Expose BoxWithNMSLimit (#16529 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16529 .. Reviewed By: smessmer Differential Revision: D13865221 fbshipit-source-id: 8a3f1d420183ed5ae51b3c9e4eb6e033078c7ae4	2019-02-05 12:56:56 -08:00
Bram Wasti	add39b85cc	Expose BBoxTransform to pytorch (#16530 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16530 .. Reviewed By: smessmer Differential Revision: D13864292 fbshipit-source-id: 1f57664e78347e72c0087aa3d825a6a9517c1945	2019-02-05 12:56:54 -08:00
Bram Wasti	f33a2b960e	Expose GenerateProposals to torch (#16477 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16477 expose generateproposals to torch Reviewed By: smessmer Differential Revision: D13856086 fbshipit-source-id: a4873646a71a6b6c01740d21729e827f4b36588f	2019-02-05 12:56:52 -08:00
Bram Wasti	f5d4636021	Expose RoIAlign to torch (#16476 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16476 enable calling roialign (caffe2) from torch frontend Reviewed By: smessmer Differential Revision: D13855525 fbshipit-source-id: cfee7bb1544dc58df4231604ba01d61ca905ae3f	2019-02-05 12:56:50 -08:00
Bram Wasti	240240bb10	LayerNorm Registration Example (#16478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16478 This diff includes an example registration of a caffe2 op in torch. A previous attempt ran into a static initialization order bug. Reviewed By: smessmer Differential Revision: D13854304 fbshipit-source-id: ec463ce2272126d08a5163d1599361ee5b718bbc	2019-02-05 12:56:48 -08:00
Sebastian Messmer	f36f3cce9a	Simplify layer_norm_op_test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16570 Reviewed By: ezyang Differential Revision: D13883913 fbshipit-source-id: 7437d3cbc00c0de92bb01562c620cb658aa9f0d3	2019-02-01 21:34:18 -08:00
Xiaomeng Yang	4ae9ab24b6	Update conv_base to support empty batch (#16603 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16603 Update conv_base to support empty batch Reviewed By: houseroad Differential Revision: D13894111 fbshipit-source-id: fc4370ff16ba6046f374e77bd845d28e6af05ea3	2019-01-31 23:46:18 -08:00
Dmytro Dzhulgakov	51752e09c6	Disable layernorm_c10 test for now (#16630 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16630 two PRs landed concurrently - enforcing tensor constraints and refactoring c10. Since it's not a prod code - disable test and I'll let Sebastian to fix it properly. Reviewed By: ezyang Differential Revision: D13908117 fbshipit-source-id: 381c5626078b794afa1fc7a95cb1ea529650424c	2019-01-31 15:47:13 -08:00
Sebastian Messmer	c43917b0a3	Add a test case calling caffe2 layer_norm from caffe2 executor but through the c10 dispatcher Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16283 Reviewed By: ezyang Differential Revision: D13792591 fbshipit-source-id: 9c190649e38e8706549102b2e136ceaf508eb37f	2019-01-30 13:16:47 -08:00
Sebastian Messmer	80f4374dde	Handle stack correctly (#16246 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16246 The op schema says it returns multiple values, so let's actually return multiple values instead of one tuple. For some reason, this did work when called from python (probably some auto-unpacking), but once called from JIT, it segfaulted. This diff fixes that. Reviewed By: dzhulgakov Differential Revision: D13780147 fbshipit-source-id: fe94f82f4c53b7454f77c4484fca4ac9dc444475	2019-01-28 11:46:03 -08:00
Juan Miguel Pino	41e9b092a9	Revert D13821061: [redo][c10] layernorm example Differential Revision: D13821061 Original commit changeset: 82f0dade0145 fbshipit-source-id: e5b0b1bab0c9e731ae04add35e9a6c91656dd178	2019-01-25 22:52:04 -08:00
Bram Wasti	27a1ba3ef2	layernorm example (#16374 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16374 this fixes the original attempt in OSS (adds to CMake and python build files) Reviewed By: smessmer Differential Revision: D13821061 fbshipit-source-id: 82f0dade0145fd04bdf8e3cb3954b5790e918162	2019-01-25 16:52:33 -08:00
Bram Wasti	958f846fb3	Back out "[c10] layernorm example" Summary: Original commit changeset: 87240ca7f48d Reviewed By: bddppq Differential Revision: D13816657 fbshipit-source-id: bafcf0779d811c7e4a134cfb323a89352fa8c180	2019-01-25 10:22:30 -08:00
Bram Wasti	265ed8ff45	layernorm example (#16350 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16350 Example usage of the new caffe2 integration Reviewed By: smessmer Differential Revision: D13408546 fbshipit-source-id: 87240ca7f48d653a70241d243aa0eb25efa67611	2019-01-24 22:28:22 -08:00
Jongsoo Park	6700eff03e	disable testing group conv with EIGEN engine (#16335 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16335 group conv is not implemented with EIGEN engine so this diff disables related tests Reviewed By: jamesr66a Differential Revision: D13807204 fbshipit-source-id: 41f6de43da40882f57e64474520e185733caefb7	2019-01-24 16:39:20 -08:00
Jongsoo Park	f0dd85d141	reduce parameter space of test_1x1_conv to avoid timeout (#16223 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16223 As title says Reviewed By: jamesr66a Differential Revision: D13758202 fbshipit-source-id: 3cdffb80a5dad53b29e65e8eb0ae128edba70dbb	2019-01-24 14:17:11 -08:00
bddppq	1a09a2a27f	Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16106 Differential Revision: D13709490 Pulled By: bddppq fbshipit-source-id: 1b5b32261f06543371f7bd7ac9b11957a5eb4ad0	2019-01-17 09:18:08 -08:00
Xiaomeng Yang	7536887cb7	Add count_include_pad for avg_pool on CuDNN (#16100 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16100 Add count_include_pad for avg_pool on CuDNN Reviewed By: houseroad Differential Revision: D13707959 fbshipit-source-id: 261f5d116066fef75cf9a5787dfbc5d12b5b9f9b	2019-01-17 02:10:12 -08:00
Xiaomeng Yang	7a5f782c2e	Fix max_pool_grad test (#16088 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16088 Fix max_pool_grad test Reviewed By: houseroad Differential Revision: D13700917 fbshipit-source-id: f4f942ee920bcd943c38a8f8a6aafd1d13c4515f	2019-01-16 15:32:27 -08:00
Xiaomeng Yang	13f38ab79d	Add count_include_pad to average_pool_gradient_op (#15997 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15997 Add count_include_pad to average_pool_gradient_op Reviewed By: houseroad Differential Revision: D13648339 fbshipit-source-id: 205cb2acb32dc24a85256b628298b1a11f0ffa2c	2019-01-15 16:56:40 -08:00
Sebastian Messmer	57b5e7572b	Test cases for calling caffe2 LayerNorm from PyTorch and JIT Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15895 Reviewed By: dzhulgakov Differential Revision: D13615336 fbshipit-source-id: de28fef8ce025d6d37a4c80c029ec97b7195cfd9	2019-01-15 12:03:57 -08:00
Sebastian Messmer	4ed9de8680	Remove code duplication (#15880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15880 The layer_norm reference was implemented twice. Removing one of them. Reviewed By: dzhulgakov Differential Revision: D13611232 fbshipit-source-id: cee96c78d3255c3a4e34300693bf9260cf096615	2019-01-14 17:59:37 -08:00
Jesse Hellemn	8964a2e6e6	Split Caffe2 CI into cmake-only and python builds (#15917 ) Summary: bypass-lint - Change all Caffe2 builds to use setup.py instead of cmake - Add a -cmake- Caffe2 build configuration that uses cmake and only builds cpp - Move skipIfCI logic from onnx test scripts to the rest of CI logic - Removal of old PYTHONPATH/LD_LIBRARY_PATH/etc. env management Pull Request resolved: https://github.com/pytorch/pytorch/pull/15917 Reviewed By: orionr Differential Revision: D13637583 Pulled By: pjh5 fbshipit-source-id: c5c5639db0251ba12b6e4b51b2ac3b26a8953153	2019-01-14 15:20:44 -08:00
Sergei Nikolaev	a282378baf	Caffe 2: Reshape Op upgrade (#15380 ) Summary: This is follow up on #13945 where we had to turn off some TRT tests because some ops were not ready to accept ONNX opset 9+ models. This PR fixes Reshape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15380 Differential Revision: D13649825 Pulled By: houseroad fbshipit-source-id: b72e62803de5b63cc001c3fe4b3bf64dfa996e94	2019-01-13 22:49:40 -08:00
Andre Georg Holzner	961f829067	deduplicated code in elementwise_op_broadcast_test.py (#15865 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15865 factored out code used in tests for operators Add, Mul and Sub into two new methods: a first one to generate the test vectors, a second one to run the actual tests given a caffe2 and python operator. Reviewed By: houseroad Differential Revision: D13526955 fbshipit-source-id: 8970ba5a1305ca19a54a14b51816d4a19f19d678	2019-01-09 03:07:22 -08:00
David Carrillo Cisneros	2b22612289	Add NHWC support to Resize Operator (#15553 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15553 Add unit test and implementation of NHWC layout for Resize operator. Also, add pragma parallel loop to old NCHWC layout. Reviewed By: jspark1105 Differential Revision: D13540762 fbshipit-source-id: eebf252bf0d1efdff180a171d804181045f100a5	2019-01-08 16:44:17 -08:00
Jongsoo Park	1159302ab1	bug fix in 3d group conv (#15625 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15625 3D group conv (both NCHW and NHWC layout) was not correct. Added group=2 in test_1d_convolution and test_3d_convolution in conv_test Reviewed By: protonu Differential Revision: D13562099 fbshipit-source-id: 586e8a7574a2764f2a3b559db6c2415b3ab90453	2019-01-03 09:46:49 -08:00
Jongsoo Park	bee6c6761e	format conv_test.py to prepare D13562099 (#15632 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15632 Just formatting and a few lints. Reviewed By: yinghai Differential Revision: D13562403 fbshipit-source-id: c56f8ee61f68cdaccc0828a764ff729454f68259	2019-01-02 11:34:30 -08:00
Jongsoo Park	d53012b4fe	add NCHW2NHWC and NHWC2NCHW in utils.py (#15588 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15588 Use NHWC2NCHW or NCHW2NHWC functions which is easier to understand compared to code using transpose and generalizable to non-2D convolutions. Reviewed By: csummersea Differential Revision: D13557674 fbshipit-source-id: c4fdb8850503ea58f6b17b188513ae2b29691ec0	2018-12-28 17:34:50 -08:00
Jongsoo Park	4e4ef0cffb	add rowwise adagrad lp test (#15082 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15082 We didn't have unit test for low-precision rowwise adagrad Reviewed By: chocjy Differential Revision: D13300732 fbshipit-source-id: 46e7bdfc82c5a6855eeb6f653c0a96b0b3a20546	2018-12-22 10:25:39 -08:00
Jongsoo Park	e012b183dd	handle empty inputs to SparseLengthsMean correctly (#15389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15389 SparseLengthsMean was generating uninitialized data for empty inputs (lengths == 0). We should return zeros. The unit tests were also not covering this special case which is fixed by this diff. Reviewed By: salexspb Differential Revision: D13515970 fbshipit-source-id: 3c35265638f64f13f0262cee930c94f8628005da	2018-12-21 22:20:14 -08:00
Jongsoo Park	f52f68bcf9	format specialized_segment_ops_test.py to prepare D13515970 (#15408 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15408 Applied formatting to specialized_segment_ops_test.py to prepare D13515970 Reviewed By: salexspb Differential Revision: D13520300 fbshipit-source-id: c3250b6abe8087c607f65ae60d1da61bd46c342b	2018-12-20 23:44:47 -08:00
Edward Yang	26b04523b1	Record Caffe2's current stream ID in c10_cuda. (#15174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15174 Previously, Caffe2 maintained a separate per-thread per-device current logical CUDA stream ID. In this PR, we switch Caffe2 over to using c10::Stream to manage the current stream, and also manage the allocation of cudaStream_t objects. This results in a slight behavior change: previously, Caffe2 would have been willing to allocate an arbitrary number of CUDA streams, depending on how high the logical stream IDs went. The c10::Stream pool has a fixed number of streams, once you exceed it, it wraps around. Reviewed By: dzhulgakov Differential Revision: D13451550 fbshipit-source-id: da6cf33ee026932a2d873835f6e090f7b8a7d8dc	2018-12-20 21:54:05 -08:00
Bill Li	3681bf7cff	add dense vector to id_list operator (#15090 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15090 as title step 2 of the linked task Reviewed By: ellie-wen Differential Revision: D13425977 fbshipit-source-id: f3538ed68f42470ba39c5b779af764d4a5591a9d	2018-12-18 16:27:38 -08:00
rohithkrn	763b9954f3	FP16MomentumSGDUpdate Op fix and enable for ROCm (#15150 ) Summary: 1. Fix a bug in FP16MomentumSGDUpdate operator 2. Enable operator for ROCm Pull Request resolved: https://github.com/pytorch/pytorch/pull/15150 Differential Revision: D13473145 Pulled By: bddppq fbshipit-source-id: 4c5c5f30cb9bba658e3639dbe193fa08a304d306	2018-12-14 16:33:45 -08:00
Xianjie Chen	fabd23cb2d	support casting to string (#15110 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15110 support casting to string on CPU Reviewed By: intermilan Differential Revision: D13429381 fbshipit-source-id: b737a1ba1237b10f692d5c42b42a544b94ba9fd1	2018-12-12 21:33:58 -08:00
Jongsoo Park	cff509e2b1	share code between adagrad and rowwise adagrad tests (#14692 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14692 Remove some code duplication Reviewed By: chocjy Differential Revision: D13296731 fbshipit-source-id: 5924e037ca64fc4b89234be922bc5ca47fb8bd32	2018-12-10 22:10:39 -08:00
bddppq	45dfc6764e	Enable more caffe2 fp16 rocm tests (#15040 ) Summary: cc rohithkrn petrex Pull Request resolved: https://github.com/pytorch/pytorch/pull/15040 Reviewed By: houseroad Differential Revision: D13413068 Pulled By: bddppq fbshipit-source-id: b2967f16f8da0b9e80083138fb8632c14e9e9b63	2018-12-10 21:30:21 -08:00
rohithkrn	7e2b074219	Integrate rocBLAS fp16 api into Caffe2 (#14882 ) Summary: This PR integrates rocBLAS half and mixed precision APIs in to Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14882 Differential Revision: D13407840 Pulled By: bddppq fbshipit-source-id: 75cb0d74da066776fa66575f1d255e879d36121e	2018-12-10 17:54:06 -08:00
rohithkrn	11a9248d01	Enable fp16 for MIOPEN operators in Caffe2 (#14905 ) Summary: This PR enables fp16 MIOPEN operators in Caffe2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14905 Differential Revision: D13383439 Pulled By: bddppq fbshipit-source-id: 840afa8d08bef2952ca0039dee2423f1542bb330	2018-12-07 17:26:44 -08:00
lcskrishna	12addc64a6	Fixed MIOpen RNN Segfault issue and enabled RNN test (#14810 ) Summary: This pull request contains changes for: 1. Added MIOpen RNN API miopenGetRNNLayerBiasSize and miopenGetRNNLayerParamSize. 2. Fixed usage of API miopenGetRNNLayerParam. 3. Modifying the RNN test to run using MIOpen engine. Differential Revision: D13355699 Pulled By: bddppq fbshipit-source-id: 6f750657f8049c5446eca893880b397804120b69	2018-12-05 23:54:31 -08:00
Huan Gui	ba287eebca	Fix clip gradient with empty input (#14709 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14709 As titled Reviewed By: Wakeupbuddy Differential Revision: D13305554 fbshipit-source-id: 380062d4b0e4f9dc0207a27766cac7b8d05384d5	2018-12-05 22:53:25 -08:00
Michael Antonov	773f4d8081	Implements Gather operator for arbitrary axis, sharing the code with BatchGather. (#13756 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13756 This implements general Gather operator for arbitrary axis, sharing the code with BatchGather. - CPU gather & batch gather logic is now shared through caffe2::gather_helper, for any axis. - Shared CUDA kernel moved to gather_op.cuh, for any axis. - Gradients of axis > 0 delegate to BatchGatherGradientOp which now has axis argument. - BatchGatherOp doc strings updated to have correct rank (q + (r -1)) and output. - Added tests for axis == 2. GatherOp supports index wrapping for axis == 0 by default, which was earlier for ONNX. This diff also extends it to work in Cuda kernel. Added "wrap_indices" argument which specifies wheather this wrapping should be done; set it to true if you'd like wrapping for any axis. TBD: Update gradients to support negative indices (separate diff). TBD: Once we have operator versioning, we'd like to update GatherOp to NOT support axis 0 wrapping by default, but rather do it only if wrap_indices is set. Reviewed By: dzhulgakov Differential Revision: D12983815 fbshipit-source-id: 8add9d67b47fe8c5ba7a335f581ca0530b205cd7	2018-12-04 11:54:28 -08:00
Yan Zhu	aeb38cfcea	cuda implementation for PackSegment to support presence mask (#14635 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14635 as title Reviewed By: enosair Differential Revision: D13254097 fbshipit-source-id: b9f40109e2889907c925f9a4df9da14f67f45f38	2018-11-30 16:54:10 -08:00
rohithkrn	0d663cec30	Unify cuda and hip device types in Caffe2 python front end (#14221 ) Summary: Goal of this PR is to unify cuda and hip device types in caffe2 python front end. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14221 Differential Revision: D13148564 Pulled By: bddppq fbshipit-source-id: ef9bd2c7d238200165f217097ac5727e686d887b	2018-11-29 14:00:16 -08:00
Ashish	f4e502a8c5	Added MIOpen conv transpose op (#13938 ) Summary: This pull request contains changes for: 1. Removing ConvTranspose related changes from caffe2/operators/hip/conv_op_miopen.cc 2. Adding the file caffe2/operators/hip/conv_transpose_op_miopen.cc 3. Modifying the tests to run convTranspose op using MIOpen engine Differential Revision: D13055099 Pulled By: bddppq fbshipit-source-id: ca284f8f9a073005b22013c375cc958257815865	2018-11-13 21:01:52 -08:00
Shuting Wang	23e19ebfa7	add non expotential emphasis loss to Lambdarank Summary: Currently Lambdarank applies exponential emphasis on relevance, i.e., g=2^rel when calculating dcg, this diff adds options that supports g=rel in the loss function. Reviewed By: itomatik Differential Revision: D9891514 fbshipit-source-id: 64730d467a665670edd37e6dc1c077987991d1a8	2018-11-13 14:54:04 -08:00
Yan Shang	c85463fc74	Allow Gather to handle empty data (#13781 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13781 allow Gather Op to handle empty data. Reviewed By: intermilan Differential Revision: D13001267 fbshipit-source-id: 633c8471b637c56be8f6574f9bf9430785073977	2018-11-10 10:00:47 -08:00
Ansha Yu	e3e6ca1102	operator serialized test coverage summary document (#13703 ) Summary: Add a markdown document summarizing the coverage of serialized operator tests. This currently only takes into account what has been covered by the tests with respect to the entire registry of c2 operators. Next, we will break down the coverage by which operators have unit tests associated with them, which have hypothesis tests, and which have tests more specifically calling assertReferenceChecks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13703 Reviewed By: dzhulgakov Differential Revision: D12970810 Pulled By: ajyu fbshipit-source-id: 4f0cd057b1cf734371333e24d26cbab630a170e1	2018-11-09 15:04:08 -08:00
François Garillot	edd2e38023	Clean up a couple of items in the C2 test scaffolding (WIP) (#7847 ) Summary: - Py3 compatibility - utility functions refactoring Pull Request resolved: https://github.com/pytorch/pytorch/pull/7847 Reviewed By: pietern Differential Revision: D9355096 Pulled By: huitseeker fbshipit-source-id: 8e78faa937488c5299714f78075d7cadb1b2490c	2018-11-07 09:16:13 -08:00
Pradeep Dorairaj	76c1b5cd79	Fix overflow error in stats_put_ops Summary: I was hitting this error: caffe2/caffe2/operators/stats_put_ops.h:66:25: runtime error: 9.22337e+18 is outside the range of representable values of type 'long' So, the assignment from int64_t to float loses some precision and because of that we overflow. Reproduced this issue with this diff D12945013 Reviewed By: mlappelbaum, jdshi-fb Differential Revision: D12927086 fbshipit-source-id: 7eae7fe25ab49d5ac15279335bd5b1fa89d6e683	2018-11-06 15:41:51 -08:00
Junjie Bai	95ca66763d	Add math functions overloaded over different numeric types for cuda and hip (#13602 ) Summary: petrex ashishfarmer rohithkrn iotamudelta Pull Request resolved: https://github.com/pytorch/pytorch/pull/13602 Reviewed By: dzhulgakov Differential Revision: D12935797 Pulled By: bddppq fbshipit-source-id: a49ec66fb60bfd947c63dd2133d431884df62235	2018-11-06 01:40:31 -08:00
Jongsoo Park	54e8623d26	3D Conv in NHWC layout (#12733 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12733 Conv in NHWC layout only works for 2D images. This has been a pain point when implementing quantized 3D convolution because we need NHWC layout for best performance (note that NHWC layout in general gives better performance in CPU not just for quantized operators). For example, our quantized ops have a functionality to measure quantized error operator by operator but this needs running a shadow fp32 operator, but this is not easy when there's no 3D conv in NHWC layout is available (currently we're doing layout conversion on the fly for the shadow fp32 operator which is error prone). Some of Caffe2 frameworks like brew generates error when we try to create a 3D conv op in NHWC layout. This was also a blocker for using aibench because aibench is using brew. i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D10333829 fbshipit-source-id: 2d203ee1db833cd3f9d39353219e3894b46c4389	2018-11-04 21:50:09 -08:00
Jongsoo Park	8be0efaa8c	omit group conv NHWC test for HIP (#13554 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13554 D10233252 broke ROCM test. We don't have group conv in NHWC for hip yet and this diff omits related tests. Reviewed By: hyuen Differential Revision: D12917880 fbshipit-source-id: 9baf36a8cb061ee8cf393b2c438a2d1460ce5cd8	2018-11-03 21:18:23 -07:00
Jongsoo Park	2bc6a7a260	enable group conv test in NHWC layout in CPU (#12428 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12428 Group conv in NHWC layout was enabled in CPU after D7547497. In D7547497, unit test of group conv in NHWC layout in CPU was enabled in group_conv_test.py but not in conv_test.py . This diff also enables it in conv_test.py . Reviewed By: BIT-silence Differential Revision: D10233252 fbshipit-source-id: aeeaf3eedc60e1cf6321b5a1dbe6a561e3aacbde	2018-11-03 11:58:51 -07:00
Junjie Bai	da029ca042	Skip Conv1D tests for MIOPEN (#13512 ) Summary: miopen currently only supports 2d Pull Request resolved: https://github.com/pytorch/pytorch/pull/13512 Differential Revision: D12903307 Pulled By: bddppq fbshipit-source-id: a8b0f0580a1859f1e0c1518907406abf013c4c8c	2018-11-02 11:38:26 -07:00
Sergei Nikolaev	61a2d47ec6	Special handling for 1D covolutional kernels in cuDNN flavor of conv_op. (#12902 ) Summary: Essentially makes cuDNN to think of those kernels like of Nx1 ones. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12902 Reviewed By: BIT-silence Differential Revision: D10852862 Pulled By: soumith fbshipit-source-id: 7416cf6d131177340d21cbf1d42c1daa6c7cad8c	2018-11-02 07:08:23 -07:00
Will Feng	4c06f1f2bb	CircleCI: enable all flaky tests (#13356 ) Summary: A few Caffe2 tests are currently disabled in `py2-gcc4.8-ubuntu14.04` test job because they are known to be flaky. https://github.com/pytorch/pytorch/pull/13055 likely had fixed the flakiness, and this PR tests it. Fixes https://github.com/pytorch/pytorch/issues/12395. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13356 Differential Revision: D12858206 Pulled By: yf225 fbshipit-source-id: 491c9c4a5c48ac1b791fdc9d78acf66091e80457	2018-10-31 09:34:49 -07:00
Michael Antonov	f58e4fbc45	Remove redundant array-gen loop in gather_ops_test.py (#13338 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13338 Remove unnecessary [r for r in []] statements. Reviewed By: ezyang Differential Revision: D12848907 fbshipit-source-id: 256551b286ac6801585acf9bb0b2644ef0b7ed58	2018-10-30 16:20:22 -07:00
Dong Shi	3a81984bde	Make Stat put ops accept empty tensors safely (#13178 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13178 Add default value option to stats put ops Reviewed By: mlappelbaum Differential Revision: D10858564 fbshipit-source-id: cc9b3e621abf3fc21821b73f354bebdcd35e477e	2018-10-30 13:28:58 -07:00
Ilia Cherniavskii	1032cf9fe4	Support for zero-length sequences in RNN executor (#13244 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13244 Adding support for zero-length sequences into RNN executor Reviewed By: dzhulgakov Differential Revision: D10848803 fbshipit-source-id: f2994ee28c09fb30146243bb300ae7205024dd17	2018-10-29 10:32:42 -07:00
Tristan Rice	ab40eff5dd	caffe2: UpsampleBilinear CUDA implementation (#12843 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12843 This adds a cuda implementation for the UpsampleBilinearOp and UpsampleBilinearGradientOp. The CUDA code is based off of the corresponding ResizeNearest operators but with bilinear interpolation logic taken from the CPU implementation. Reviewed By: houseroad Differential Revision: D10453776 fbshipit-source-id: b29ac330b72465974ddb27c0587bca590773fdec	2018-10-25 11:10:04 -07:00
Junjie Bai	ccfaf46431	Make CUDNN an alias of MIOPEN for HIP ops (#12278 ) Summary: This is mostly for reusing all the cudnn test cases in our python operator_tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12278 Differential Revision: D10842592 Pulled By: bddppq fbshipit-source-id: 4b3ed91fca64ff02060837b3270393bc2f9a9898	2018-10-24 17:07:31 -07:00
Edward Yang	df47bbe9c1	Fix test_glu_old HealthCheck with smarter generation strategy. (#12975 ) Summary: Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/12975 Differential Revision: D10513493 Pulled By: ezyang fbshipit-source-id: ac183aeb4ae7f0a5f91f1a369b595ae92c3e844d	2018-10-24 13:45:19 -07:00
Yangqing Jia	ff508c91a1	Remove numba dependency Summary: TSIA - we want to deprecate numba in fbcode when moving to new compiler tiers. Converted the old test to a non-numba regular python op test. Reviewed By: xw285cornell Differential Revision: D10519910 fbshipit-source-id: 0e9188a6d0fc159100f0db704b106fbfde3c5833	2018-10-23 17:03:47 -07:00
Tristan Rice	6190408e24	caffe2: UpsampleBilinear support for scales (#12736 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12736 This updates UpsampleBilinearOp and UpsampleBilinearGradientOp to support scales to bring it inline with ResizeNearestOp https://github.com/pytorch/pytorch/pull/12720. Reviewed By: houseroad Differential Revision: D10416228 fbshipit-source-id: f339b7e06979c9c566afb4cee64a2d939b352957	2018-10-19 08:55:55 -07:00
Dmytro Dzhulgakov	92890d4314	Delete ExtendTensor operator Summary: Added 2 years ago in D3665603, never used, kill it. Reviewed By: ezyang Differential Revision: D10421336 fbshipit-source-id: 1b027a9ef2b71d0dd2c572cd4338bc8e046320d8	2018-10-18 15:18:40 -07:00
Lu Fang	f1e7d384b6	Support scales as inputs in ResizeNearest (#12720 ) Summary: To address https://github.com/onnx/onnx/pull/1467 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12720 Reviewed By: BIT-silence Differential Revision: D10414813 Pulled By: houseroad fbshipit-source-id: 8831381b0115c363065c8d23bd1a95b4d641b857	2018-10-17 23:08:53 -07:00
Hector Yuen	17ab3bd502	implement rowwise quantization for fp16 (#12382 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12382 implement fp16-> (uint8 + scale and bias in fp32) this is similar to fp32 rowwise quantization we could have done scale and bias in fp16 but not too motivated since we are not saving much and those datatypes have to be converted to fp32 to process since x86 doesn't support half float operations anyways Reviewed By: csummersea Differential Revision: D10220463 fbshipit-source-id: 6c382026de881f03798c2e5fc43abfc80f84ea1f	2018-10-12 13:57:55 -07:00
Dong Shi	da3dd9af12	No Op Optimizer (#12390 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12390 Introduce a no op optimizer for when we don't want updates to happen, but don't want to affect downstream processes. Reviewed By: mlappelbaum Differential Revision: D10209812 fbshipit-source-id: 2af4ebc0fb42e78ea851c3a9f4860f3d224037b6	2018-10-10 18:09:46 -07:00
Junjie Bai	f54ab540af	Rename cuda_gpu_id to device_id in DeviceOption (#12456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12456 codemod with 'Yes to all' codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id Overload TextFormat::ParseFromString to do string replace when parsing from protobuf format Reviewed By: Yangqing Differential Revision: D10240535 fbshipit-source-id: 5e6992bec961214be8dbe26f16f5794154a22b25	2018-10-09 15:54:04 -07:00
Will Feng	cdead5ace1	Enable CircleCI for Linux jobs (#12389 ) Summary: Changes in this PR: 1. Intermediate Docker image is shared from build stage to test stage through ECR, in order to fix the Caffe2 flaky CUDA tests. 2. There are ~7 Caffe2 operator tests that are only flaky in `caffe2_py2_gcc4_8_ubuntu14_04_test` on CPU. Disabling those tests on that config only, which is okay to do because we are still running those tests in other test jobs. After this PR is merged, CircleCI will be running on master automatically, and will be running on PRs if the author rebased their PR onto the newest master (which we will ask all the authors to do when we switch off Jenkins for Linux). Pull Request resolved: https://github.com/pytorch/pytorch/pull/12389 Differential Revision: D10224267 Pulled By: yf225 fbshipit-source-id: dd1a90a425c3d13b870d3d328cb301eee2e6e2cd	2018-10-08 17:09:37 -07:00
Dong Shi	5a0d2c7138	Add clamping functionality to stats_put_ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12391 Reviewed By: mlappelbaum Differential Revision: D10220000 fbshipit-source-id: 10fdbc8ebab931a5be31df964b5de5728048205d	2018-10-08 16:53:26 -07:00
Junjie Bai	ff608a9ff3	Back out "Revert D10123245: Back out "codemod cuda_gpu_id to device_id"" (#12232 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12232 Original commit changeset: fca91fea58b7 This adds proper modifications to the DeviceType <->DeviceOption conversion code added in D10033396 Reviewed By: jerryzh168 Differential Revision: D10132473 fbshipit-source-id: 801ef777e2950982cb47b48051b1471a0a91e64b	2018-10-01 21:54:52 -07:00
Junjie Bai	26df16eb21	Clear previous device option when keep_device is set in load op Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12240 Reviewed By: jerryzh168 Differential Revision: D10133933 fbshipit-source-id: 05935bd527177f936c1d08626888d43dedbf5ce4	2018-10-01 17:20:26 -07:00
Rick Ratmansky	3010dc4208	Revert D10123245: Back out "codemod cuda_gpu_id to device_id" Differential Revision: D10123245 Original commit changeset: d83da8e00a12 fbshipit-source-id: fca91fea58b7df208edc2e218a1d514f9821ec7b	2018-10-01 12:22:36 -07:00
Yang Liu	7d7d336c45	Back out "codemod cuda_gpu_id to device_id" Summary: Original commit changeset: f5614a5d2607 D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz We need to land this revert ASAP to unblock aggregator push. Reviewed By: orionr Differential Revision: D10123245 fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2	2018-10-01 11:31:14 -07:00
Satish Nadathur	04c0971679	Special case BatchGather and BatchGatherGradient for block_size=1. (#11349 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11349 Special case BatchGather and BatchGatherGradient for block_size=1. This makes BatchGather 3-4X faster and BatchGatherGradient 10X for this case. Reviewed By: jspark1105, ilia-cher Differential Revision: D7218043 fbshipit-source-id: ea12042239a8adc92b9efcbd0b66e354fb43f4c7	2018-09-27 21:11:38 -07:00
Junjie Bai	3eb5940cf5	codemod cuda_gpu_id to device_id (#12022 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12022 codemod -d . --extensions h,cc,cpp,cu,py,proto,pbtxt,pb.txt,config cuda_gpu_id device_id codemod with 'Yes to all' Reviewed By: orionr Differential Revision: D9986213 fbshipit-source-id: f5614a5d26078817aee8caf79a494abfd6a95ff1	2018-09-27 20:24:53 -07:00
Dong Shi	d9c27f4d8d	T33898723: Simple put operators for caffe2 stats (#12057 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12057 Add simple put operators for various types of stats Reviewed By: mlappelbaum Differential Revision: D9925268 fbshipit-source-id: cec02b0027d2d0ef3d35741be4b02c429d492810	2018-09-26 12:39:37 -07:00
Will Feng	7122f8b3bb	Disable more flaky tests on CircleCI (#11399 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/11362. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11399 Differential Revision: D9736673 Pulled By: yf225 fbshipit-source-id: cad8c0e86a70a01b047e648975ca5b9926e4acb3	2018-09-25 10:25:30 -07:00
Ansha Yu	3b1a5a1b8a	Refactor tests part 2 (#11811 ) Summary: Followup to the [first refactor](https://github.com/pytorch/pytorch/pull/11350). Increase coverage of tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/11811 Reviewed By: houseroad Differential Revision: D9923074 Pulled By: ajyu fbshipit-source-id: 0f899bb9e9a75bf7ed939e06cc9b028daa7f6bd9	2018-09-19 10:09:28 -07:00
Ansha Yu	98aebed88e	Refactor tests part 1 (#11350 ) Summary: Followup to [the serialized test framework](https://github.com/pytorch/pytorch/pull/10594) Round 1 for refactoring tests, starting alphabetically. I added some functionality, so I wanted to send out some of these initial changes sooner. I'm skipping all tests that don't explicitly call assertReferenceChecks. Some tests directly call np.allclose, and others are simply TestCase (rather than HypothesisTestCase). 1. Start alphabetically producing serialized outputs for test functions, annotating those we want to include with `serialized_test_util.given`. So far I've only added one test per operator, but this already does seem to add quite a few tests. 2. Add functionality to allow us to generate outputs using pytest by adding pytest argument options. This allows us to skip adding a `__main__` function to quite a few tests. 3. Catch any exceptions generating the gradient operator and skip serializing/reading it, since certain operators don't have gradients. 4. Add functionality to better handle jagged array inputs, which numpy doesn't handle very well. We simply explicitly do the conversion to dtype=object. 5. Make only one file per test function, rather than 4, to reduce the number of files in the github repo. I also noticed that there is some hypothesis handling that makes `serialized_test_util.given` not compatible with adding more hypothesis decorators on top. For example, there are tests that do ``` settings(...) given(...) def test_my_stuff(...) ``` But there is a hypothesis handler that explicitly checks that `given` is called below `settings`, so we cannot refactor this to `serialized_test_util.given`. I've just avoided decorating these kinds of tests for now, I hope that's alright. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11350 Reviewed By: houseroad Differential Revision: D9693857 Pulled By: ajyu fbshipit-source-id: a9b4279afbe51c90cf2025c5ac6b2db2111f4af7	2018-09-18 10:42:10 -07:00
Chenguang Xi	cdefc27795	Support lr adaption for SparseAdam and RowWiseSparseAdam (#11162 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11162 as title, fix pr test failure Reviewed By: chocjy Differential Revision: D9619308 fbshipit-source-id: 0a2228841ed8fadb15f07e94d3575aa701b10146	2018-09-17 10:29:03 -07:00
Xiaomeng Yang	7f7cda99cd	Optimize order_swich_ops on GPU (#11404 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11404 Optimize order_swich_ops on GPU Reviewed By: houseroad Differential Revision: D9728642 fbshipit-source-id: 74ff62268856fb1613fa61eb214bed6ec6716632	2018-09-12 16:56:15 -07:00
Lukasz Wesolowski	4db21a1d8e	Optimize LengthsTileOp on GPU to run a kernel instead of a sequence of memcopies (#11413 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11413 LengthsTileOp was implemented using a sequence of device memcopies initiated on the CPU. This was very slow. I changed it to use a kernel. TUM benchmark QPS improved from 13k QPS to 20k QPS as a result. Reviewed By: manojkris, xianjiec Differential Revision: D9724988 fbshipit-source-id: 2f98c697730982734d7c6a26d0b6967310d49900	2018-09-11 13:25:35 -07:00
Mingda Li	f2f43ad2da	Add new LengthsSplit operator (#10974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10974 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10291 This new operator will do the following: Given a LENGTHS vector and n_splits, output a "split" LENGTHS vector where: 1. Each length in input vector is split into n_splits values (thus output vector should have LENGTHS.size(0) * n_splits elements) 2. The new lengths in output should be evenly split, and if the length is not divisible by n_splits, then order new values in descending order. (e.g. n_splits = 3, length = 5 -> 2 2 1) 3. If n_splits > some element in the array, its split elements will contain 0s. (e.g. n_splits = 3, length = 2 - > 1 1 0) Reviewed By: bddppq, chocjy Differential Revision: D9013119 fbshipit-source-id: 82bf3371ec08c41fc3379177f0007afc142e0d84	2018-09-10 15:40:28 -07:00
Xiaomeng Yang	ec5404a449	Add cuda version of SpatialBNOp also optimize SpatialBN on CPU (#10888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10888 Add cuda version of SpatialBNOp also optimize SpatialBN on CPU Reviewed By: houseroad Differential Revision: D9512435 fbshipit-source-id: 6f828c88d56d30dc9a2f98a297a161c35cc511b1	2018-09-06 18:26:13 -07:00
Will Feng	c9e66351a7	Port all PyTorch and Caffe2 jobs to CircleCI (#11264 ) Summary: This PR adds all PyTorch and Caffe2 job configs to CircleCI. Steps for the CircleCI mini-trial: - [ ] Make sure this PR passes Jenkins CI and fbcode internal tests - [x] Approve this PR - [ ] Ask CircleCI to turn up the number of build machines - [ ] Land this PR so that the new `.circleci/config.yml` will take effect Several Caffe2 tests are flaky on CircleCI machines and hence skipped when running on CircleCI. A proper fix for them will be worked on after a successful mini-trial. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11264 Differential Revision: D9656793 Pulled By: yf225 fbshipit-source-id: 7832e90018f3dff7651489c04a179d6742168fe1	2018-09-05 16:28:11 -07:00
Xiaomeng Yang	b3d559cdd1	Optimize WeightedSumOp for two inputs (#11049 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11049 Optimize WeightedSumOp for two inputs Reviewed By: houseroad Differential Revision: D9566692 fbshipit-source-id: 9aab1f02251d386b6f7d0699ae11eeb2ea2b5b4f	2018-09-01 11:54:55 -07:00
Edward Yang	3073051a18	Revert D9554375: Support lr adaption for SparseAdam and RowWiseSparseAdam Differential Revision: D9554375 Original commit changeset: b88768f470ef fbshipit-source-id: 2c103c616c8680684892c7d9085fd7bb8289d2f1	2018-08-31 07:54:31 -07:00
Chenguang Xi	0555768e0f	Support lr adaption for SparseAdam and RowWiseSparseAdam (#10993 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10993 as title Reviewed By: chocjy Differential Revision: D9554375 fbshipit-source-id: b88768f470ef7d023dd481c6a97b91594892f422	2018-08-31 00:55:39 -07:00
Ansha Yu	9fae8fcdff	framework for committed serialized tests (#10594 ) Summary: Generate serialized test inputs/outputs/backward graphs of tests inside `caffe2/python/operator_test` that call assertSerializedOperatorCheck(). Tests should be decorated with serialized_test.collect_tests.given_and_seeded to run hypothesis tests that are actually random and a single fixed seeded hypothesis tests. To use: 1. Refactor your test to be a SerializedTestCase 1a. Decorate it with given_and_seeded 1b. Call testWithArgs in main 2. Run your test with -g to generate the output. Check it in. 3. Subsequent runs of the test without generating the output will check against the checked in test case. Details: Run your test with `python caffe2/python/operator_test/[your_test].py -g` Outputs are in `caffe2/python/serialized_test/data`. The operator tests outputs are in a further subdirectory `operator_test`, to allow for other tests in the future (model zoo tests?) Currently, we've only refactored weighted_sum_test to use this, but in the next diff, we'll refactor as many as possible. The directory structure may also change as usually there are multiple tests in a single file, so we may create more structure to account for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10594 Reviewed By: ezyang Differential Revision: D9370359 Pulled By: ajyu fbshipit-source-id: 2ce77389cd8bcc0255d3bccd61569833e545ede8	2018-08-30 22:41:46 -07:00
Tommy Yu	89834dfe64	Add GPU version of HardSigmoid Op to Caffe2 (#10955 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10955 Add GPU version of HardSigmoid Op to Caffe2. Updated test file to include GPU tests. Reviewed By: enosair Differential Revision: D9499353 fbshipit-source-id: fcb51902063d0c3e4b10354533a8a42cf827c545	2018-08-29 14:55:29 -07:00
Tommy Yu	92ff070b83	Add CPU version of hard sigmoid operator to caffe2 (#10837 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10837 Add CPU version of hard sigmoid operator to caffe2. The definition of this operator can be found here: https://github.com/onnx/onnx/blob/master/docs/Operators.md#HardSigmoid. Reviewed By: BIT-silence Differential Revision: D9489536 fbshipit-source-id: 67b3171ed96d5ebcc8d500d93e7827a4a9705a81	2018-08-28 14:55:49 -07:00
Yanghan Wang	f64f6eed3a	move HeatmapMaxKeypointOp unittest to oss Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10859 Reviewed By: newstzpz Differential Revision: D9498312 fbshipit-source-id: 08b8a596f774c9102286019f286ca0b74d1f5304	2018-08-27 12:56:46 -07:00
Edward Yang	deda05e59f	Revert D9395814: move HeatmapMaxKeypointOp unittest to oss Differential Revision: D9395814 Original commit changeset: 25073eb6b143 fbshipit-source-id: 56f2b7b57e3c6361e2d78e5ba7850ea3b89e98fb	2018-08-23 06:54:29 -07:00
Yanghan Wang	9a43fc5eaa	move HeatmapMaxKeypointOp unittest to oss Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10674 Reviewed By: newstzpz Differential Revision: D9395814 fbshipit-source-id: 25073eb6b143fc1e7cbf5f887545d2b7df15c9a9	2018-08-22 19:11:10 -07:00
Wei Wen	6c75fc0aa3	Intergrating stochastic quantization to easgd to reduce communication + supporting quantization on both sides (split from D8849770) (#10644 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10644 Depends on D8493264 Reviewed By: chocjy, boryiingsu Differential Revision: D9347706 fbshipit-source-id: 6fdcc5b61098bf47ec9391b1f009b0e6a0615842	2018-08-22 17:10:03 -07:00
Huan Gui	7832e9d564	Add a bisect percentile operator (#10563 ) Summary: Add a bisect percentile operators with lower and upper bounds for interpolation Pull Request resolved: https://github.com/pytorch/pytorch/pull/10563 Reviewed By: chocjy Differential Revision: D7802182 Pulled By: olittle fbshipit-source-id: 89ebfa8b3463adc2c89235fa3dfffa187a9d5417	2018-08-20 13:14:05 -07:00

... 2 3 4 5 6 ...

987 Commits