pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Lingyi Liu	2d884f2263	Optimize Scale function (#44913 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44913 Pull Request resolved: https://github.com/pytorch/pytorch/pull/18322 Optimize Scale function i-am-not-moving-c2-to-c10 Test Plan: buck test mode/dbg caffe2/caffe2/python/operator_test:weighted_sum_test Reviewed By: BIT-silence Differential Revision: D14575780 fbshipit-source-id: db333a7964581dcaff6e432ff1d6b517ba1a075f	2020-09-18 14:31:33 -07:00
Kevin Matzen	6d8649dc53	[caffe2] fix Transpose2D calls in NHWC<->NCHW (#34625 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34625 These templated function calls are not specifying the template args correctly. The first arg is the index type, not the array data type. That means, right now it's using `T` as the index type as well, which will break if we do a template specialization for uint8_t. If we omit both, it will correctly infer that the index type is `int` and the data type is `T`. Reviewed By: BIT-silence Differential Revision: D20358728 fbshipit-source-id: 8cbd8eeb14bce602c02eb6fce2cc141f0121fa24	2020-03-16 15:18:44 -07:00
Igor Sugak	23846d5a38	[caffe2] use Clang identification macro in various places (#33574 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33574 Sprinkle with Clang identification macro places that otherwise would cause build errors when Clang is used to drive the CUDA compilation. Note: `__clang__` is defined when either Clang is used as host compiler by NVCC or when Clang drives the compilation. `__CUDA__` is defined only for the latter case. Test Plan: ```lang=bash buck build mode/opt -c fbcode.cuda_use_clang=true //fblearner/flow/projects/dper:workflow buck build mode/opt //fblearner/flow/projects/dper:workflow ``` Reviewed By: BIT-silence Differential Revision: D20007440 fbshipit-source-id: 53caa70695b99461a3910d41dc71a9f6d0728a75	2020-02-20 15:16:11 -08:00
Gregory Chanan	2f03205c65	Support torch::tensor and at::tensor with bool and BFloat16 dtypes. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23337 Test Plan: Imported from OSS Differential Revision: D16467979 Pulled By: gchanan fbshipit-source-id: 2e6ad431c47a61c917d501390d14c55b788958ab	2019-08-09 12:36:35 -07:00
Xiaomeng Yang	29b53b0259	Fix bug in caffe2 transpose on GPU (#22233 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/22233 Fix bug in caffe2 transpose on GPU Reviewed By: hl475 Differential Revision: D15994973 fbshipit-source-id: 542dc8757b51a6322fffa55826c1d4e32927398d	2019-06-26 11:33:25 -07:00
Xiaomeng Yang	2ce39de3fc	Add elementwise_affine for layer_norm_op (#19713 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19713 Add elementwise_affine for layer_norm_op Reviewed By: houseroad Differential Revision: D15075454 fbshipit-source-id: e8a7d3da1c81e49fa55323f5e74a68bc4ef8d83f	2019-04-26 17:20:01 -07:00
Xiaomeng Yang	fb9fc42a0c	optimize BatchMatmulOp (#18612 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18612 optimize BatchMatmulOp Reviewed By: houseroad Differential Revision: D14681665 fbshipit-source-id: cf5ea4909ace58fd44fe6fa634531102ac84e851	2019-04-23 15:34:59 -07:00
Xiaomeng Yang	fd40c0eba0	Add gelu op (#18992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18992 Add gelu op Reviewed By: houseroad Differential Revision: D14814811 fbshipit-source-id: 00f126b8b83763c57ebbf28fbd2de5a8fab6d491	2019-04-08 21:58:29 -07:00
Xiaomeng Yang	265fa0ce4d	Move math::Axpy function to elementwise lib (#18316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18316 Move math::Axpy function to elementwise lib i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14574697 fbshipit-source-id: 7cfbb2da295c8966c5328bd6b577cce2638eea62	2019-03-26 12:19:19 -07:00
nihui	ed8c462dc7	Fix caffe2 build with BLAS=OpenBLAS (#18422 ) Summary: g++ complains about failing to find the declaration of cblas_sscal and cblas_dscal BLAS function let's fix it :) fedora 29, gcc 8.3.1, openblas 0.3.5 build with cmake -DBLAS=OpenBLAS .. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18422 Differential Revision: D14598977 Pulled By: soumith fbshipit-source-id: bde77bfb359d2ff38226401caeed78c114ef7468	2019-03-25 11:59:10 -07:00
Xiaomeng Yang	e04c9195b7	Update math::Transpose to support tensor with size > 2G (#17670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17670 Update math::Transpose to support tensor with size > 2G i-am-not-moving-c2-to-c10 Differential Revision: D14313624 fbshipit-source-id: 0b4a85b913972e5a8981f0d40d0c539407b98f30	2019-03-20 18:22:21 -07:00
Xiaomeng Yang	0fd1dc45c0	Optimize LayerNormOp (#17604 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17604 Optimize LayerNormOp i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14274175 fbshipit-source-id: a7aa263a1b0eb109682d2be99306e7b2cdcc0faf	2019-03-08 17:38:14 -08:00
Xiaomeng Yang	9709d5e787	Fix math::Set for large tensor (#17539 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17539 Fix math::Set for large tensor i-am-not-moving-c2-to-c10 Reviewed By: dzhulgakov, houseroad Differential Revision: D14240756 fbshipit-source-id: 0ade26790be41fb26d2cc193bfa3082c7bd4e69d	2019-02-27 12:34:58 -08:00
Xiaomeng Yang	2e67b34ea7	Separate gpu reduce functions (#17146 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17146 Separate gpu reduce functions i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14097564 fbshipit-source-id: a27de340997111a794b1d083c1673d4263afb9fb	2019-02-20 14:49:01 -08:00
Xiaomeng Yang	3a34f443c5	Separate reduce functions from math (#16929 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16929 Separate CPU reduce functions from math i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13999469 fbshipit-source-id: bd628b15a6e3c1f04cc62aefffb0110690e1c0d1	2019-02-13 17:50:47 -08:00
Xiaomeng Yang	2db847b3a7	Separate elementwise level2 math functions (#16753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16753 Separate elementwise level2 math functions i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13954928 fbshipit-source-id: 1ca7a5d3da96e32510f502e5e4e79168854bee67	2019-02-07 18:38:26 -08:00
Xiaomeng Yang	7d4a81cbb2	Use macro for reduce on 2d blocks (#16344 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16344 Use macro for reduce on 2d blocks i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13808988 fbshipit-source-id: b68c0fb6079c1b6e203a072083aba7a95c202bc2	2019-02-01 23:49:07 -08:00
Xiaomeng Yang	598b713660	Seperate level1 elementwise functions from math (#16397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16397 Seperate level1 elementwise functions from math i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13830626 fbshipit-source-id: e6e672647076dab8b3b24be181f580a1486250c9	2019-01-30 00:04:12 -08:00
Xiaomeng Yang	0a2d14dd7c	Optimize SpatialBNOp on GPU (#16395 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16395 Optimize SpatialBNOp on GPU i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13829833 fbshipit-source-id: 04d2a63e8e9830c4c39a91cf87fcd7aa765dc55f	2019-01-28 09:36:45 -08:00
Xiaomeng Yang	866c4e3467	Separate Moments from math and optimize it (#16175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16175 Separate Moments from math and optimize it i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13742472 fbshipit-source-id: 90757d908d38c98ca69818855aaf68315e525992	2019-01-20 08:53:25 -08:00
Xiaomeng Yang	b436f94b53	Separate affine_channel from math and optimize it (#16135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16135 Separate affine_channel from math and optimize it i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13727606 fbshipit-source-id: 8980af4afadaf964a18a9da581106fe30896a7e9	2019-01-18 22:40:16 -08:00

21 Commits