pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Yangxin Zhong	514f20ea51	Histogram Binning Calibration Summary: Adding a calibration module called histogram binning: Divide the prediction range (e.g., [0, 1]) into B bins. In each bin, use two parameters to store the number of positive examples and the number of examples that fall into this bucket. So we basically have a histogram for the model prediction. As a result, for each bin, we have a statistical value for the real CTR (num_pos / num_example). We use this statistical value as the final calibrated prediction if the pre-cali prediction falls into the corresponding bin. In this way, the predictions within each bin should be well-calibrated if we have sufficient examples. That is, we have a fine-grained calibrated model by this calibration module. Theoretically, this calibration layer can fix any uncalibrated model or prediction if we have sufficient bins and examples. It provides the potential to use any kind of training weight allocation to our training data, without worrying about the calibration issue. Test Plan: buck test dper3/dper3/modules/calibration/tests:calibration_test -- test_histogram_binning_calibration buck test dper3/dper3_models/ads_ranking/tests:model_paradigm_e2e_tests -- test_sparse_nn_histogram_binning_calibration All tests passed. Example workflows: f215431958 {F326445092} f215445048 {F326445223} Reviewed By: chenshouyuan Differential Revision: D23356450 fbshipit-source-id: c691b66c51ef33908c17575ce12e5bee5fb325ff	2020-09-06 17:11:16 -07:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
Elliott Clark	ad58045af9	Remove LOG(INFO) from math_cpu.cc (#27001 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27001 This unconditional log line spams the logs enough that it's a drag on cpu and will eventually fill up logs. Test Plan: Allow unit test and automated testing to give feedback. Reviewed By: jspark1105 Differential Revision: D17638140 fbshipit-source-id: 4e8a44bda31327ba7e797f7579a9e3bf866eef7e	2019-09-27 16:37:49 -07:00
Jongsoo Park	d5490c662e	batch size 0 tests in BatchMatMul ops (#26874 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26874 Add batch_size == 0 testings of BatchMatMul DNNLOWP operator. Test Plan: CI Reviewed By: jianyuh Differential Revision: D17596117 fbshipit-source-id: 029e29e6c2bd7894d83dac46e8ce8484cc92b1c0	2019-09-26 16:08:39 -07:00
Zachary DeVito	6a48a5b65c	Fix more warnings Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24291 Test Plan: Imported from OSS Differential Revision: D16795898 Pulled By: zdevito fbshipit-source-id: cbd5f2dd4e3bbd361909ae13c243561899568ad0	2019-08-14 17:47:54 -07:00
Hong Xu	513c4291c5	Suppress implicit-fallthrough warning on g++ >= 7 in caffe2/utils/math_cpu.cc (#24053 ) Summary: These implicit fallthroughs lead to the following warning on g++ 7, because g++ could not recognize the implicit `abort` call in `LOG(FATAL)`. We suppress by adding explicit `return`s. /home/hong/wsrc/pytorch/caffe2/utils/math_cpu.cc: In function void caffe2::math::GemmEx(CBLAS_TRANSPOSE, CBLAS_TRANSPOSE, int , int, int, T, const T, int, const T, int, T, T, int, Context) [with T = float; Context = caffe2::CPUContext; Engine = caf fe2::DefaultEngine]: /home/hong/wsrc/pytorch/c10/util/logging_is_not_google_glog.h:98:10: warning: this statement may fall through [-Wimplicit-fall through=] ::c10::MessageLogger((char)__FILE__, __LINE__, n).stream() ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/hong/wsrc/pytorch/caffe2/utils/math_cpu.cc:179:11: note: in expansion of macro LOG LOG(FATAL) << "Unexpected CBLAS_TRANSPOSE for trans_B"; ^ /home/hong/wsrc/pytorch/caffe2/utils/math_cpu.cc:182:5: note: here case CblasTrans: { ^~~~ In file included from /home/hong/wsrc/pytorch/c10/util/Logging.h:28:0, from /home/hong/wsrc/pytorch/caffe2/core/logging.h:2, from /home/hong/wsrc/pytorch/caffe2/core/types.h:9, from /home/hong/wsrc/pytorch/caffe2/utils/math.h:17, from /home/hong/wsrc/pytorch/caffe2/utils/math_cpu.cc:14: /home/hong/wsrc/pytorch/c10/util/logging_is_not_google_glog.h:98:10: warning: this statement may fall through [-Wimplicit-fall through=] ::c10::MessageLogger((char)__FILE__, __LINE__, n).stream() ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/hong/wsrc/pytorch/caffe2/utils/math_cpu.cc:202:11: note: in expansion of macro LOG LOG(FATAL) << "Unexpected CBLAS_TRANSPOSE for trans_B"; ^ /home/hong/wsrc/pytorch/caffe2/utils/math_cpu.cc:205:5: note: here default: ^~~~~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/24053 Differential Revision: D16732530 Pulled By: ezyang fbshipit-source-id: 90373879f25b52efca5bf151c7ed58d6ad19d925	2019-08-09 09:17:23 -07:00
Long Jin	95ce796663	enable CopyVector for type of int32_t (#19931 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19931 as title Reviewed By: Wakeupbuddy Differential Revision: D15134369 fbshipit-source-id: a8afa166b5537bf815be875fa8afcb599897d5a7	2019-04-29 22:24:06 -07:00
Xiaomeng Yang	265fa0ce4d	Move math::Axpy function to elementwise lib (#18316 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18316 Move math::Axpy function to elementwise lib i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D14574697 fbshipit-source-id: 7cfbb2da295c8966c5328bd6b577cce2638eea62	2019-03-26 12:19:19 -07:00
Xiaomeng Yang	e04c9195b7	Update math::Transpose to support tensor with size > 2G (#17670 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17670 Update math::Transpose to support tensor with size > 2G i-am-not-moving-c2-to-c10 Differential Revision: D14313624 fbshipit-source-id: 0b4a85b913972e5a8981f0d40d0c539407b98f30	2019-03-20 18:22:21 -07:00
Xiaomeng Yang	3a34f443c5	Separate reduce functions from math (#16929 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16929 Separate CPU reduce functions from math i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13999469 fbshipit-source-id: bd628b15a6e3c1f04cc62aefffb0110690e1c0d1	2019-02-13 17:50:47 -08:00
Xiaomeng Yang	2db847b3a7	Separate elementwise level2 math functions (#16753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16753 Separate elementwise level2 math functions i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13954928 fbshipit-source-id: 1ca7a5d3da96e32510f502e5e4e79168854bee67	2019-02-07 18:38:26 -08:00
Xiaomeng Yang	598b713660	Seperate level1 elementwise functions from math (#16397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16397 Seperate level1 elementwise functions from math i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13830626 fbshipit-source-id: e6e672647076dab8b3b24be181f580a1486250c9	2019-01-30 00:04:12 -08:00
Xiaomeng Yang	866c4e3467	Separate Moments from math and optimize it (#16175 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16175 Separate Moments from math and optimize it i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13742472 fbshipit-source-id: 90757d908d38c98ca69818855aaf68315e525992	2019-01-20 08:53:25 -08:00
Xiaomeng Yang	b436f94b53	Separate affine_channel from math and optimize it (#16135 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16135 Separate affine_channel from math and optimize it i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D13727606 fbshipit-source-id: 8980af4afadaf964a18a9da581106fe30896a7e9	2019-01-18 22:40:16 -08:00
bddppq	1a09a2a27f	Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16106 Differential Revision: D13709490 Pulled By: bddppq fbshipit-source-id: 1b5b32261f06543371f7bd7ac9b11957a5eb4ad0	2019-01-17 09:18:08 -08:00
Yinghai Lu	7c053b7e64	Add filler for SparseLengthsWeightedSum (#13949 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13949 This diff adds support to fillers for `SparseLengthsWeight` ops. It does 3 things: 1. Add the fillers for `SparseLengthsWeight` ops 2. Add filling heuristics to consider the path of `LengthsRangeFill` -> `Gather` -> `SparseLengthsWeightedSum`, where the length input is shared by `LengthsRangeFill` and `SparseLengthsWeightedSum`. Therefore, we need to carefully bound the value of that length input so that at `Gather`, it does not index out-of-bound for the weight input of `Gather`. 3. Fix and simplify the logic of `math::RandFixedSum`, where we just keep rejecting the generated value if it violates the invariants. Reviewed By: highker Differential Revision: D13048216 fbshipit-source-id: bfe402e07e6421b28548047d18b298c148e0ec87	2018-11-16 11:31:05 -08:00
Jongsoo Park	54e8623d26	3D Conv in NHWC layout (#12733 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12733 Conv in NHWC layout only works for 2D images. This has been a pain point when implementing quantized 3D convolution because we need NHWC layout for best performance (note that NHWC layout in general gives better performance in CPU not just for quantized operators). For example, our quantized ops have a functionality to measure quantized error operator by operator but this needs running a shadow fp32 operator, but this is not easy when there's no 3D conv in NHWC layout is available (currently we're doing layout conversion on the fly for the shadow fp32 operator which is error prone). Some of Caffe2 frameworks like brew generates error when we try to create a 3D conv op in NHWC layout. This was also a blocker for using aibench because aibench is using brew. i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D10333829 fbshipit-source-id: 2d203ee1db833cd3f9d39353219e3894b46c4389	2018-11-04 21:50:09 -08:00
Jongsoo Park	f000101b81	add a few comments on layout after im2col (#12429 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12429 Comments to clarify layout after NHWC im2col for group convolution. i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D10233284 fbshipit-source-id: 996a69f2f932e02c978abaade7571b00741b6ae8	2018-11-04 11:02:58 -08:00
Xiaomeng Yang	34dd831dc2	Revert MKL rowwise moments (#13480 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13480 Revert D12845220 since the MKL functions are using multi-thread while the single-thread run is slower than eigen version. i-am-not-moving-c2-to-c10 Reviewed By: houseroad Differential Revision: D12891751 fbshipit-source-id: 2a61727b269a304daeee2af6ff7fee7820cb5344	2018-11-02 11:31:43 -07:00
Yufei Wang	d843f63f2a	optimization on cpu conv3d (#11884 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11884 In cpu mode, current convNd uses Im2ColNdNCHWImpl, which is generic implementation to handle convolutional layer for arbitrary number of dimensions. In video modeling, we use convNd for filter dimension=3. The problem of current convNd is that Im2ColNdNCHWImpl is much slower than Im2Col used by conv2d for the filters with same Flops. For example, a (1, 7, 7) 3d filter takes 5 times longer than a (7, 7) 2d filter at inference time. This diff extends Im2Col to 3d case (Im2Col3dNCHWImpl), and this optimization for 3d convolution gives 4~5 times faster inference time on cpu for various video models: {F128300920} i-am-not-moving-c2-to-c10 Reviewed By: BIT-silence Differential Revision: D8245940 fbshipit-source-id: 75231d65c9dd56059dfe31701e26021fd1ff2a85	2018-11-01 15:13:26 -07:00
Xiaomeng Yang	bfe7df2211	Optimize rowwise_moments by MKL (#13329 ) Summary: i-am-not-moving-c2-to-c10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13329 Optimize rowwise_moments by MKL Reviewed By: houseroad Differential Revision: D12845220 fbshipit-source-id: b047e52ba82ed184bd322680fbf96306dfbb9867	2018-10-30 21:43:36 -07:00
Xiaomeng Yang	3092a69546	Optimize NCHW2NHWC on GPU (#12910 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12910 Optimize NCHW2NHWC on GPU Reviewed By: houseroad Differential Revision: D10481163 fbshipit-source-id: 6ddbd0ec9c96965b96aa1b8a006232d6f2b94249	2018-10-22 11:24:29 -07:00
Sebastian Messmer	6476e4598c	Rename TypeMeta function pointers (#12306 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12306 In a future diff, I'm going to introduce non-placement constructor and destructor to TypeMeta. To make it less ambigous, this diff is first renaming the existing ones to PlacementXXX. Reviewed By: dzhulgakov Differential Revision: D10184117 fbshipit-source-id: 119120ebc718048bdc1d66e0cc4d6a7840e666a4	2018-10-16 16:45:47 -07:00
Yangqing Jia	1962646d0f	Remove CAFFE2_UNIQUE_LONG_TYPEMETA (#12311 ) Summary: CAFFE2_UNIQUE_LONG_TYPEMETA has been a tricky variable defined only from cmake - this is an experiment to remove it and see what exact compilers need that one set. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12311 Reviewed By: dzhulgakov Differential Revision: D10187777 Pulled By: Yangqing fbshipit-source-id: 03e4ede4eafc291e947e0449382bc557cb624b34	2018-10-04 10:12:13 -07:00
Yangqing Jia	38f3d1fc40	move flags to c10 (#12144 ) Summary: still influx. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144 Reviewed By: smessmer Differential Revision: D10140176 Pulled By: Yangqing fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c	2018-10-04 02:09:56 -07:00
Yangqing Jia	28dba2f928	Unify all _EXPORT and _IMPORT macros across c++ backend (#12019 ) Summary: TSIA. Right now we should basically use C10_EXPORT and C10_IMPORT for explicitly marking dllexport and dllimport, as a continued effort of the C10 unification. This is a codemod by mechanically doing the following change: CAFFE2_{EXPORT,IMPORT} -> C10_{EXPORT,IMPORT} AT_CORE_{EXPORT,IMPORT} -> C10_{EXPORT,IMPORT} Pull Request resolved: https://github.com/pytorch/pytorch/pull/12019 Reviewed By: ezyang, teng-li Differential Revision: D10016276 Pulled By: Yangqing fbshipit-source-id: a420d62c43d1110105fc88f9e9076e28a3203164	2018-09-25 17:41:05 -07:00
Christian Puhrsch	a6630e25af	Remove many caffe2::TIndex and replace them with int64_t (#11943 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11943 See title Reviewed By: ezyang Differential Revision: D9992645 fbshipit-source-id: e8f80d6ea762971513e5e8072975ceea53e1f11a	2018-09-22 18:11:04 -07:00
Xiaomeng Yang	ec5404a449	Add cuda version of SpatialBNOp also optimize SpatialBN on CPU (#10888 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10888 Add cuda version of SpatialBNOp also optimize SpatialBN on CPU Reviewed By: houseroad Differential Revision: D9512435 fbshipit-source-id: 6f828c88d56d30dc9a2f98a297a161c35cc511b1	2018-09-06 18:26:13 -07:00
Yangqing Jia	68613cf5a2	Windows DLL build with Caffe2 code (#11266 ) Summary: This is an experimental build on top of what orionr and mingzhe09088 built. Essentially, the idea is that we will need separate *_API versions for different shared libraries. If this theory is right, I'll try to clean up the design a bit and document it properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11266 Reviewed By: orionr Differential Revision: D9682942 Pulled By: Yangqing fbshipit-source-id: c79653199e67a1500c9174f39f8b0357324763f3	2018-09-06 15:12:20 -07:00
Jongsoo Park	ad116210e5	typo fix Tranpose2D -> Transpose2D (#11281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11281 A simple typo fix Reviewed By: BIT-silence Differential Revision: D9658324 fbshipit-source-id: b6513c8d12d8fe75a9b18df1b443e9e66e692744	2018-09-05 17:25:58 -07:00
Maxim Naumov	7e0a052a5d	Adding synthetic data generation to the filler.h file (#11060 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11060 Adding synthetic data generation to the filler.h file (the exact distribution to be replaced later on). Reviewed By: highker Differential Revision: D9417594 fbshipit-source-id: 5d66dfbcb254a5961c36b7d3a081332c7372dac7	2018-09-04 13:40:53 -07:00
Xiaomeng Yang	b3d559cdd1	Optimize WeightedSumOp for two inputs (#11049 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11049 Optimize WeightedSumOp for two inputs Reviewed By: houseroad Differential Revision: D9566692 fbshipit-source-id: 9aab1f02251d386b6f7d0699ae11eeb2ea2b5b4f	2018-09-01 11:54:55 -07:00
Jongsoo Park	cc53807be5	group conv with NHWC layout (#10585 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10585 group conv with NHWC layout Reviewed By: BIT-silence Differential Revision: D7547497 fbshipit-source-id: da0ec5a4512c15a0a0d7b79e6ce00c1f8f77f661	2018-08-17 00:39:23 -07:00
Xiaomeng Yang	87cac4c2f1	Update Im2Col related to make preparation for group conv in NHWC order. (#10439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10439 Update Im2Col related to make preparation for group conv in NHWC order. Reviewed By: houseroad Differential Revision: D9285344 fbshipit-source-id: 1377b0243acb880d2ad9cf73084529a787dcb97d	2018-08-15 17:10:24 -07:00
Xiaomeng Yang	f57e4ce1d5	Update broadcast with alpha to reduce num of launching kernels. (#10235 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10235 Update broadcast with alpha to reduce num of launching kernels. Reviewed By: houseroad Differential Revision: D9175824 fbshipit-source-id: 7a463833350a2c84dcfb82f73cf40da403dd59a0	2018-08-04 19:54:20 -07:00
Xiaomeng Yang	57d2d4bcff	Optimize reduce ops for 2d and 3d (#9992 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9992 Optimize reduce ops for 2d and 3d Reviewed By: houseroad Differential Revision: D9042505 fbshipit-source-id: 62af2125aa6439106293e59bdf6a2b920792fd2d	2018-08-04 13:53:58 -07:00
Andrew Tulloch	8f0a229078	Fix HPTT path for 0-sized inputs. Reviewed By: Maratyszcza Differential Revision: D9068091 fbshipit-source-id: 4aeac45f9732a86979a08488637bf0ba6cc79b34	2018-07-30 17:54:57 -07:00
Igor Milyakov	607688e928	Adding reciprocal operator and a test Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9908 Differential Revision: D9035809 Pulled By: virtan fbshipit-source-id: bce1db46fd55faeeab18a3b266d25c8beeb08df7	2018-07-27 18:24:43 -07:00
Jerry Zhang	aebf3b47ae	Remove template parameter from Tensor (#9939 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939 Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: ezyang, houseroad Differential Revision: D9024330 fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba	2018-07-27 10:56:39 -07:00
Jerry Zhang	969b62f276	Revert D8121878: Remove template parameter from Tensor Differential Revision: D8121878 Original commit changeset: 4a5e9a677ba4 fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e	2018-07-26 14:02:04 -07:00
Jerry Zhang	cd5adc7b5f	Remove template parameter from Tensor (#13 ) Summary: Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: xw285cornell Differential Revision: D8121878 fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81	2018-07-26 10:25:23 -07:00
Xiaomeng Yang	445c17d492	Update CopyMatrix in math (#9792 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9792 Update CopyMatrix in math Reviewed By: houseroad Differential Revision: D8982421 fbshipit-source-id: da2056306cde3300124b21eba7a6c2d113111002	2018-07-25 16:10:52 -07:00
Xiaomeng Yang	5df3eae89e	Add 1x1 specialization for conv with NCHW order (#9671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9671 Add 1x1 specialization for conv with NCHW order Reviewed By: houseroad Differential Revision: D8944686 fbshipit-source-id: 94bf44f69498b1934b7dfff4c0e989342c7bb61c	2018-07-23 18:54:58 -07:00
Xiaomeng Yang	a01d6f01b5	Update channel_shuffle_op and transpose 2d to speed up ShuffleNet (#9525 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9525 Update channel_shuffle_op and transpose 2d to speed up ShuffleNet Reviewed By: houseroad Differential Revision: D8889361 fbshipit-source-id: 60196e819b6842becc53b4859b62d4419a0e2c6e	2018-07-21 12:54:33 -07:00
James Sun	6de038286a	Add random data filler to predictor bench to support production nets (#9520 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9520 Add random data filler to predictor bench to support production nets Reviewed By: salexspb Differential Revision: D8712757 fbshipit-source-id: 2c732b2ba71ab210f9222adf94d08442ca71dc03	2018-07-18 00:46:02 -07:00
Mark Richardson	88146484b4	Add support for .norm() pytorch onnx export and ReduceL1/ReduceL2 caffe2 operators (#9299 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9299 Onnx has ReduceL1 and ReduceL2 operators that would facilitate this, so allow pytorch to export those and allow caffe2 to run them. I only implemented this on CPU so far. Reviewed By: pjh5 Differential Revision: D8757381 fbshipit-source-id: 68afc9e2f90042a70929b73ace05a499b5c670c7	2018-07-14 10:54:13 -07:00
Orion Reblitz-Richardson	7f33ec55b2	Fix Eigen issue on OS X with CUDA and nvcc compile (#9350 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9350 Re-apply #9270 Breaking this out of #8338 This takes care of the Eigen failure we saw on Mac CUDA builds when BUILD_CAFFE2 and BUILD_ATEN were removed. Fix is to isolate Eigen from headers included by cu files and processed by nvcc. This was worked on with smessmer. Reviewed By: mingzhe09088 Differential Revision: D8794431 fbshipit-source-id: de656334af46c697802073f8e8d9a6aeb9ca65a7	2018-07-11 14:00:05 -07:00
Xiaomeng Yang	cbcf45274b	Move tanh function to math (#9328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9328 Move tanh function to math Reviewed By: houseroad Differential Revision: D8794745 fbshipit-source-id: ea525dedde6f53592b06c2caffd6426688dea5fc	2018-07-11 13:59:50 -07:00
Huamin Li	fb9f9c9ba2	Implement Sinh and Cosh (#9213 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9213 Closes https://github.com/pytorch/pytorch/pull/9213 Added hyperbolic trig functions Sinh and Cosh Reviewed By: BIT-silence Differential Revision: D8752566 fbshipit-source-id: 5a58336a5153ec804404b9ac7b10b5662ede3cb7	2018-07-10 18:55:31 -07:00
Mike Kelley	8e6e8098ce	Revert D8768025: [pytorch][PR] Fix Eigen issue on OS X with CUDA and nvcc compile Differential Revision: D8768025 Original commit changeset: 5b34017aeb67 fbshipit-source-id: 6ec892ff483bb9d966eb7138eadc77443972c8f8	2018-07-10 10:24:43 -07:00

1 2 3

133 Commits