pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Tristan Rice 0c9787c758 caffe2: use at::mt19937 instead of std::mt19937 (10x speedup) (#43987 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43987 This replaces the caffe2 CPU random number (std::mt19937) with at::mt19937 which is the one currently used in pytorch. The ATen RNG is 10x faster than the std one and appears to be more robust given bugs in the std (https://fburl.com/diffusion/uhro7lqb) For large embedding tables (10GB+) we see UniformFillOp taking upwards of 10 minutes as we're bottlenecked on the single threaded RNG. Swapping to at::mt19937 cuts that time to 10% of the current. Test Plan: Ran all relevant tests + CI. This doesn't introduce new features (+ is a core change) so existing tests+CI should be sufficient to catch regressions. Reviewed By: dzhulgakov Differential Revision: D23219710 fbshipit-source-id: bd16ed6415b2933e047bcb283a013d47fb395814		2020-10-16 16:08:35 -07:00
..
hip	Change hip filename extension to .hip (#14036 )	2018-11-16 11:55:59 -08:00
math	Optimize Scale function (#44913 )	2020-09-18 14:31:33 -07:00
threadpool	Re-apply PyTorch pthreadpool changes	2020-06-23 19:26:21 -07:00
bench_utils.cc	wipe cache with writes (#12279 )	2018-10-03 17:12:23 -07:00
bench_utils.h	Lightweight at-most-once logging for API usage (#20745 )	2019-05-23 23:17:59 -07:00
cast_test.cc	Update from facebook (#7855 )	2018-05-29 11:38:02 -07:00
cast.h	Update from facebook (#7855 )	2018-05-29 11:38:02 -07:00
cblas.h	Fix more MKL build issues	2017-08-25 14:01:01 -07:00
CMakeLists.txt	Fix BUILD_CAFFE2 if FBGEMM and NNPACK are not built (#45610 )	2020-10-01 14:58:55 -07:00
conversions.h	[caffe2] use Clang identification macro in various places (#33574 )	2020-02-20 15:16:11 -08:00
cpu_neon.h	[caffe2] Use both __ARM_NEON__ and __ARM_NEON macros (#6697 )	2018-04-18 17:45:47 -04:00
cpuid_test.cc	Remove Apache headers from source.	2018-03-27 13:10:18 -07:00
cpuid.cc	Remove Apache headers from source.	2018-03-27 13:10:18 -07:00
cpuid.h	[caffe2] Use cpuinfo in perfkernels to simplify build dependency (#36371 )	2020-04-10 13:26:34 -07:00
eigen_utils.h	Export PyTorch erf to ONNX Erf and add Caffe2 Erf operator	2019-01-17 09:18:08 -08:00
fatal_signal_asan_no_sig_test.cc	Windows shared build (#13550 )	2018-11-16 12:16:28 -08:00
filler.h	Delete Tensor::swap(), replace with pointer swap (#12730 )	2019-01-25 08:25:07 -08:00
fixed_divisor_test.cc	Enable ROCm multi-gpu with Gloo	2019-05-07 09:55:47 -07:00
fixed_divisor.h	[caffe2] use Clang identification macro in various places (#33574 )	2020-02-20 15:16:11 -08:00
GpuBitonicSort.cuh	Manually applying cudnn5 pull request.	2018-01-02 15:31:33 -08:00
GpuDefs.cuh	CUDA RTX30 series support (#45489 )	2020-09-29 18:19:23 -07:00
GpuScanUtils.cuh	RIP CUDA <9.2: circleci, aten, and caffe2 (#36846 )	2020-05-18 13:41:05 -07:00
map_utils.h	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
math_cpu.cc	caffe2: use at::mt19937 instead of std::mt19937 (10x speedup) (#43987 )	2020-10-16 16:08:35 -07:00
math_gpu_test.cc	Update math::Transpose to support tensor with size > 2G (#17670 )	2019-03-20 18:22:21 -07:00
math_gpu.cu	RIP CUDA <9.2: circleci, aten, and caffe2 (#36846 )	2020-05-18 13:41:05 -07:00
math_test.cc	Update math::Transpose to support tensor with size > 2G (#17670 )	2019-03-20 18:22:21 -07:00
math-detail.h	Remove Apache headers from source.	2018-03-27 13:10:18 -07:00
math.h	Move math::Axpy function to elementwise lib (#18316 )	2019-03-26 12:19:19 -07:00
murmur_hash3.cc	Remove core and util warnings (#8239 )	2018-06-07 09:10:33 -07:00
murmur_hash3.h
proto_convert.cc	New serialization format (#12384 )	2018-10-16 16:36:58 -07:00
proto_convert.h	New serialization format (#12384 )	2018-10-16 16:36:58 -07:00
proto_utils_test.cc	caffe2 - Util to cleanup external inputs and outputs from a NetDef (#18194 )	2019-03-22 11:23:03 -07:00
proto_utils.cc	[Onnxifi] Don't throw exception when we cannot write out debug files (#45979 )	2020-10-08 00:18:24 -07:00
proto_utils.h	[Onnxifi] Don't throw exception when we cannot write out debug files (#45979 )	2020-10-08 00:18:24 -07:00
proto_wrap.cc	New Serialization Proto	2018-09-11 10:55:43 -07:00
proto_wrap.h	build changes to make cpu unified build working. (#10504 )	2018-08-15 17:22:36 -07:00
signal_handler.cc	preprocessor cleanup (#33957 )	2020-03-02 13:37:19 -08:00
signal_handler.h	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00
simple_queue_test.cc	Remove Apache headers from source.	2018-03-27 13:10:18 -07:00
simple_queue.h	Fix issues under caffe round 1	2019-01-23 19:04:59 -08:00
smart_tensor_printer_test.cc	Kill more weird constructors on Tensor	2018-11-04 16:54:49 -08:00
smart_tensor_printer.cc	preprocessor cleanup (#33957 )	2020-03-02 13:37:19 -08:00
smart_tensor_printer.h	More changes for hidden visibility (#10692 )	2018-08-21 13:39:57 -07:00
string_utils.cc	BlackBoxPredictor OSS part 5: glow transforms	2019-07-23 16:39:23 -07:00
string_utils.h	Fix out-of-boundary access in `caffe2::StartsWith` (#36672 )	2020-04-15 20:40:59 -07:00
zmq_helper.h	Fix typos, via a Levenshtein-type corrector (#31523 )	2020-01-17 16:03:19 -08:00