pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Yudong Guang	265b55d028	Revert D13205604: Move numa.{h, cc} to c10/util Differential Revision: D13205604 Original commit changeset: 54166492d318 fbshipit-source-id: 89b6833518c0b554668c88ae38d97fbc47e2de17	2018-12-07 10:01:25 -08:00
Jerry Zhang	1d111853ae	Move numa.{h, cc} to c10/util (#14393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14393 att Reviewed By: ezyang Differential Revision: D13205604 fbshipit-source-id: 54166492d31827b0343ed070cc36a825dd86e2ed	2018-12-06 11:30:13 -08:00
Junjie Bai	e290a9d2fd	Back out "Migrate DeviceOption.numa_node_id to DeviceOption.device_id" Summary: Original commit changeset: 82583d0ad4b8 Reviewed By: enosair, ilia-cher Differential Revision: D10560741 fbshipit-source-id: e289a37d441bd2243b369810abf451292891d9ee	2018-10-24 17:11:25 -07:00
Junjie Bai	202893fe1a	Migrate DeviceOption.numa_node_id to DeviceOption.device_id Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12717 Reviewed By: ilia-cher Differential Revision: D10408325 fbshipit-source-id: 82583d0ad4b8db094ee4c5c607b52500826328f7	2018-10-19 12:45:48 -07:00
Bram Wasti	aa56a1211d	Update from facebook (#6871 ) * Track checkpoint performance in scuba As title. * [C2/CUDA]: fix cross entropy sigmoid with logits when adding log_d_trick, I forgot to add it to the cuda impl; this diff fixes it. * Back out "[caffe2] Unregister MKL fallbacks for NCHW conversions" Original commit changeset: 8918dd40205a Will land after @jongsoo's diff https://phabricator.intern.facebook.com/D7596315 lands * [Easy][C2] Don't add blob to external outputs from output_record if it's already external output As desc. * On Mobile phones, call GlobalInit with no arguments in predictor in case we need to perform initialization FACEBOOK: The QPL logger needs the initialization code. In the past, the initialization code is put in the pipeline calling Caffe2. However, those places become obsolete quickly, as the product teams change places to call Caffe2 from time to time. We also need to track which teams use Caffe2 so that we can put the initialization code there. With this diff, the initialization code is put in the predictor constructor, only enabled for mobile phones. This way, we can always enable QPL logging. Once we do this, we can check how many times Caffe2 inference is called in production, and which models are more popular in production. This way, we can prioritize our effort supporting those models. Will clean up the old code calling the init in the product in a separate diff. * add padding op for sparse length tensor to pad length-based sparse tensor with padding_value * Add conv_op with cudaconvnet engine Add conv_op with cudaconvnet engine * [numa] Fix simple NUMA copy benchmark Move XavierFill into init_net and also compute BW * call roundf (device function) instead of round (host function) * [caffe2_benchmark][observer] Make caffe2_benchmark use its own observer 1. Add ClearGlobalNetObservers() 2. Make caffe2_benchmark use its own observer and observer_reporter * [detectron] Use roundf instead of round in the detectron module ops * allow K larger than number of elements in top k op one use case is to use this op together with PackSegments for sparse tensors, where the number of elements in each slice is not statistically defined. * add ChannelShuffle DNNLOWP op * fixup math_cpu.cc break	2018-04-23 15:01:56 -07:00
Dmytro Dzhulgakov	9e71de398b	[core] Graph-level NUMA awareness in Caffe2 Adding NUMA awareness through numa_node_id in DeviceOption. Blobs of operators with numa_node_id are allocated on corr. memory banks, using CPU pools with NUMA affinity set to run operators.	2018-03-06 00:33:11 -08:00

6 Commits