pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Geoffrey Goh	e23e4cc356	Back out "Revert D16469619: Add Virtual Memory and CPU percentage computation to AIBench" Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23821 Reviewed By: hl475 Differential Revision: D16654854 fbshipit-source-id: f057023e890cbcbd9145ef2ecb449df2fbba592b	2019-08-07 15:44:22 -07:00
Michael Suo	1b1bddaab3	Revert D16469619: Add Virtual Memory and CPU percentage computation to AIBench Differential Revision: D16469619 Original commit changeset: 670f3549c830 fbshipit-source-id: f55d4cda36f5e29df2df306d33a70158e5a7908b	2019-08-04 16:06:51 -07:00
Geoffrey Goh	445440a6a9	Add Virtual Memory and CPU percentage computation to AIBench (#23590 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23590 This diff adds CPU% and Virtual Memory computation by default to AIBench when doing mobile remote run Reviewed By: llyfacebook Differential Revision: D16469619 fbshipit-source-id: 670f3549c830a36bc456a57f2ea668f9f82dd15a	2019-08-04 09:29:44 -07:00
Yangqing Jia	a6f1ae7f20	set up c10 scaffolding. Move macros proper first. Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11939 Reviewed By: orionr, dzhulgakov Differential Revision: D10004629 Pulled By: Yangqing fbshipit-source-id: ba50a96820d35c7922d81c78c4cbe849c85c251c	2018-09-24 11:09:59 -07:00
Hao Lu	af107c4d16	Fix shape inference bug (#9199 ) Summary: Closes https://github.com/pytorch/pytorch/pull/9199 The input shapes are not logged correctly in production because `PerfNetObserver::Stop()` only gets called after the inference is done for the net and in the mobile models, it's common practice to reuse the blobs as much as possible to save memory. And the shapes of the blobs keep changing during inference. By the time you you query `InputTensorShapes()` in `PerfNetObserver::Stop()`, you only get the final shape of the blobs. To fix this bug, I moved the 'InputTensorShapes()' query from `PerfNetObserver::Stop()` to `PerfOperatorObserver::Stop()`. The latter gets called at the end of operator->run() whereas `PerfNetObserver::Stop()` gets called at the end of net->run(). Also remove `PerfOperatorObserver::getAnalyticalCost()` since it's now done on the server side and no longer needed on mobile Reviewed By: Maratyszcza Differential Revision: D8743346 fbshipit-source-id: 5d2d0132e3f5e084be7d0173863e695e62a6b4a0	2018-07-06 15:15:17 -07:00
Bram Wasti	82b981e4db	Update from facebook 1ee4edd286a3 (#8040 ) * Adding instance weight to batch distill loss as title * add bfloat 16-31 added bfloat 16-31 and their respective unit tests * [CUDA9] Upgrade - fbcode CUDA9 upgrade diff D5654023 has been out for a while thanks to Pieter. But with time growing it's becoming quite hard to rebase, because of the symlinks and auto-generated build/config files in tp2. Break D5654023 into two diffs, one touching tp2 config files, and another one touching fbcode TARGETS file (adding nvcc flag). These two should be a bit easier to rebase (for detailed procedure see "Test Plan"). This diff can only be committed if: 1. CUDA 9 rpm is rolled out fleet-wide (TBD) 2. NVidia driver 390.40 is rolled out fleet-wide (done) 3. Upgrade CUDA 9.1, cudnn 7.1, nccl 2.1 (done) 4. Make sure all dependents are built (done) 5. Test all C2 operators, PyTorch (see test plan) * Share intermediate int32 buffer across Conv ops Adding a known type * [C2 fix] infer function for ensure_cpu_output_op this is adding the missing device funtion for ensure_cpu_output_op * [int8] Add blob serializer/deserializer for Int8TensorCPU To export to logfiledb * [nomnigraph] Add try catch block to optimization passes in predictor This will catch failures that happen in the optimization pass. * Caffe2: avoid static initialization order fiasco for CAFFE_ENFORCE CAFFE_ENFORCE uses strack trace fetcher. Which is currently a global static variable. If at static initialization time CAFFE_ENFORCE is used, this is a SIOF. Recently CAFFE_ENFORCE was added into init functions registration, so we started to see this. Meyers singleton is going to provide safety here. If stacktrace fetcher was not registered yet, it will just use a dummy one. * NUMA support in SparseNN CPU benchmark Adding support for NUMA in SparseNN CPU benchmark * [mobile-roofline] Add logging needed for roofline model This should be all that's needed * Let the operators using the same input if the operators are not chained or else, we have to change the input data dims * fix null-pointer-use UBSAN errors in in reshape_op.h * revert previous fix on input blob name as title * Adding flag to let MineHardNegative automatically extract single value from dict Model exporter requires the output of the model to be a struct. This makes it convenient to use those models directly in MineHardNegative by allow automatic extraction of the single element of dict, which is a common use case. * Reverting change that broke internal tests back to OSS compatible state	2018-06-01 17:41:09 -04:00
xkszltl	89ba9dc44f	Import/export observer symbols for DLL, which fixes the linking error in Visual Studio. (#6834 ) * Import/export observer symbols for DLL, which fixes the linking error in Visual Studio. * Add support of all default cmake build types for release to cuda.	2018-05-31 10:22:21 -07:00
Martin Schatz	8baa563daf	Change observer copy() method to take id parameter This diff is added to support the ProfileObserver in order to differentiate operators in the stepnet properly. Since copy() is only used in the context of RNNs, the name has been changed to reflect that.	2018-03-27 18:10:39 -07:00
Yangqing Jia	dd1564b061	Caffe2 module update: move observers as well as binaries. (#2145 ) * Caffe2 module update: move observers as well as binaries. * Add threads linkage * Add Threads dependency to public interface	2018-03-06 14:45:21 -08:00

9 Commits