Commit Graph

9 Commits

Author SHA1 Message Date
Geoffrey Goh
e23e4cc356 Back out "Revert D16469619: Add Virtual Memory and CPU percentage computation to AIBench"
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23821

Reviewed By: hl475

Differential Revision: D16654854

fbshipit-source-id: f057023e890cbcbd9145ef2ecb449df2fbba592b
2019-08-07 15:44:22 -07:00
Michael Suo
1b1bddaab3 Revert D16469619: Add Virtual Memory and CPU percentage computation to AIBench
Differential Revision:
D16469619

Original commit changeset: 670f3549c830

fbshipit-source-id: f55d4cda36f5e29df2df306d33a70158e5a7908b
2019-08-04 16:06:51 -07:00
Geoffrey Goh
445440a6a9 Add Virtual Memory and CPU percentage computation to AIBench (#23590)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23590

This diff adds CPU% and Virtual Memory computation by default to AIBench when doing mobile remote run

Reviewed By: llyfacebook

Differential Revision: D16469619

fbshipit-source-id: 670f3549c830a36bc456a57f2ea668f9f82dd15a
2019-08-04 09:29:44 -07:00
Yangqing Jia
a6f1ae7f20 set up c10 scaffolding. Move macros proper first.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11939

Reviewed By: orionr, dzhulgakov

Differential Revision: D10004629

Pulled By: Yangqing

fbshipit-source-id: ba50a96820d35c7922d81c78c4cbe849c85c251c
2018-09-24 11:09:59 -07:00
Hao Lu
af107c4d16 Fix shape inference bug (#9199)
Summary:
Closes https://github.com/pytorch/pytorch/pull/9199

The input shapes are not logged correctly in production because `PerfNetObserver::Stop()` only gets called after the inference is done for the net and in the mobile models, it's common practice to reuse the blobs as much as possible to save memory. And the shapes of the blobs keep changing during inference. By the time you you query `InputTensorShapes()` in `PerfNetObserver::Stop()`, you only get the final shape of the blobs.

To fix this bug, I moved the 'InputTensorShapes()' query from `PerfNetObserver::Stop()` to `PerfOperatorObserver::Stop()`. The latter gets called at the end of operator->run() whereas `PerfNetObserver::Stop()` gets called at the end of net->run().

Also remove `PerfOperatorObserver::getAnalyticalCost()` since it's now done on the server side and no longer needed on mobile

Reviewed By: Maratyszcza

Differential Revision: D8743346

fbshipit-source-id: 5d2d0132e3f5e084be7d0173863e695e62a6b4a0
2018-07-06 15:15:17 -07:00
Bram Wasti
82b981e4db Update from facebook 1ee4edd286a3 (#8040)
* Adding instance weight to batch distill loss

as title

* add bfloat 16-31

added bfloat 16-31 and their respective unit tests

* [CUDA9] Upgrade - fbcode

CUDA9 upgrade diff D5654023 has been out for a while thanks to Pieter. But with time growing it's becoming quite hard to rebase, because of the symlinks and auto-generated build/config files in tp2. Break D5654023 into two diffs, one touching tp2 config files, and another one touching fbcode TARGETS file (adding nvcc flag). These two should be a bit easier to rebase (for detailed procedure see "Test Plan").

This diff can only be committed if:
1. CUDA 9 rpm is rolled out fleet-wide (TBD)
2. NVidia driver 390.40 is rolled out fleet-wide (done)
3. Upgrade CUDA 9.1, cudnn 7.1, nccl 2.1 (done)
4. Make sure all dependents are built (done)
5. Test all C2 operators, PyTorch (see test plan)

* Share intermediate int32 buffer across Conv ops

Adding a known type

* [C2 fix] infer function for ensure_cpu_output_op

this is adding the missing device funtion for ensure_cpu_output_op

* [int8] Add blob serializer/deserializer for Int8TensorCPU

To export to logfiledb

* [nomnigraph] Add try catch block to optimization passes in predictor

This will catch failures that happen in the optimization pass.

* Caffe2: avoid static initialization order fiasco for CAFFE_ENFORCE

CAFFE_ENFORCE uses strack trace fetcher. Which is currently a
global static variable. If at static initialization time CAFFE_ENFORCE
is used, this is a SIOF. Recently CAFFE_ENFORCE was added into init
functions registration, so we started to see this.

Meyers singleton is going to provide safety here. If stacktrace
fetcher was not registered yet, it will just use a dummy one.

* NUMA support in SparseNN CPU benchmark

Adding support for NUMA in SparseNN CPU benchmark

* [mobile-roofline] Add logging needed for roofline model

This should be all that's needed

* Let the operators using the same input if the operators are not chained

or else, we have to change the input data dims

* fix null-pointer-use UBSAN errors in in reshape_op.h

* revert previous fix on input blob name

as title

* Adding flag to let MineHardNegative automatically extract single value from dict

Model exporter requires the output of the model to be a struct. This makes it convenient to use those models directly in MineHardNegative by allow automatic extraction of the single element of dict, which is a common use case.

* Reverting change that broke internal tests back to OSS compatible state
2018-06-01 17:41:09 -04:00
xkszltl
89ba9dc44f Import/export observer symbols for DLL, which fixes the linking error in Visual Studio. (#6834)
* Import/export observer symbols for DLL, which fixes the linking error in Visual Studio.

* Add support of all default cmake build types for release to cuda.
2018-05-31 10:22:21 -07:00
Martin Schatz
8baa563daf Change observer copy() method to take id parameter
This diff is added to support the ProfileObserver in order to differentiate operators in the stepnet properly.  Since copy() is only used in the context of RNNs, the name has been changed to reflect that.
2018-03-27 18:10:39 -07:00
Yangqing Jia
dd1564b061 Caffe2 module update: move observers as well as binaries. (#2145)
* Caffe2 module update: move observers as well as binaries.

* Add threads linkage

* Add Threads dependency to public interface
2018-03-06 14:45:21 -08:00