Commit Graph

5 Commits

Author SHA1 Message Date
Thomas Dudziak
60c78d6160 Fixes range/xrange for Python 3
Summary: As title

Differential Revision: D5151894

fbshipit-source-id: 7badce5d3122e8f2526a7170fbdcf0d0b66e2638
2017-06-07 00:04:26 -07:00
Pieter Noordhuis
bbd7aee9ab Revert D4952993: [Caffe2] fix mkl_sparse and migrate sparsity experiments
Summary: This reverts commit 86c03676ab4e47f04d2d0dd438a4a1c849bbbff0

Differential Revision: D4952993

fbshipit-source-id: 5c213c48ac44ce6aefccacc6d80534648d3c516a
2017-05-17 14:46:56 -07:00
Yiming Wu
f359d70ae7 fix mkl_sparse and migrate sparsity experiments
Summary:
Migrate experiments folder to fb/sparse folder. Keep FunHashOp and SparseFunHashOp because they are now assumed as a default Op in depr. What I did

  # Migrate FunHashOp and SparseFunHashOp and their unitests to core-caffe2, make sure tests are passed.
  # Migrate other Ops in experiment folder to fb/sparse folder. Write new TARGETS files for them. Make sure tests are passed.
  # Make sure all related tests passed.
  # Fix MKL definition btw. Make sure that FC_Sparse is not compiled when there is no MKL support

Reviewed By: salexspb

Differential Revision: D4952993

fbshipit-source-id: 86c03676ab4e47f04d2d0dd438a4a1c849bbbff0
2017-05-16 18:33:51 -07:00
Aaron Markham
58f7f2b441 doxygen python block added
Summary: Closes https://github.com/caffe2/caffe2/pull/226

Differential Revision: D4793550

Pulled By: JoelMarcey

fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e
2017-03-29 06:46:16 -07:00
Aapo Kyrola
d38499f727 Optimize BlobIsDefined() + benchmark --> net construction 95 secs to 8.2 secs!
Summary:
I have noticed that constructing the Xray model takes quite a while. To measure this, I wrote a benchmark script that creates a resnet-50 model on 8 gpus. This takes about 95 secs -- which is kind of annoying when you want to quickly debug stuff.

Profiling (using Python's cProfile), I was able to see that the most of the time is used in net.BlobIsDefined(), which does a linear search over external inputs and operator outputs. Thus it gets slower and slower with large nets.  This can be fully optimized by keeping a separate lookup table of operator inputs and outputs (and external inputs and outputs). It is a bit annoying to keep this separate data structure, but I setup the unit tests to ensure things are doing correctly over Clones.

After the optimization, the net construction drops from 95 secs to 8.2 secs!

Reviewed By: azzolini

Differential Revision: D4288307

fbshipit-source-id: 0bb82c8bde9d86a2702b298f4aa706cba509346e
2016-12-15 12:01:30 -08:00