Commit Graph

5 Commits

Author SHA1 Message Date
Pieter Noordhuis
6729d81418 Specify which GPUs to use in resnet50 example
Summary:
TSIA

This change also fixes an undefined attribute error after running 20
iterations of the resnet50 example trainer.

Differential Revision: D4692794

fbshipit-source-id: b98efdfeb078c5ba89d2a86837f3c672e1eade5f
2017-03-12 22:33:15 -07:00
Jerry Pan
bde53f61af Caffe2: add scuba logging to benchmark
Summary: Caffe2: add scuba logging to benchmark

Differential Revision: D4667194

fbshipit-source-id: 8e9fca5517d7d40a6bc3e55cd00161e7482cd6f4
2017-03-09 16:32:47 -08:00
Aapo Kyrola
42279a610c use Pieter-MPI and fb.distributed
Summary:
Remove MPI and use fb.distributed rendezvous and Pieter's new Ops.

One now can pass a 'rendezvous' struct to data_parallel_model to initiate distributed SyncSGD. Provided rendezvoud implementation uses the kv-store handler of fb.distributed to disseminate information about other hosts. We can easily add other rendezvous, such as file-based, but that is topic of another diff.

Removing MPI allowed also simplifiying of Xray startup scripts, which are included in this diff.

When accepted, I will work on a simple example code so others can use this stuff as well. Also Flow implementation will be topic of next week.

Differential Revision: D4180012

fbshipit-source-id: 9e74f1fb43eaf7d4bb3e5ac6718d76bef2dfd731
2016-11-29 15:18:36 -08:00
Yangqing Jia
238ceab825 fbsync. TODO: check if build files need update. 2016-11-15 00:00:46 -08:00
Yangqing Jia
d1e9215184 fbsync 2016-10-07 13:08:53 -07:00