Commit Graph

154 Commits

Author SHA1 Message Date
Jerry Zhang
cd5adc7b5f Remove template parameter from Tensor (#13)
Summary:
Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13

Pull Request resolved: https://github.com/pytorch/translate/pull/166

Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125

Closes https://github.com/pytorch/pytorch/pull/9125

Use inheritance for polymorphism, and remove template parameter
This is to change the templating in call sites, the core implementations will change later

Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are:

1. We added an extra argument *DeviceType* to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)),
2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided.
3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type
4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change

Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s.

Reviewed By: xw285cornell

Differential Revision: D8121878

fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81
2018-07-26 10:25:23 -07:00
Kittipat Virochsiri
2b134c72e6 Add interface to provide blob types to shape&type inference (#9643)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9643

Current map interface assumes float data type, which is not always correct.

Reviewed By: kennyhorror

Differential Revision: D8455784

fbshipit-source-id: b94a31267760f7f97c15aa4b03008affc347fd10
2018-07-24 11:58:05 -07:00
Yinghai Lu
45e5c17ecf ONNXIFI transform (#9569)
Summary:
Cut-off runnable subgraph and off-load to ONNXIFI backend
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9569

Reviewed By: Maratyszcza

Differential Revision: D8930408

Pulled By: yinghai

fbshipit-source-id: 2b494f7f8dc10c00e58cf0fed5c4a9434be6155b
2018-07-20 15:09:59 -07:00
Kittipat Virochsiri
01581037dc Add workspace.RunPlanInBackground (#9637)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9637

Adding a method to run plan in background. The intended use is to run BlueWhale's data reading & preprocessing net in background while the GPU is training.

Reviewed By: MisterTea

Differential Revision: D8906439

fbshipit-source-id: b1c73ca7327e2d87a8f873924e05ab3d161a3f1e
2018-07-20 14:56:12 -07:00
Lin Li
0fe980c748 Memory usage measurement -- Caffe2 (#9017)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9017

Closes https://github.com/pytorch/pytorch/pull/9017

Added "get_blob_size_bytes" to "pybind_state.cc" in Caffe2 to expose the size of blob in bytes.

Reviewed By: kuttas

Differential Revision: D8685696

fbshipit-source-id: 9a9d38f207c8c59ef534217181e8ce1514617628
2018-07-17 16:40:23 -07:00
Gu, Jinghui
e8b8c3895e Enable Conv fusion optimizations in optimizeForIdeep (#9255)
Summary:
Enable fusion for IDEEP in optimizeForIdeep
including Conv+ReLU, Conv+Sum, Conv+Sum+ReLU, Conv+BN
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9255

Reviewed By: bddppq

Differential Revision: D8809030

Pulled By: yinghai

fbshipit-source-id: af30bad3b96cb965bd26a4dfa810370faec4bb88
2018-07-16 21:28:50 -07:00
Orion Reblitz-Richardson
9ec0a2aef4 fbshipit-source-id: ba600fcd2b5cefc7621357bdeb05e24cea02e5af 2018-06-27 04:50:56 -07:00
bddppq
f94ae3ba1d
Update from facebook (#7696)
* Fix handling of empty batches in SumReduceDimsOp

As titled

* Deferrable async_scheduling finishRun fix

Proper order of finishing run operations in deferrable_async_scheduling net

* Simplify exception handling in async_scheduling

Simplify exception handling, no need to busy wait, thread that processes the
last task can finish the run

* [C2]worker_coordinator_memorize_worker_ids

As titled. This is related to T28689868, where the number of blobs we want to create is equal to the number of worker ids

* Add unit test for nets with no type set

* Ignore total length argument in sympolic_pad_packed_sequence

1- There was a mistake in the code that total_length was added to the wrong symbolic function (pack_padded_sequence) instead of (pad_packed_sequence)
2- No need to throw an exception if total_length is given since it is only used to enable data_parallel training on multi-gpus and doesn't have anything to do with onnx export, so just ignore it. https://fburl.com/tk4gciqp

* Add support for MKLDNN to async_scheduling

Just add MKLDNN as a possible CPU option to async_scheduling's pool function

* [AuFL][ensemble] support branch output for prediction

This diff supports using predictions from different branches and thus enables model ensembling (not fully independent).

* Fix a bug in add_loss in layer_model_helper

As titled.

* Support lradaption for adam

1.lr adaption operator
2.apply to dense adam

* Perf tweaks for async_scheduling

Restore single pool option + remove unnecessary (no-ops) calls

* add quantization to SparseSimdAdagradOp

add a bunch of quantization signatures to SparseSimdAdagradOp, implementations to come next

* [sr] [codemod] Change all SR callsites to use new API

@allow-large-files

This diff refactors all callsites of SR to use the slightly changed API introduced in the diff below. Really what this means is that you need to include the correct header. Also if you were using `ClientFactory::newFactory` you need to not prefix it with `ClientFactory::`.

```
cd ~/fbsource/fbcode
find ./ -type f -exec sed -i -e 's:#include "servicerouter/client/cpp2/ClientFactory.h":#include "servicerouter/client/cpp2/ServiceRouter.h":' -e 's:#include <servicerouter/client/cpp2/ClientFactory.h>:#include <servicerouter/client/cpp2/ServiceRouter.h>:' -e 's/ClientFactory::newFactory(/newFactory(/g' {} \;
```

Also manually fixed spots that couldn't be done automatically (or broke because they depended on transitive includes).

* Back out "Fix handling of empty batches in SumReduceDimsOp"

Original commit changeset: 282da1730cc2 This commit is blocking the
Github->fbcode sync, which really needs to get merged ASAP. D7881937 which this
diff depends on will be reverted in the sync D7990948 which causes this to
break. The sync diff cannot be patched with this reversion because it must be
landed against base revision 5c8c099 , and D7881937 must not be included in the
sync diff because it is breaking GPU tests that are not available in sandcastle
: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-cuda8.0-cudnn6-ubuntu16.04-test/3638/console
for one example.

* Add the flow to support operator benchmark

1) generate model with the operator 2) upload to everstore 3) generate model spec into json file 4) start running the benchmark

* [tum][gpu] Connect DPM trainer with flow and unit tests

This diff:
- Fix some small bugs for Yiming's recent changes to parallelizer, so it suits real use cases.
- Add correct tags to the TUM code, so we can do data parallel transform
- pass extra info when instantiation.
- add unit test for using DPM in TUM model

After this diff, we can do simple box, multi-gpu fully-sync trainer for TUM in Fblearner workflow, but may still need to do speed benchmarking.

* w/o normalized lradaption for adam dense only

The previous lr adaption includes a normalization step when performing the dot product operation. This is not exactly same as what is proposed in the paper. I add normalization as an option. Without it, the operator performs exactly what the paper proposed. With the option, we add the normalization step

* [fb] Use SharedPromise in DeferrableAsyncSchedulingNet

This code is to simplify DeferrableAsyncSchedulingNet by removing condition
variable + small fixes

* [tum] implement cuda sparseLengthsMean and LengthsMean

as title

* Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.

Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.

* Move feature_to_index to FeatureSpec.feature_to_index

move feature_to_index to FeatureSpec.feature_to_index to avoid override other fields

* [Caffe2] Rename bytes_moved to bytes_written

Just a rename in preparation for supporting bytes_read.

* [c2] fix ReduceFrontSumOp for empty case by setting 0

otherwise, it may use the results from last iteration when it's empty batch.

* [Caffe2] [Int8] Improve Intel CPU performance

* [Easy] Improve PrependDim op logging

as titled

* DBFileReader expand db_path using os.path.expanduser(..)

Since there are a lot of possible use cases of `DBFileReader` to read from user home path, like `~/local/sample.db`, I want to save people's trouble of calling `os.path.expanduser(db_path)` themselves.

* [Caffe2] Add bytes_read to cost structure

We're adding analytical read bytes to cost functions.  This extends the structure accordingly for all CostInference defined operators.
Additionally, some small bug fixes were performed:
1) Cost functions now extract type information of operands instead of assuming float

* Fix sleef on aarch64 for hhvm

@bypass-lint

Rename flag

* Remove duplicated part in caffe2/ideep/operators/conv_op.cc

should be sync error

* Rename test helper function test_adagrad_sparse_helper to adagrad_sparse_test_helper to avoid confusing pytest
2018-05-19 23:10:48 -07:00
Bram Wasti
b1fbf29b52
[caffe2][nomnigraph] Change the standard transform API to take in NNModule rather than NetDef (#7308) 2018-05-08 17:43:51 -07:00
Bram Wasti
3913e9ead3
[caffe2][nomnigraph] Batchnorm + Conv Fusion (#7057) 2018-05-08 15:40:34 -07:00
Yinghai Lu
e3935f7509
[Caffe2] Add conv+relu fusion for MKLDNN ops (IDEEP) (#7385)
* Add conv+relu fusion for MKLDNN ops (IDEEP)

* comments
2018-05-08 14:44:53 -07:00
bddppq
7b66c433bc
Use a CI specific onnx namespace to catch hardcoded ones in the code (#7369) 2018-05-08 13:40:55 -07:00
Bram Wasti
3642745ef9
[caffe2][nomnigraph] Add maxpool sink transform (#7207) 2018-05-07 14:52:10 -07:00
Yinghai Lu
e6ce1afe47
[Caffe2] Follow-up of onnx-trt API change (#7076)
* Follow-up of onnx-trt API change

* indent

* comments
2018-04-28 23:07:15 -07:00
Yinghai Lu
8b70f7d248
[Caffe2] Clean up ideep integration (#6881)
* Clean up ideep integrtation

* .

* Remove redundant code in convnet benchmark

* MKL ON

* Do not add -mavx2 everywhere

* .

* Comments

* rename

* .
2018-04-24 18:32:35 -07:00
James Reed
6e60edb799
[caffe2] Fix logic error in tensor filling ops in C++ ONNX backend (#6909) 2018-04-24 13:53:27 -07:00
Jinghui
26ddefbda1 [feature request] [Caffe2] Enable MKLDNN support for inference (#6699)
* Add operators based-on IDEEP interfaces

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Enable IDEEP as a caffe2 device

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Add test cases for IDEEP ops

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Add IDEEP as a caffe2 submodule

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Skip test cases if no IDEEP support

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Correct cmake options for IDEEP

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Add dependences on ideep libraries

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Fix issues in IDEEP conv ops and etc.

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Move ideep from caffe2/ideep to caffe2/contrib/ideep

Signed-off-by: Gu Jinghui <jinghui.gu@intel.com>

* Update IDEEP to fix cmake issue

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Fix cmake issue caused by USE_MKL option

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>

* Correct comments in MKL cmake file

Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>
2018-04-22 21:58:14 -07:00
Yinghai Lu
6252706feb
[Caffe2] Workspace centric API for TensorRT transformation (#6678)
* Workspace centric API for trt transformation

* Merge SSA rewrite code
2018-04-17 21:23:27 -07:00
Yinghai Lu
582d47e986
[Caffe2] Scoped dummy name generator (#6458)
* Scoped dummy name generator

* Fix

* Fix

* Use class variable

* Fix build

* comment
2018-04-16 11:58:02 -07:00
Bram Wasti
7bd398b3db
Add fuseNNPACKConvRelu (#6439) 2018-04-10 16:51:16 -07:00
Svetoslav Kolev
997acfd7fe [Caffe2] Some small changes to InferBlobShapesAndTypes definition and SameAsInput Schema (#6335)
* Change Same as input type deduction to work for ops with multiple outputs

* change InferBlobShapesAndTypes definition to take vector ot pointers instead of unique_ptr. The function doesn't own the objects, so no need to pass smart pointers and that prevents calling the function with existing object, since the caller has to create unique_ptr, i.e. copy an existing object just to create the pointer

* switching order of std::move<unique_ptr> and uniqur_ptr.get

* adding comma
2018-04-06 19:06:46 -07:00
Bram Wasti
ee64200c64 [nomnigraph] Expose transformations to python
Adding a python interface to the transformations
2018-03-30 21:00:44 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yinghai Lu
b6e80a1ec4 Caffe2-onnx exporter (#2248)
* caffe2-onnx frontend

* Remove Python part of the conversion code

* nit

* convert more ops

* Address commmetns
2018-03-26 19:23:45 -07:00
Yinghai Lu
45da53f478 Remove Python onnx-caffe2 conversion code (#2362)
* WIP

* Remove Python onnx-caffe2 onversion code

* Fix build

* Comments

* Add comments

* Fix typo in comments
2018-03-22 11:59:03 -07:00
Yangqing Jia
2d03ae2f85 Move ParseProtobufFromLargeString to proto_utils (#2354)
* Move ParseProtobufFromLargeString to proto_utils

* ParseProtobuf -> ParseProto to be consistent in naming
2018-03-21 17:05:14 -07:00
Yinghai Lu
7e6693991d Onnx caffe2 backend (#2039)
* C++ version of ONNX->Caffe2 backend

* use namespace ONNX_NAMESPACE

* Fix Build

* Comments

* Change namespace from onnx_caffe2 to caffe2::onnx
2018-03-12 15:18:05 -07:00
Dmytro Dzhulgakov
9e71de398b [core] Graph-level NUMA awareness in Caffe2
Adding NUMA awareness through numa_node_id in DeviceOption. Blobs of operators
with numa_node_id are allocated on corr. memory banks, using CPU pools with
NUMA affinity set to run operators.
2018-03-06 00:33:11 -08:00
mdschatz
3c952426fb Add operator attaching net observer
Summary:
Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from.
Closes https://github.com/caffe2/caffe2/pull/1806

Reviewed By: salexspb

Differential Revision: D6808623

Pulled By: mdschatz

fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736
2018-01-29 14:34:34 -08:00
Ilia Cherniavskii
a7ac591d3b Support for DLPack in Python op
Summary: Adding support for DLPack tensors to Python op

Reviewed By: Yangqing

Differential Revision: D6577702

fbshipit-source-id: e14ef213fcdb2930ffe164667971a92aa8db503c
2017-12-21 17:02:16 -08:00
Peter Goldsborough
ce2a0aa4d8 Add slice and gather syntax
Summary:
Implemented syntactic sugar for the following constructs:

- `x.Gather(y)` can now be written as `x[y]`
- `x.Slice(start, end)` can now be written as `x[start:end]`

For slicing, `start` and/or `end` can be omitted iff `x` is one-dimensional (i.e. a vector). That is, `vector[start:]`, `vector[:end]` and `vector[:]` will work. Doesn't work for higher-dimensional tensors because to emit the start/end indices we need to know the rank of the tensor (since `Slice` requires one entry per dimension of the tensor).

Also added a `getProto()` function so that I could test that the generated code is as expected (i.e. that the syntactic sugar does not affect the structure of the output).

Reviewed By: zdevito

Differential Revision: D6605864

fbshipit-source-id: 786359713a13314c24be2fc07e01486c507404ef
2017-12-19 19:17:01 -08:00
Zachary DeVito
1c6595c8e8 Add function calls and externs
Summary:
Adds the ability for a script function to call another and adds the extern function to register an external Caffe2 Net that can be called by the script.
Closes https://github.com/caffe2/caffe2/pull/1591

Reviewed By: jamesr66a

Differential Revision: D6515877

Pulled By: zdevito

fbshipit-source-id: b893d9e4bacd7389b550ac8a37ad7974b95de749
2017-12-07 23:44:28 -08:00
Zachary DeVito
6811acbef9 Syntax for control flow in C2
Summary: Experimental code that allows you to write C2 NetDefs directly using python-like syntax. This includes the ability to write native control-flow (if, while) and have it turn into IfOp and WhileOp

Reviewed By: jamesr66a, dzhulgakov

Differential Revision: D6123298

fbshipit-source-id: 25fc078b5769be61ac7fb3aa9a7c95bd88dccc30
2017-11-29 16:47:45 -08:00
Andrew Dye
1ba3e14608 Throw Python exception from PythonOp instead of logging
Summary: Today when PythonOp throws an exception, we log the error and fail the op. Later we assert that the op/net/plan succeeds and throw with a generic message. The user must ttail the logs to find the real error. Instead, align with exception handling from other ops - throw directly. This will include full context of the exception in the error message.

Reviewed By: Yangqing, akyrola

Differential Revision: D6359684

fbshipit-source-id: 85133ba6562759607a3971449120647cbacce946
2017-11-20 09:03:17 -08:00
Qinqing Zheng
c77f0cb5e6 Attach observers to operators inside step net
Summary: Pass the list of observers to rnnExecutor_ and attach them to operators

Reviewed By: akyrola

Differential Revision: D6279655

fbshipit-source-id: 086dde1bf6edbfb36082d6b4de33ec41f0bbefab
2017-11-14 15:06:38 -08:00
Ilia Cherniavskii
1149b9bbb5 Polling async net executor
Summary:
Implementation of polling async net executor.
Notes:
- New net executor async_polling - schedules CPU and GPU ops asynchronously, uses single polling thread
- Events: update to Caffe2 events to support async CPU events, adding new methods:
 Query() - non-blocking checking of event states: INITIALIZED -> RECORDED -> SUCCESS/FAILED
 ErrorMessage() - when operation runs asynchronously and fails calling this on event will give error message
- Tasks: using existing DAGNet's algorithm to compute CPU and GPU chains, a separate task for each chain
- Polling: using single thread to query state of events - for CPU tasks atomically queries task state, for GPU task - uses cudaEventQuery; using Event
- Scheduling of CPU ops: using global thread pools
- Scheduling of GPU ops: using GPU thread pool per GPU device

Reviewed By: dzhulgakov

Differential Revision: D5985110

fbshipit-source-id: a9de7fcbb71d046a3aa1b573072b89a65dfeee8c
2017-11-03 07:27:44 -07:00
Bram Wasti
a0aa6d0e24 expose flop annotation to python
Summary: expose the flop annotation framework to python functions

Reviewed By: Maratyszcza, Yangqing

Differential Revision: D6135705

fbshipit-source-id: 2eed80b6cbda7b3ee3fe0e019a0f1fc4b0aa320b
2017-10-24 11:35:24 -07:00
Soumith Chintala
891f41c14b Upgrade to 2.2.1
Summary:
Update pybind from 1.8.1 to 2.2.1
aarch64 platform updates pending.

Reviewed By: houseroad, kmatzen

Differential Revision: D6089712

fbshipit-source-id: 80ce09c381717f4317e2e698479ff604cf28c709
2017-10-22 13:26:56 -07:00
Junjie Bai
43b303bfc0 Expose Predictor::run_map to Python
Reviewed By: jerryzh168

Differential Revision: D6087316

fbshipit-source-id: d90e20429645391f17f0c56c8a8a60685097f801
2017-10-18 19:32:56 -07:00
Bram Wasti
7d16d320d5 expose observers to python, add multiple observers per observable
Summary: observer framework can now be used in python + a small writeup of how to use it.  this is D6035393 with a fix for ct-scan

Reviewed By: salexspb

Differential Revision: D6066380

fbshipit-source-id: 896c4c580d4387240b81ac2dbbc43db51d4bfeb9
2017-10-16 14:32:56 -07:00
Scott Yost
a7a81351f2 Revert D6035393: [caffe2] expose observers to python, add multiple observers per observable
Summary:
This reverts commit 4563cf0203095fa979bb2160621cd16dd22ff830

bypass-lint

Differential Revision: D6035393

fbshipit-source-id: 090fba774ce433904f7ef769dda75c2fbbf784a8
2017-10-14 21:47:34 -07:00
Bram Wasti
58fe66e337 expose observers to python, add multiple observers per observable
Summary: observer framework can now be used in python + a small writeup of how to use it

Reviewed By: sf-wind

Differential Revision: D6035393

fbshipit-source-id: 4563cf0203095fa979bb2160621cd16dd22ff830
2017-10-14 13:09:29 -07:00
Yangqing Jia
b1508e8e86 Revert D5905002: [caffe2] expose observers to python
Summary:
This reverts commit e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1

bypass-lint

Differential Revision: D5905002

fbshipit-source-id: 4f1b79d9a318978f6b74565f633f34b9701a9d5c
2017-10-10 22:12:00 -07:00
Bram Wasti
63caca89db expose observers to python
Summary: observer framework can now be used in python + a small writeup of how to use it

Reviewed By: salexspb

Differential Revision: D5905002

fbshipit-source-id: e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1
2017-10-10 16:10:41 -07:00
Junjie Bai
91bb6ce095 Allow explicitly specifying to use operators' default implementation
Reviewed By: dzhulgakov

Differential Revision: D5973635

fbshipit-source-id: 12dccc6332a8dd264ccc9f831a053a3be9b89c56
2017-10-04 12:17:36 -07:00
Dmytro Dzhulgakov
5527dd3b08 Expose CMake options in the binary
Summary:
Useful for figuring out with people which version they built with. We can just ask for --caffe2_version gflag or get core.build_options from python.

Also adds CMAKE_INSTALL_RPATH_USE_LINK_PATH - without it wasn't building on my Mac. How should it be tested?
Closes https://github.com/caffe2/caffe2/pull/1271

Reviewed By: bddppq

Differential Revision: D5940750

Pulled By: dzhulgakov

fbshipit-source-id: 45b4c94f67e79346a10a65b34f40fd258295dad1
2017-10-04 02:33:02 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Jerry Zhang
23f4f78c22 Functional C2
Summary:
Supporting calling C2 operators as functions, e.g.
```
from caffe2.python.functional import Functional
Y = Functional.Relu(X)[0]
```
Supporting numpy arrays as input for now.

Reviewed By: bddppq

Differential Revision: D5791821

fbshipit-source-id: 7e936ad52b8b304c5e210248bd6649fd066cd909
2017-09-13 15:37:28 -07:00
Junjie Bai
90ca470d70 Standardize operator argument "is_test"
Summary:
Also add the ability to mark an argument as required.

Added a string constant `OpSchema::Arg_IsTest` for `is_test` arg.
If users define the `is_test` argument with `ArgIsTest(...)`, then it automatically becomes required argument, in the meanwhile user can still use `Arg("is_test", ...)` to define an optional `is_test` argument.

Reviewed By: akyrola

Differential Revision: D5812391

fbshipit-source-id: eaaba50d027813a8012389edc6c459de23c3c728
2017-09-13 14:35:27 -07:00
Junjie Bai
5748e7140f Strip Operator Schema in mobile build
Reviewed By: Yangqing

Differential Revision: D5677792

fbshipit-source-id: d29edb26a36b24a46821e13e2d77af0f21571fcd
2017-08-22 13:31:08 -07:00
Ben Zhang
cfbd116966 ApplyTransformIfFaster
Summary:
Implemented ApplyTransformIfFaster

Determine if a transform is faster, then return whichever net is better.

Reviewed By: bwasti

Differential Revision: D5534535

fbshipit-source-id: 509943205b0c454bf30fb01343ac4e88d1441c39
2017-08-17 15:36:51 -07:00
Ahmed Taei
a0fe96d7cd Rewrite memonger DAG in C++.
Summary: This diff replaces the main of the memonger for dag algorithm _compute_blob_recycling_for_dag with a c++ implementation.

Reviewed By: akyrola

Differential Revision: D5544219

fbshipit-source-id: 9f868880c8d0eb997ad3dd39433f9d0b9216d303
2017-08-16 16:17:15 -07:00
Junjie Bai
1ce95090ca Add support for specifying engine preferences
Reviewed By: Yangqing

Differential Revision: D5460994

fbshipit-source-id: 08a8af699eebec37defc070389a8415b3e81ac16
2017-08-09 00:47:18 -07:00
Ben Zhang
6314c1fc15 Transforms in Python
Summary: Allow the use of apply_transform() in the python API

Reviewed By: bwasti

Differential Revision: D5530483

fbshipit-source-id: 61a6d36fe125c89629fdeea040a717c453d84417
2017-08-01 16:51:38 -07:00
Yangqing Jia
de92dbe4bb MKL code move
Summary:
Nothing gets changed - this would allow us to more easily deal with build
systems. Also now everything that is MKL related lives under mkl/.

Reviewed By: dzhulgakov

Differential Revision: D5505157

fbshipit-source-id: ddb2e6ac290a146a7cb495da23bb0e5b5594bd2a
2017-07-26 20:21:55 -07:00
Yangqing Jia
f6afa6adbd Add proper cpuid support.
Summary:
This is needed for us to do more fine grained dispatch based on CPU arch, so
I figured we should just add it. Can help Dima and Misha doing optimization
I think?

Reviewed By: dzhulgakov

Differential Revision: D5477444

fbshipit-source-id: 48aaf8bd799e9755493cd51c793ceec080a8846c
2017-07-23 17:21:50 -07:00
Victor Gao
f7a92145d4 comment out unused parameter in pybind_state.cc
Summary:
This removes/comments out/silences one or more unused parameters in the files.
We are going to enable `-Wunused-parameter` in fbcode and this fixes a case that automated tooling can't handle.

This diff is automatically generated.
Reviewers are added heuristically.

Reviewed By: dzhulgakov

Differential Revision: D5437217

fbshipit-source-id: c2fc5ed30e7ee47b8c40248f89a9f4304ce7c098
2017-07-17 15:57:49 -07:00
Aapo Kyrola
ad62e82179 fast simple-net memonger for C++
Summary:
To be used with predictor "online": C++ version of memonger for simple nets. Very simple greedy algorithm. Works well at least on Resnet-50 inference graph: only 3 shared blobs are used.

Next I will integrate this with predictor and run canary (separate diff).

Reviewed By: asaadaldien

Differential Revision: D5375392

fbshipit-source-id: d36e419e39a32e568e105657c27fb00c85a2535d
2017-07-06 15:17:07 -07:00
Luke Yeager
fe9b0bfd27 Fix some typos
Summary: Closes https://github.com/caffe2/caffe2/pull/882

Differential Revision: D5341277

Pulled By: harouwu

fbshipit-source-id: bb5595c65c05ca7ea1a1d060d61d14fbfe008241
2017-06-28 13:50:48 -07:00
Alexander Sidorov
c8410859d9 Operator python stacktraces, attempt 2
Summary:
Last time I used uuid filled into OperatorDef. And operator_tracebacks was populated using traceback.extract_stack. There were several issues with this approach:

1. A random field in OperatorDef breaks workflows relying on memoization, i.e. when computation is skipped based on already computed result before.
2. Adding one more field revealed RNNs being non forward compatible wrt to new fields in there. prototxt format seems to not allow forward compatibility (thanks jamesr66a for the investigation!). For RNNs we need to swtich them to a more resilient approach. azzolini's proposed change to OperatorDef / NetDef would allow that by just nesting NetDef dirrectly inside OperatorDef without need for extra serialization.
3. traceback.extract_stack is very slow when executable is on a remote filesystem. It does one or more os.stat for each frame on the stack. For some cases it ended up being up to 15 extra minutes on model construction.

In this diff I use a different approach which should fix all those problems above.

1.2. are solved by not adding a new field at all. Instead I report operator idx wrt to a net it runs in. Thanks akyrola and dzhulgakov for the idea. Downside here is that operator list manipulation breaks the logic and separately created ops are not covered at all.
3. I solved this by operating on raw frames without using traceback and inspect modules which end up doing a lot of file system calls. See function extract_stacktace in core.py with additional comments.

Reviewed By: dzhulgakov

Differential Revision: D5286285

fbshipit-source-id: 626dd0f5f6b8b1d86bd6bf519078b122f43ddcaa
2017-06-25 19:32:58 -07:00
Thomas Dudziak
342de07231 Core unit test fixes for Python 3
Summary: As title

Differential Revision: D5291327

fbshipit-source-id: 7dd9279c53ba55d3422c31973ffcec5705787fdf
2017-06-23 13:22:16 -07:00
Alisson Gusatti Azzolini
7d482742fd Allow tasks/execution_steps to be cloned at runtime
Summary:
Advantages of cloning the tasks/execution_steps at runtime:
- Less complexity on the python side: no need to clone nets and add prefixes to blob names
- Faster start-up: we had cases of complex plans that took up to 30min to be created.
- Better isolation: each task cloned at runtime has its own child workspace, preventing false sharing of blobs.
- Opens up possibility for dynamic scheduling: Number of threads per task can be increased on the fly, at runtime.

Reviewed By: dzhulgakov

Differential Revision: D5100730

fbshipit-source-id: 71b83193b135da4e6eaf2536d8fc266528e1fdcc
2017-06-20 22:32:07 -07:00
Alexander Sidorov
83e6a0bec8 Revert uuid change to OperatorDef protobuf
Summary:
a few issues:

1. Randomization hurts memoization
1. Even if we make it non random, then we can get key colisions when loading it back.
2. RNNs use prototxt for step net and apparently its not forward compatible like normal protobuf is

I am thinking of a better less invasive solution now.

Reviewed By: jamesr66a

Differential Revision: D5272118

fbshipit-source-id: ab577fad04fbfc632e1fceffa923377a0d3da1be
2017-06-19 16:47:31 -07:00
Alexander Sidorov
eebda50b79 Operator python traceback
Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call.

Reviewed By: jamesr66a

Differential Revision: D5226047

fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108
2017-06-13 18:50:02 -07:00
Alisson Gusatti Azzolini
d3ec6e8f55 Run python op builder at op creation time
Summary:
This allows to construct a python op by passing a pickled "builder function call" as an argument to the op.
The builder function is called at PythonOp construction time and returns a function that will be called when the op is run.

This way we allow to drop the dependency on 'tokens', which didn't work properly for protobufs that get distributed to other processes. Now, the PythonOp definition is self-contained: as long as the build dependencies are right, sharding the protobuf is enough to execute the net remotely.

Reviewed By: dzhulgakov

Differential Revision: D5080833

fbshipit-source-id: a5deaca5d3143024cdb121519689224e9dbec5ce
2017-06-13 16:29:22 -07:00
Thomas Dudziak
c7f5bf282b Revert py::bytes -> std::string
Summary: As title

Reviewed By: salexspb

Differential Revision: D5229338

fbshipit-source-id: 3bc9442c76061436db8f3217c1ba8edfd9581f8b
2017-06-12 14:11:37 -07:00
Yiming Wu
8cd208ad6f Infer input and output device from OperatorDef through OperatorSchema
Summary: Infer input and output device from OperatorDef through OperatorSchema. This is inspired by shape inference. With this feature, we can easily analysis device information for all blobs in the net in a generic way. It is really helpful for auto cross device execution.

Reviewed By: akyrola, dzhulgakov

Differential Revision: D5161065

fbshipit-source-id: ee656123112171a4ca00f2fb3f6940f32ddf3135
2017-06-05 23:47:33 -07:00
Ross Girshick
8e99824ce7 Allow subsets of gradient outputs / inputs in Python ops
Summary:
I'm using Python ops in a project and need corresponding Python gradient ops. For my use case, only a subset of the forward op outputs have gradients and only a subset of forward op inputs have gradients. However the current implementation of `GetPythonGradient` forces all grad inputs and outputs to exist. This diff allows one to specify that only a subset of grad inputs / outputs are used when constructing the Python op.

I'm not sure if this is up to caffe2 standards, so please push back on style and content as needed.

Reviewed By: dzhulgakov

Differential Revision: D4897004

fbshipit-source-id: 96fffe8634c51a49b6bce7339a46c6235f7d4bbd
2017-06-05 12:52:01 -07:00
Thomas Dudziak
3ccbf23132 String-related fixes for Python 3
Summary: This diff is one step towards enabling python 3 build by making it be more diligent in its handling of strings.

Reviewed By: salexspb

Differential Revision: D4893083

fbshipit-source-id: 28b8adf3280e8d1f0a7dc9b0fee5ad53f2fada57
2017-05-26 16:04:32 -07:00
Dmytro Dzhulgakov
35eaf444c0 Quickly hack sparsenn_benchmarks to also do BenchmarkNet
Summary:
Makes benchmark a bit hacky, but it's a benchmark after all :)

Specifically ports functionality of proper BenchmarkNet run from the ads_benchmarks so that we can see training net perf.

Also adds --report_interval parameter to print stats more often when running in hogwild mode

kdub0 - hopefully if you have time you can integrate it properly with the Flow's workflow

harouwu -shouldn't conflict too much with your current diff

Reviewed By: rayleichen

Differential Revision: D5125183

fbshipit-source-id: 9c6f1663bc85e26d6609f0f2f23aa280731939db
2017-05-26 10:48:45 -07:00
Aapo Kyrola
658c337f41 Error status for Gloo ops, and handling in elastic dpm
Summary: Add a RandomFailureOp and handling to elastic data parallel model of the status code

Reviewed By: andrewwdye

Differential Revision: D5065936

fbshipit-source-id: 24224f9ea414ee535c9e90cc28add5189354b0ef
2017-05-17 00:16:52 -07:00
Alisson Gusatti Azzolini
75bc9f5e77 Relax requirement on token uniqueness
Summary: Relax requirement on token uniqueness since a few use cases broke after the uniqueness requirement was added in a previous diff.

Reviewed By: kittipatv

Differential Revision: D5034132

fbshipit-source-id: 327eb065923e6ea152a360324316f81b7fb9564b
2017-05-09 19:36:00 -07:00
Alisson Gusatti Azzolini
bd8ed6641c Stabilize PythonOp token name
Summary: For distributed jobs, we were relying on the order the PythonOps were registered, which was very fragile.

Reviewed By: dzhulgakov

Differential Revision: D5016847

fbshipit-source-id: f5601467c5b0569d5e8a0efdd76abad0d703c5f5
2017-05-09 11:19:44 -07:00
Aapo Kyrola
5c52392229 opsify AccumulateInputGradients
Summary:
Part of project to make all gradient accumulation business ops in RecurrentNetworkGradientOp, this makes the accumulateInputGradients ops.

Also added way to mark operators private so they don't appear in docs.

Reviewed By: salexspb

Differential Revision: D5006698

fbshipit-source-id: 226d7afb473290c8d0f936d2cc87640be3e06615
2017-05-05 09:13:39 -07:00
Yangqing Jia
cf317d1106 create_net: explicitly specify if one wants to overwrite the network.
Summary:
This is from discussion with dzhulgakov : as a step towards revisiting the
core.Net autonaming, we will first guard against accidental overwrites of
existing networks in the workspace.

ajtulloch since we are doing Predictors in mobile, this should be safe right?

azzolini - I assume this would be safe, but would love to get your approval.

akyrola - would this hurt xray?

Reviewed By: dzhulgakov

Differential Revision: D4897725

fbshipit-source-id: aa41271927ad6671f07a53b9505283623f8c49e5
2017-04-17 21:46:53 -07:00
Dongsheng Fang
3c0dc06ac8 Add __builtin_cpu_supports function def in windows
Summary: Closes https://github.com/caffe2/caffe2/pull/253

Differential Revision: D4892628

Pulled By: Yangqing

fbshipit-source-id: 45d49121027454d9259c4a753438d8f0771cf042
2017-04-14 19:46:19 -07:00
Yangqing Jia
ca0c8e5b25 remove import_array() help and use import_array1
Summary:
TSIA. See

https://github.com/numpy/numpy/blob/master/numpy/core/code_generators/generate_numpy_api.py

Reviewed By: jamorton

Differential Revision: D4893002

fbshipit-source-id: 4b6bee1bdf8ae905e4c0952a3e8bbbacd4129a50
2017-04-14 19:46:19 -07:00
Fei Sun
e2323ad688 Add CAFFE_ENFORCE to protobuf parsing
Summary: Add CAFFE_ENFORCE to make sure the protobuf parsing is successful.

Reviewed By: salexspb

Differential Revision: D4843662

fbshipit-source-id: 20cab7180e6b0e5afb5e29ff3333591659e41f7a
2017-04-06 14:34:30 -07:00
Fei Sun
95657ea1e8 Protobuf is binary string. Use bytes instead.
Summary: Prepare for the Protobuf change.

Reviewed By: dzhulgakov

Differential Revision: D4784884

fbshipit-source-id: 86219eecefaf7637e70339437c9274c526ebd6fe
2017-03-28 19:03:23 -07:00
Alexander Sidorov
56f324d191 Added predictor bindings to python interface
Summary: from caffe2.python import workspace; p = workspace.Predictor(init_net, predict_net); outputs = p.run(inputs)

Reviewed By: Yangqing

Differential Revision: D4576793

fbshipit-source-id: b829bbcaf2e7c34dad85024177433207bd96a234
2017-03-15 11:17:54 -07:00
Kittipat Virochsiri
f0d78753ae Make ModelExporter.load_from_db() load to specific workspace
Summary: In case of distributed task, load_from_db() loads to wrong workspace (when used inside a Python op). Passing which workspace to use explicitly so that it loads to the one Python op is being run.

Reviewed By: kennyhorror

Differential Revision: D4653692

fbshipit-source-id: 94585c012b05ee38b9ce5e8ef0efdd50aa41dd2b
2017-03-08 09:31:42 -08:00
Zachary Mirman
1c92e85dae Added editDistance helper to caffe2 operators
Summary: Added editDistance helper to caffe2 operators

Differential Revision: D4622152

fbshipit-source-id: 4d6246b8226c1283d5883edfaa27e8f7748fdc4c
2017-02-28 13:31:56 -08:00
Yangqing Jia
47b65b6d8d Add a create your own dataset tutorial
Summary:
bwasti - will follow up via email.
Closes https://github.com/caffe2/caffe2/pull/166

Differential Revision: D4596858

Pulled By: Yangqing

fbshipit-source-id: 6d088ccf1604e0dc9b94cbf0a75b51587e734d95
2017-02-22 03:31:47 -08:00
Yangqing Jia
8ca1b3baea import_array python3 compatibility
Summary: TSIA

Reviewed By: salexspb

Differential Revision: D4535571

fbshipit-source-id: 61ce724d4fc3c79fac551e8622a2d45cda67f80a
2017-02-09 10:08:13 -08:00
Andrew Dye
306fde233a Accept optional blob map for InferShapesAndTypes
Summary:
Shape inference allows Caffe2 to compute shapes of blobs without running a model. Update InferShapesAndTypes() to accept an optional blob:dimensions map so that external input blobs do not need to be part of the workspace.

InferShapesAndTypes() in workspace.py conditionally calls the ...from_workspace or ...from_map bindings. Note I favored a small amount of code duplication here for the sake of readability. InferShapesAndTypes() in operator.cc has been refactored into mirrored entry points, invoking a common helper.

Other minor changes to address linter warnings.

Reviewed By: dzhulgakov

Differential Revision: D4524873

fbshipit-source-id: 56f863b759c016d7f23523f06fda3aa5bba22357
2017-02-08 15:04:24 -08:00
Aapo Kyrola
6a03641cde Add num_iters to RunNet()
Summary:
Running RunNet() in python in a loop can be a performance issue if the python code is doing a lot of other processing, such as data input, because python's Global Interpreter lock (GIL) will prevent the RunNet() to be called. This can easily be fixed by making RunNet() run multiple iterations inside the C++ land. (Another way to accomplish the same thing is to use Caffe2's "execution plans", but that requires more setup).

+ fixed timing reporting in my OC workflow
+ improved one error log in data_workers.py

Sorry for piggypagging those small changes, but landing diffs currently is slow...

Reviewed By: rpenggithub

Differential Revision: D4523575

fbshipit-source-id: 039a647576efad5dd9afda74df478ac22b43c103
2017-02-07 14:16:14 -08:00
Aapo Kyrola
dcefc74a0c Shape and Type Inference Part1
Summary:
This is a bit large diff, sorry about it. It includes basic shape and type inference functionality, based on YQ's Schema scaffolding. I added some helper functions to make it easier to write simple translations.

Bigger refactoring was needed for ConvPoolBase so that we could use the shape inference already there in the schema.

I annotated enough operators to be able to infer forward-pass of shapes for basic convnet, and added test for that. I intend to bootcamp some annotations and annotate enough to handle Resnets fully. Need to think about gradients, if they could be annotated in an easier way.

Only shapes are now exposed to Python, types will follow later. Also the inference is not called yet anywhere but unit test.

Also I am not sure if everything is in the best location in the code, but shouldn't be hard to move stuff around.

Reviewed By: dzhulgakov

Differential Revision: D4436818

fbshipit-source-id: eebee5937ccc9ac09c245465302388a1fae6933c
2017-02-02 22:29:22 -08:00
Yangqing Jia
8553bd3f68 Ensure we are not using Eigen LGPL code, and build on raspbian.
Summary:
Turns out that building on raspbian is easy as a cake for caffe2 - cmake is awesome.
Closes https://github.com/caffe2/caffe2/pull/112

Differential Revision: D4480985

Pulled By: Yangqing

fbshipit-source-id: 5dbe5e1e71d8680dea7a5ec8a9ce7fbe6aa5270a
2017-01-30 09:44:27 -08:00
Fei Sun
cc65cc64c8 Create function ParseProtobufFromLargeString to parse strings more than 64MB
Summary: Replace ParseFromString with ParseProtobufFromLargeString to get around the limitation of the 64MB limit.

Reviewed By: Yangqing

Differential Revision: D4466226

fbshipit-source-id: b68a6efc76955db294ddb0d23bbaf03b69e4952a
2017-01-27 10:29:22 -08:00
Dmytro Dzhulgakov
864f561525 Make BlobDeserialization throw exceptions instead of returning bool
Summary: Makes it much nicer to spot errors, especially in iPython notebook.

Reviewed By: kennyhorror

Differential Revision: D4465726

fbshipit-source-id: c0adaf5168248a70987ff9d5dfce54a622ff2219
2017-01-26 09:44:19 -08:00
Ahmed Taei
9ad10959ee Enable large PlanDef protobuf message.
Summary:
Enable cases where PlanDef message is bigger than protobuf string decoding
limits.

Differential Revision: D4412736

fbshipit-source-id: 91ee02d7a8ab85b1c8169683a6c1dccd4c79be40
2017-01-13 09:29:29 -08:00
Bram Wasti
737000b166 Linter fix up to sync fbsource and github 2017-01-06 15:36:17 -08:00
Bram Wasti
3833dad5f6 manual sync of old never sync'd files 2017-01-06 15:28:45 -08:00
Yangqing Jia
5bfd6c4cd1 semicolon 2017-01-04 14:36:16 -08:00
Yangqing Jia
311ae2ba33 build file fix and avx2 on mac fix 2017-01-04 14:35:15 -08:00
bwasti
9ce23cbb71 Fix false positive for non-clang compilers. 2016-12-29 11:39:50 -08:00
Bram Wasti
b48f1ff810 OS X build 2016-12-29 12:25:53 -05:00
Dmytro Dzhulgakov
119b687994 Allow PythonOp to access the workspace
Summary:
DPER has very strange python ops that play with Workspace - they are somewhat similar to LoadOp/SaveOp, so I guess the semantics is fine.

Thus it makes sense to allow python operators to receive workspace pointer similarly to regular Operators.

I didn't figure out a better way to implement optional argument than just checking the number of args function receives on python side.

Reviewed By: ajtulloch

Differential Revision: D4242943

fbshipit-source-id: d97d4227815b741c8f884cfe254b06d2b56b5a41
2016-12-05 11:53:26 -08:00
Yangqing Jia
0e298ec399 Expose MKLMemory to the Python Feed and Fetch interface, and misc changes
Summary:
This is #2 of a series of changes. It did the following:

(1) a few refactor of the MKL memory interface
(2) an initial MKLContext to deal with MKL specific computations
(3) Provide MKLMemory access in Python with the blob feeder/fetcher registration.

Reviewed By: dzhulgakov

Differential Revision: D4210123

fbshipit-source-id: adea1f1ffbd0b9ffdd55092676468c16bec08992
2016-11-29 15:18:36 -08:00
Yangqing Jia
589398950f fbsync at f5a877 2016-11-18 15:41:06 -08:00