pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jerry Zhang	cd5adc7b5f	Remove template parameter from Tensor (#13 ) Summary: Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: xw285cornell Differential Revision: D8121878 fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81	2018-07-26 10:25:23 -07:00
Kittipat Virochsiri	2b134c72e6	Add interface to provide blob types to shape&type inference (#9643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9643 Current map interface assumes float data type, which is not always correct. Reviewed By: kennyhorror Differential Revision: D8455784 fbshipit-source-id: b94a31267760f7f97c15aa4b03008affc347fd10	2018-07-24 11:58:05 -07:00
Yinghai Lu	45e5c17ecf	ONNXIFI transform (#9569 ) Summary: Cut-off runnable subgraph and off-load to ONNXIFI backend Pull Request resolved: https://github.com/pytorch/pytorch/pull/9569 Reviewed By: Maratyszcza Differential Revision: D8930408 Pulled By: yinghai fbshipit-source-id: 2b494f7f8dc10c00e58cf0fed5c4a9434be6155b	2018-07-20 15:09:59 -07:00
Kittipat Virochsiri	01581037dc	Add workspace.RunPlanInBackground (#9637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9637 Adding a method to run plan in background. The intended use is to run BlueWhale's data reading & preprocessing net in background while the GPU is training. Reviewed By: MisterTea Differential Revision: D8906439 fbshipit-source-id: b1c73ca7327e2d87a8f873924e05ab3d161a3f1e	2018-07-20 14:56:12 -07:00
Lin Li	0fe980c748	Memory usage measurement -- Caffe2 (#9017 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9017 Closes https://github.com/pytorch/pytorch/pull/9017 Added "get_blob_size_bytes" to "pybind_state.cc" in Caffe2 to expose the size of blob in bytes. Reviewed By: kuttas Differential Revision: D8685696 fbshipit-source-id: 9a9d38f207c8c59ef534217181e8ce1514617628	2018-07-17 16:40:23 -07:00
Gu, Jinghui	e8b8c3895e	Enable Conv fusion optimizations in optimizeForIdeep (#9255 ) Summary: Enable fusion for IDEEP in optimizeForIdeep including Conv+ReLU, Conv+Sum, Conv+Sum+ReLU, Conv+BN Pull Request resolved: https://github.com/pytorch/pytorch/pull/9255 Reviewed By: bddppq Differential Revision: D8809030 Pulled By: yinghai fbshipit-source-id: af30bad3b96cb965bd26a4dfa810370faec4bb88	2018-07-16 21:28:50 -07:00
Orion Reblitz-Richardson	9ec0a2aef4	fbshipit-source-id: ba600fcd2b5cefc7621357bdeb05e24cea02e5af	2018-06-27 04:50:56 -07:00
bddppq	f94ae3ba1d	Update from facebook (#7696 ) * Fix handling of empty batches in SumReduceDimsOp As titled * Deferrable async_scheduling finishRun fix Proper order of finishing run operations in deferrable_async_scheduling net * Simplify exception handling in async_scheduling Simplify exception handling, no need to busy wait, thread that processes the last task can finish the run * [C2]worker_coordinator_memorize_worker_ids As titled. This is related to T28689868, where the number of blobs we want to create is equal to the number of worker ids * Add unit test for nets with no type set * Ignore total length argument in sympolic_pad_packed_sequence 1- There was a mistake in the code that total_length was added to the wrong symbolic function (pack_padded_sequence) instead of (pad_packed_sequence) 2- No need to throw an exception if total_length is given since it is only used to enable data_parallel training on multi-gpus and doesn't have anything to do with onnx export, so just ignore it. https://fburl.com/tk4gciqp * Add support for MKLDNN to async_scheduling Just add MKLDNN as a possible CPU option to async_scheduling's pool function * [AuFL][ensemble] support branch output for prediction This diff supports using predictions from different branches and thus enables model ensembling (not fully independent). * Fix a bug in add_loss in layer_model_helper As titled. * Support lradaption for adam 1.lr adaption operator 2.apply to dense adam * Perf tweaks for async_scheduling Restore single pool option + remove unnecessary (no-ops) calls * add quantization to SparseSimdAdagradOp add a bunch of quantization signatures to SparseSimdAdagradOp, implementations to come next * [sr] [codemod] Change all SR callsites to use new API @allow-large-files This diff refactors all callsites of SR to use the slightly changed API introduced in the diff below. Really what this means is that you need to include the correct header. Also if you were using `ClientFactory::newFactory` you need to not prefix it with `ClientFactory::`. ``` cd ~/fbsource/fbcode find ./ -type f -exec sed -i -e 's:#include "servicerouter/client/cpp2/ClientFactory.h":#include "servicerouter/client/cpp2/ServiceRouter.h":' -e 's:#include <servicerouter/client/cpp2/ClientFactory.h>:#include <servicerouter/client/cpp2/ServiceRouter.h>:' -e 's/ClientFactory::newFactory(/newFactory(/g' {} \; ``` Also manually fixed spots that couldn't be done automatically (or broke because they depended on transitive includes). * Back out "Fix handling of empty batches in SumReduceDimsOp" Original commit changeset: 282da1730cc2 This commit is blocking the Github->fbcode sync, which really needs to get merged ASAP. D7881937 which this diff depends on will be reverted in the sync D7990948 which causes this to break. The sync diff cannot be patched with this reversion because it must be landed against base revision 5c8c099 , and D7881937 must not be included in the sync diff because it is breaking GPU tests that are not available in sandcastle : https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-cuda8.0-cudnn6-ubuntu16.04-test/3638/console for one example. * Add the flow to support operator benchmark 1) generate model with the operator 2) upload to everstore 3) generate model spec into json file 4) start running the benchmark * [tum][gpu] Connect DPM trainer with flow and unit tests This diff: - Fix some small bugs for Yiming's recent changes to parallelizer, so it suits real use cases. - Add correct tags to the TUM code, so we can do data parallel transform - pass extra info when instantiation. - add unit test for using DPM in TUM model After this diff, we can do simple box, multi-gpu fully-sync trainer for TUM in Fblearner workflow, but may still need to do speed benchmarking. * w/o normalized lradaption for adam dense only The previous lr adaption includes a normalization step when performing the dot product operation. This is not exactly same as what is proposed in the paper. I add normalization as an option. Without it, the operator performs exactly what the paper proposed. With the option, we add the normalization step * [fb] Use SharedPromise in DeferrableAsyncSchedulingNet This code is to simplify DeferrableAsyncSchedulingNet by removing condition variable + small fixes * [tum] implement cuda sparseLengthsMean and LengthsMean as title * Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function. Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function. * Move feature_to_index to FeatureSpec.feature_to_index move feature_to_index to FeatureSpec.feature_to_index to avoid override other fields * [Caffe2] Rename bytes_moved to bytes_written Just a rename in preparation for supporting bytes_read. * [c2] fix ReduceFrontSumOp for empty case by setting 0 otherwise, it may use the results from last iteration when it's empty batch. * [Caffe2] [Int8] Improve Intel CPU performance * [Easy] Improve PrependDim op logging as titled * DBFileReader expand db_path using os.path.expanduser(..) Since there are a lot of possible use cases of `DBFileReader` to read from user home path, like `~/local/sample.db`, I want to save people's trouble of calling `os.path.expanduser(db_path)` themselves. * [Caffe2] Add bytes_read to cost structure We're adding analytical read bytes to cost functions. This extends the structure accordingly for all CostInference defined operators. Additionally, some small bug fixes were performed: 1) Cost functions now extract type information of operands instead of assuming float * Fix sleef on aarch64 for hhvm @bypass-lint Rename flag * Remove duplicated part in caffe2/ideep/operators/conv_op.cc should be sync error * Rename test helper function test_adagrad_sparse_helper to adagrad_sparse_test_helper to avoid confusing pytest	2018-05-19 23:10:48 -07:00
Bram Wasti	b1fbf29b52	[caffe2][nomnigraph] Change the standard transform API to take in NNModule rather than NetDef (#7308 )	2018-05-08 17:43:51 -07:00
Bram Wasti	3913e9ead3	[caffe2][nomnigraph] Batchnorm + Conv Fusion (#7057 )	2018-05-08 15:40:34 -07:00
Yinghai Lu	e3935f7509	[Caffe2] Add conv+relu fusion for MKLDNN ops (IDEEP) (#7385 ) * Add conv+relu fusion for MKLDNN ops (IDEEP) * comments	2018-05-08 14:44:53 -07:00
bddppq	7b66c433bc	Use a CI specific onnx namespace to catch hardcoded ones in the code (#7369 )	2018-05-08 13:40:55 -07:00
Bram Wasti	3642745ef9	[caffe2][nomnigraph] Add maxpool sink transform (#7207 )	2018-05-07 14:52:10 -07:00
Yinghai Lu	e6ce1afe47	[Caffe2] Follow-up of onnx-trt API change (#7076 ) * Follow-up of onnx-trt API change * indent * comments	2018-04-28 23:07:15 -07:00
Yinghai Lu	8b70f7d248	[Caffe2] Clean up ideep integration (#6881 ) * Clean up ideep integrtation * . * Remove redundant code in convnet benchmark * MKL ON * Do not add -mavx2 everywhere * . * Comments * rename * .	2018-04-24 18:32:35 -07:00
James Reed	6e60edb799	[caffe2] Fix logic error in tensor filling ops in C++ ONNX backend (#6909 )	2018-04-24 13:53:27 -07:00
Jinghui	26ddefbda1	[feature request] [Caffe2] Enable MKLDNN support for inference (#6699 ) * Add operators based-on IDEEP interfaces Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Enable IDEEP as a caffe2 device Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add test cases for IDEEP ops Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add IDEEP as a caffe2 submodule Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Skip test cases if no IDEEP support Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct cmake options for IDEEP Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Add dependences on ideep libraries Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix issues in IDEEP conv ops and etc. Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Move ideep from caffe2/ideep to caffe2/contrib/ideep Signed-off-by: Gu Jinghui <jinghui.gu@intel.com> * Update IDEEP to fix cmake issue Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Fix cmake issue caused by USE_MKL option Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com> * Correct comments in MKL cmake file Signed-off-by: Gu, Jinghui <jinghui.gu@intel.com>	2018-04-22 21:58:14 -07:00
Yinghai Lu	6252706feb	[Caffe2] Workspace centric API for TensorRT transformation (#6678 ) * Workspace centric API for trt transformation * Merge SSA rewrite code	2018-04-17 21:23:27 -07:00
Yinghai Lu	582d47e986	[Caffe2] Scoped dummy name generator (#6458 ) * Scoped dummy name generator * Fix * Fix * Use class variable * Fix build * comment	2018-04-16 11:58:02 -07:00
Bram Wasti	7bd398b3db	Add fuseNNPACKConvRelu (#6439 )	2018-04-10 16:51:16 -07:00
Svetoslav Kolev	997acfd7fe	[Caffe2] Some small changes to InferBlobShapesAndTypes definition and SameAsInput Schema (#6335 ) * Change Same as input type deduction to work for ops with multiple outputs * change InferBlobShapesAndTypes definition to take vector ot pointers instead of unique_ptr. The function doesn't own the objects, so no need to pass smart pointers and that prevents calling the function with existing object, since the caller has to create unique_ptr, i.e. copy an existing object just to create the pointer * switching order of std::move<unique_ptr> and uniqur_ptr.get * adding comma	2018-04-06 19:06:46 -07:00
Bram Wasti	ee64200c64	[nomnigraph] Expose transformations to python Adding a python interface to the transformations	2018-03-30 21:00:44 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Yinghai Lu	b6e80a1ec4	Caffe2-onnx exporter (#2248 ) * caffe2-onnx frontend * Remove Python part of the conversion code * nit * convert more ops * Address commmetns	2018-03-26 19:23:45 -07:00
Yinghai Lu	45da53f478	Remove Python onnx-caffe2 conversion code (#2362 ) * WIP * Remove Python onnx-caffe2 onversion code * Fix build * Comments * Add comments * Fix typo in comments	2018-03-22 11:59:03 -07:00
Yangqing Jia	2d03ae2f85	Move ParseProtobufFromLargeString to proto_utils (#2354 ) * Move ParseProtobufFromLargeString to proto_utils * ParseProtobuf -> ParseProto to be consistent in naming	2018-03-21 17:05:14 -07:00
Yinghai Lu	7e6693991d	Onnx caffe2 backend (#2039 ) * C++ version of ONNX->Caffe2 backend * use namespace ONNX_NAMESPACE * Fix Build * Comments * Change namespace from onnx_caffe2 to caffe2::onnx	2018-03-12 15:18:05 -07:00
Dmytro Dzhulgakov	9e71de398b	[core] Graph-level NUMA awareness in Caffe2 Adding NUMA awareness through numa_node_id in DeviceOption. Blobs of operators with numa_node_id are allocated on corr. memory banks, using CPU pools with NUMA affinity set to run operators.	2018-03-06 00:33:11 -08:00
mdschatz	3c952426fb	Add operator attaching net observer Summary: Commonly, net observers attach operator observers at construction. This diff separates the logic into a base class to inherit from. Closes https://github.com/caffe2/caffe2/pull/1806 Reviewed By: salexspb Differential Revision: D6808623 Pulled By: mdschatz fbshipit-source-id: 75ef0eea913ef30943541c829c0a976965f42736	2018-01-29 14:34:34 -08:00
Ilia Cherniavskii	a7ac591d3b	Support for DLPack in Python op Summary: Adding support for DLPack tensors to Python op Reviewed By: Yangqing Differential Revision: D6577702 fbshipit-source-id: e14ef213fcdb2930ffe164667971a92aa8db503c	2017-12-21 17:02:16 -08:00
Peter Goldsborough	ce2a0aa4d8	Add slice and gather syntax Summary: Implemented syntactic sugar for the following constructs: - `x.Gather(y)` can now be written as `x[y]` - `x.Slice(start, end)` can now be written as `x[start:end]` For slicing, `start` and/or `end` can be omitted iff `x` is one-dimensional (i.e. a vector). That is, `vector[start:]`, `vector[:end]` and `vector[:]` will work. Doesn't work for higher-dimensional tensors because to emit the start/end indices we need to know the rank of the tensor (since `Slice` requires one entry per dimension of the tensor). Also added a `getProto()` function so that I could test that the generated code is as expected (i.e. that the syntactic sugar does not affect the structure of the output). Reviewed By: zdevito Differential Revision: D6605864 fbshipit-source-id: 786359713a13314c24be2fc07e01486c507404ef	2017-12-19 19:17:01 -08:00
Zachary DeVito	1c6595c8e8	Add function calls and externs Summary: Adds the ability for a script function to call another and adds the extern function to register an external Caffe2 Net that can be called by the script. Closes https://github.com/caffe2/caffe2/pull/1591 Reviewed By: jamesr66a Differential Revision: D6515877 Pulled By: zdevito fbshipit-source-id: b893d9e4bacd7389b550ac8a37ad7974b95de749	2017-12-07 23:44:28 -08:00
Zachary DeVito	6811acbef9	Syntax for control flow in C2 Summary: Experimental code that allows you to write C2 NetDefs directly using python-like syntax. This includes the ability to write native control-flow (if, while) and have it turn into IfOp and WhileOp Reviewed By: jamesr66a, dzhulgakov Differential Revision: D6123298 fbshipit-source-id: 25fc078b5769be61ac7fb3aa9a7c95bd88dccc30	2017-11-29 16:47:45 -08:00
Andrew Dye	1ba3e14608	Throw Python exception from PythonOp instead of logging Summary: Today when PythonOp throws an exception, we log the error and fail the op. Later we assert that the op/net/plan succeeds and throw with a generic message. The user must ttail the logs to find the real error. Instead, align with exception handling from other ops - throw directly. This will include full context of the exception in the error message. Reviewed By: Yangqing, akyrola Differential Revision: D6359684 fbshipit-source-id: 85133ba6562759607a3971449120647cbacce946	2017-11-20 09:03:17 -08:00
Qinqing Zheng	c77f0cb5e6	Attach observers to operators inside step net Summary: Pass the list of observers to rnnExecutor_ and attach them to operators Reviewed By: akyrola Differential Revision: D6279655 fbshipit-source-id: 086dde1bf6edbfb36082d6b4de33ec41f0bbefab	2017-11-14 15:06:38 -08:00
Ilia Cherniavskii	1149b9bbb5	Polling async net executor Summary: Implementation of polling async net executor. Notes: - New net executor async_polling - schedules CPU and GPU ops asynchronously, uses single polling thread - Events: update to Caffe2 events to support async CPU events, adding new methods: Query() - non-blocking checking of event states: INITIALIZED -> RECORDED -> SUCCESS/FAILED ErrorMessage() - when operation runs asynchronously and fails calling this on event will give error message - Tasks: using existing DAGNet's algorithm to compute CPU and GPU chains, a separate task for each chain - Polling: using single thread to query state of events - for CPU tasks atomically queries task state, for GPU task - uses cudaEventQuery; using Event - Scheduling of CPU ops: using global thread pools - Scheduling of GPU ops: using GPU thread pool per GPU device Reviewed By: dzhulgakov Differential Revision: D5985110 fbshipit-source-id: a9de7fcbb71d046a3aa1b573072b89a65dfeee8c	2017-11-03 07:27:44 -07:00
Bram Wasti	a0aa6d0e24	expose flop annotation to python Summary: expose the flop annotation framework to python functions Reviewed By: Maratyszcza, Yangqing Differential Revision: D6135705 fbshipit-source-id: 2eed80b6cbda7b3ee3fe0e019a0f1fc4b0aa320b	2017-10-24 11:35:24 -07:00
Soumith Chintala	891f41c14b	Upgrade to 2.2.1 Summary: Update pybind from 1.8.1 to 2.2.1 aarch64 platform updates pending. Reviewed By: houseroad, kmatzen Differential Revision: D6089712 fbshipit-source-id: 80ce09c381717f4317e2e698479ff604cf28c709	2017-10-22 13:26:56 -07:00
Junjie Bai	43b303bfc0	Expose Predictor::run_map to Python Reviewed By: jerryzh168 Differential Revision: D6087316 fbshipit-source-id: d90e20429645391f17f0c56c8a8a60685097f801	2017-10-18 19:32:56 -07:00
Bram Wasti	7d16d320d5	expose observers to python, add multiple observers per observable Summary: observer framework can now be used in python + a small writeup of how to use it. this is D6035393 with a fix for ct-scan Reviewed By: salexspb Differential Revision: D6066380 fbshipit-source-id: 896c4c580d4387240b81ac2dbbc43db51d4bfeb9	2017-10-16 14:32:56 -07:00
Scott Yost	a7a81351f2	Revert D6035393: [caffe2] expose observers to python, add multiple observers per observable Summary: This reverts commit 4563cf0203095fa979bb2160621cd16dd22ff830 bypass-lint Differential Revision: D6035393 fbshipit-source-id: 090fba774ce433904f7ef769dda75c2fbbf784a8	2017-10-14 21:47:34 -07:00
Bram Wasti	58fe66e337	expose observers to python, add multiple observers per observable Summary: observer framework can now be used in python + a small writeup of how to use it Reviewed By: sf-wind Differential Revision: D6035393 fbshipit-source-id: 4563cf0203095fa979bb2160621cd16dd22ff830	2017-10-14 13:09:29 -07:00
Yangqing Jia	b1508e8e86	Revert D5905002: [caffe2] expose observers to python Summary: This reverts commit e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1 bypass-lint Differential Revision: D5905002 fbshipit-source-id: 4f1b79d9a318978f6b74565f633f34b9701a9d5c	2017-10-10 22:12:00 -07:00
Bram Wasti	63caca89db	expose observers to python Summary: observer framework can now be used in python + a small writeup of how to use it Reviewed By: salexspb Differential Revision: D5905002 fbshipit-source-id: e40ec24a55e08fb73beea9b4f3b68e71fc66ffb1	2017-10-10 16:10:41 -07:00
Junjie Bai	91bb6ce095	Allow explicitly specifying to use operators' default implementation Reviewed By: dzhulgakov Differential Revision: D5973635 fbshipit-source-id: 12dccc6332a8dd264ccc9f831a053a3be9b89c56	2017-10-04 12:17:36 -07:00
Dmytro Dzhulgakov	5527dd3b08	Expose CMake options in the binary Summary: Useful for figuring out with people which version they built with. We can just ask for --caffe2_version gflag or get core.build_options from python. Also adds CMAKE_INSTALL_RPATH_USE_LINK_PATH - without it wasn't building on my Mac. How should it be tested? Closes https://github.com/caffe2/caffe2/pull/1271 Reviewed By: bddppq Differential Revision: D5940750 Pulled By: dzhulgakov fbshipit-source-id: 45b4c94f67e79346a10a65b34f40fd258295dad1	2017-10-04 02:33:02 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Jerry Zhang	23f4f78c22	Functional C2 Summary: Supporting calling C2 operators as functions, e.g. ``` from caffe2.python.functional import Functional Y = Functional.Relu(X)[0] ``` Supporting numpy arrays as input for now. Reviewed By: bddppq Differential Revision: D5791821 fbshipit-source-id: 7e936ad52b8b304c5e210248bd6649fd066cd909	2017-09-13 15:37:28 -07:00
Junjie Bai	90ca470d70	Standardize operator argument "is_test" Summary: Also add the ability to mark an argument as required. Added a string constant `OpSchema::Arg_IsTest` for `is_test` arg. If users define the `is_test` argument with `ArgIsTest(...)`, then it automatically becomes required argument, in the meanwhile user can still use `Arg("is_test", ...)` to define an optional `is_test` argument. Reviewed By: akyrola Differential Revision: D5812391 fbshipit-source-id: eaaba50d027813a8012389edc6c459de23c3c728	2017-09-13 14:35:27 -07:00
Junjie Bai	5748e7140f	Strip Operator Schema in mobile build Reviewed By: Yangqing Differential Revision: D5677792 fbshipit-source-id: d29edb26a36b24a46821e13e2d77af0f21571fcd	2017-08-22 13:31:08 -07:00
Ben Zhang	cfbd116966	ApplyTransformIfFaster Summary: Implemented ApplyTransformIfFaster Determine if a transform is faster, then return whichever net is better. Reviewed By: bwasti Differential Revision: D5534535 fbshipit-source-id: 509943205b0c454bf30fb01343ac4e88d1441c39	2017-08-17 15:36:51 -07:00
Ahmed Taei	a0fe96d7cd	Rewrite memonger DAG in C++. Summary: This diff replaces the main of the memonger for dag algorithm _compute_blob_recycling_for_dag with a c++ implementation. Reviewed By: akyrola Differential Revision: D5544219 fbshipit-source-id: 9f868880c8d0eb997ad3dd39433f9d0b9216d303	2017-08-16 16:17:15 -07:00
Junjie Bai	1ce95090ca	Add support for specifying engine preferences Reviewed By: Yangqing Differential Revision: D5460994 fbshipit-source-id: 08a8af699eebec37defc070389a8415b3e81ac16	2017-08-09 00:47:18 -07:00
Ben Zhang	6314c1fc15	Transforms in Python Summary: Allow the use of apply_transform() in the python API Reviewed By: bwasti Differential Revision: D5530483 fbshipit-source-id: 61a6d36fe125c89629fdeea040a717c453d84417	2017-08-01 16:51:38 -07:00
Yangqing Jia	de92dbe4bb	MKL code move Summary: Nothing gets changed - this would allow us to more easily deal with build systems. Also now everything that is MKL related lives under mkl/. Reviewed By: dzhulgakov Differential Revision: D5505157 fbshipit-source-id: ddb2e6ac290a146a7cb495da23bb0e5b5594bd2a	2017-07-26 20:21:55 -07:00
Yangqing Jia	f6afa6adbd	Add proper cpuid support. Summary: This is needed for us to do more fine grained dispatch based on CPU arch, so I figured we should just add it. Can help Dima and Misha doing optimization I think? Reviewed By: dzhulgakov Differential Revision: D5477444 fbshipit-source-id: 48aaf8bd799e9755493cd51c793ceec080a8846c	2017-07-23 17:21:50 -07:00
Victor Gao	f7a92145d4	comment out unused parameter in pybind_state.cc Summary: This removes/comments out/silences one or more unused parameters in the files. We are going to enable `-Wunused-parameter` in fbcode and this fixes a case that automated tooling can't handle. This diff is automatically generated. Reviewers are added heuristically. Reviewed By: dzhulgakov Differential Revision: D5437217 fbshipit-source-id: c2fc5ed30e7ee47b8c40248f89a9f4304ce7c098	2017-07-17 15:57:49 -07:00
Aapo Kyrola	ad62e82179	fast simple-net memonger for C++ Summary: To be used with predictor "online": C++ version of memonger for simple nets. Very simple greedy algorithm. Works well at least on Resnet-50 inference graph: only 3 shared blobs are used. Next I will integrate this with predictor and run canary (separate diff). Reviewed By: asaadaldien Differential Revision: D5375392 fbshipit-source-id: d36e419e39a32e568e105657c27fb00c85a2535d	2017-07-06 15:17:07 -07:00
Luke Yeager	fe9b0bfd27	Fix some typos Summary: Closes https://github.com/caffe2/caffe2/pull/882 Differential Revision: D5341277 Pulled By: harouwu fbshipit-source-id: bb5595c65c05ca7ea1a1d060d61d14fbfe008241	2017-06-28 13:50:48 -07:00
Alexander Sidorov	c8410859d9	Operator python stacktraces, attempt 2 Summary: Last time I used uuid filled into OperatorDef. And operator_tracebacks was populated using traceback.extract_stack. There were several issues with this approach: 1. A random field in OperatorDef breaks workflows relying on memoization, i.e. when computation is skipped based on already computed result before. 2. Adding one more field revealed RNNs being non forward compatible wrt to new fields in there. prototxt format seems to not allow forward compatibility (thanks jamesr66a for the investigation!). For RNNs we need to swtich them to a more resilient approach. azzolini's proposed change to OperatorDef / NetDef would allow that by just nesting NetDef dirrectly inside OperatorDef without need for extra serialization. 3. traceback.extract_stack is very slow when executable is on a remote filesystem. It does one or more os.stat for each frame on the stack. For some cases it ended up being up to 15 extra minutes on model construction. In this diff I use a different approach which should fix all those problems above. 1.2. are solved by not adding a new field at all. Instead I report operator idx wrt to a net it runs in. Thanks akyrola and dzhulgakov for the idea. Downside here is that operator list manipulation breaks the logic and separately created ops are not covered at all. 3. I solved this by operating on raw frames without using traceback and inspect modules which end up doing a lot of file system calls. See function extract_stacktace in core.py with additional comments. Reviewed By: dzhulgakov Differential Revision: D5286285 fbshipit-source-id: 626dd0f5f6b8b1d86bd6bf519078b122f43ddcaa	2017-06-25 19:32:58 -07:00
Thomas Dudziak	342de07231	Core unit test fixes for Python 3 Summary: As title Differential Revision: D5291327 fbshipit-source-id: 7dd9279c53ba55d3422c31973ffcec5705787fdf	2017-06-23 13:22:16 -07:00
Alisson Gusatti Azzolini	7d482742fd	Allow tasks/execution_steps to be cloned at runtime Summary: Advantages of cloning the tasks/execution_steps at runtime: - Less complexity on the python side: no need to clone nets and add prefixes to blob names - Faster start-up: we had cases of complex plans that took up to 30min to be created. - Better isolation: each task cloned at runtime has its own child workspace, preventing false sharing of blobs. - Opens up possibility for dynamic scheduling: Number of threads per task can be increased on the fly, at runtime. Reviewed By: dzhulgakov Differential Revision: D5100730 fbshipit-source-id: 71b83193b135da4e6eaf2536d8fc266528e1fdcc	2017-06-20 22:32:07 -07:00
Alexander Sidorov	83e6a0bec8	Revert uuid change to OperatorDef protobuf Summary: a few issues: 1. Randomization hurts memoization 1. Even if we make it non random, then we can get key colisions when loading it back. 2. RNNs use prototxt for step net and apparently its not forward compatible like normal protobuf is I am thinking of a better less invasive solution now. Reviewed By: jamesr66a Differential Revision: D5272118 fbshipit-source-id: ab577fad04fbfc632e1fceffa923377a0d3da1be	2017-06-19 16:47:31 -07:00
Alexander Sidorov	eebda50b79	Operator python traceback Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call. Reviewed By: jamesr66a Differential Revision: D5226047 fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108	2017-06-13 18:50:02 -07:00
Alisson Gusatti Azzolini	d3ec6e8f55	Run python op builder at op creation time Summary: This allows to construct a python op by passing a pickled "builder function call" as an argument to the op. The builder function is called at PythonOp construction time and returns a function that will be called when the op is run. This way we allow to drop the dependency on 'tokens', which didn't work properly for protobufs that get distributed to other processes. Now, the PythonOp definition is self-contained: as long as the build dependencies are right, sharding the protobuf is enough to execute the net remotely. Reviewed By: dzhulgakov Differential Revision: D5080833 fbshipit-source-id: a5deaca5d3143024cdb121519689224e9dbec5ce	2017-06-13 16:29:22 -07:00
Thomas Dudziak	c7f5bf282b	Revert py::bytes -> std::string Summary: As title Reviewed By: salexspb Differential Revision: D5229338 fbshipit-source-id: 3bc9442c76061436db8f3217c1ba8edfd9581f8b	2017-06-12 14:11:37 -07:00
Yiming Wu	8cd208ad6f	Infer input and output device from OperatorDef through OperatorSchema Summary: Infer input and output device from OperatorDef through OperatorSchema. This is inspired by shape inference. With this feature, we can easily analysis device information for all blobs in the net in a generic way. It is really helpful for auto cross device execution. Reviewed By: akyrola, dzhulgakov Differential Revision: D5161065 fbshipit-source-id: ee656123112171a4ca00f2fb3f6940f32ddf3135	2017-06-05 23:47:33 -07:00
Ross Girshick	8e99824ce7	Allow subsets of gradient outputs / inputs in Python ops Summary: I'm using Python ops in a project and need corresponding Python gradient ops. For my use case, only a subset of the forward op outputs have gradients and only a subset of forward op inputs have gradients. However the current implementation of `GetPythonGradient` forces all grad inputs and outputs to exist. This diff allows one to specify that only a subset of grad inputs / outputs are used when constructing the Python op. I'm not sure if this is up to caffe2 standards, so please push back on style and content as needed. Reviewed By: dzhulgakov Differential Revision: D4897004 fbshipit-source-id: 96fffe8634c51a49b6bce7339a46c6235f7d4bbd	2017-06-05 12:52:01 -07:00
Thomas Dudziak	3ccbf23132	String-related fixes for Python 3 Summary: This diff is one step towards enabling python 3 build by making it be more diligent in its handling of strings. Reviewed By: salexspb Differential Revision: D4893083 fbshipit-source-id: 28b8adf3280e8d1f0a7dc9b0fee5ad53f2fada57	2017-05-26 16:04:32 -07:00
Dmytro Dzhulgakov	35eaf444c0	Quickly hack sparsenn_benchmarks to also do BenchmarkNet Summary: Makes benchmark a bit hacky, but it's a benchmark after all :) Specifically ports functionality of proper BenchmarkNet run from the ads_benchmarks so that we can see training net perf. Also adds --report_interval parameter to print stats more often when running in hogwild mode kdub0 - hopefully if you have time you can integrate it properly with the Flow's workflow harouwu -shouldn't conflict too much with your current diff Reviewed By: rayleichen Differential Revision: D5125183 fbshipit-source-id: 9c6f1663bc85e26d6609f0f2f23aa280731939db	2017-05-26 10:48:45 -07:00
Aapo Kyrola	658c337f41	Error status for Gloo ops, and handling in elastic dpm Summary: Add a RandomFailureOp and handling to elastic data parallel model of the status code Reviewed By: andrewwdye Differential Revision: D5065936 fbshipit-source-id: 24224f9ea414ee535c9e90cc28add5189354b0ef	2017-05-17 00:16:52 -07:00
Alisson Gusatti Azzolini	75bc9f5e77	Relax requirement on token uniqueness Summary: Relax requirement on token uniqueness since a few use cases broke after the uniqueness requirement was added in a previous diff. Reviewed By: kittipatv Differential Revision: D5034132 fbshipit-source-id: 327eb065923e6ea152a360324316f81b7fb9564b	2017-05-09 19:36:00 -07:00
Alisson Gusatti Azzolini	bd8ed6641c	Stabilize PythonOp token name Summary: For distributed jobs, we were relying on the order the PythonOps were registered, which was very fragile. Reviewed By: dzhulgakov Differential Revision: D5016847 fbshipit-source-id: f5601467c5b0569d5e8a0efdd76abad0d703c5f5	2017-05-09 11:19:44 -07:00
Aapo Kyrola	5c52392229	opsify AccumulateInputGradients Summary: Part of project to make all gradient accumulation business ops in RecurrentNetworkGradientOp, this makes the accumulateInputGradients ops. Also added way to mark operators private so they don't appear in docs. Reviewed By: salexspb Differential Revision: D5006698 fbshipit-source-id: 226d7afb473290c8d0f936d2cc87640be3e06615	2017-05-05 09:13:39 -07:00
Yangqing Jia	cf317d1106	create_net: explicitly specify if one wants to overwrite the network. Summary: This is from discussion with dzhulgakov : as a step towards revisiting the core.Net autonaming, we will first guard against accidental overwrites of existing networks in the workspace. ajtulloch since we are doing Predictors in mobile, this should be safe right? azzolini - I assume this would be safe, but would love to get your approval. akyrola - would this hurt xray? Reviewed By: dzhulgakov Differential Revision: D4897725 fbshipit-source-id: aa41271927ad6671f07a53b9505283623f8c49e5	2017-04-17 21:46:53 -07:00
Dongsheng Fang	3c0dc06ac8	Add __builtin_cpu_supports function def in windows Summary: Closes https://github.com/caffe2/caffe2/pull/253 Differential Revision: D4892628 Pulled By: Yangqing fbshipit-source-id: 45d49121027454d9259c4a753438d8f0771cf042	2017-04-14 19:46:19 -07:00
Yangqing Jia	ca0c8e5b25	remove import_array() help and use import_array1 Summary: TSIA. See https://github.com/numpy/numpy/blob/master/numpy/core/code_generators/generate_numpy_api.py Reviewed By: jamorton Differential Revision: D4893002 fbshipit-source-id: 4b6bee1bdf8ae905e4c0952a3e8bbbacd4129a50	2017-04-14 19:46:19 -07:00
Fei Sun	e2323ad688	Add CAFFE_ENFORCE to protobuf parsing Summary: Add CAFFE_ENFORCE to make sure the protobuf parsing is successful. Reviewed By: salexspb Differential Revision: D4843662 fbshipit-source-id: 20cab7180e6b0e5afb5e29ff3333591659e41f7a	2017-04-06 14:34:30 -07:00
Fei Sun	95657ea1e8	Protobuf is binary string. Use bytes instead. Summary: Prepare for the Protobuf change. Reviewed By: dzhulgakov Differential Revision: D4784884 fbshipit-source-id: 86219eecefaf7637e70339437c9274c526ebd6fe	2017-03-28 19:03:23 -07:00
Alexander Sidorov	56f324d191	Added predictor bindings to python interface Summary: from caffe2.python import workspace; p = workspace.Predictor(init_net, predict_net); outputs = p.run(inputs) Reviewed By: Yangqing Differential Revision: D4576793 fbshipit-source-id: b829bbcaf2e7c34dad85024177433207bd96a234	2017-03-15 11:17:54 -07:00
Kittipat Virochsiri	f0d78753ae	Make ModelExporter.load_from_db() load to specific workspace Summary: In case of distributed task, load_from_db() loads to wrong workspace (when used inside a Python op). Passing which workspace to use explicitly so that it loads to the one Python op is being run. Reviewed By: kennyhorror Differential Revision: D4653692 fbshipit-source-id: 94585c012b05ee38b9ce5e8ef0efdd50aa41dd2b	2017-03-08 09:31:42 -08:00
Zachary Mirman	1c92e85dae	Added editDistance helper to caffe2 operators Summary: Added editDistance helper to caffe2 operators Differential Revision: D4622152 fbshipit-source-id: 4d6246b8226c1283d5883edfaa27e8f7748fdc4c	2017-02-28 13:31:56 -08:00
Yangqing Jia	47b65b6d8d	Add a create your own dataset tutorial Summary: bwasti - will follow up via email. Closes https://github.com/caffe2/caffe2/pull/166 Differential Revision: D4596858 Pulled By: Yangqing fbshipit-source-id: 6d088ccf1604e0dc9b94cbf0a75b51587e734d95	2017-02-22 03:31:47 -08:00
Yangqing Jia	8ca1b3baea	import_array python3 compatibility Summary: TSIA Reviewed By: salexspb Differential Revision: D4535571 fbshipit-source-id: 61ce724d4fc3c79fac551e8622a2d45cda67f80a	2017-02-09 10:08:13 -08:00
Andrew Dye	306fde233a	Accept optional blob map for InferShapesAndTypes Summary: Shape inference allows Caffe2 to compute shapes of blobs without running a model. Update InferShapesAndTypes() to accept an optional blob:dimensions map so that external input blobs do not need to be part of the workspace. InferShapesAndTypes() in workspace.py conditionally calls the ...from_workspace or ...from_map bindings. Note I favored a small amount of code duplication here for the sake of readability. InferShapesAndTypes() in operator.cc has been refactored into mirrored entry points, invoking a common helper. Other minor changes to address linter warnings. Reviewed By: dzhulgakov Differential Revision: D4524873 fbshipit-source-id: 56f863b759c016d7f23523f06fda3aa5bba22357	2017-02-08 15:04:24 -08:00
Aapo Kyrola	6a03641cde	Add num_iters to RunNet() Summary: Running RunNet() in python in a loop can be a performance issue if the python code is doing a lot of other processing, such as data input, because python's Global Interpreter lock (GIL) will prevent the RunNet() to be called. This can easily be fixed by making RunNet() run multiple iterations inside the C++ land. (Another way to accomplish the same thing is to use Caffe2's "execution plans", but that requires more setup). + fixed timing reporting in my OC workflow + improved one error log in data_workers.py Sorry for piggypagging those small changes, but landing diffs currently is slow... Reviewed By: rpenggithub Differential Revision: D4523575 fbshipit-source-id: 039a647576efad5dd9afda74df478ac22b43c103	2017-02-07 14:16:14 -08:00
Aapo Kyrola	dcefc74a0c	Shape and Type Inference Part1 Summary: This is a bit large diff, sorry about it. It includes basic shape and type inference functionality, based on YQ's Schema scaffolding. I added some helper functions to make it easier to write simple translations. Bigger refactoring was needed for ConvPoolBase so that we could use the shape inference already there in the schema. I annotated enough operators to be able to infer forward-pass of shapes for basic convnet, and added test for that. I intend to bootcamp some annotations and annotate enough to handle Resnets fully. Need to think about gradients, if they could be annotated in an easier way. Only shapes are now exposed to Python, types will follow later. Also the inference is not called yet anywhere but unit test. Also I am not sure if everything is in the best location in the code, but shouldn't be hard to move stuff around. Reviewed By: dzhulgakov Differential Revision: D4436818 fbshipit-source-id: eebee5937ccc9ac09c245465302388a1fae6933c	2017-02-02 22:29:22 -08:00
Yangqing Jia	8553bd3f68	Ensure we are not using Eigen LGPL code, and build on raspbian. Summary: Turns out that building on raspbian is easy as a cake for caffe2 - cmake is awesome. Closes https://github.com/caffe2/caffe2/pull/112 Differential Revision: D4480985 Pulled By: Yangqing fbshipit-source-id: 5dbe5e1e71d8680dea7a5ec8a9ce7fbe6aa5270a	2017-01-30 09:44:27 -08:00
Fei Sun	cc65cc64c8	Create function ParseProtobufFromLargeString to parse strings more than 64MB Summary: Replace ParseFromString with ParseProtobufFromLargeString to get around the limitation of the 64MB limit. Reviewed By: Yangqing Differential Revision: D4466226 fbshipit-source-id: b68a6efc76955db294ddb0d23bbaf03b69e4952a	2017-01-27 10:29:22 -08:00
Dmytro Dzhulgakov	864f561525	Make BlobDeserialization throw exceptions instead of returning bool Summary: Makes it much nicer to spot errors, especially in iPython notebook. Reviewed By: kennyhorror Differential Revision: D4465726 fbshipit-source-id: c0adaf5168248a70987ff9d5dfce54a622ff2219	2017-01-26 09:44:19 -08:00
Ahmed Taei	9ad10959ee	Enable large PlanDef protobuf message. Summary: Enable cases where PlanDef message is bigger than protobuf string decoding limits. Differential Revision: D4412736 fbshipit-source-id: 91ee02d7a8ab85b1c8169683a6c1dccd4c79be40	2017-01-13 09:29:29 -08:00
Bram Wasti	737000b166	Linter fix up to sync fbsource and github	2017-01-06 15:36:17 -08:00
Bram Wasti	3833dad5f6	manual sync of old never sync'd files	2017-01-06 15:28:45 -08:00
Yangqing Jia	5bfd6c4cd1	semicolon	2017-01-04 14:36:16 -08:00
Yangqing Jia	311ae2ba33	build file fix and avx2 on mac fix	2017-01-04 14:35:15 -08:00
bwasti	9ce23cbb71	Fix false positive for non-clang compilers.	2016-12-29 11:39:50 -08:00
Bram Wasti	b48f1ff810	OS X build	2016-12-29 12:25:53 -05:00
Dmytro Dzhulgakov	119b687994	Allow PythonOp to access the workspace Summary: DPER has very strange python ops that play with Workspace - they are somewhat similar to LoadOp/SaveOp, so I guess the semantics is fine. Thus it makes sense to allow python operators to receive workspace pointer similarly to regular Operators. I didn't figure out a better way to implement optional argument than just checking the number of args function receives on python side. Reviewed By: ajtulloch Differential Revision: D4242943 fbshipit-source-id: d97d4227815b741c8f884cfe254b06d2b56b5a41	2016-12-05 11:53:26 -08:00
Yangqing Jia	0e298ec399	Expose MKLMemory to the Python Feed and Fetch interface, and misc changes Summary: This is #2 of a series of changes. It did the following: (1) a few refactor of the MKL memory interface (2) an initial MKLContext to deal with MKL specific computations (3) Provide MKLMemory access in Python with the blob feeder/fetcher registration. Reviewed By: dzhulgakov Differential Revision: D4210123 fbshipit-source-id: adea1f1ffbd0b9ffdd55092676468c16bec08992	2016-11-29 15:18:36 -08:00
Yangqing Jia	589398950f	fbsync at f5a877	2016-11-18 15:41:06 -08:00

1 2 3 4

154 Commits