pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Lu Fang	32494c226e	OperatorDef <==> NodeProto Conversion (#11621 ) Summary: Operator level proto conversion between (new) torch proto and (old) caffe2 proto. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11621 Reviewed By: BIT-silence Differential Revision: D9892422 Pulled By: houseroad fbshipit-source-id: 01a55ec0a09479876a27082d90fc970723f4d431	2018-09-19 08:41:33 -07:00
Shihao Xu	72a84127b1	Add Workspace methods ws.feed_blob(name, arr) ws.remove_blob(name) (#10929 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10929 Workspace classes methods were missing on the Python side. Being able to write the New Checkpoint Framework with more control of the workspace and cleaner implementation. Added - ws.feed_blob(name, arr) - ws.remove_blob(name) Reviewed By: mraway Differential Revision: D9486867 fbshipit-source-id: ea02d2e3a39d716a5a3da0482f57d4ac4c893763	2018-08-28 17:54:34 -07:00
Kittipat Virochsiri	2b134c72e6	Add interface to provide blob types to shape&type inference (#9643 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9643 Current map interface assumes float data type, which is not always correct. Reviewed By: kennyhorror Differential Revision: D8455784 fbshipit-source-id: b94a31267760f7f97c15aa4b03008affc347fd10	2018-07-24 11:58:05 -07:00
Kittipat Virochsiri	01581037dc	Add workspace.RunPlanInBackground (#9637 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9637 Adding a method to run plan in background. The intended use is to run BlueWhale's data reading & preprocessing net in background while the GPU is training. Reviewed By: MisterTea Differential Revision: D8906439 fbshipit-source-id: b1c73ca7327e2d87a8f873924e05ab3d161a3f1e	2018-07-20 14:56:12 -07:00
Peter Yeh	54db14e390	HIP Operators Generator--> HipOpG (#9322 ) Summary: The goal of this PR is to add an infrastructure; to convert(hipify) CUDA ops into [HIP](https://github.com/ROCm-Developer-Tools/HIP) ops , at compile time. Note that HIP ops, which are portable c++ code, can run on AMD and NVIDIA platform. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9322 Differential Revision: D8884707 Pulled By: bddppq fbshipit-source-id: dabc6319546002c308c10528238e6684f7aef0f8	2018-07-19 00:26:06 -07:00
Lin Li	0fe980c748	Memory usage measurement -- Caffe2 (#9017 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9017 Closes https://github.com/pytorch/pytorch/pull/9017 Added "get_blob_size_bytes" to "pybind_state.cc" in Caffe2 to expose the size of blob in bytes. Reviewed By: kuttas Differential Revision: D8685696 fbshipit-source-id: 9a9d38f207c8c59ef534217181e8ce1514617628	2018-07-17 16:40:23 -07:00
bddppq	f94ae3ba1d	Update from facebook (#7696 ) * Fix handling of empty batches in SumReduceDimsOp As titled * Deferrable async_scheduling finishRun fix Proper order of finishing run operations in deferrable_async_scheduling net * Simplify exception handling in async_scheduling Simplify exception handling, no need to busy wait, thread that processes the last task can finish the run * [C2]worker_coordinator_memorize_worker_ids As titled. This is related to T28689868, where the number of blobs we want to create is equal to the number of worker ids * Add unit test for nets with no type set * Ignore total length argument in sympolic_pad_packed_sequence 1- There was a mistake in the code that total_length was added to the wrong symbolic function (pack_padded_sequence) instead of (pad_packed_sequence) 2- No need to throw an exception if total_length is given since it is only used to enable data_parallel training on multi-gpus and doesn't have anything to do with onnx export, so just ignore it. https://fburl.com/tk4gciqp * Add support for MKLDNN to async_scheduling Just add MKLDNN as a possible CPU option to async_scheduling's pool function * [AuFL][ensemble] support branch output for prediction This diff supports using predictions from different branches and thus enables model ensembling (not fully independent). * Fix a bug in add_loss in layer_model_helper As titled. * Support lradaption for adam 1.lr adaption operator 2.apply to dense adam * Perf tweaks for async_scheduling Restore single pool option + remove unnecessary (no-ops) calls * add quantization to SparseSimdAdagradOp add a bunch of quantization signatures to SparseSimdAdagradOp, implementations to come next * [sr] [codemod] Change all SR callsites to use new API @allow-large-files This diff refactors all callsites of SR to use the slightly changed API introduced in the diff below. Really what this means is that you need to include the correct header. Also if you were using `ClientFactory::newFactory` you need to not prefix it with `ClientFactory::`. ``` cd ~/fbsource/fbcode find ./ -type f -exec sed -i -e 's:#include "servicerouter/client/cpp2/ClientFactory.h":#include "servicerouter/client/cpp2/ServiceRouter.h":' -e 's:#include <servicerouter/client/cpp2/ClientFactory.h>:#include <servicerouter/client/cpp2/ServiceRouter.h>:' -e 's/ClientFactory::newFactory(/newFactory(/g' {} \; ``` Also manually fixed spots that couldn't be done automatically (or broke because they depended on transitive includes). * Back out "Fix handling of empty batches in SumReduceDimsOp" Original commit changeset: 282da1730cc2 This commit is blocking the Github->fbcode sync, which really needs to get merged ASAP. D7881937 which this diff depends on will be reverted in the sync D7990948 which causes this to break. The sync diff cannot be patched with this reversion because it must be landed against base revision 5c8c099 , and D7881937 must not be included in the sync diff because it is breaking GPU tests that are not available in sandcastle : https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-cuda8.0-cudnn6-ubuntu16.04-test/3638/console for one example. * Add the flow to support operator benchmark 1) generate model with the operator 2) upload to everstore 3) generate model spec into json file 4) start running the benchmark * [tum][gpu] Connect DPM trainer with flow and unit tests This diff: - Fix some small bugs for Yiming's recent changes to parallelizer, so it suits real use cases. - Add correct tags to the TUM code, so we can do data parallel transform - pass extra info when instantiation. - add unit test for using DPM in TUM model After this diff, we can do simple box, multi-gpu fully-sync trainer for TUM in Fblearner workflow, but may still need to do speed benchmarking. * w/o normalized lradaption for adam dense only The previous lr adaption includes a normalization step when performing the dot product operation. This is not exactly same as what is proposed in the paper. I add normalization as an option. Without it, the operator performs exactly what the paper proposed. With the option, we add the normalization step * [fb] Use SharedPromise in DeferrableAsyncSchedulingNet This code is to simplify DeferrableAsyncSchedulingNet by removing condition variable + small fixes * [tum] implement cuda sparseLengthsMean and LengthsMean as title * Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function. Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function. * Move feature_to_index to FeatureSpec.feature_to_index move feature_to_index to FeatureSpec.feature_to_index to avoid override other fields * [Caffe2] Rename bytes_moved to bytes_written Just a rename in preparation for supporting bytes_read. * [c2] fix ReduceFrontSumOp for empty case by setting 0 otherwise, it may use the results from last iteration when it's empty batch. * [Caffe2] [Int8] Improve Intel CPU performance * [Easy] Improve PrependDim op logging as titled * DBFileReader expand db_path using os.path.expanduser(..) Since there are a lot of possible use cases of `DBFileReader` to read from user home path, like `~/local/sample.db`, I want to save people's trouble of calling `os.path.expanduser(db_path)` themselves. * [Caffe2] Add bytes_read to cost structure We're adding analytical read bytes to cost functions. This extends the structure accordingly for all CostInference defined operators. Additionally, some small bug fixes were performed: 1) Cost functions now extract type information of operands instead of assuming float * Fix sleef on aarch64 for hhvm @bypass-lint Rename flag * Remove duplicated part in caffe2/ideep/operators/conv_op.cc should be sync error * Rename test helper function test_adagrad_sparse_helper to adagrad_sparse_test_helper to avoid confusing pytest	2018-05-19 23:10:48 -07:00
kuttas	460e8cd376	change print to logger.warning in operator traceback code (#6216 )	2018-04-03 08:01:25 -07:00
Yiming Wu	03c5198331	[C2 Int8][C2 Core]fetch int8 blob Providing Python API to fetch Int8 tensors. data, scale. zero_point = workspace.FetchInt8Blob(blob_name) now returns a tuple if the blob contains a Int8TensorCPU 'data' = int8 data array 'scale' = fake quantization scale 'zero_point' = fake quantization offset Although FetchBlob shares back-end implmentation with FetchInt8Blob, we raise error to prevent unexpected behavior of the same method	2018-03-30 21:00:44 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Kutta Srinivasan	0a18608b43	hacks to test exception handling and python operator backtraces Add exception handling & re-throwing to worker threads of DAGNetBase	2018-03-07 15:09:17 -08:00
Dmytro Dzhulgakov	9e71de398b	[core] Graph-level NUMA awareness in Caffe2 Adding NUMA awareness through numa_node_id in DeviceOption. Blobs of operators with numa_node_id are allocated on corr. memory banks, using CPU pools with NUMA affinity set to run operators.	2018-03-06 00:33:11 -08:00
Yangqing Jia	ced2c7e2b2	Remove Set/GetDefaultGPUID and move to use current gpu id instead. Summary: Reason for this change: (1) Setting/Getting default gpu id doesn't seem to be used at all. (2) It actually is confusing compared to the CUDA_VISIBLE_DEVICES options etc. (3) When setting cuda_gpu_id=-1 in the CUDAContext arg, it used to use the default gpu id but probably we should use the current gpu - so that the caller will be able to control the device placement. One use case is for TensorRT - if we have a custom callback layer, then it would be easier for TRT or whatever caller to set the running device. Reviewed By: dzhulgakov Differential Revision: D6740357 fbshipit-source-id: 2ea710e434b10220d5a198e31c93847304636863	2018-01-19 18:03:21 -08:00
Lu Fang	f779f44c89	Add ONNX exporter for glcgan Summary: Export PyTorch glcgan model to Caffe2 using ONNX Reviewed By: dzhulgakov Differential Revision: D6298765 fbshipit-source-id: 324e52249bb88c6e7bb3b682a4ec0662b6a0c1ea	2017-11-14 10:09:44 -08:00
Ilia Cherniavskii	1149b9bbb5	Polling async net executor Summary: Implementation of polling async net executor. Notes: - New net executor async_polling - schedules CPU and GPU ops asynchronously, uses single polling thread - Events: update to Caffe2 events to support async CPU events, adding new methods: Query() - non-blocking checking of event states: INITIALIZED -> RECORDED -> SUCCESS/FAILED ErrorMessage() - when operation runs asynchronously and fails calling this on event will give error message - Tasks: using existing DAGNet's algorithm to compute CPU and GPU chains, a separate task for each chain - Polling: using single thread to query state of events - for CPU tasks atomically queries task state, for GPU task - uses cudaEventQuery; using Event - Scheduling of CPU ops: using global thread pools - Scheduling of GPU ops: using GPU thread pool per GPU device Reviewed By: dzhulgakov Differential Revision: D5985110 fbshipit-source-id: a9de7fcbb71d046a3aa1b573072b89a65dfeee8c	2017-11-03 07:27:44 -07:00
Bram Wasti	a0aa6d0e24	expose flop annotation to python Summary: expose the flop annotation framework to python functions Reviewed By: Maratyszcza, Yangqing Differential Revision: D6135705 fbshipit-source-id: 2eed80b6cbda7b3ee3fe0e019a0f1fc4b0aa320b	2017-10-24 11:35:24 -07:00
Dmytro Dzhulgakov	2972a6ca02	Revert D6026557: [caffe2][PR] Fix "No handlers could be found for logger" Summary: This reverts commit 95c634872ac02be721257169e38c8fead04cd66b bypass-lint Differential Revision: D6026557 fbshipit-source-id: 663c28583ce3b01070ff5449115ed7e222f71776	2017-10-12 20:21:52 -07:00
Luke Yeager	75bece6ede	Fix "No handlers could be found for logger" Summary: Closes https://github.com/caffe2/caffe2/pull/1316 Differential Revision: D6026557 Pulled By: Yangqing fbshipit-source-id: 95c634872ac02be721257169e38c8fead04cd66b	2017-10-10 22:32:13 -07:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Luke Yeager	ebeaecbfa3	workspace_gpu: Get{CUDAVersion,DeviceProperties} Summary: Expose some useful utilities to Python Closes https://github.com/caffe2/caffe2/pull/1216 Differential Revision: D5843888 Pulled By: akyrola fbshipit-source-id: fc731781aec3c7cc6a4b7132f1624423d015abff	2017-09-17 20:01:34 -07:00
Ben Zhang	cfbd116966	ApplyTransformIfFaster Summary: Implemented ApplyTransformIfFaster Determine if a transform is faster, then return whichever net is better. Reviewed By: bwasti Differential Revision: D5534535 fbshipit-source-id: 509943205b0c454bf30fb01343ac4e88d1441c39	2017-08-17 15:36:51 -07:00
Ben Zhang	6314c1fc15	Transforms in Python Summary: Allow the use of apply_transform() in the python API Reviewed By: bwasti Differential Revision: D5530483 fbshipit-source-id: 61a6d36fe125c89629fdeea040a717c453d84417	2017-08-01 16:51:38 -07:00
Szymon Piechowicz	54b171eae5	Caffe2: don't swallow exception stacktrace Summary: Caffe2: don't swallow exception stacktrace {F69325406} Reviewed By: akyrola Differential Revision: D5503227 fbshipit-source-id: 4e11d921652a094e20c46af19ba880390be8e997	2017-07-26 15:48:05 -07:00
Thomas Dudziak	5355634dac	Dict fixes/improvements and unittest targets for Python 3 in caffe2 core Summary: As title Reviewed By: salexspb Differential Revision: D5316104 fbshipit-source-id: aee43819d817842e5ce6ba3d045a55b1a2491c30	2017-06-29 17:05:41 -07:00
Alexander Sidorov	c8410859d9	Operator python stacktraces, attempt 2 Summary: Last time I used uuid filled into OperatorDef. And operator_tracebacks was populated using traceback.extract_stack. There were several issues with this approach: 1. A random field in OperatorDef breaks workflows relying on memoization, i.e. when computation is skipped based on already computed result before. 2. Adding one more field revealed RNNs being non forward compatible wrt to new fields in there. prototxt format seems to not allow forward compatibility (thanks jamesr66a for the investigation!). For RNNs we need to swtich them to a more resilient approach. azzolini's proposed change to OperatorDef / NetDef would allow that by just nesting NetDef dirrectly inside OperatorDef without need for extra serialization. 3. traceback.extract_stack is very slow when executable is on a remote filesystem. It does one or more os.stat for each frame on the stack. For some cases it ended up being up to 15 extra minutes on model construction. In this diff I use a different approach which should fix all those problems above. 1.2. are solved by not adding a new field at all. Instead I report operator idx wrt to a net it runs in. Thanks akyrola and dzhulgakov for the idea. Downside here is that operator list manipulation breaks the logic and separately created ops are not covered at all. 3. I solved this by operating on raw frames without using traceback and inspect modules which end up doing a lot of file system calls. See function extract_stacktace in core.py with additional comments. Reviewed By: dzhulgakov Differential Revision: D5286285 fbshipit-source-id: 626dd0f5f6b8b1d86bd6bf519078b122f43ddcaa	2017-06-25 19:32:58 -07:00
Alexander Sidorov	83e6a0bec8	Revert uuid change to OperatorDef protobuf Summary: a few issues: 1. Randomization hurts memoization 1. Even if we make it non random, then we can get key colisions when loading it back. 2. RNNs use prototxt for step net and apparently its not forward compatible like normal protobuf is I am thinking of a better less invasive solution now. Reviewed By: jamesr66a Differential Revision: D5272118 fbshipit-source-id: ab577fad04fbfc632e1fceffa923377a0d3da1be	2017-06-19 16:47:31 -07:00
Alexander Sidorov	eebda50b79	Operator python traceback Summary: This is going to show a python Caffe2 user where a failed operator was created. Motivation for having this information not right in protobuf is to avoid having it too verboose and keep ability to read protobufs of a net after a simple print() call. Reviewed By: jamesr66a Differential Revision: D5226047 fbshipit-source-id: 7edfe850e05a2ec209577142aa3368664a57a108	2017-06-13 18:50:02 -07:00
Yiming Wu	8871ef029b	quick fix future issue with brew/core/schema/workspace/scope/utils.py Summary: fixing missing future package issue. Recently we found some of our users does not have future module support. So we might need a try/catch wrapper around all past import Reviewed By: Yangqing Differential Revision: D5183547 fbshipit-source-id: 262fdf2940ee1be4454bf0b0abb9e6a0f1a0ee82	2017-06-05 12:01:48 -07:00
Thomas Dudziak	3ccbf23132	String-related fixes for Python 3 Summary: This diff is one step towards enabling python 3 build by making it be more diligent in its handling of strings. Reviewed By: salexspb Differential Revision: D4893083 fbshipit-source-id: 28b8adf3280e8d1f0a7dc9b0fee5ad53f2fada57	2017-05-26 16:04:32 -07:00
Aapo Kyrola	658c337f41	Error status for Gloo ops, and handling in elastic dpm Summary: Add a RandomFailureOp and handling to elastic data parallel model of the status code Reviewed By: andrewwdye Differential Revision: D5065936 fbshipit-source-id: 24224f9ea414ee535c9e90cc28add5189354b0ef	2017-05-17 00:16:52 -07:00
Yangqing Jia	cf317d1106	create_net: explicitly specify if one wants to overwrite the network. Summary: This is from discussion with dzhulgakov : as a step towards revisiting the core.Net autonaming, we will first guard against accidental overwrites of existing networks in the workspace. ajtulloch since we are doing Predictors in mobile, this should be safe right? azzolini - I assume this would be safe, but would love to get your approval. akyrola - would this hurt xray? Reviewed By: dzhulgakov Differential Revision: D4897725 fbshipit-source-id: aa41271927ad6671f07a53b9505283623f8c49e5	2017-04-17 21:46:53 -07:00
Aaron Markham	58f7f2b441	doxygen python block added Summary: Closes https://github.com/caffe2/caffe2/pull/226 Differential Revision: D4793550 Pulled By: JoelMarcey fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e	2017-03-29 06:46:16 -07:00
Fei Sun	95657ea1e8	Protobuf is binary string. Use bytes instead. Summary: Prepare for the Protobuf change. Reviewed By: dzhulgakov Differential Revision: D4784884 fbshipit-source-id: 86219eecefaf7637e70339437c9274c526ebd6fe	2017-03-28 19:03:23 -07:00
Alexander Sidorov	56f324d191	Added predictor bindings to python interface Summary: from caffe2.python import workspace; p = workspace.Predictor(init_net, predict_net); outputs = p.run(inputs) Reviewed By: Yangqing Differential Revision: D4576793 fbshipit-source-id: b829bbcaf2e7c34dad85024177433207bd96a234	2017-03-15 11:17:54 -07:00
Yangqing Jia	bdd542d087	backup functions for non-cuda cases Summary: This fixes the error introduced in cudnn v6 diff. Reviewed By: ajtulloch Differential Revision: D4633113 fbshipit-source-id: 454cd4b3e52b8de01c1914e66d25310d7ecb13aa	2017-02-28 22:07:54 -08:00
Simon Layton	fbf47a8825	Cudnn v6 Summary: Add cudnn v6 support, including testing support for dilated convolution. Add a check to ensure that the versions of cuDNN used to compile Caffe2 and run it are compatible Closes https://github.com/caffe2/caffe2/pull/85 Reviewed By: bwasti Differential Revision: D4387690 Pulled By: Yangqing fbshipit-source-id: 312960134398dd4afe6ee0c01cdc160046c904e8	2017-02-28 17:46:33 -08:00
Aaron Markham	851cb7059d	changed StringfyProto to StringifyProto Summary: Closes https://github.com/caffe2/caffe2/pull/155 Reviewed By: dzhulgakov Differential Revision: D4621607 Pulled By: Yangqing fbshipit-source-id: ec7f45132260fbb6d36ef61ffbf5bf6466f237eb	2017-02-27 23:05:04 -08:00
Bram Wasti	183e158642	Remove Model API (unused) Summary: Removed Model API because no one {seems to,should} be using it Reviewed By: Yangqing Differential Revision: D4575126 fbshipit-source-id: 174d39e9aa46750f1fae8295f7e1e5452559af33	2017-02-21 17:19:05 -08:00
ezineo	e52676b272	Delete SerializeToString() call in class Model(), workspace.py Summary: .In Tutorial, I found it not correct when calling Model(). After that changing, It works. Closes https://github.com/caffe2/caffe2/pull/148 Reviewed By: bwasti Differential Revision: D4556894 Pulled By: Yangqing fbshipit-source-id: 949a8d0496861f19869436908ffe1ef1a0f853b1	2017-02-13 23:18:03 -08:00
Andrew Dye	306fde233a	Accept optional blob map for InferShapesAndTypes Summary: Shape inference allows Caffe2 to compute shapes of blobs without running a model. Update InferShapesAndTypes() to accept an optional blob:dimensions map so that external input blobs do not need to be part of the workspace. InferShapesAndTypes() in workspace.py conditionally calls the ...from_workspace or ...from_map bindings. Note I favored a small amount of code duplication here for the sake of readability. InferShapesAndTypes() in operator.cc has been refactored into mirrored entry points, invoking a common helper. Other minor changes to address linter warnings. Reviewed By: dzhulgakov Differential Revision: D4524873 fbshipit-source-id: 56f863b759c016d7f23523f06fda3aa5bba22357	2017-02-08 15:04:24 -08:00
Aapo Kyrola	6a03641cde	Add num_iters to RunNet() Summary: Running RunNet() in python in a loop can be a performance issue if the python code is doing a lot of other processing, such as data input, because python's Global Interpreter lock (GIL) will prevent the RunNet() to be called. This can easily be fixed by making RunNet() run multiple iterations inside the C++ land. (Another way to accomplish the same thing is to use Caffe2's "execution plans", but that requires more setup). + fixed timing reporting in my OC workflow + improved one error log in data_workers.py Sorry for piggypagging those small changes, but landing diffs currently is slow... Reviewed By: rpenggithub Differential Revision: D4523575 fbshipit-source-id: 039a647576efad5dd9afda74df478ac22b43c103	2017-02-07 14:16:14 -08:00
Aapo Kyrola	dcefc74a0c	Shape and Type Inference Part1 Summary: This is a bit large diff, sorry about it. It includes basic shape and type inference functionality, based on YQ's Schema scaffolding. I added some helper functions to make it easier to write simple translations. Bigger refactoring was needed for ConvPoolBase so that we could use the shape inference already there in the schema. I annotated enough operators to be able to infer forward-pass of shapes for basic convnet, and added test for that. I intend to bootcamp some annotations and annotate enough to handle Resnets fully. Need to think about gradients, if they could be annotated in an easier way. Only shapes are now exposed to Python, types will follow later. Also the inference is not called yet anywhere but unit test. Also I am not sure if everything is in the best location in the code, but shouldn't be hard to move stuff around. Reviewed By: dzhulgakov Differential Revision: D4436818 fbshipit-source-id: eebee5937ccc9ac09c245465302388a1fae6933c	2017-02-02 22:29:22 -08:00
Aapo Kyrola	fe38a0c2b1	remove logging.basicConfig() from workplace Summary: As part of PR from GitHub, "logging.basicConfig()" was added to workplace, causing havoc with existing logger configurations. It should not be here. Thanks rbgirshick for reporting. Reviewed By: kdub0 Differential Revision: D4346077 fbshipit-source-id: 084ddcbfe6354bdaf5c97a42086c0bd36ec4629c	2016-12-19 11:59:26 -08:00
Pooya Davoodi	78edb8295e	No exception for float64 in FeedBlob. Warning instead. Summary: The exception in FeedBlob causes many tests to fail. Instead of exception, we log a warning message and move on. Feeding a float64 blob should not cause any issue. Closes https://github.com/caffe2/caffe2/pull/57 Reviewed By: bwasti Differential Revision: D4343135 Pulled By: Yangqing fbshipit-source-id: cd1144b94c9883fcbd8bdcd78f9f93a67debc0a6	2016-12-16 17:29:29 -08:00
Aapo Kyrola	3410939459	pass learning rate scaling factor to parameter update builder function Summary: When refactoring data parallel model, the division of LR by number of devices was dropped, and thus we ended up effectively multiplying gradients by the number of devices. Thus, we need to scale the LR by 1/numgpus. Created a test to confirm that data_parallel_model produces exactly same results on different number of gpus, given the total batch size. Reviewed By: prigoyal Differential Revision: D4248907 fbshipit-source-id: af21ede113e6ac25f12c556de298cb18974548be	2016-12-05 11:53:26 -08:00
Dmytro Dzhulgakov	a7df0e6724	Clone model net to avoid hard-coded inputs Summary: Previously DPER was quite broken - we couldn't change loaders on the fly because serialized model had blob names hard-coded, e.g. "nn_loader/dense". In fact, the tests worked only by accident as both trainer and evaluator used the same loader type. This diff does the following: 1) when writing out model, remap input blobs to be 'inputs/<field_name>' 2) when loading eval model, remap them back to the current loader This diff uses Net.input_schema() for convenience, in particular the schema format is implicitly serialized in input blobs names. From our discussion with Andrey this type of hardcoding is actually acceptible since the schema of HiveReader on python side is inferred via the same string-parsing procedure It also modifies model saving a bit so that we don't pollute global namespace with shape_provider net. Overall code in mlp.py is pretty terrible. But I'd leave refactoring to xianjiec as a part of Layers migration. Reviewed By: xianjiec Differential Revision: D4218902 fbshipit-source-id: 6cd19f0343ec1be6ddaa3581512e61879957749e	2016-11-29 15:18:38 -08:00
Aapo Kyrola	c1c92479bd	check that numpy arrays are float32 when CUDA is used Summary: Recurrent developer-issue is that they pass numpy arrays with FeedBlob but forget that python float is actually double. Cuda ops in caffe2 don't allow doubles. Thus, I think we should reject incorrect types already at the FeedBlob() when device option is CUDA. Added test. Is this too strong? Reviewed By: ajtulloch Differential Revision: D4208153 fbshipit-source-id: 364b057a2a37b5d4b95de4e59faebdab724bb0ed	2016-11-29 15:18:37 -08:00
Yangqing Jia	238ceab825	fbsync. TODO: check if build files need update.	2016-11-15 00:00:46 -08:00
Yangqing Jia	d1e9215184	fbsync	2016-10-07 13:08:53 -07:00
Yangqing Jia	b23e51d467	chunky sync	2016-09-06 15:55:19 -07:00

1 2

56 Commits