pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Aapo Kyrola	dcefc74a0c	Shape and Type Inference Part1 Summary: This is a bit large diff, sorry about it. It includes basic shape and type inference functionality, based on YQ's Schema scaffolding. I added some helper functions to make it easier to write simple translations. Bigger refactoring was needed for ConvPoolBase so that we could use the shape inference already there in the schema. I annotated enough operators to be able to infer forward-pass of shapes for basic convnet, and added test for that. I intend to bootcamp some annotations and annotate enough to handle Resnets fully. Need to think about gradients, if they could be annotated in an easier way. Only shapes are now exposed to Python, types will follow later. Also the inference is not called yet anywhere but unit test. Also I am not sure if everything is in the best location in the code, but shouldn't be hard to move stuff around. Reviewed By: dzhulgakov Differential Revision: D4436818 fbshipit-source-id: eebee5937ccc9ac09c245465302388a1fae6933c	2017-02-02 22:29:22 -08:00
Alexander Sidorov	2ce3cfefe1	Char-RNN Tutorial Summary: This learns Shakespeare and then generates samples one character at a time. We want this to be an example of using our LSTM and RNNs in general. Now it takes 4ms to run the training net on current parameters (with batch size = 1). I don't have data on how much each operator takes yet. But overal python loop doesn't seem to influence much - with 1000 fake iterations in run_net it took 4s for each iteration as expected. Future work: * fixing convergence for batching * profiling on operator level * trying it out with GPUs * benchmarking against existing char-rnn implementations * stacking lstms (one lstm is different from two, one needs to take care of scoping) Reviewed By: urikz Differential Revision: D4430612 fbshipit-source-id: b36644fed9844683f670717d57f8527c25ad285c	2017-02-02 15:44:32 -08:00
Alisson Gusatti Azzolini	d7e85bf38e	Fix ops.stop_if() from inside processors Summary: stop_if() was not being honored in ProcessingReader. Reviewed By: dzhulgakov Differential Revision: D4497784 fbshipit-source-id: 1c967c6252f832149800796e2c26aadf10b74850	2017-02-02 15:14:27 -08:00
Alisson Gusatti Azzolini	000c53a7b1	AtomicCounter to return previous value on Reset. Summary: This allows to save the previous value of the counter and send it upstream without losing counts. Reviewed By: kennyhorror Differential Revision: D4497854 fbshipit-source-id: 28a7ad0ff1020bde26f78b1f59614b094d1e1881	2017-02-02 14:59:30 -08:00
Alisson Gusatti Azzolini	d93b9eeae2	Fix NetBuilder's task_init Summary: The net was being added to the task body by mistake. Also, adds local_init and local_exit functionality. Reviewed By: dzhulgakov Differential Revision: D4497794 fbshipit-source-id: 4d9dfb48a277ccfa204f1e74886abba5d44c61f8	2017-02-02 14:59:30 -08:00
Zhao Tan	d8dff5853e	Add numSample field for preComputing Summary: For customers like Ads, Feeds, MarketPlace, their training data size is super large. It is unnecessary and costly to go over all the data to compute meta information. In this diff, numSample option is added in preCompute, so users have control over how many samples they want to use when computing meta information. Differential Revision: D4492399 fbshipit-source-id: 7199381d226ee6300a959fc5e116d39984d199fc	2017-02-02 13:59:30 -08:00
Bram Wasti	77fd7c2b6f	Make translator work as command line tool Summary: The initial implementation wasn't working quite right (no const fill of an empty external input) Reviewed By: viswanathgs Differential Revision: D4490569 fbshipit-source-id: 1b2a4f612efb3b2685edfe6c683571dd9d01aa4f	2017-02-01 13:14:26 -08:00
Sean Snyder	79c04d32dc	add an option to use a resnet network instead of alexnet Summary: add an option to use a resnet network instead of alexnet. Modified the resnet.create_resnet50 function slightly to allow specifying different kernel/stride parameters so we can adapt resnet to our image size. Differential Revision: D4472535 fbshipit-source-id: ed06acf52f6425a1e04d047548eb3c70388d74aa	2017-01-31 16:59:30 -08:00
Alexander Sidorov	b7fa6b2a8b	remove recurrent_inputs in a favor of recurrent_input_ids Summary: I have forgotten to remove this one. The rest of indexing instead of string names is comming after D4446813 lands as scratches aren't inputs or outputs and thus can't be indexed. Reviewed By: urikz Differential Revision: D4465748 fbshipit-source-id: 2ccbedfb35541ef4a2231d1480eef59025bd5290	2017-01-31 13:14:33 -08:00
Alexander Sidorov	d019ec793c	improve fluky test Summary: On some inputs TestWarden was failing Reviewed By: Yangqing Differential Revision: D4487293 fbshipit-source-id: 3da4b310a619c2b57f033b2dd7727f71403bfd68	2017-01-30 22:14:27 -08:00
Yury Zemlyanskiy	debd256177	Fix for gradient propagation for initial recurrent state for RecurrentNetwork Summary: looks like we don't a good job with initial recurrent input gradients yet. Here is some fix, but gradient doesn't check yet. The shape is correct now though Reviewed By: salexspb Differential Revision: D4475447 fbshipit-source-id: 280f1f59f19e487fd0dce0d440609c50ddce294a	2017-01-30 18:59:32 -08:00
Alisson Gusatti Azzolini	0700e05e68	Disallow duplicate field names in Struct Summary: title. Differential Revision: D4482958 fbshipit-source-id: a732f6b5d862b440a4856251ad68ecd98f60e8d1	2017-01-30 14:44:28 -08:00
Alisson Gusatti Azzolini	1d3834eeb2	Nodes to support resource requirements and outputs Summary: See distributed.py for example of usage Reviewed By: xianjiec Differential Revision: D4467723 fbshipit-source-id: c74f71bebaa1751098379838d3da55945aac62bd	2017-01-30 11:29:25 -08:00
Yangqing Jia	8553bd3f68	Ensure we are not using Eigen LGPL code, and build on raspbian. Summary: Turns out that building on raspbian is easy as a cake for caffe2 - cmake is awesome. Closes https://github.com/caffe2/caffe2/pull/112 Differential Revision: D4480985 Pulled By: Yangqing fbshipit-source-id: 5dbe5e1e71d8680dea7a5ec8a9ce7fbe6aa5270a	2017-01-30 09:44:27 -08:00
Alisson Gusatti Azzolini	14a5b35805	Snapshot -> Checkpoint Summary: As per kennyhorror request. Reviewed By: kennyhorror Differential Revision: D4473177 fbshipit-source-id: 6cab6ccf247b09aab8f6f056c807bd3ed27ee6a5	2017-01-27 22:29:32 -08:00
Andrey Malevich	86fb25cefa	Rely on embedding size in split Summary: As desc. Differential Revision: D4471823 fbshipit-source-id: 2685c64c22556da1749b3e3e6b21a684a7231e7b	2017-01-27 19:44:31 -08:00
Viswanath Sivakumar	eba5299576	Port ROIPool to caffe2 trunk, add CPU implementation Summary: Xray is being converted to c2 and ROIPool (needed for detection models) is missing in c2 trunk. Ported rbgirshick's implementation from experimental with a few changes: Also added code for translation in caffe_translate.py Differential Revision: D4453331 fbshipit-source-id: 7a05a88edec1bd6e806e52dc1e6c55bc75c3149f	2017-01-27 12:59:20 -08:00
Yury Zemlyanskiy	22e1bdd6d1	Use stack workspaces in RecurrentNetwork Summary: This diff use stack workspaces in RecurrentNetwork, which allows to simplify the implementation and get rid of scratches. Reviewed By: salexspb Differential Revision: D4446813 fbshipit-source-id: 514eec7e4300bdf492a9cb192b40cf4f89acf656	2017-01-27 11:44:26 -08:00
Ou Jin	ed04a20289	distributed reader for evaluation Summary: Using multiple readers for model evaluation. Since it is built by new framework, only NativeLoader is supported. With 5 readers, the evaluation speed is 124k. The speed for single evaluator is 32k. There is still room for improvement since the evaluator machine is under-utilized. (Hive is the bottleneck. Adding more loading threads help to improve the speed to 240k. More readers can improve it further.) Reviewed By: azzolini Differential Revision: D4469393 fbshipit-source-id: b55af5f798faca4c150b2c0663fe5db0f154cb70	2017-01-27 10:44:24 -08:00
Vsevolod Oparin	319945df15	Test for FC operator + fix for docs Summary: Test for FC operator + fix for docs Differential Revision: D4473293 fbshipit-source-id: 6e6ebad007ee08b05184fda288ab74982c6b2219	2017-01-27 10:44:24 -08:00
Fei Sun	cc65cc64c8	Create function ParseProtobufFromLargeString to parse strings more than 64MB Summary: Replace ParseFromString with ParseProtobufFromLargeString to get around the limitation of the 64MB limit. Reviewed By: Yangqing Differential Revision: D4466226 fbshipit-source-id: b68a6efc76955db294ddb0d23bbaf03b69e4952a	2017-01-27 10:29:22 -08:00
Viswanath Sivakumar	ca1ff1ee9b	Add Flatten layer, bugfix in InnerProduct Summary: Uncovered these while converting xray detection model. Differential Revision: D4461051 fbshipit-source-id: 1654c0d7ed101c8c211a93aed6bb542db1e20e0a	2017-01-26 21:44:35 -08:00
Bram Wasti	9dd1d9428e	Made translator work as command line tool Summary: Might be useful to have a command line version of this. Thoughts? Reviewed By: Yangqing Differential Revision: D4456221 fbshipit-source-id: 42dd464c5734c0cfbd4c2b1cb348aef9b269b4c2	2017-01-26 20:29:35 -08:00
Dmytro Dzhulgakov	864f561525	Make BlobDeserialization throw exceptions instead of returning bool Summary: Makes it much nicer to spot errors, especially in iPython notebook. Reviewed By: kennyhorror Differential Revision: D4465726 fbshipit-source-id: c0adaf5168248a70987ff9d5dfce54a622ff2219	2017-01-26 09:44:19 -08:00
Alexander Sidorov	8bff8014b3	print out inputs in lstm test to catch when it is fluky Summary: We get fluky lstm tests on a numerical gradient check. I would like to improve accuracy of the latter. But first need an example. After lading this TestWarden would find a bad input for me. Reviewed By: urikz Differential Revision: D4467223 fbshipit-source-id: 68d4bf22af11190f39fa28332c6d99efbb192132	2017-01-25 20:59:21 -08:00
Minsuk (Brian) Kahng	de8cd46416	Caffe2 graph to json for visualization in flow Summary: - Writing a Caffe2 computation graph to json for visualization in Flow - Example use in the Text models workflow: it replaces the existing draw function which produces PNG file - Visualization: https://our.intern.facebook.com/intern/fblearner/c2graphvis/13215753/ - The visualization uses FBLearnerDAG. Plan to add many visualization-related features. Reviewed By: Mortimerp9 Differential Revision: D4415299 fbshipit-source-id: 2d641d60177566ed2837fb3750394420690f28de	2017-01-25 19:44:20 -08:00
Andrew Tulloch	0f870d4f40	Add error checking for too-small input in ConvPoolOpBase Summary: Fixes segfaults that occur in Eigen and im2col/sgemm backends. Reviewed By: Yangqing Differential Revision: D4451772 fbshipit-source-id: 3cf21e5afb2fe300db4228933a82063db5f7091f	2017-01-25 17:44:22 -08:00
Viswanath Sivakumar	9775ffc6ae	Fixes to topological sort, canonical blob naming, sharing final blob Summary: Three small changes: Reviewed By: ajtulloch Differential Revision: D4437131 fbshipit-source-id: c849e36e1c4d1dce947076349df863fafe62c66d	2017-01-25 15:14:26 -08:00
Viswanath Sivakumar	a4ba0cceb2	Run memonger to optimize net if needed Summary: This runs memory optimization on the net. Differential Revision: D4433788 fbshipit-source-id: 80c3f0568795c2d7a5beb3cdb89a92af91162fef	2017-01-25 15:14:26 -08:00
Priya Goyal	40ce50e0bd	Speed-up training, fast data-augmentation, sync data_parallel_model changes + other small fixes Summary: 1. Use opencv for data augmentation after benchmarking various image libraries in python 2. Use cuda no bias conv 3. Use cuda fastest conv (exhaustive search) 4. data_parallel_model had a few changes. Syncing them 3. propagate the errors in threads to make debugging easy Reviewed By: rbgirshick Differential Revision: D4341422 fbshipit-source-id: aa4471a2f49dd6d7ca13879999b3c7ceaf818c1e	2017-01-25 11:44:22 -08:00
Dmytro Dzhulgakov	aed53dd7cf	Pass cmd flags of GlobalInit down to workers in Flow Summary: It's a similar trick to dyndeps. The idea is that global state is better to be just replicated to gang workers as otherwise it causes a lot of confusion. In particular it's useful if one wants to enable detailed logging (--v) For other operators user still needs to call GlobalInit explicitly. We should consider doing it for all Flow operators, but I'll leave it for future considerations. Reviewed By: kennyhorror Differential Revision: D4460686 fbshipit-source-id: 5836737dd3195f9ad12589fd899a3ff63f173e05	2017-01-25 11:14:51 -08:00
Xianjie Chen	ddbf90afa3	improve dper dh Summary: it's broken because it relies on add sparse bias. it's not easy to add_sparse_bias after switch to loader_param. DPA would like to try it out :) Differential Revision: D4447275 fbshipit-source-id: 631cb4995f35383070e44387dc86692ba64b91eb	2017-01-25 02:59:22 -08:00
Yury Zemlyanskiy	0e3146e1e8	Remove recurrent_sizes from RecurrentNetwork Summary: Remove usage of recurrent_sizes, so recurrent states' sizes can depend on input (in case of attention matrix for beam decoder). I removed recurrent_sizes from forward and backward steps. Reviewed By: salexspb Differential Revision: D4427688 fbshipit-source-id: 580420a294d309c86ec5cb4e677058623b7228e1	2017-01-24 23:14:25 -08:00
Vsevolod Oparin	5e5486491d	Replace Gather + RowMul by SparseLengthsWeightedSum Summary: Improving performace using command SparseLenghtsWeightedSum. Results for my run: Before: 8.98474 RowMul 6.89952 Gather 0.80991 LengthsSum 2.02056 SparseLengthsWeightedSum Total: 18.71 After: 1.075 Gather 6.54999 SparseLengthsWeightedSum Total: 7.62 Log of run: P56992396 With skip_backward. Command: CLASSPATH=/mnt/vol/gfsetlprocstore-oregon/users/cxj/hivereader-wrapper-1.0-SNAPSHOT-standalone.jar OMP_NUM_THREADS=1 MKL_NUM_THREADS=1 MKL_DYNAMIC=FALSE ./buck-out/gen/caffe2/caffe2/fb/dper/tools/speed_benchmark.par -loader_param /mnt/vol/gfsfblearner-altoona/flow/data/2017-01-22/d832bb7b-5598-422e-9fee-b3299a9c8c1f -negDownsampleRate 0.1 -hidden 'unary(dot{"num_dense": 6, "pooling_method": "PositionWeighted"}(128, 64)128-128, 1)' -model_type mlp_sparse -warmup_runs 10 -main_runs 1000 -run_individual -skip_backward 2>&1 \| tee /tmp/log.txt Before: P56993234$7509 After: P56992503$7344 Command: ./fblearner/nn/ads/canary all https://our.intern.facebook.com/intern/fblearner/details/13320564/?notif_channel=cli Cloned "caffe2 ads sparse nn canary" run: https://our.intern.facebook.com/intern/fblearner/details/13322337/ Reviewed By: xianjiec Differential Revision: D4451073 fbshipit-source-id: 0a4e9693d7b8b0372b2efefa61154e987a493210	2017-01-24 20:44:21 -08:00
Alexander Sidorov	b1472a173a	don't hardcode outputs order to work only for lstm + don't pass blob names for parameters Summary: In this diff I stop passing parameters by name and also remove hardcoded output ids which were there specifically for LSTM to work. It also allows to avoid using recurrent_sizes in the backward pass (for forward this is done in D4427688) Using similar technic it should be simple enough to eliminate blob name passing at all. Then we can fix scoping. These can be done in a next diff. Reviewed By: urikz Differential Revision: D4444614 fbshipit-source-id: 3580a76365502b9f2f09e3d8b7e78084ca739f00	2017-01-24 16:29:23 -08:00
Alexander Sidorov	f09da676d7	CNNModelHelper.LSTM test Summary: lets have a test for this so we don't break existing usecases while iterating over RecurrentOp's code Reviewed By: urikz Differential Revision: D4456404 fbshipit-source-id: 79f2b88c1eed16106adf5b793b4c74441c7146c6	2017-01-24 15:59:24 -08:00
Chao Zhang	96fc095ccb	Add piecewise linear transformation operator Summary: New operator is added for model calibration. Given a piecewise linear function and raw prediction as input, generate the mapping as output. Detail can be find in the operator doc. Differential Revision: D4418640 fbshipit-source-id: f8ff3ea786b0fe233a4ddcb709e5dbf0861ca484	2017-01-23 17:44:26 -08:00
Bram Wasti	b5424c9646	Enable top-k accuracy option in caffe_translator Summary: Caffe2 has a topk accuracy op now Differential Revision: D4450387 fbshipit-source-id: 2d516cc44fb4e814ca901e73746b0364a0584217	2017-01-23 14:29:24 -08:00
Simon Layton	7acdece3b2	Comment out NHWC Alexnet test for now Summary: Relies on NHWC implementation of group conv which doesn't exist right now Closes https://github.com/caffe2/caffe2/pull/103 Differential Revision: D4451635 Pulled By: Yangqing fbshipit-source-id: 31d99b37abf7563a26389f47affcc759ce6bc5e1	2017-01-23 13:59:29 -08:00
Yangqing Jia	e3ea3e8c12	MKL convolution operator Summary: Closes https://github.com/caffe2/caffe2/pull/102 Differential Revision: D4448886 Pulled By: Yangqing fbshipit-source-id: 914d11cd79107895a9755154df3526fcf71a31ea	2017-01-23 09:59:30 -08:00
Ross Girshick	e0c90de6e6	Speedup get_op_ids_in_path Summary: Perf bug report: https://www.facebook.com/groups/1405155842844877/permalink/1617904561570003/ Diagnosis: I've done some digging into this and here's what I've found: (1) In this use case, the call is disallowed_op_ids = get_op_ids_in_path(ssa, blob_versions, [], inputs)) where inputs = ['res4_22_sum'] is the last blob produced by the res4 stage of a ResNet101 model. (2) get_op_ids_in_path has exponential running time in the number of blocks in the res4 stage of ResNet. This is based on empirical running times. This call should complete in 4.5 days on my devgpu. (3) I haven't familiarized myself enough with the IR and SSA code in core.py to understand the algorithmic fix yet, but surely there's a more efficient algorithm to compute the same thing. Reviewed By: Yangqing Differential Revision: D4446278 fbshipit-source-id: 8bd147f92d62b865dc355d5802a53e92d64b6e21	2017-01-23 09:44:26 -08:00
Alexander Sidorov	c4b640aeb2	@debug decorator to make it easier to use dropin debugger Summary: Now it takes two lines to get drop-in debugger: import it and then decorate your function. Also got rid of enable / disable logic as it doesn't seem usefull. We can also try to enable this by default for our tests when running locally as a next step. Reviewed By: bwasti Differential Revision: D4444299 fbshipit-source-id: 6e2006945d8ad640685b1017ca1bd63054728908	2017-01-23 09:44:26 -08:00
Andrey Malevich	ec51f887bf	Create only one instance of SigridTransform in DPerExample. Summary: DPer example have been creating multiple copies of the transform config in net defition till this moment, that resulted in the fact that I've hit the limit of ProtoBuf (64MB) for a certain Task requests (especially visible because of the ValidationPipeline that I was adding). After this diff we're going to store SigridTransforms in one instance per machine for training (or 1 instance per reading). Difference in sizes of the plans for some simple SparseNN model ~30 MB (even including the fact that second model have validation plan as well). TODO: Do similar logic for NNPreProc as well (it's also pretty large). Reviewed By: dzhulgakov Differential Revision: D4441441 fbshipit-source-id: 4452dd86a4dc49b2c7f5b7642f443aed5720b047	2017-01-22 19:29:16 -08:00
Aapo Kyrola	06398e9bfb	softmax-with-loss, handle gracefully cases when total weight is 0 Summary: Spatial Softmax allows specifying locations that are not counted for the loss. If none of the locations are counted, this resulted in NaNs, and headache. This diff fixes that by explicitly handling these cases. + assertion for label blob dimension(0) Created a new test as well. Differential Revision: D4442939 fbshipit-source-id: 8641bfad2a994e517ca3eda39345380a6ca1ba50	2017-01-20 15:29:21 -08:00
Aapo Kyrola	e18643f90b	More fixes Summary: When testing the code, a couple of issues arised: - we need to have different name for last layer than the preprocessed model, otherwise a shape assertion is created - preprocess_noaugmentation still needs to do a crop for images larger than 227x227, otherwise things fail. Reviewed By: viswanathgs Differential Revision: D4442700 fbshipit-source-id: 05f54e7f17c266280f5ba5bb57af1721fe30df12	2017-01-20 13:44:24 -08:00
Kevin Matzen	6a7dd236fa	instance norm Summary: Added gradient and GPU implementation to caffe2 InstanceNorm op Reviewed By: Yangqing Differential Revision: D4304808 fbshipit-source-id: 6feecaed589ea9f825260a49b39b4260da6e5426	2017-01-20 12:29:28 -08:00
Alexander Sidorov	3f66f66da9	DebugMode helper for Caffe2 Summary: It helps to develop scripts locally (when working outside of Flow). One doesn't have to rerun the script in order to catch exception in the debugger / add a print statement. (Flow does this kind of thing automatically) Usage example: ``` if __name__ == '__main__': workspace.GlobalInit(['caffe2', '--caffe2_log_level=2']) from caffe2.python.utils import DebugMode DebugMode.enable() DebugMode.run(main) ``` Reviewed By: Yangqing Differential Revision: D4424096 fbshipit-source-id: 73f418c80f581820e70139df7e166981e4d8c55f	2017-01-20 09:29:31 -08:00
Aapo Kyrola	afe822ebd7	Small tweaks Summary: Some tweaks, hopefully getting us to 0.98 MAP - no cropping for test dataset (as per patrick) - spatialBN momentum 0.1 (default is 0.9) Also added some additional logging and reduced frequency of running of test net and logging. Reviewed By: viswanathgs Differential Revision: D4439790 fbshipit-source-id: 700705b811a5fc8c7139a265de96db646605ca5a	2017-01-19 18:44:26 -08:00
Ahmed Taei	411059d649	Generate huffman tree Summary: In this diff : [1] Change the output from generating all paths from root to labels to TreeProto. TreeProto itself is required by inference and we can use hsm_util to get the paths from TreeProto. [2] Fix hsm_util index assigment. Differential Revision: D4416731 fbshipit-source-id: 657d8b9b4df6fa30c9f92d391cf7e07b5c5db1f8	2017-01-19 16:14:23 -08:00
Ahmed Taei	dd51336611	Fix label start index for HuffmanTreeHierarchyOp Summary: Change labels indices range to be in the range [0, num_classes[ Differential Revision: D4416685 fbshipit-source-id: b16ca8539fd538ad62bf1298dbad3f1553956241	2017-01-19 15:14:53 -08:00

1 2 3 4

199 Commits