pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Yinghai Lu	63dbef3038	Better msg (#43848 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43848 Missing space in logging. Test Plan: build Reviewed By: hl475 Differential Revision: D23416698 fbshipit-source-id: bf7c494f33836601f5f380c03a0910f419c2e62b	2020-08-31 10:36:59 -07:00
Hector Yuen	c8e789e06e	add fake fp16 fusions to net transforms (#42927 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42927 added fp16 fusion to net transforms refactored the transforms as well as glow_transform to get out of opt/custom so that the OSS builds passed Test Plan: added net runner tests for this Reviewed By: yinghai Differential Revision: D23080881 fbshipit-source-id: ee6451811fedfd07c6560c178229854bca29301f	2020-08-14 13:30:27 -07:00
Hector Yuen	18ca999e1a	integrate int8 swish with net transformer Summary: add a fuse path for deq->swish->quant update swish fake op interface to take arguments accordingly Test Plan: net_runner passes unit tests need to be updated Reviewed By: venkatacrc Differential Revision: D22962064 fbshipit-source-id: cef79768db3c8af926fca58193d459d671321f80	2020-08-07 23:01:06 -07:00
Stephen Chen	2971bc23a6	Handle fused scale and bias in fake fp16 layernorm Summary: Allow passing scale and bias to fake fp16 layernorm. Test Plan: net_runner. Now matches glow's fused layernorm. Reviewed By: hyuen Differential Revision: D22952646 fbshipit-source-id: cf9ad055b14f9d0167016a18a6b6e26449cb4de8	2020-08-07 10:48:33 -07:00
Yinghai Lu	5c5d7a9dca	Freeze dynamic (re)quantizaiton ops into standard ones (#42591 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42591 We don't support lowering with 2-input Int8Quantize and 4-input Int8FC. Just do a conversion to absorb the quantization params into the op itself. Test Plan: ``` buck test caffe2/caffe2/quantization/server:quantize_dnnlowp_op_test ``` Reviewed By: benjibc Differential Revision: D22942673 fbshipit-source-id: a392ba2afdfa39c05c5adcb6c4dc5f814c95e449	2020-08-05 11:53:09 -07:00
Ying Zhang	b2ef7fa359	Add a flag to enforce fp32 to fp16 conversion for all inputs of the onnxifi net. (#39931 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39931 ATT. Reviewed By: yinghai, ChunliF Differential Revision: D21993492 fbshipit-source-id: ff386e6e9b95a783906fc1ae6a62462e6559a20b	2020-07-28 16:48:43 -07:00
Yinghai Lu	eb3bf96f95	During inbatch broadcast, move Tile op after Fused8BitRowwiseQuantizedToFloat if applicable (#41464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41464 If input is int8 rowwise quantized, currently we cannot low it to Glow. And previously, we had some error when running with inbatch broadcast. The main issue is that Tile op doesn't support uint8_t type, which is very easily added here. However, this will result in non-ideal situation that we will leave Tile -> Fused8BitRowwiseQuantizedToFloat on host side, which probably hurt the memory bw a lot. Even we later add the support to Fused8BitRowwiseQuantizedToFloat in Glow, it's still not ideal because we are doing redudant compute on identical columns. So the solution here is to swap the order of Fused8BitRowwiseQuantizedToFloat and Tile to make it Tile -> Fused8BitRowwiseQuantizedToFloat. In this way, it will resolve the error we saw immediately. For the short term, we can still run Tile in card. And for longer term, things runs faster on card. The optimization is a heuristic. If in the net, there isn't such pattern, inbatch broadcast will work as it was before. (Note: this ignores all push blocking failures!) Test Plan: ``` buck test caffe2/caffe2/opt/custom:in_batch_broadcast_test ``` Reviewed By: benjibc Differential Revision: D22544162 fbshipit-source-id: b6dd36a5925a9c8103b80f034e7730a7a085a6ff	2020-07-16 21:25:18 -07:00
Hector Yuen	d601325de4	update operators in the mapping to fp16 emulation Summary: add logit and swish to this list Test Plan: f203925461 Reviewed By: amylittleyang Differential Revision: D22506814 fbshipit-source-id: b449e4ea16354cb76915adb01cf317cffb494733	2020-07-13 14:08:24 -07:00
Hector Yuen	6d70d1574f	rename the LayerNorm operator and add it to the replacement map (#40318 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40318 rename layernom fakefp16 to the right naming convention add it to the map of replacement ops this can be done even if the operator is not complete because we are blacklisting anyways Test Plan: net_runner and inspected the log that replacement happened Reviewed By: venkatacrc Differential Revision: D22145900 fbshipit-source-id: f19794ec05234b877f7697ed8b05dd8f46606c47	2020-06-19 16:49:22 -07:00
Yinghai Lu	3ea15af630	[Onnxifi] Allow adding timeout for OnnxifOp run (#40081 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40081 Adding the functionality to enable timeout of OnnxifiOp run. In the case of backend hanging, it can error out quickly. Test Plan: ``` buck test glow/fb/test:test_onnxifinnpi -- test_timeout ``` Reviewed By: jackm321 Differential Revision: D22064533 fbshipit-source-id: 25487287c10ab217eb95692f09d48e13e19436ab	2020-06-17 16:21:25 -07:00
Yinghai Lu	00505adbad	Add net_pos Tiles added during in-batch broadcast (#40078 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40078 att. It's good to have net_pos for all the ops so that we can distinguish each op in minimizer in net_runner. Test Plan: unittest Reviewed By: ipiszy, ChunliF Differential Revision: D22062748 fbshipit-source-id: 5266abdb6dde63055fdffdba6e8d65bd0f221d7b	2020-06-16 21:51:18 -07:00
Amy Yang	88c5fd94e7	[nnpi eval] enable int8 eval with emulation Int8FC (#39112 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39112 Allow int8 packed weights in int8 model to deserialize to original format. Set default deserialization behavior in eval workflows to original format. Test Plan: Tested with workflow: f192797187 Reviewed By: yinghai Differential Revision: D21737940 fbshipit-source-id: 7afaf307b16cb4e85e61f019356f83fdab772c57	2020-05-29 11:59:12 -07:00
Chunli Fu	898d062bfd	[disagg_acc] In batch broadcast (#38700 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38700 Reviewed By: yinghai Differential Revision: D21634147 fbshipit-source-id: 7bd1912654e2433cfb580b5f7a9fb86570a55cab	2020-05-27 15:21:37 -07:00
Yinghai Lu	8338426ed8	Fix infinite loop bug in minimizer (#38507 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38507 With `--merge_fp32_inputs_into_fp16` we added some ops to the net with out net_pos, this makes the cardinality of blacklist pos smaller than number of op in the net. Previously, the updateInternalState() function of minimizer will just enter infinite loop. This diff fixed it by changing the loop condition. Reviewed By: tracelogfb Differential Revision: D21578777 fbshipit-source-id: 0d5373fa0a417ded1c80a2dc03248c07b1e0a320	2020-05-18 11:44:05 -07:00
Ansha Yu	25413635d0	[c2][opt] nomnigraph transform for ClipRangesGatherSigridHashV2 fusion (#38004 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38004 Un-backout of D21353550, originally D21262085. No changes here, fix in D21445881. Fuse ClipRanges + GatherRanges + SigridHash -> ClipRangesGatherSigridHashV2 dpa_product_ctr model's dper2 to dper3 migration is blocked by 3.6% higher prospector cpu usage. Root cause is traced down to sigrid transforms, where ClipRanges, GatherRanges, SigridHash are separately called, instead of fused, as is the case in dper2. Further context: https://fb.quip.com/GijaAZtX5mav https://fb.quip.com/pIDdAjJP2uiG Test Plan: Local benchmarking with small model 181513584_0 (Dper3 full model is 178772812, dper2 refresh is 178770392) Transform turned on: P129799373 Iters per second: 609.291 Transform turned off: P129799397 Iters per second: 519.088 We also want to confirm this performance on the full model in canary and in qrt. `buck build mode/opt-clang mode/no-gpu caffe2/caffe2/fb/predictor:ptvsc2_predictor_bench` `MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 ./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --pred_net=/data/users/ansha/tmp/dpa/small_pred_net.pb --c2_model=/data/users/ansha/tmp/dpa/181513584_0.predictor --c2_inputs=/data/users/ansha/tmp/dpa/c2_inputs_small.pb --iters=3000 --warmup_iters=100 --num_threads=32 --c2_apply_nomnigraph_passes=1 --caffe2_predictor_enable_preproc_fusion=1` Run dbgo build to check that all transforms happen. Check that ClipRangesGatherSigridHash is used: https://fburl.com/scuba/caffe2_operator_stats_canary/e6qfdsat Canaries: https://our.intern.facebook.com/intern/ads/canary/426498918895712377/ https://our.intern.facebook.com/intern/ads/canary/426498905389730718/ https://our.intern.facebook.com/intern/ads/canary/426498901795492517/ Dbgo canaries: https://our.intern.facebook.com/intern/ads/canary/426498888067456166/ https://our.intern.facebook.com/intern/ads/canary/426498879652089095/ https://our.intern.facebook.com/intern/ads/canary/426498873491575187/ https://our.intern.facebook.com/intern/ads/canary/426498860171351505/ Reviewed By: houseroad Differential Revision: D21445887 fbshipit-source-id: a3c15ee30465de693f434b6ee041025c276581ac	2020-05-07 20:00:35 -07:00
Ansha Yu	b410d03e6e	Back out "[c2][opt] nomnigraph transform for ClipRangesGatherSigridHash fusion" (#37675 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37675 Original commit changeset: 2c2481e3d497 (Note: this ignores all push blocking failures!) Test Plan: Back out D21262085 due to ASAN crash P130123493 Differential Revision: D21353550 fbshipit-source-id: c43c8764322f7e58aca0c1360b1d03966b1d9798	2020-05-01 12:49:17 -07:00
Ansha Yu	b97341e3dd	[c2][opt] nomnigraph transform for ClipRangesGatherSigridHash fusion (#37535 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37535 Fuse ClipRanges + GatherRanges + SigridHash -> ClipRangesGatherSigridHash dpa_product_ctr model's dper2 to dper3 migration is blocked by 3.6% higher prospector cpu usage. Root cause is traced down to sigrid transforms, where ClipRanges, GatherRanges, SigridHash are separately called, instead of fused, as is the case in dper2. Further context: https://fb.quip.com/GijaAZtX5mav https://fb.quip.com/pIDdAjJP2uiG Test Plan: Local benchmarking with small model 181513584_0 (Dper3 full model is 178772812, dper2 refresh is 178770392) Transform turned on: P129799373 Iters per second: 609.291 Transform turned off: P129799397 Iters per second: 519.088 We also want to confirm this performance on the full model in canary and in qrt. `buck build mode/opt-clang mode/no-gpu caffe2/caffe2/fb/predictor:ptvsc2_predictor_bench` `MKL_NUM_THREADS=1 OMP_NUM_THREADS=1 ./buck-out/opt/gen/caffe2/caffe2/fb/predictor/ptvsc2_predictor_bench --pred_net=/data/users/ansha/tmp/dpa/small_pred_net.pb --c2_model=/data/users/ansha/tmp/dpa/181513584_0.predictor --c2_inputs=/data/users/ansha/tmp/dpa/c2_inputs_small.pb --iters=3000 --warmup_iters=100 --num_threads=32 --c2_apply_nomnigraph_passes=1 --caffe2_predictor_enable_preproc_fusion=1` Prospector canary: https://our.intern.facebook.com/intern/ads/canary/426280288521552095/ Check that ClipRangesGatherSigridHash is used: https://fburl.com/scuba/caffe2_operator_stats_canary/e6qfdsat Reviewed By: yinghai Differential Revision: D21262085 fbshipit-source-id: 2c2481e3d4977abb8abe6e9ef0c9999382320ab2	2020-04-30 11:03:47 -07:00
Yinghai Lu	dd98abb453	Enable splitSparseLengthsSumSparse in onnxifi (#35555 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35555 Att. So that we can lower the SparseLengthsSum* part of SparseLengthsSumSparse. We update the tying policy between Gather and SparsLengthsWeightSum so that we don't bother lowering a single Gather into the backend, which is inefficient to execute on card and creates bubbles between continuous lowering graphs. Test Plan: ``` buck test glow/fb/test:test_onnxifinnpi ``` Reviewed By: ipiszy Differential Revision: D20688525 fbshipit-source-id: cb8e38239057ff13a8d385ed09d0d019421de78b	2020-03-30 13:34:59 -07:00
Benny Chen	dbd2b8bb41	[SigridHashOp] Fix converter (#34836 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34836 Once SigridHashOp argument is supplied, I realized the shape inference is still wrong because the argument is not supplied in the debug_ssa. Thanks to yinghai, I didn't fix the converter, fixing it in this diff Test Plan: Run the binary, and checked the exported op op { input: "sequential_250/parallel/normalization/dper_feature_normalization/sparse_features_processor/sparse_feature_transform/gather_ranges_GSF_IDLIST_COOCCUR_APP_ID_NEKO_ORGANIC_1D_7D_INSTALL_V1/gathered_values_0" output: "sequential_250/parallel/normalization/dper_feature_normalization/sparse_features_processor/sparse_feature_transform/sequential_1/hash_feature_ids/SigridHash:0_0" type: "SigridHash" arg { name: "salt" i: 0 } arg { name: "maxValue" i: 100000 } arg { name: "hashIntoInt32" i: 1 } arg { name: "net_pos" i: 3 } } it now have hashIntInt32 Reviewed By: yinghai Differential Revision: D20457057 fbshipit-source-id: 023ade5e66df82037a8f2da3174383dda8aff230	2020-03-29 13:06:05 -07:00
Chunli Fu	6b1ffcbf59	[model loading] Skip ssaRewrite for predict_net if it has been ssaRewritten (#35428 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35428 ATT Reviewed By: yinghai Differential Revision: D20655131 fbshipit-source-id: 4089b3527fc7b83ba793f8d292c7189a0fa68361	2020-03-26 16:48:15 -07:00
Hao Lu	4bd5d1b3be	[TVM] Use caffe2_predictor_model_shape_hints to pass shape_hints to TVM (#35091 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35091 Test Plan: AI/AF canary to make sure it does not affect production: https://our.intern.facebook.com/intern/ads/canary/425387509869003921/ https://our.intern.facebook.com/intern/ads/canary/425387881631488449/ Glow: ``` buck test glow: ``` Reviewed By: yinghai Differential Revision: D20552830 fbshipit-source-id: bdf65fb0ba945963a7c9621cc3f7ea5ebaecb907	2020-03-20 20:06:17 -07:00
Yinghai Lu	6000dca5df	[nomnigraph] Copy device option when customize the op conversion (#34976 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34976 Previously, we are dropping the original device option info when we override the operator conversion function. Test Plan: ``` buck test caffe2/caffe2/opt:converter_nomigraph_test ``` Reviewed By: ipiszy Differential Revision: D20507277 fbshipit-source-id: 66b5eab07d18651eff27dab2a809cd04872ac224	2020-03-19 22:48:28 -07:00
Yinghai Lu	1af6002321	Initial implementation of NNPI Int8FC op Test Plan: ``` buck test mode/no-gpu glow/fb/test/numerics:test_fc_nnpi_int8nnpi -- --print-passing-detail ``` Reviewed By: hyuen Differential Revision: D20450490 fbshipit-source-id: c4811cdc994548b6e319d57115434dfc199e07c2	2020-03-16 10:46:17 -07:00
Amy Yang	7c20578794	NNPI op mapping correct SpatialBN NNPI op name (#34176 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34176 Wrong operator name for the NNPI SpatialBN Test Plan: flow canary Reviewed By: hyuen Differential Revision: D20237933 fbshipit-source-id: dfde658dcbf2482320e36d549f7d83c27df264a0	2020-03-03 17:57:28 -08:00
Hector Yuen	49586a2a7e	fix sph batchnorm to use sph fma Summary: make use of springhill's fma on SpatialBatchnorm Test Plan: re-enabled the unit test, ran it a couple of times pending: net runner Reviewed By: amylittleyang Differential Revision: D20227767 fbshipit-source-id: 7c601f185940249c0a32bdf95d74a20552cd2625	2020-03-03 12:53:08 -08:00
Amy Yang	0759191f12	blacklist spatialBN until bitwise matching (#34092 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34092 Disable op in transform map until we get bitwise matching to ice-ref Test Plan: CI Reviewed By: hyuen Differential Revision: D20177936 fbshipit-source-id: e316384184cb264852e63e5edce721a8614742d1	2020-03-02 17:55:00 -08:00
Hector Yuen	56d9906083	update mapping of fake operators (#33946 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33946 update mapping of fake operators to model nnpi update SpatialBN to non-lowered Test Plan: compilation https://github.com/pytorch/pytorch/pull/33946 Reviewed By: amylittleyang Differential Revision: D20156136 fbshipit-source-id: e6ed87c3c5eba692a49376f0d9dae37ae185f185	2020-02-28 14:01:02 -08:00
Hector Yuen	a80d0330e4	add int4 fake fp16 mappings Summary: update this mapping with thte int4 sls ops so we can run netrunner Test Plan: testing with net_runner Reviewed By: jfix71 Differential Revision: D19879826 fbshipit-source-id: eac84b10e2365c21cb8a7cfbf3123e26a9945deb	2020-02-13 15:37:23 -08:00
Yinghai Lu	b4b1b100bd	Add a loop test for onnxified net (#32935 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32935 Mock away the content of onnxified net with some low cost ops so that we can still mimic the input/output transfer while doing minimal work on the card. Test Plan: ``` buck run glow/fb/test:sparsenn_test -- --gtest_filter='SparseNNTest.vanillaC2' --onnxifi_debug_mode --onnxifi_loop_test_mode --nocaffe2_predictor_use_memonger ``` Differential Revision: D19631971 fbshipit-source-id: f970c55ccb410702f479255eeb750e01e3f8c2ae	2020-02-03 18:35:41 -08:00
Hector Yuen	4baadd54d7	add SpatialBN lowered fake fp16 Summary: SpatialBNFakeLoweredFp16NNPI this is the fake operator for SpatialBN that gets lowered into add/mul/div, etc. Test Plan: test_spatialbn Reviewed By: tracelogfb, amylittleyang Differential Revision: D19658680 fbshipit-source-id: 2abddbcd9a2023ac75c494f20eaac2051b7139dc	2020-02-03 15:03:34 -08:00
Yinghai Lu	94ddc2c462	Resubmit more code fakefp16 mapping unification (#32798 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32798 ATT Test Plan: unittests Reviewed By: amylittleyang Differential Revision: D19632251 fbshipit-source-id: 670004050d67415bb24392f3520afa32b64ce740	2020-01-30 12:48:48 -08:00
Edward Yang	c47c78d0bf	Revert D19597036: More code fakefp16 mapping unification Test Plan: revert-hammer Differential Revision: D19597036 Original commit changeset: deed61945884 fbshipit-source-id: c057e57810a99464aefb00b645613ecd6a7c5533	2020-01-29 13:32:42 -08:00
Yinghai Lu	642c9ef922	More code fakefp16 mapping unification Summary: ATT Reviewed By: amylittleyang Differential Revision: D19597036 fbshipit-source-id: deed61945884fb4b01d058f3c72c75f5a937a41c	2020-01-29 11:01:24 -08:00
Yinghai Lu	02f055ffd9	Add mapping for FbFCPacked in fakefp16 transform Summary: ATT. Since the infra is there. Test Plan: run it Reviewed By: amylittleyang Differential Revision: D19605250 fbshipit-source-id: c68be4d7963afa4fa5f8f60c90f1913605eae516	2020-01-28 17:00:24 -08:00
Yinghai Lu	ffdcbadeaa	Minor refactoring to improve code reuse (#32675 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/32675 It's good to have one location to do the mapping. Test Plan: Everything still runs. Reviewed By: amylittleyang Differential Revision: D19590354 fbshipit-source-id: d8c0d14e4bdf27da3e13bd4d161cd135d6e3822b	2020-01-28 13:31:48 -08:00
Brian Wignall	f326045b37	Fix typos, via a Levenshtein-type corrector (#31523 ) Summary: Should be non-semantic. Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking. Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523 Differential Revision: D19216749 Pulled By: mrshenli fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea	2020-01-17 16:03:19 -08:00
Yinghai Lu	d2fdf140af	Combine all the user inputs together and convert them to fp16 (#31898 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31898 Att Reviewed By: tracelogfb Differential Revision: D19291357 fbshipit-source-id: 747ed5234ca042ceeaff2d094701ead7597ac3ee	2020-01-08 14:36:42 -08:00
Chunli Fu	bb7befb12c	Support loading by blob in predictor Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30805 Reviewed By: ipiszy Differential Revision: D18827383 fbshipit-source-id: b97f958768618ca29a02b057667a9b4ee313ad3c	2019-12-10 10:34:14 -08:00
Chunli Fu	42324cb6e8	Change interface from map of TensorShape to shapeInfoMap (#30802 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30802 Change shape_hints from map<string, TensorShape> to ShapeInfoMap to catch dimType info from model file. Reviewed By: ipiszy Differential Revision: D18821486 fbshipit-source-id: c5d9ed72e158d3698aba38900aeda00f776745b4	2019-12-10 00:35:11 -08:00
Hector Yuen	ee20e66c48	replace the SLSRQ for their right emulations in the replayer test (#30367 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30367 use the SLS emulations that match the hardware Test Plan: replayer test Differential Revision: D18667605 fbshipit-source-id: 89aee630184737b86ecfb09717437e5c7473e42c	2019-11-23 00:06:03 -08:00
Benny Chen	496f740824	Connect with clip range gather operator (#28866 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28866 When we are working on the fix for int32 instead of int64, we also need to take care of the ClipRangesGatherSigridHash since this is the operator that actually gets used during inference. Test Plan: Added unittest to cover for the new case Reviewed By: ipiszy Differential Revision: D17147237 fbshipit-source-id: 2b562b72a6ae8f7282e54d822467b8204fb1055e	2019-10-29 23:32:08 -07:00
Ying Zhang	e8c23c9f85	Add various flags for fakefp16 conversion Summary: ATT Test Plan: manually tested Reviewed By: hyuen Differential Revision: D17849416 fbshipit-source-id: 85ae8fb9c31a0f0139a3c61d5a164b342851d847	2019-10-11 18:06:18 -07:00
Ying Zhang	024a422f41	Add fakefp16 transformation. Summary: ATT. Reviewed By: hyuen Differential Revision: D17559866 fbshipit-source-id: 58e3de97d00f20a9b5556e35504c520926d43cbd	2019-09-27 16:46:03 -07:00
Summer Deng	d95763b4dc	Enable loading int8 prepacked models in PredictorContainer Summary: To test the int8 ads models on CPU and accelerators with the ads replayer, we need to load the PREPACKING_INIT_NET_TYPE in the int8 model to initialize the int8 w_packed blobs. Test Plan: Ads replayer test. P74811059 Reviewed By: zrphercule Differential Revision: D16518888 fbshipit-source-id: cee212710ad37d9e491c970b25b2fe484373e5e4	2019-09-06 02:53:52 -07:00
Yinghai Lu	4edf77b6c0	Fuse to individual operators to GatherFuse8BitRowwiseQuantFloatMulLengthElim (#25519 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25519 Fuse Gather-Fused8BitRowwiseQuantizedToFloat-Mul-LengthsSum opportunistically. Test Plan: ``` buck test caffe2/caffe2/opt/custom:concat_elim_test ``` Reviewed By: dreamingleo Differential Revision: D17125045 fbshipit-source-id: 8ee50410eb13a82e1e5c8180f392fce2fe9cd728	2019-09-03 19:08:49 -07:00
Stephen Chen	c5e1e5c300	Put ParseBlackListOps() into caffe2::glow namespace (#24384 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24384 So that we can use them in other functions. Reviewed By: yinghai Differential Revision: D16824289 fbshipit-source-id: 3cb33cfa9a5c479a63db6438aef518209bdfb1f4	2019-08-15 10:53:10 -07:00
Stephen Chen	b53916a373	C2/glow: assign net_pos to a net before applying onnxifi_blacklist_ops (#24262 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24262 Previously for onnxifi_blacklist_ops option, we figure out the net_pos based on the order of ops in the net. But this logic is wrong if the net already has net_pos assigned and we may end up blacklisting unintended ops. Fix this issue to always assign net_pos before computing any blacklist. Reviewed By: yinghai Differential Revision: D16789166 fbshipit-source-id: 2d08a7737d417822f2209adb4dcb24dbb258ff90	2019-08-14 10:39:15 -07:00
Lucian Grijincu	a936a90391	caffe2/caffe2/fb/operators/cc_amrc: drop SIMD OpenMP vectorization Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/23235 Reviewed By: ajtulloch Differential Revision: D16384612 Pulled By: luciang fbshipit-source-id: a4c8257c6d3e151ba99167a152ad824b0dde7671	2019-07-23 17:25:00 -07:00
Alexander Sidorov	a6ccd62a81	BlackBoxPredictor OSS part 5: glow transforms Summary: Overal context: open-source BlackBoxPredictor as the entry point for inference in Caffe2 (thread safe abstraction for Caffe2 inference). This should be used in ThroughputBenchmark for the purpose of framework comparison This specific diff: There should be no harm in moving transformation code to OSS. On the advantages side we will be able to compare production Caffe2 setup with PyTorch in the most fair way via ThroughputBenchmark. This approach avoid any complicated transformation regirstries. Building those proper would be significant engineering effort as well as production risk. In the past we had SEVs related to transforms being turned off due to various refactors. Given that we don't plan to build any other significant investments into transformation logic except existing ones (like TVM and Glow), and those also relate to open-source technologies, I came up to the conclusion of moving to OSS the whole thing. Reviewed By: bertmaher Differential Revision: D16367134 fbshipit-source-id: fc6bacc1be3ff6336beb57cdad58168d3a2b8c28	2019-07-23 16:39:23 -07:00
Alexander Sidorov	2becbd3faa	BlackBoxPredictor OSS part 4: Open-source other transforms (#23099 ) Summary: Overal context: open-source BlackBoxPredictor as the entry point for inference in Caffe2 (thread safe abstraction for Caffe2 inference). This should be used in ThroughputBenchmark for the purpose of framework comparison This specific diff: There should be no harm in moving transformation code to OSS. On the advantages side we will be able to compare production Caffe2 setup with PyTorch in the most fair way via ThroughputBenchmark. This approach avoid any complicated transformation regirstries. Building those proper would be significant engineering effort as well as production risk. In the past we had SEVs related to transforms being turned off due to various refactors. Given that we don't plan to build any other significant investments into transformation logic except existing ones (like TVM and Glow), and those also relate to open-source technologies, I came up to the conclusion of moving to OSS the whole thing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/23099 Test Plan: salex@devvm4218:caffe2 { (fcdaf96\|HISTEDIT)}$ submit_canary --q tw_adindexer_canary_on_canary_tier && submit_canary --q tw_adfinder_canary_on_canary_tier && submit_canary prospector_repl ay_canary /proc/self/fd/4/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings Patch Phabricator Link: differential/diff/86851419/ Submit job request to the thrift service https://our.intern.facebook.com/intern/ads/canary/419717789681292057 DONE Everpaste link: https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GBYe_ANnNNBnbWsDAAAAAABJPvJBbjEQAAAz /proc/self/fd/4/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings Patch Phabricator Link: differential/diff/86851536/ Submit job request to the thrift service https://our.intern.facebook.com/intern/ads/canary/419717806884923980 DONE Everpaste link: https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GArl_QPncP7tc30IAAAAAACfza93bjEQAAAz /proc/self/fd/4/urllib3/connectionpool.py:851: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings Patch Phabricator Link: differential/diff/86851661/ Submit job request to the thrift service https://our.intern.facebook.com/intern/ads/canary/419717823090263325 DONE Everpaste link: https://our.intern.facebook.com/intern/everpaste/?color=0&handle=GNcyAwRrfFd0MIUIAAAAAABLOINibjEQAAAz Differential Revision: D16288332 Pulled By: salexspb fbshipit-source-id: 95899dede6b11a2ae14703b9aaea8e1a677f0aaa	2019-07-22 13:53:43 -07:00

1 2

51 Commits