pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Priya Ramani	ac97e953b4	Add dynamic shape support to AOT driver & compiler (#72995 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72995 Add ability to specify input dimensions that need to be dynamic. Example: if dim 115 can be dynamic in input sizes "1,115;1", then specify dynamic_dims as "115" Also recompile and update CI models and some asm code as the old ones don't compile with compiler changes in context.cpp Test Plan: - Compiles and runs BI Bytedoc model with and without dynamic inputs. Reviewed By: ZolotukhinM Differential Revision: D34233121 fbshipit-source-id: 35095e549ebd6d3bec98b9abb3f0764366a0ff6f (cherry picked from commit 33166a9f9ac9194b5df0a35280b57708df255ebd)	2022-02-24 04:30:48 +00:00
Ivan Kobzarev	c32b74cecb	[nnc][aot_compiler] Memory formats args to aot_compiler (#72873 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72873 Test Plan: Imported from OSS Reviewed By: priyaramani Differential Revision: D34250984 Pulled By: IvanKobzarev fbshipit-source-id: e723ee64b024883eef78853e1b185b7040cafb09 (cherry picked from commit `e9908df045`)	2022-02-16 18:39:31 +00:00
Priya Ramani	444191de56	Use default value on empty llvm_code_path (#72758 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72758 Bug: FLAGS_output_llvm option was introduced recently to specify LLVM assembly code file. Without previously default value, now the llvm code is not being saved to a file if asmfile input is not specified and is resulting in making the compiled output unusable. Fix: Use default value if output_llvm/asmfile input is not specified. Test Plan: Verified that output is saved to deafult .ll file path Reviewed By: IvanKobzarev Differential Revision: D34189107 fbshipit-source-id: ee51e8c17de92d3045690ca871fb9569fc3164d6 (cherry picked from commit `46352d446b`)	2022-02-12 00:35:24 +00:00
Mikhail Zolotukhin	a60e2ae037	[TensorExpr] Move AOT compilation logic from aot_compiler.cpp to NNC's to_backend (#70375 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70375 Differential Revision: D33303645 D33303645 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin, priyaramani Pulled By: ZolotukhinM fbshipit-source-id: 01ab9fab9bb0d63f89b06a146d3c5fb6ed7fe52d (cherry picked from commit `aac8e0ed90`)	2022-02-02 02:34:55 +00:00
Priya Ramani	8cc9ec2f6b	Add option to get input dtype from user (#68751 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68751 Add option to get input dtype from user for AOT compilation Test Plan: BI model compiles and runs fine ``` (pytorch) ~/fbsource/fbcode/caffe2/fb/nnc └─ $ buck run //caffe2/binaries:aot_model_compiler -- --model=bi.pt --model_name=pytorch_dev_bytedoc --model_version=v1 '--input_dims=1,115;1' --input_types='int64;int64' Building... 8.3 sec (99%) 7673/7674 jobs, 0/7674 updated WARNING: Logging before InitGoogleLogging() is written to STDERR W1116 14:32:44.632536 1332111 TensorImpl.h:1418] Warning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (function operator()) E1116 14:32:44.673710 1332111 huge_pages_allocator.cc:287] Not using huge pages because not linked with jemalloc The compiled llvm assembly code was saved to bi.compiled.ll The compiled model was saved to bi.compiled.pt ``` > Error thrown when input dims and input types sizes don't match ``` (pytorch) ~/fbsource/fbcode/caffe2/fb/nnc └─ $ buck run //caffe2/binaries:aot_model_compiler -- --model=bi.pt --model_name=pytorch_dev_bytedoc --model_version=v1 '--input_dims=1,115;1' --input_types='int64;int64;int64' . . terminate called after throwing an instance of 'c10::Error' what(): [enforce fail at aot_model_compiler.cc:208] split(';', FLAGS_input_dims).size() == split(';', FLAGS_input_types).size(). Number of input_dims and input_types should be the same . . . ``` Reviewed By: ljk53 Differential Revision: D32477001 fbshipit-source-id: 8977b0b59cf78b3a2fec0c8428f83a16ad8685c5	2021-11-29 21:39:49 -08:00
Ivan Kobzarev	7fbcf79684	[tensorexpr][nnc] Support quantization (#66676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66676 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31676329 Pulled By: IvanKobzarev fbshipit-source-id: 288b41ff4ed603dfaacb465f296997f14bb23c22	2021-10-31 22:49:30 -07:00
Priya Ramani	fa70d72e95	Set kernel func name from AOT Compiler (#67229 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67229 Right now, assembly code generated for the a given method from the model is named wrapper or func by default. The function name is then replaced with a proper kernel_func_name after target specific assembly is generated. This PR propagates a desired kernel_func_name right from aotCompiler API so that the generated function has the needed name that doesn't need to be replaced later. Note: Most of this change was landed in https://github.com/pytorch/pytorch/pull/66337 which had to be reverted as it was breaking `test_profiler` in `test_jit_fuser_te` as it replaced the name generated for graph with the default kernel_func_name value. This PR fixes that as well. ``` (pytorch) ~/local/pytorch kname └─ $ python3 test/test_jit_fuser_te.py CUDA not available, skipping tests monkeytype is not installed. Skipping tests for Profile-Directed Typing ........................................<string>:3: UserWarning: torch.cholesky is deprecated in favor of torch.linalg.cholesky and will be removed in a future PyTorch release. L = torch.cholesky(A) should be replaced with L = torch.linalg.cholesky(A) and . . . ......................<string>:3: UserWarning: torch.symeig is deprecated in favor of torch.linalg.eigh and will be removed in a future PyTorch release. The default behavior has changed from using the upper triangular portion of the matrix by default to using the lower triangular portion. L, _ = torch.symeig(A, upper=upper) should be replaced with L = torch.linalg.eigvalsh(A, UPLO='U' if upper else 'L') and L, V = torch.symeig(A, eigenvectors=True) should be replaced with L, V = torch.linalg.eigh(A, UPLO='U' if upper else 'L') (Triggered internally at ../aten/src/ATen/native/BatchLinearAlgebra.cpp:2492.) ......[W pybind_utils.cpp:35] Warning: Using sparse tensors in TorchScript is experimental. Many optimization pathways have not been thoroughly tested with sparse tensors. Please include the fact that the network is running sparse tensors in any bug reports submitted. (function operator()) /data/users/priyaramani/pytorch/torch/testing/_internal/common_utils.py:403: UserWarning: Using sparse tensors in TorchScript is experimental. Many optimization pathways have not been thoroughly tested with sparse tensors. Please include the fact that the network is running sparse tensors in any bug reports submitted. (Triggered internally at ../torch/csrc/jit/python/pybind_utils.h:691.) return callable(args, *kwargs) .....................................................................[W Resize.cpp:23] Warning: An output with one or more elements was resized since it had shape [1], which does not match the required output shape [].This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (function resize_output_check) [W Resize.cpp:23] Warning: An output with one or more elements was resized since it had shape [1, 5], which does not match the required output shape [5].This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (function resize_output_check) ........................................................................s.......s...s.s....s......s..sss............................ ---------------------------------------------------------------------- Ran 503 tests in 37.536s OK (skipped=10) ``` Test Plan: Imported from OSS Reviewed By: navahgar, pbelevich Differential Revision: D31945713 Pulled By: priyaramani fbshipit-source-id: f2246946f0fd51afba5cb6186d9743051e3b096b	2021-10-27 13:10:49 -07:00
Zhengxu Chen	b55a2500d2	[jit] Remove graph() call from abstract Function interface. (#65967 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65967 Graph is an implementation detail. If user wants to get access to the underlying graph, they should be able to explicitly dynamic cast instead. ghstack-source-id: 141659819 Test Plan: no behavior change. Reviewed By: gmagogsfm Differential Revision: D31326153 fbshipit-source-id: a0e984f57c6013494b92a7095bf5bb660035eb84	2021-10-27 11:54:26 -07:00
Priya Ramani	ecf7e96969	[Light] Remove ambiguity from compile_spec names, use actual output type (#67209 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67209 Pull Request resolved: https://github.com/pytorch/pytorch/pull/67198 Fixing a couple instances where parameters were named method_compile_spec when they were actually compile_specs that could have multiple method_compile_specs inside. Also use output dtype from buffer. Test Plan: Mobilenetv3 compiles and runs fine ``` (pytorch) ~/fbsource/fbcode/caffe2/fb/nnc └─ $ PYTORCH_JIT_LOG_LEVEL="aot_compiler" buck run //caffe2/binaries:aot_model_compiler -- --model mobilenetv3.pt --model_name=pytorch_dev_mobilenetv3 --model_version=v1 --input_dims="1,3,224,224 " Downloaded 4501/6195 artifacts, 433.89 Mbytes, 14.3% cache miss (for updated rules) Building: finished in 06:34.6 min (100%) 20233/20233 jobs, 5467/20233 updated Total time: 06:35.0 min BUILD SUCCEEDED The compiled llvm assembly code was saved to mobilenetv3.compiled.ll The compiled model was saved to mobilenetv3.compiled.pt └─ $ ./compile_model.sh -m pytorch_dev_mobilenetv3 -p /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/mobilenetv3.pt -v v1 -i "1,3,224,224" + VERSION=v1 + getopts m:p:v:i:h opt + case $opt in + MODEL=pytorch_dev_mobilenetv3 . . Columns 961 to 9701e-11 * -4.2304 -3.9674 2.4473 -0.8664 -0.7513 1.2140 0.0010 3.8675 1.2714 2.2989 Columns 971 to 9801e-11 * -2.7203 1.6772 -0.7460 -0.6936 4.4421 -0.9865 -0.5186 -1.4441 1.3047 -1.6112 Columns 981 to 9901e-11 * 0.1275 -1.8815 2.5105 -0.4871 -2.2342 0.8520 0.8658 1.6180 3.8901 -0.2454 Columns 991 to 10001e-11 * -1.4896 4.1337 -2.6640 0.8226 0.2441 -1.4830 -1.7430 1.8758 0.5481 0.5093 [ CPUFloatType{1,1000} ] Starting benchmark. Running warmup runs. Main runs. Main run finished. Milliseconds per iter: 276.255. Iters per second: 3.61984 Memory usage before main runs: 104366080 bytes Memory usage after main runs: 343441408 bytes Average memory increase per iter: 2.39075e+07 bytes 0 value means "not available" in above ``` Reviewed By: ljk53 Differential Revision: D31698338 fbshipit-source-id: da6c74c1321ec02e0652f3afe6f97bf789d3361b	2021-10-25 17:44:05 -07:00
Natalia Gimelshein	b6fa998892	Revert D31514095: Use kernel_func_name from aotCompiler Test Plan: revert-hammer Differential Revision: D31514095 (`7b55dc8340`) Original commit changeset: b70c8e2c7336 fbshipit-source-id: ad4d828f33506e612b51c276149fa0e12b0565d5	2021-10-23 17:17:53 -07:00
Priya Ramani	7b55dc8340	Use kernel_func_name from aotCompiler (#66337 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66337 Right now, assembly code generated for the a given method from the model is named wrapper or func by default. The function name is then replaced with a proper kernel_func_name after target specific assembly is generated. This PR propagates a desired kernel_func_name right from aotCompiler API so that the generated function has the needed name that doesn't need to be replaced later. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31514095 Pulled By: priyaramani fbshipit-source-id: b70c8e2c733600a435cd4e8b32092d37b7bf7de5	2021-10-23 02:20:45 -07:00
Priya Ramani	9e3a2babfa	Make aotCompile support multiple input sizes (#66727 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66727 Make aotCompile support multiple input sizes Test Plan: Able to compile and run a model with multiple inputs ``` (pytorch) ~/fbsource/fbcode/caffe2/fb/nnc └─ $ PYTORCH_JIT_LOG_LEVEL=aot_compiler buck run //caffe2/binaries:aot_model_compiler -- --model aot_test_model.pt --model_name=aot_test_model --model_version=v1 --input_dims="2,2,2;2,2,2" Building: finished in 3.2 sec (100%) 7461/7461 jobs, 0/7461 updated Total time: 3.4 sec BUILD SUCCEEDED [DUMP aot_compiler.cpp:097] graph before shape propagation [DUMP aot_compiler.cpp:097] graph(%x.1 : Tensor, [DUMP aot_compiler.cpp:097] %y.1 : Tensor): [DUMP aot_compiler.cpp:097] %3 : int = prim::Constant[value=1]() # :0:0 [DUMP aot_compiler.cpp:097] %4 : Tensor = aten::add(%x.1, %y.1, %3) # /data/users/priyaramani/fbsource/fbcode/caffe2/test/mobile/nnc/aot_test_model.py:10:15 [DUMP aot_compiler.cpp:097] return (%4) (1,.,.) = 0.3357 0.6137 0.8472 0.0858 (2,.,.) = 0.8406 0.2959 0.6012 0.7184 [ CPUFloatType{2,2,2} ] (1,.,.) = 0.7086 0.6398 0.0579 0.1913 (2,.,.) = 0.8598 0.3641 0.5925 0.0200 [ CPUFloatType{2,2,2} ] here 2 2 graph 0x6130001ee2d0 [DUMP aot_compiler.cpp:118] graph after shape propagation [DUMP aot_compiler.cpp:118] graph(%x.1 : Float(2, 2, 2, strides=[4, 2, 1], requires_grad=0, device=cpu), [DUMP aot_compiler.cpp:118] %y.1 : Float(2, 2, 2, strides=[4, 2, 1], requires_grad=0, device=cpu)): [DUMP aot_compiler.cpp:118] %3 : int = prim::Constant[value=1]() # :0:0 [DUMP aot_compiler.cpp:118] %4 : Tensor(2, 2, 2) = aten::add(%x.1, %y.1, %3) # /data/users/priyaramani/fbsource/fbcode/caffe2/test/mobile/nnc/aot_test_model.py:10:15 [DUMP aot_compiler.cpp:118] return (%4) The compiled llvm assembly code was saved to aot_test_model.compiled.ll The compiled model was saved to aot_test_model.compiled.pt └─ $ ./compile_model.sh -m aot_test_model -p /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.pt -v v1 -i "2,2,2;2,2,2" + VERSION=v1 + getopts m:p:v:i:h opt + case $opt in + MODEL=aot_test_model + getopts m:p:v:i:h opt + case $opt in + MODEL_PATH=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.pt + getopts m:p:v:i:h opt + case $opt in + VERSION=v1 + getopts m:p:v:i:h opt + case $opt in + INPUT_DIMS='2,2,2;2,2,2' + getopts m:p:v:i:h opt + require_arg m aot_test_model + '[' -n aot_test_model ']' + require_arg p /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.pt + '[' -n /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.pt ']' + require_arg i '2,2,2;2,2,2' + '[' -n '2,2,2;2,2,2' ']' + '[' '!' -f /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.pt ']' +++ dirname ./compile_model.sh ++ cd . ++ pwd -P + SRC_DIR=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc + FBCODE_DIR=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/../../.. + FBSOURCE_DIR=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/../../../.. + KERNEL_DIR=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/../../../../xplat/pytorch_models/build/aot_test_model/v1/nnc ++ echo /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.pt ++ sed 's/.pt.*//' + MODEL_PATH_PREFIX=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model + LLVM_CODE_PATH=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.compiled.ll + ASSEMBLY_CODE_PATH=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.compiled.s + COMPILED_MODEL_FILE_PATH=/data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.compiled.pt + KERNEL_FUNC_NAME=nnc_aot_test_model_v1_forward + cd /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/../../../.. + buck run //xplat/caffe2/fb/lite_predictor:lite_predictor_nnc -- --model /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.compiled.pt --print_output true --input_dims '2,2,2$ 2,2,2' --input_type 'float;float' --input_memory_format 'contiguous_format;contiguous_format' clang-9: warning: argument unused during compilation: '-pthread' [-Wunused-command-line-argument] Downloaded 1/4 artifacts, 2.11 Kbytes, 50.0% cache miss (for updated rules) Building: finished in 12.2 sec (100%) 4572/4572 jobs, 3/4572 updated Total time: 12.2 sec BUILD SUCCEEDED Run with 56 threads Run with 56 threads Loading model... Model loaded: /data/users/priyaramani/fbsource/fbcode/caffe2/fb/nnc/aot_test_model.compiled.pt Running forward ... (1,.,.) = -0.7451 -0.7451 -0.7451 -0.7451 (2,.,.) = -0.7451 -0.7451 -0.7451 -0.7451 [ CPUFloatType{2,2,2} ] Starting benchmark. Running warmup runs. Main runs. Main run finished. Milliseconds per iter: 0.0887. Iters per second: 11274 Memory usage before main runs: 71262208 bytes Memory usage after main runs: 71573504 bytes Average memory increase per iter: 31129.6 bytes 0 value means "not available" in above ``` Reviewed By: ljk53 Differential Revision: D31631975 fbshipit-source-id: 7956787b3e121f9c14f4733398a64c2f7ae84373	2021-10-16 20:04:52 -07:00
Priya Ramani	962c6476da	Refactor: move method to func compilation work to compileMethod, add option to specify method name (#66726 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66726 Move method to func compilation work to compileMethod Test Plan: Mobilenetv3 compiles and runs successfully ``` (pytorch) ~/fbsource/fbcode/caffe2/fb/nnc └─ $ buck run //caffe2/binaries:aot_model_compiler -- --model mobilenetv3.pt --model_name=pytorch_dev_mobilenetv3 --model_version=v1 --input_dims="1,3,224,224" Downloaded 0/4 artifacts, 0.00 bytes, 100.0% cache miss (for updated rules) Building: finished in 13.2 sec (100%) 18719/18719 jobs, 2/18719 updated Total time: 13.5 sec BUILD SUCCEEDED The compiled llvm assembly code was saved to mobilenetv3.compiled.ll The compiled model was saved to mobilenetv3.compiled.pt ``` Reviewed By: ljk53, IvanKobzarev Differential Revision: D31624342 fbshipit-source-id: 233a6e94ea05ba8d6fc166d2414034c9e58cb076	2021-10-16 20:03:24 -07:00
Priya Ramani	63bb7c6dba	Refactor AotCompile to return a pair (#65707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65707 Refactoring aotCompile to return a pair of compiled function and the LLVM assembly instead of updating an incoming string with assembly code Testing: Gives expected results when compiled and run ``` (pytorch) ~/local/pytorch refactor_aot └─ $ build/bin/aot_model_compiler --model mobilenetv3.pt --model_name=pytorch_dev_mobilenetv3 --model_version=v1 --input_dims="2,2,2" The compiled model was saved to mobilenetv3.compiled.pt ``` Test Plan: Imported from OSS Reviewed By: qihqi Differential Revision: D31220452 Pulled By: priyaramani fbshipit-source-id: f957c53ba83f876a2e7dbdd4b4571a760b3b6a9a	2021-09-27 18:56:04 -07:00
Priya Ramani	206646d6ed	Add NNC AOT Compiler executable (#63994 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63994 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D30582149 Pulled By: priyaramani fbshipit-source-id: 3bbf085428824c3cb308e006c18bb0a57f50fef6	2021-09-15 19:18:24 -07:00

15 Commits