pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

History

Sheng Qin d25617255c Fix AOTI update_constant_buffer issue. (#149243 ) Summary: In D69553929 we changed the logic of constant & buffer update in AOTI. However this is incompatible with current Sigmoid runtime since we have different logics to pass in buffers, resulted in errors like ``` I0310 17:29:24.456960 3679102 AOTIDelegateExecutor.cpp:89] AOTIDelegateExecutor processing weights * Aborted at 1741652964 (Unix time, try 'date -d 1741652964') * * Signal 11 (SIGSEGV) (0x30) received by PID 3679102 (pthread TID 0x7f9933e49000) (linux TID 3679102) (code: address not mapped to object), stack trace: * @ 00000000000040b9 folly::symbolizer::(anonymous namespace)::signalHandler(int, siginfo_t, void) ./fbcode/folly/debugging/symbolizer/SignalHandler.cpp:453 @ 0000000000006c45 folly::fibers::(anonymous namespace)::sigsegvSignalHandler(int, siginfo_t, void) ./fbcode/folly/fibers/GuardPageAllocator.cpp:237 @ 000000000004455f (unknown) /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/libc_sigaction.c:8 -> /home/engshare/third-party2/glibc/2.34/src/glibc-2.34/signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c @ 00000000001e8164 torch::aot_inductor::AOTInductorModelContainer::update_constant_buffer(std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, AtenTensorOpaque, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, AtenTensorOpaque> > > const&, bool, bool) ``` Test Plan: 1) Generate lowered merge net ``` CUDA_VISIBLE_DEVICES=0 ../buck-out/v2/gen/fbcode/b5b13003c82cbdec/caffe2/torch/fb/model_transform/fx2trt/packaging/__generate_merge_net_file__/generate_merge_net_file.par --action=generate --input-file=/home/shengqin/models/aoti_sigmoid_test/cmf_interformer_with_custom_triton_kernels_691990503_0_input --output-file=/home/shengqin/models/aoti_sigmoid_test/cmf_interformer_with_custom_triton_kernels_691990503_0_output.aoti_sigmoid --lower-backend=aot_inductor --use_sigmoid=true --aot_inductor_config="{'max_autotune': True, 'comprehensive_padding': False}" --add_passes=use_matmul_lce_replace_normal_LCE,use_triton_dot_compress,use_matmul_fuse_lce_replace_first_LCE,use_contiguous_linear_reduction_replace_linear_reduction --disable_acc_tracer=false ``` 2) Load net predictor ``` CUDA_VISIBLE_DEVICES=1 ../buck-out/v2/gen/fbcode/103717df3cc2b97a/caffe2/torch/fb/model_transform/fx2trt/packaging/__load_net_predictor__/load_net_predictor --loadMode=AccuracyAB --inputNetFile=/home/shengqin/models/aoti_sigmoid_test/cmf_interformer_with_custom_triton_kernels_691990503_0_output.aoti_ts --otherNetFile=/home/shengqin/models/aoti_sigmoid_test/cmf_interformer_with_custom_triton_kernels_691990503_0_output.aoti_sigmoid --moduleName=merge --benchmarkEnableProfiling=false —-predictor_hardware_type=1 --disableStaticRuntime=true ``` Reviewed By: hl475 Differential Revision: D71236710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149243 Approved by: https://github.com/hl475, https://github.com/jingsh		2025-03-17 22:10:57 +00:00
..
aoti_eager	Fix for AOTI + CUDAGraphs when calling from Python (#148601 )	2025-03-08 02:44:14 +00:00
aoti_include	cpp_wrapper: reduce memory usage by removing unneeded temporaries (#147403 )	2025-03-06 16:08:16 +00:00
aoti_package	BC fix for AOTIModelPackageLoader() constructor defaults (#149082 )	2025-03-13 18:40:53 +00:00
aoti_runner	Revert "[AOTInductor] [BE] Add swap_constant_buffer into pybind for tests. (#149167 )"	2025-03-14 15:16:21 +00:00
aoti_runtime	Fix AOTI update_constant_buffer issue. (#149243 )	2025-03-17 22:10:57 +00:00
aoti_torch	op should NOT be static in aoti_torch_call_dispatcher (#149208 )	2025-03-15 01:47:11 +00:00
cpp_wrapper	cpp_wrapper: reduce memory usage by removing unneeded temporaries (#147403 )	2025-03-06 16:08:16 +00:00
array_ref_impl.h	cpp_wrapper: Move #includes to per-device header files (#145932 )	2025-01-29 21:08:45 +00:00
inductor_ops.cpp	[Reland] [1/N] Fix clang-tidy warnings in inductor (#134544 )	2024-08-28 04:05:06 +00:00
inductor_ops.h	[2/N] Fix clang-tidy warnings in inductor (#132040 )	2024-07-29 18:41:24 +00:00
resize_storage_bytes.cpp	[Reland] [1/N] Fix clang-tidy warnings in inductor (#134544 )	2024-08-28 04:05:06 +00:00
static_cuda_launcher.cpp	[Reland] First version of statically compiled launcher for triton compiled CUDA kernels (#149238 )	2025-03-15 15:06:46 +00:00
static_cuda_launcher.h	[Reland] First version of statically compiled launcher for triton compiled CUDA kernels (#149238 )	2025-03-15 15:06:46 +00:00