pytorch/torch/_C
IvanKobzarev a37afd23fa [custom_ops][perf] Move expensive pytree traversals of tensors to C++ (#148555)
(benchmark for 1 call)

Before:
```
└─ $ python ~/task_custom_ops_perf/test_custom_ops_perf_repro.py
DO_BENCH mutate: 77.72445678710938 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/mutate.json
DO_BENCH no_mutate: 64.61143493652344 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/no_mutate.json
DO_BENCH direct_mutate: 11.682510375976562 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_mutate.json
DO_BENCH direct_no_mutate: 18.596649169921875 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_no_mutate.json
```

After:
```
└─ $ python ~/task_custom_ops_perf/test_custom_ops_perf_repro.py
DO_BENCH mutate: 47.6837158203125 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/mutate.json
DO_BENCH no_mutate: 31.709671020507812 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/no_mutate.json
DO_BENCH direct_mutate: 10.967254638671875 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_mutate.json
DO_BENCH direct_no_mutate: 10.728836059570312 us PROFILE:/home/ivankobzarev/task_custom_ops_perf/direct_no_mutate.json
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148555
Approved by: https://github.com/zou3519
2025-04-01 18:45:48 +00:00
..
_dynamo [ca] introduce RuntimeState to support c++ hooks via graph breaks (#149987) 2025-03-27 05:05:34 +00:00
__init__.pyi.in [custom_ops][perf] Move expensive pytree traversals of tensors to C++ (#148555) 2025-04-01 18:45:48 +00:00
_aoti.pyi [AOTI XPU] Support AOT Inductor for Intel GPU. (#140269) 2024-12-10 05:05:08 +00:00
_autograd.pyi Add overload names to profiler trace (#143114) 2025-03-05 01:00:29 +00:00
_cpu.pyi [CPUInductor] Fix SVE256 detection (#146207) 2025-02-01 18:51:34 +00:00
_cudnn.pyi Improve typing in torch/types.py (#145237) 2025-01-28 05:29:12 +00:00
_cusparselt.pyi [sparse] Add cuSPARSELt as a backend (#128534) 2024-08-21 22:06:07 +00:00
_distributed_autograd.pyi remove allow-untyped-defs for torch/_C/_distributed_autograd.pyi (#143369) 2024-12-17 18:09:28 +00:00
_distributed_c10d.pyi [Reland] Launch kernel on current stream & remove record_stream entirely (#150398) 2025-04-01 16:46:07 +00:00
_distributed_rpc_testing.pyi Use Generic TypeAlias (PEP 585) and Union Type (PEP 604) in .pyi stub files (#129419) 2024-06-29 09:23:39 +00:00
_distributed_rpc.pyi Use Generic TypeAlias (PEP 585) and Union Type (PEP 604) in .pyi stub files (#129419) 2024-06-29 09:23:39 +00:00
_export.pyi [export] Implement cpp deserializer. (#136398) 2024-11-14 16:34:59 +00:00
_functions.pyi PEP585 update - torch/_C torch/_decomp torch/_lazy torch/_library torch/_numpy torch/_prims torch/_refs torch/_strobelight (#145102) 2025-01-18 20:47:12 +00:00
_functorch.pyi [BE] Upgrade to mypy 1.14 (#145966) 2025-03-04 20:58:26 +00:00
_instruction_counter.pyi Add compile time instruction count metric (#133834) 2024-08-27 23:29:02 +00:00
_itt.pyi Fix ITT unit-tests if PyTorch is compiled with USE_ITT=OFF (#86199) 2022-10-04 21:57:05 +00:00
_lazy_ts_backend.pyi Use Generic TypeAlias (PEP 585) and Union Type (PEP 604) in .pyi stub files (#129419) 2024-06-29 09:23:39 +00:00
_lazy.pyi remove allow-untyped-defs for torch/_C/_lazy.pyi (#143370) 2024-12-17 17:18:10 +00:00
_monitor.pyi PEP585: More UP006 fixes (#146392) 2025-02-20 06:18:13 +00:00
_nn.pyi.in Use Python 3.9 typing (#148157) 2025-03-04 03:09:55 +00:00
_nvtx.pyi Inductor annotations (#130429) 2024-12-10 08:53:39 +00:00
_onnx.pyi [1/N] [Caffe2] Remove caffe2_aten_fallback code (#128675) 2024-06-17 21:25:59 +00:00
_profiler.pyi [Profiler] Add profiler activity for HPU devices (#148182) 2025-03-05 01:37:48 +00:00
_VariableFunctions.pyi.in Use Python 3.9 typing (#148157) 2025-03-04 03:09:55 +00:00
_verbose.pyi
build.bzl
return_types.pyi.in Use Python 3.9 typing (#148157) 2025-03-04 03:09:55 +00:00