make g2p ~30% faster on mobile by suppressing a log (#85907)

Summary: using the tool from D39559248 i was able to make g2p faster on mobile by taking a look at profiles on stella frames. It turned out that the pytorch interpreter code does some logging that ends up being a pretty big bottleneck. Differential Revision: D39901455 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85907 Approved by: https://github.com/dzdang
2025-12-06 12:20:52 +01:00 · 2022-10-08 01:25:03 +00:00 · 2022-10-08 01:25:03 +00:00 · b645c237bc
commit b645c237bc
parent bac26155e7
2 changed files with 5 additions and 1 deletions
--- a/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
+++ b/aten/src/ATen/native/quantized/cpu/qlinear_dynamic.cpp
@ -236,7 +236,7 @@ at::Tensor PackedLinearWeightsQnnp::apply_dynamic_impl(
    at::Tensor input,
    bool reduce_range) {
  if (reduce_range) {
-    TORCH_WARN("Currently, qnnpack incorrectly ignores reduce_range when it is set to true; this may change in a future release.");
+    TORCH_WARN_ONCE("Currently, qnnpack incorrectly ignores reduce_range when it is set to true; this may change in a future release.");
  }

  using at::Tensor;
--- a/torch/csrc/jit/mobile/interpreter.cpp
+++ b/torch/csrc/jit/mobile/interpreter.cpp
@ -110,6 +110,10 @@ bool InterpreterState::run(Stack& stack) {
      // Check with iliacher if has been done.
      // Plus this is not safe as if you throw exception record function will be
      // left enabled. That is a TODO
+      // NOTE: this recordFunction logic takes up ~2-3% of cpu cycles in some
+      // workflows. do we need it and/or can we opt-out of
+      // isRecordFunctionEnabled with a macro? if we delete it, things appear to
+      // work just fine.
      bool prev_value = isRecordFunctionEnabled();
      if (!prev_value) {
        // enable only for the RecordFunction