Summary:
The current implementation of `str` passes wide types (`wchar_t`, `wchar_t*`, `std::wstring`) directly to `std::ostringstream`. This has the following behavior:
- C++17, `wchar_t` & `wchar_t *`: print the integer representation of the character or the pointer. This is unexpected and almost certainly a (runtime) bug.
- C++17, `std::wstring`: compile-time error.
- C++20, all of the above: compile-time error.
To fix the bug and to enable C++20 migration, this diff performs narrowing on these wide types (assuming UTF-16 encoding) before passing them to `std::ostringstream`. This fixes both the C++20 compile time errors and the C++17 runtime bugs.
This bug surfaced in enabling C++20 windows builds, because windows specific caffe2 code uses `TORCH_CHECK` with wide strings, which references `str` for generating error messages.
Test Plan: CI & https://godbolt.org/z/ecTGd8Ma9
Differential Revision: D52792393
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117531
Approved by: https://github.com/malfet
`processErrorMsg` uses a table of string constants to be replaced in
the error message. However, this table is non-static so gets
re-constructed from scratch every time. So, I've made it `constexpr`
by using `std::array` instead of `std::vector` and `c10::string_view`
instead of `std::string`.
To support `c10::string_view` I've also updated `c10::ReplaceAll` to
accept string_view arguments, and added a fast-path that also avoids
repeated string searches when no translation is needed.
Using `torch.floor_divide` to benchmark a python warning, I see the
callgrind instruction count fall from 3,806,446 to 703,168 and a 6.5
us time improvement using `%timeit`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76976
Approved by: https://github.com/swolchok
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71356
Suppress remaining header based warnings in `caffe2/c10` when building with `clang`
Test Plan: CI pass
Reviewed By: r-barnes
Differential Revision: D33600097
fbshipit-source-id: e1c0d84a0bad768eb03e047d62b5379cf28b48e2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56830
Opt into formatting on GitHub and format everything. This is a trial run before turning on formatting for more and eventually all of the codebase.
Test Plan: CI
Reviewed By: zertosh
Differential Revision: D27979080
fbshipit-source-id: a80f0c48691c08ae8ca0af06377b87e6a2351151
Summary:
In my last PR I've missed CUDA and distributed folders, fixing this now
This change is autogenerated by `python tool/clang_tidy.py -s`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57235
Reviewed By: janeyx99
Differential Revision: D28084444
Pulled By: malfet
fbshipit-source-id: bf222f69ee90c7872c3cb0931e8cdb84f0cb3cda
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52222
`c10::str()` is often used with variadic macros. It can be more efficient to get a C string out if you put a C string in, like if you are able to defer std::string creation to an outlined function or even never do it at all. Meanwhile, there is an implicit conversion from const char* to std::string, so users who expected a std::string will still make one.
ghstack-source-id: 121877052
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: bhosmer
Differential Revision: D26419663
fbshipit-source-id: 400bef71e6a0004b5914f5f511ea0e04e0d7599b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52221
The previous code forced a `std::string` to be created even when the default message or a user-provided string literal message was used. Now it's not forced and we don't need an outlined lambda in those cases either.
ghstack-source-id: 121877056
Test Plan:
Compare assembly for
```
#include <c10/util/Exception.h>
void f(bool b) {
TORCH_CHECK(b, "message");
}
void g(bool b) {
TORCH_CHECK(b);
}
void h(bool b) {
TORCH_CHECK(b, "message", random());
}
```
before/after in fbcode optimized build.
Before: P174696735
After: P174696840
For `f()` and `g()`, we go from a call to an outlined lambda that did a bunch of `std::string` creation to a load of a string constant before calling `torchCheckFail`. This is a clear improvement.
For `h()`, results are mixed: we save a bunch of *extra* string goop in the outlined lambda and instead call `c10::detail::_str_wrapper` directly. This is good for overall size. However, we no longer outline the call to `random()`, which is less than ideal. I hope to recover the ability to fully outline the `random()` call in future diffs; this is just thorny enough that I don't want to cram even more into one diff.
Added automated test to make sure `TORCH_CHECK` and `TORCH_INTERNAL_ASSERT` only evaluate their arguments once.
Profiled AdIndexer mergenet benchmark in perf to check that `IValue::toTensor` is still getting inlined.
Reviewed By: bhosmer
Differential Revision: D26380783
fbshipit-source-id: 288860772423994ac739a8f33e2c09f718e8dd38
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52220
D21268320 (d068a456d3) made this thread_local, but I don't think it was necessary to do so.
ghstack-source-id: 121877050
Test Plan: CI
Reviewed By: dzhulgakov
Differential Revision: D26378724
fbshipit-source-id: 7f17b5cff42983ea8f5be1bd254de01bf8db9a0e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37382
After adding c10::DispatchKey::Profiler the behavior of RecordFunction
observers is also controlled by the dispatch key,
this PR moves the logic outside of the profiler into the record function
Reviewed By: jamesr66a
Differential Revision: D21268320
fbshipit-source-id: 93207e3b55325d20dcc5b1e8f448ab86933321da
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37292
After adding c10::DispatchKey::Profiler the behavior of RecordFunction
observers is also controlled by the dispatch key,
this PR moves the logic outside of the profiler into the record function
Reviewed By: jamesr66a
Differential Revision: D21245094
fbshipit-source-id: 595e41b18206d2ba4cf639cb320f630907868b3f
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37023
Optimize binary size of assert macros, through two ideas:
Concatenate string literals with __FILE__ and __LINE__ at compile time into one literal instead of keeping them in separate literals and combining them with c10::str
Optimize binary size of c10::str for some scenarios, especially for the scenario where it is called with an empty parameter list, this is actually a common call scenario in assert macros.
In server oss builds, this PR reduces binary size from 118.05 MB to 117.05 MB
ghstack-source-id: 102607237
Test Plan: Run oss server build (python setup.py install) and check size of libtorch_cpu.so reducing from 118.05MB to 117.05MB
Differential Revision: D20719400
fbshipit-source-id: 5c61f4195b947f06aafb8f0c8e255de3366e1ff2
Summary:
This dramatically reduces the number of instantiations and eliminates
~900KB of code from my local build of libtorch_cpu.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31683
Differential Revision: D19258364
Pulled By: resistor
fbshipit-source-id: addb921a26289978ffd14c203325ca7e35a4515b
Summary:
Hi guys,
I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios.
This is the first pull request.
Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015.
CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system.
Python is 3.5, Detectron works from python interface as well.
It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built.
What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat.
After this pull request the next step is to add Visual Studio 2017 support in the script.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550
Reviewed By: ezyang
Differential Revision: D13042597
Pulled By: orionr
fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc
Summary:
There are still a few work to be done:
- Move logging and unify AT_WARN with LOG(ERROR).
- A few header files are still being plumbed through, need cleaning.
- caffe2::EnforceNotMet aliasing is not done yet.
- need to unify the macros. See c10/util/Exception.h
This is mainly a codemod and not causing functional changes. If you find your job failing and trace back to this diff, usually it can be fixed by the following approaches:
(1) add //caffe2/c10:c10 to your dependency (or transitive dependency).
(2) change objects such as at::Error, at::Optional to the c10 namespace.
(3) change functions to the c10 namespace. Especially, caffe2::MakeString is not overridden by the unified c10::str function. Nothing else changes.
Please kindly consider not reverting this diff - it involves multiple rounds of rebasing and the fix is usually simple. Contact jiayq@ or AI Platform Dev for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12354
Reviewed By: orionr
Differential Revision: D10238910
Pulled By: Yangqing
fbshipit-source-id: 7794d5bf2797ab0ca6ebaccaa2f7ebbd50ff8f32