mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Joel Schlosser e86476f736 Huber loss (#50553 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48595. ## Background This PR implements HuberLoss, which differs from SmoothL1Loss by a factor of beta. The current implementation does not share logic between the two. Feedback is welcome for the optimal way to minimize code duplication while remaining performant. I've done some early [benchmarking](https://pytorch.org/tutorials/recipes/recipes/benchmark.html#collecting-instruction-counts-with-callgrind) with Huber calling in to the Smooth L1 kernel and scaling afterwards; for the simple test case I used, instruction counts are as follows: ``` Huber loss calls dedicated Huber kernel: 2,795,300 Huber loss calls Smooth L1 kernel and scales afterwards: 4,523,612 ``` With these numbers, instruction counts are ~62% higher when using the pre-existing Smooth L1 kernel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50553 Test Plan: ``` python test/test_nn.py TestNN.test_HuberLoss python test/test_nn.py TestNN.test_HuberLoss_delta python test/test_nn.py TestNN.test_huber_loss_invalid_delta python test/test_nn.py TestNNDeviceTypeCPU.test_smooth_l1_loss_vs_huber_loss_cpu python test/test_nn.py TestNNDeviceTypeCUDA.test_smooth_l1_loss_vs_huber_loss_cuda python test/test_nn.py TestNNDeviceTypeCPU.test_invalid_reduction_strings_cpu python test/test_nn.py TestNNDeviceTypeCUDA.test_invalid_reduction_strings_cuda python test/test_nn.py TestNN.test_loss_equal_input_target_shape python test/test_nn.py TestNN.test_pointwise_loss_broadcast python test/test_overrides.py python test/test_jit.py TestJitGeneratedFunctional.test_nn_huber_loss python test/test_type_hints.py python test/test_cpp_api_parity.py build/bin/test_api ``` ## Documentation <img width="677" alt="Screen Shot 2021-01-14 at 4 25 08 PM" src="https://user-images.githubusercontent.com/75754324/104651224-5a445980-5685-11eb-884b-14ea517958c2.png"> <img width="677" alt="Screen Shot 2021-01-14 at 4 24 35 PM" src="https://user-images.githubusercontent.com/75754324/104651190-4e589780-5685-11eb-974d-8c63a89c050e.png"> <img width="661" alt="Screen Shot 2021-01-14 at 4 24 45 PM" src="https://user-images.githubusercontent.com/75754324/104651198-50225b00-5685-11eb-958e-136b36f6f8a8.png"> <img width="869" alt="Screen Shot 2021-01-14 at 4 25 27 PM" src="https://user-images.githubusercontent.com/75754324/104651208-53b5e200-5685-11eb-9fe4-5ff433aa13c5.png"> <img width="862" alt="Screen Shot 2021-01-14 at 4 25 48 PM" src="https://user-images.githubusercontent.com/75754324/104651209-53b5e200-5685-11eb-8051-b0cfddcb07d3.png"> Reviewed By: H-Huang Differential Revision: D26734071 Pulled By: jbschlosser fbshipit-source-id: c98c1b5f32a16f7a2a4e04bdce678080eceed5d5		2021-03-02 17:30:45 -08:00
..
any.cpp
autograd.cpp	Fix autograd when `inputs` contains tensors without materialized grad_fn (#51940 )	2021-02-11 09:22:15 -08:00
CMakeLists.txt	Implement C++ ModuleDict (#47707 )	2020-11-19 08:07:51 -08:00
dataloader.cpp
dispatch.cpp
enum.cpp
expanding-array.cpp
fft.cpp	Remove deprecated spectral ops from torch namespace (#48594 )	2020-12-05 04:12:32 -08:00
functional.cpp	Huber loss (#50553 )	2021-03-02 17:30:45 -08:00
init_baseline.h
init_baseline.py
init.cpp
integration.cpp
jit.cpp
memory.cpp
misc.cpp	codegen: Resolve overload ambiguities created by defaulted arguments (#49348 )	2021-01-04 11:59:16 -08:00
module.cpp
moduledict.cpp	Implement C++ ModuleDict (#47707 )	2020-11-19 08:07:51 -08:00
modulelist.cpp
modules.cpp	Huber loss (#50553 )	2021-03-02 17:30:45 -08:00
namespace.cpp
nn_utils.cpp	[WIP] Fix cpp grad accessor API (#40887 )	2020-07-16 09:11:12 -07:00
operations.cpp	[Codemod][GleanFbcode] Remove dead includes in caffe2/test (#43953 )	2020-09-01 21:48:28 -07:00
optim_baseline.h	Add `AdamW` to C++ frontend (#40009 )	2020-06-18 15:28:12 -07:00
optim_baseline.py	Add `AdamW` to C++ frontend (#40009 )	2020-06-18 15:28:12 -07:00
optim.cpp	[WIP] Fix cpp grad accessor API (#40887 )	2020-07-16 09:11:12 -07:00
ordered_dict.cpp
parallel_benchmark.cpp
parallel.cpp
parameterdict.cpp	Python/C++ API Parity: Add impl and tests for ParameterDict (#40654 )	2020-06-29 08:50:44 -07:00
parameterlist.cpp	Impl for ParameterList (#41259 )	2020-07-12 20:50:31 -07:00
README.md
rnn.cpp	Adding support for CuDNN-based LSTM with projections (#47725 )	2020-12-16 11:27:02 -08:00
sequential.cpp
serialize.cpp	Modernize for-loops (#50912 )	2021-01-22 10:53:24 -08:00
static.cpp
support.cpp
support.h
tensor_cuda.cpp
tensor_flatten.cpp	fix unflatten_dense_tensor when there is empty tensor inside (#50321 )	2021-01-23 12:14:34 -08:00
tensor_indexing.cpp	Making ops c10-full: list of optional tensors (#49138 )	2021-01-04 05:04:02 -08:00
tensor_options_cuda.cpp
tensor_options.cpp	[PyTorch] Narrow Device to 2 bytes by narrowing DeviceType and DeviceIndex (#47023 )	2020-11-18 19:39:40 -08:00
tensor.cpp	Change to.dtype_layout to c10-full (#41169 )	2020-07-10 16:04:34 -07:00
torch_include.cpp
transformer.cpp	C++ APIs Transformer NN Module Top Layer (#44333 )	2020-09-11 08:25:27 -07:00

README.md

C++ Frontend Tests

In this folder live the tests for PyTorch's C++ Frontend. They use the GoogleTest test framework.

CUDA Tests

To make a test runnable only on platforms with CUDA, you should suffix your test with _CUDA, e.g.

TEST(MyTestSuite, MyTestCase_CUDA) { }

To make it runnable only on platforms with at least two CUDA machines, suffix it with _MultiCUDA instead of _CUDA, e.g.

TEST(MyTestSuite, MyTestCase_MultiCUDA) { }

There is logic in main.cpp that detects the availability and number of CUDA devices and supplies the appropriate negative filters to GoogleTest.

Integration Tests

Integration tests use the MNIST dataset. You must download it by running the following command from the PyTorch root folder:

$ python tools/download_mnist.py -d test/cpp/api/mnist

The required paths will be referenced as test/cpp/api/mnist/... in the test code, so you must run the integration tests from the PyTorch root folder.