pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Svetlana Karslioglu	d425da8bf3	Replace master with main in links and docs/conf.py (#100176 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/100176 Approved by: https://github.com/albanD, https://github.com/malfet	2023-05-02 18:20:32 +00:00
Hirochika Matsumoto	f143c92739	[docs] Fix typo in get-started.rst (#100355 ) This PR changes `""nvprims_nvfuser"` which should be a typo to `"nvprims_nvfuser"`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100355 Approved by: https://github.com/Skylion007, https://github.com/kit1980	2023-05-02 00:29:53 +00:00
BowenBao	c94b6a6712	[ONNX] Introduce 'diagnostics' to 'dynamo_export' api (#99668 ) Summary * Introduce `DiagnosticContext` to `torch.onnx.dynamo_export`. * Remove `DiagnosticEngine` in preparations to update 'diagnostics' in `dynamo_export` to drop dependencies on global diagnostic context. No plans to update `torch.onnx.export` diagnostics. Next steps * Separate `torch.onnx.export` diagnostics and `torch.onnx.dynamo_export` diagnostics. * Drop dependencies on global diagnostic context. https://github.com/pytorch/pytorch/pull/100219 * Replace 'print's with 'logger.log'. Pull Request resolved: https://github.com/pytorch/pytorch/pull/99668 Approved by: https://github.com/justinchuby, https://github.com/abock	2023-05-01 19:58:49 +00:00
pbialecki	8fe91d16b0	Remove CUDA 11.6 note from complex docs (#100118 ) Removes note in the complex docs pointing to the CUDA 11.6 wheels introduced in https://github.com/pytorch/pytorch/pull/80363. Background: this warning was added via https://github.com/pytorch/pytorch/issues/79876 which pointed out a slow compilation time in 11.3. The 11.6 pip wheels were thus recommended but are not build anymore as our current support is 11.7, 11.8 (and 12.1 experimental in nightlies). The note is confusing users as it doesn't explain why 11.6 is needed. Reference: https://discuss.pytorch.org/t/complex-numbers-cuda-11-6-documentation-warning/178588/1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100118 Approved by: https://github.com/msaroufim	2023-04-27 16:26:27 +00:00
milesial	45bf3f6216	Optimized EMA implementation (#94820 ) This PR proposes an optimized way to do Exponential Moving Average (EMA), which is faster than the current way using `swa_utils.AveragedModel` described in https://pytorch.org/docs/stable/optim.html#custom-averaging-strategies. This implementation is asynchronous, and is built as an optimizer wrapper so that the EMA weight update happens without any additional CPU/GPU sync, just after optimizer steps, and with limited code changes. Example usage: ``` model = Model().to(device) opt = torch.optim.Adam(model.parameters()) opt = EMAOptimizer(opt, device, 0.9999) for epoch in range(epochs): training_loop(model, opt) regular_eval_accuracy = evaluate(model) with opt.swap_ema_weights(): ema_eval_accuracy = evaluate(model) ``` Here are some benchmarks (time per iteration) on various torchvision models: \|model\|this PR iteration time \|swa_utils.AveragedModel iteration time\| iteration speedup \| \|-----\|-----------------------------\|-----------------------\|---------------------------------------------\| \| \| \| \| \| \|regnet_x_1_6gf\|62.73 \|67.998 \|1.08 \| \|regnet_x_3_2gf\|101.75 \|109.422 \|1.08 \| \|regnet_x_400mf\|25.13 \|32.005 \|1.27 \| \|regnet_x_800mf\|33.01 \|37.466 \|1.13 \| \|regnet_x_8gf\|128.13 \|134.868 \|1.05 \| \|regnet_y_16gf\|252.91 \|261.292 \|1.03 \| \|regnet_y_1_6gf\|72.14 \|84.22 \|1.17 \| \|regnet_y_3_2gf\|99.99 \|109.296 \|1.09 \| \|regnet_y_400mf\|29.53 \|36.506 \|1.24 \| \|regnet_y_800mf\|37.82 \|43.634 \|1.15 \| \|regnet_y_8gf\|196.63 \|203.317 \|1.03 \| \|resnet101\|128.80 \|137.434 \|1.07 \| \|resnet152\|182.85 \|196.498 \|1.07 \| \|resnet18\|29.06 \|29.975 \|1.03 \| \|resnet34\|50.73 \|53.443 \|1.05 \| \|resnet50\|76.88 \|80.602 \|1.05 \| \|resnext101_32x8d\|277.29 \|280.759 \|1.01 \| \|resnext101_64x4d\|269.56 \|281.052 \|1.04 \| \|resnext50_32x4d\|100.73 \|101.102 \|1.00 \| \|shufflenet_v2_x0_5\|10.56 \|15.419 \|1.46 \| \|shufflenet_v2_x1_0\|13.11 \|18.525 \|1.41 \| \|shufflenet_v2_x1_5\|18.05 \|23.132 \|1.28 \| \|shufflenet_v2_x2_0\|25.04 \|30.008 \|1.20 \| \|squeezenet1_1\|14.26 \|14.325 \|1.00 \| \|swin_b\|264.52 \|274.613 \|1.04 \| \|swin_s\|180.66 \|188.914 \|1.05 \| \|swin_t\|108.62 \|112.632 \|1.04 \| \|swin_v2_s\|220.29 \|231.153 \|1.05 \| \|swin_v2_t\|127.27 \|133.586 \|1.05 \| \|vgg11\|95.52 \|103.714 \|1.09 \| \|vgg11_bn\|106.49 \|120.711 \|1.13 \| \|vgg13\|132.94 \|147.063 \|1.11 \| \|vgg13_bn\|149.73 \|165.256 \|1.10 \| \|vgg16\|158.19 \|172.865 \|1.09 \| \|vgg16_bn\|177.04 \|192.888 \|1.09 \| \|vgg19\|184.76 \|194.194 \|1.05 \| \|vgg19_bn\|203.30 \|213.334 \|1.05 \| \|vit_b_16\|217.31 \|219.748 \|1.01 \| \|vit_b_32\|69.47 \|75.692 \|1.09 \| \|vit_l_32\|223.20 \|258.487 \|1.16 \| \|wide_resnet101_2\|267.38 \|279.836 \|1.05 \| \|wide_resnet50_2\|145.06 \|154.918 \|1.07 \| You can see that in all cases it is faster than using `AveragedModel`. In fact in many cases, adding EMA does not add any overhead since the computation is hidden behind the usual iteration flow. This is a similar implementation to the one currently in [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). If the team is interested in merging this, let me know and I'll add some documentation similar to `swa_utils` and tests. Credits to @szmigacz for the implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94820 Approved by: https://github.com/janeyx99	2023-04-26 18:02:11 +00:00
Chris Gottbrath	f0e28b1cb9	Adding the maintainers approved in 2023Q1 Core Maintainers meeting (#98520 ) Added Nikita to Core Maintainers Merged MKLDNN with CPU Performance Renamed CUDA to GPU Performance Added Jiong to Compiler and CPU Performance Added Xiaobing to CPU Performance Marking Vitaly and Jian Hui as Emeritus Pull Request resolved: https://github.com/pytorch/pytorch/pull/98520 Approved by: https://github.com/ezyang, https://github.com/soumith, https://github.com/dzhulgakov	2023-04-24 17:58:18 +00:00
Kurt Mohler	1e8cf6ad7f	Add documentation for `torch._logging.set_logs` (#99219 ) Part of #98871 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99219 Approved by: https://github.com/mlazos, https://github.com/lezcano	2023-04-24 08:06:57 +00:00
BowenBao	51742a467d	[ONNX] Fix missing import numpy for docs example (#99663 ) Fixes https://github.com/pytorch/pytorch/issues/99408 Pull Request resolved: https://github.com/pytorch/pytorch/pull/99663 Approved by: https://github.com/justinchuby	2023-04-21 04:06:45 +00:00
Simon Seo	9f95032101	Fix broken links in contribution_guide.rst (#99295 ) mainly from `master` to `main` Pull Request resolved: https://github.com/pytorch/pytorch/pull/99295 Approved by: https://github.com/kit1980	2023-04-20 22:20:56 +00:00
Will Constable	e6aa8e0729	Test and document dynamo backward hooks support (#99382 ) No new support added, but backward hooks are working and now there is a test and some documentation about the limitations (hooks firing after whole graph). Pull Request resolved: https://github.com/pytorch/pytorch/pull/99382 Approved by: https://github.com/yanboliang	2023-04-18 03:03:29 +00:00
Will Constable	6eab5e88c8	Graph-break on allowed modules if they have hooks (#97184 ) Allowed modules are stuck into dynamo's fx graph as call_module nodes, without dynamo doing any tracing of the module. This means during AOT trace time, hooks will fire during tracing when the call_module is executed, but the hooks themselves will disappear after that and not be present in the compiled program. (worse, if they performed any tensor operations, those would get traced so you could end up with part of the hook's functionality). To circumvent this, there are two options for 'allowed modules' with hooks. 1) don't treat them as 'allowed' - trace into them 2) graph-break, so the module is no longer part of the dynamo trace at all (1) will fail for users that opted into allowed modules becuase they know their module has problems being traced by dynamo. (2) causes graph breaks on common modules such as nn.Linear, just because they are marked as 'allowed'. It would help matters if we could differentiate between types of allowed modules (A) allowed to avoid overheads - used for common ops like nn.Linear (B) allowed to avoid dynamo graphbreaks caused by unsupported code Ideally, we'd use method (1) for group (A) and (2) for (B). For now, graph-break on all cases of allowed modules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97184 Approved by: https://github.com/jansel	2023-04-15 01:46:15 +00:00
BowenBao	606ce5b653	[ONNX] Introduce Input/Ouptut adapter; Switch to 'DynamoExporter' (#98421 ) Summary * Introduce input/output adapter. Due to design differences, input/output format between PyTorch model and exported ONNX model are often not the same. E.g., `None` inputs are allowed for PyTorch model, but are not supported by ONNX. Nested constructs of tensors are allowed for PyTorch model, but only flattened tensors are supported by ONNX, etc. The new input/output adapter is exported with the model. Providing an interface to automatically convert and validate inputs/outputs format. * As suggested by #98251, provide extension for unwrapping user defined python classes for `dynamo.export` based exporter. Unblock huggingface models. * Re-wire tests to run through `DynamoExporter` w/ `dynamo_export` api. Kept `DynamoOptimizeExporter` in the tests for now for coverage of this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98421 Approved by: https://github.com/justinchuby, https://github.com/titaiwangms, https://github.com/thiagocrepaldi	2023-04-15 01:13:00 +00:00
PyTorch MergeBot	dda7ce4bb3	Revert "[core][pruning][be] Rename sparsifier folder to pruner (#98758 )" This reverts commit `778fd1922a`. Reverted https://github.com/pytorch/pytorch/pull/98758 on behalf of https://github.com/jcaip due to https://www.internalfb.com/diff/D44905951 need to fix broken import in fbcode	2023-04-13 16:30:47 +00:00
Tugsbayasgalan Manlaibaatar	39fd7f945f	Add Symbool support in python to C++ translation (#98453 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98453 Approved by: https://github.com/ezyang	2023-04-12 03:21:57 +00:00
Mark Saroufim	bc8cb62bcb	torch.compile benchmark utility (#97699 ) I've had many exchanges that look like this https://github.com/rasbt/faster-pytorch-blog/pull/2 so this is an attempt to get make this problem easier Pull Request resolved: https://github.com/pytorch/pytorch/pull/97699 Approved by: https://github.com/ezyang	2023-04-12 03:02:06 +00:00
soulitzer	367051e47e	[docs] Add missing functions to autograd.rst (#98854 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/98854 Approved by: https://github.com/albanD	2023-04-11 20:45:49 +00:00
Jesse Cai	778fd1922a	[core][pruning][be] Rename sparsifier folder to pruner (#98758 ) Summary: att Test Plan: ``` python test/test_ao_sparsity.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98758 Approved by: https://github.com/jerryzh168	2023-04-11 17:26:29 +00:00
Edward Z. Yang	b8b840be3d	Convert logging f-strings to use % format, part five (#98765 ) This does some annoying but simple cases by hand. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/98765 Approved by: https://github.com/wanchaol	2023-04-11 13:17:59 +00:00
Guspan Tanadi	ab385bd49e	docs: Linking ResNeXt PyTorch Hub Pipeline (#98689 ) Introducing ResNeXt model as link to PyTorch Hub see Skip connections section. Handle issue in #98690. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98689 Approved by: https://github.com/zou3519, https://github.com/kit1980	2023-04-11 02:20:26 +00:00
Will Constable	390c51bf87	Skip nnmodule hook guards by default (#98371 ) This PR makes basic nnmodule forward hooks work by default, without any overhead. But it leaves silent correctness issues if users modify/remove their hooks later, thus also emits a warning. - the usual case is to not use hooks, so avoid guard overhead here - registering any hook before compile will trigger a warning about hook support - registering a hook later (or removing one) requires user knowledge and opting in, currently this isn't warnable (but maybe we can observe compiled nnmodules to make it warnable). Why skip hook guards by default instead of not tracing __call__/hooks by default? - avoid having a mode flag that alters dynamo tracing behavior (harder to test both codepaths in CI with full coverage) - the most basic hook usecase (registering a hook before compile, and never removing it) will work by default with this PR, while it would require enablement and incur overhead in the 'not tracing __call__' proposal. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98371 Approved by: https://github.com/jansel	2023-04-07 15:10:51 +00:00
BJ Hargrave	555ab310dc	Add itemsize and nbytes properties to Tensor (#98322 ) Adds properties for itemsize and nbytes to Tensor matching the properties in NumPy. Fixes https://github.com/pytorch/pytorch/issues/12728 Pull Request resolved: https://github.com/pytorch/pytorch/pull/98322 Approved by: https://github.com/ezyang	2023-04-05 12:11:55 +00:00
Aaron Bockover	558e5a240e	Introduce torch.onnx.dynamo_export API (#97920 ) This is the first phase of the new ONNX exporter API for exporting from TorchDynamo and FX, and represents the beginning of a new era for exporting ONNX from PyTorch. The API here is a starting point upon which we will layer more capability and expressiveness in subsequent phases. This first phase introduces the following into `torch.onnx`: ```python dynamo_export( model: torch.nn.Module, /, model_args, export_options: Optional[ExportOptions] = None, model_kwargs, ) -> ExportOutput: ... class ExportOptions: opset_version: Optional[int] = None dynamic_shapes: Optional[bool] = None logger: Optional[logging.Logger] = None class ExportOutputSerializer(Protocol): def serialize( self, export_output: ExportOutput, destination: io.BufferedIOBase, ) -> None: ... class ExportOutput: model_proto: onnx.ModelProto def save( self, destination: Union[str, io.BufferedIOBase], , serializer: Optional[ExportOutputSerializer] = None, ) -> None: ... ``` In addition to the API in the first commit on this PR, we have a few experiments for exporting Dynamo and FX to ONNX that this PR rationalizes through the new Exporter API and adjusts tests to use the new API. - A base `FXGraphModuleExporter` exporter from which all derive: - `DynamoExportExporter`: uses dynamo.export to acquire FX graph - `DynamoOptimizeExporter`: uses dynamo.optimize to acquire FX graph - `FXSymbolicTraceExporter`: uses FX symbolic tracing The `dynamo_export` API currently uses `DynamoOptimizeExporter`. ### Next Steps (subsequent PRs): * Combine `DynamoExportExporter` and `DynamoOptimizeExporter` into a single `DynamoExporter`. * Make it easy to test `FXSymbolicTraceExporter` through the same API; eventually `FXSymbolicTraceExporter` goes away entirely when the Dynamo approach works for large models. We want to keep `FXSymbolicTraceExporter` around for now for experimenting and internal use. * Parameterize (on `ExportOptions`) and consolidate Dynamo exporter tests. - This PR intentionally leaves the existing tests unchanged as much as possible except for the necessary plumbing. * Subsequent API phases: - Diagnostics - Registry, dispatcher, and Custom Ops - Passes - Dynamic shapes Fixes #94774 Pull Request resolved: https://github.com/pytorch/pytorch/pull/97920 Approved by: https://github.com/justinchuby, https://github.com/titaiwangms, https://github.com/thiagocrepaldi, https://github.com/shubhambhokare1	2023-04-04 18:13:29 +00:00
Richard Zou	6b9e22f3f6	Clarify the saving of intermediates in the "extending torch.func" docs (#98020 ) Fixes https://github.com/pytorch/pytorch/issues/97260 We got some feedback that the page reads like "in order to save an input for backward, you must return it as an output of the autograd.Function.forward". Doing so actually raises an error (on master and as of 2.1), but results in an ambiguous situation on 2.0.0. To avoid more users running into this, we clarify the documentation so it doesn't read like the above and clearly mentions that you can save things from the inputs or outputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98020 Approved by: https://github.com/soulitzer, https://github.com/kshitij12345	2023-03-31 13:57:37 +00:00
drisspg	a5b6f10c5d	Fix format bug in NT docs (#97998 ) Fixes a formatting bug in the NT docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/97998 Approved by: https://github.com/jbschlosser	2023-03-31 01:00:25 +00:00
Driss Guessous	5a81508bb6	Add NestedTensor ops: logical_not, logical_not_, masked_fill (#97934 ) # Summary <!-- copilot:summary --> ### <samp>🤖 Generated by Copilot at 7954302</samp> This pull request adds support for `logical_not` and `masked_fill` operations on nested tensors, which are tensors that can have tensors as elements. It modifies the `native_functions.yaml` file to dispatch these operations to the nested tensor backend, implements the logic for these operations in `NestedTensorBinaryOps.cpp` and `NestedTensorUnaryOps.cpp`, adds documentation in `nested.rst`, and adds tests in `test_nestedtensor.py`. ## Description <!-- copilot:walkthrough --> ### <samp>🤖 Generated by Copilot at 7954302</samp> * Implement `logical_not` operation on nested tensors ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1164), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1172), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f7c94671810b3ce652f9ad5458518cb7bbd67e8bf7e84e0a2fba641d878ba7c5R45-R56), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR203), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0L854-R867)) - Add `NestedTensor_logical_not` and `NestedTensor_logical_not_` functions to `native_functions.yaml` for CPU and CUDA dispatch ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1164), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R1172)) - Define `NestedTensor_logical_not` and `NestedTensor_logical_not_` functions in `NestedTensorUnaryOps.cpp` using `map_nt` and `get_buffer` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f7c94671810b3ce652f9ad5458518cb7bbd67e8bf7e84e0a2fba641d878ba7c5R45-R56)) - Document `torch.logical_not` function for nested tensors in `nested.rst` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR203)) - Add subtest for `logical_not` function in `test_activations` method in `TestNestedTensorDeviceType` class in `test_nestedtensor.py` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0L854-R867)) * Implement `masked_fill` operation on nested tensors ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R7439), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L210-R224), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR197), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R677-R688), [link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R2515-R2528)) - Add `NestedTensor_masked_fill` function to `native_functions.yaml` for CPU and CUDA dispatch ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991R7439)) - Define `NestedTensor_masked_fill` function in `NestedTensorBinaryOps.cpp` using `NestedTensor_elementwise_Tensor` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L210-R224)) - Document `torch.Tensor.masked_fill` function for nested tensors in `nested.rst` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-c8b131d009badb3f92031b2aaa6e7f93a793f13caee278ea78e1c57d78c0399eR197)) - Add test case for `masked_fill` function in `TestNestedTensorDeviceType` class in `test_nestedtensor.py` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R677-R688)) - Add test case for backward pass of `masked_fill` function in `TestNestedTensorAutograd` class in `test_nestedtensor.py` ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-6eef496a8ec635930b6e52507358e069c80021f3535b8737d39e14ffc38950c0R2515-R2528)) * Improve error message for unsupported element-wise binary operations on nested dense tensors ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L142-R150)) - Modify `NestedTensor_elementwise_Tensor` function in `NestedTensorBinaryOps.cpp` to include operation name in error message ([link](https://github.com/pytorch/pytorch/pull/97934/files?diff=unified&w=0#diff-f847e41e3d373230df0b25574e993ec0e6b699bf16796b3df9ae9fb518048e25L142-R150)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97934 Approved by: https://github.com/cpuhrsch	2023-03-30 08:14:39 +00:00
Driss Guessous	f603873c1b	add various NT ops needed for testing (#97837 ) # Summary Add some Simple unary and binary NT ops - Sub - sgn - abs Pull Request resolved: https://github.com/pytorch/pytorch/pull/97837 Approved by: https://github.com/cpuhrsch	2023-03-29 23:43:37 +00:00
vfdev	0f424f7f05	Fixed broken link to troubleshooting.html docs page (#97330 ) Seen first in error message: ``` [2023-03-22 10:30:39,786] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:407) reasons: w == 857 to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html. [2023-03-22 10:30:40,036] torch._dynamo.convert_frame: [WARNING] torch._dynamo hit config.cache_size_limit (64) function: '<resume in paste_mask_in_image>' (/vision/torchvision/models/detection/roi_heads.py:406) reasons: ___stack0 == 207 to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html. ``` Broken link: - https://pytorch.org/docs/master/dynamo/troubleshooting.html. Good link: - https://pytorch.org/docs/master/compile/troubleshooting.html Pull Request resolved: https://github.com/pytorch/pytorch/pull/97330 Approved by: https://github.com/zou3519	2023-03-22 16:40:21 +00:00
Mikayla Gawarecki	b04363ead4	[easy] Expose documentation for a few global nn.Module hooks (#97185 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/97185 Approved by: https://github.com/albanD	2023-03-21 20:09:29 +00:00
Kazuaki Ishizaki	50ed38a7eb	Fix typo under docs directory (#97202 ) This PR fixes typo in `.rst` files under docs directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97202 Approved by: https://github.com/kit1980	2023-03-21 01:24:10 +00:00
Driss Guessous	a269e5fa04	Add forward and backward support for silu to NestedTensors (#97181 ) # Summary Add forward and backward support for silu to NestedTensors - Add forward support to silu - Add forward support to silu_ - Add backward support to silu - Add to NT docs - Add tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/97181 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2023-03-20 23:46:12 +00:00
Mark Saroufim	6110effa86	Rework torch.compile docs (#96706 ) Chatted with @stas00 on slack and here are some great improvements he suggested to the compile docs - [x] Rename `dynamo` folder to `compile` - [x] Link `compile` docstring on `torch.html` to main index page for compile - [x] Create a new index page that describes why people should care - [x] easy perf, memory reduction, 1 line - [x] Short benchmark table - [x] How to guide - [x] TOC that links to the more technical pages folks have written, make the existing docs we have a Technical overview - [x] Highlight the new APIs for `torch._inductor.list_options()` and `torch._inductor.list_mode_options()` - clarify these are inductor specific and add more prose around which ones are most interesting He also highlighted an interesting way to think about who is reading this doc we have - [x] End users, that just want things to run fast - [x] Library maintainers wrapping torch.compile which would care for example about understanding when in their code they should compile a model, which backends are supported - [x] Debuggers who needs are somewhat addressed by the troubleshooting guide and faq but those could be dramatically reworked to say what we expect to break And in a seperate PR I'll work on the below with @SherlockNoMad - [ ] Authors of new backends that care about how to plug into dynamo or inductor layer so need to explain some more internals like - [ ] IR - [ ] Where to plugin, dynamo? inductor? triton? Pull Request resolved: https://github.com/pytorch/pytorch/pull/96706 Approved by: https://github.com/svekars	2023-03-15 04:41:13 +00:00
Bin Bao	f03db8d6cb	[reland2][inductor] Add an AOT compilation mode for Inductor CPP backend (#96520 ) Summary: This is a reland of https://github.com/pytorch/pytorch/pull/94822. Solved the long compilation issue for inductor cpp tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96520 Approved by: https://github.com/huydhn, https://github.com/malfet	2023-03-14 16:10:54 +00:00
eqy	6e3e22d58c	[CUDA][cuFFT] Minor fix for cuFFT plan cache docs (#96373 ) The attributes described in the docs require indexing in to the plan cache manager, as there is a separate plan cache per device. CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/96373 Approved by: https://github.com/ngimel	2023-03-14 00:28:14 +00:00
Driss Guessous	f330281fb2	Add torch.nn.LayerNorm() to documented list of supported nested tensor ops (#96434 ) Layer norm is supported and this updates the documentation to reflect that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96434 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2023-03-13 23:16:09 +00:00
Joel Schlosser	30d56dd8c1	Support randn_like() for NT (#96528 ) To satisfy an internal ask. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96528 Approved by: https://github.com/mikaylagawarecki, https://github.com/cpuhrsch	2023-03-13 19:39:51 +00:00
Kiuk Chung	55a1bd3fc6	[PT-D] Update CODEOWNERS, merge_rules, and Persons-of-Interest for to… (#96321 ) Synchronize CODEOWNERS, merge_rules, and POI files to reflect kiukchung and d4l3k (Tristan Rice) as one of the maintainers for the distributed module. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96321 Approved by: https://github.com/d4l3k, https://github.com/albanD, https://github.com/malfet	2023-03-13 17:38:43 +00:00
Joel Schlosser	024ea1a21e	Support zeros_like() for NT (#96527 ) This is used for the fake tensor fallbacks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96527 Approved by: https://github.com/cpuhrsch	2023-03-13 15:15:08 +00:00
Rishub Tamirisa	f3b8638074	Adding nn.ZeroPad1d and nn.ZeroPad3d (#96295 ) Fixes #95796 ### Implementation Adds python implementation for `nn.ZeroPad1d` and `nn.ZeroPad3d` in `torch/nn/modules/padding.py`. Adds cpp implementation for `nn::ZeroPad1d` and `nn::ZeroPad3d` in the following 3 files, refactored with templates similarly to `nn::ConstantPad`'s implementation: <br> - `torch/crsc/api/include/torch/nn/modules/padding.h` - `torch/csrc/api/include/torch/nn/options/padding.h` - `torch/csrc/api/src/nn/modules/padding.cpp` Also added relevant definitions in `torch/nn/modules/__init__.py`. ### Testing Adds the following tests: - cpp tests of similar length and structure as `ConstantPad` and the existing `ZeroPad2d` impl in `test/cpp/api/modules.cpp` - cpp API parity tests in `torch/testing/_internal/common_nn.py` - module init tests in `test/test_module_init.py` Also added relevant definitions in `test/cpp_api_parity/parity-tracker.md` Pull Request resolved: https://github.com/pytorch/pytorch/pull/96295 Approved by: https://github.com/soulitzer	2023-03-10 03:51:41 +00:00
Joel Schlosser	7324aef9a8	Add torch.empty_like() to documented list of supported nested tensor ops (#96211 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96211 Approved by: https://github.com/drisspg	2023-03-07 23:33:34 +00:00
Iris	a7698a8260	[DCP] Add DCP FSDP sharded_state_dict checkpoint example to DCP .rst file (#95517 ) As title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95517 Approved by: https://github.com/kumpera	2023-03-03 18:09:10 +00:00
Svetlana Karslioglu	004bcffc6a	Fix formatting (#95906 ) Fixing list formatting by adding a missing blank line: Before: ![Screenshot 2023-03-02 at 3 17 28 PM (2)](https://user-images.githubusercontent.com/5317992/222585127-9b6ed4dd-4719-4756-b2ac-1ba6e8f97b87.png) After: ![Screenshot 2023-03-02 at 3 16 48 PM (2)](https://user-images.githubusercontent.com/5317992/222585172-3ef35a48-641f-4b73-9f7b-f419a122196b.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95906 Approved by: https://github.com/orionr	2023-03-03 16:18:12 +00:00
Michael Lazos	184fb9f11d	Small doc update for torch_compile_debug (#95809 ) Updates the troubleshooting documentation with the folder structure of the debug directory Pull Request resolved: https://github.com/pytorch/pytorch/pull/95809 Approved by: https://github.com/msaroufim	2023-03-02 00:25:28 +00:00
Mark Saroufim	f7b26bdd22	Remove mention of dynamo.optimize() in docs (#95802 ) This should be self containable to merge but other stuff that's been bugging me is * Instructions on debugging IMA issues * Dynamic shape instructions * Explaining config options better Will look at adding a config options doc Pull Request resolved: https://github.com/pytorch/pytorch/pull/95802 Approved by: https://github.com/svekars	2023-03-01 23:24:09 +00:00
ajithvallabai	e9c70b0b20	Fix typo and grammatical errors in community docs and dynamo docs (#95692 ) Fixes typo and grammatical errors in community docs and dynamo docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/95692 Approved by: https://github.com/H-Huang	2023-03-01 18:10:46 +00:00
ajithvallabai	3944e7c3e8	Fix grammatical errors in contribution guide (#95454 ) Fixed following errors in contribution guide. "deep neural networks using a on tape-based autograd systems." to "deep neural networks using a tape-based autograd systems." "the best entrance point and are great places to start." to "the best entrance points and are great places to start." Pull Request resolved: https://github.com/pytorch/pytorch/pull/95454 Approved by: https://github.com/ezyang	2023-02-28 03:44:40 +00:00
Svetlana Karslioglu	d7146e7870	Update copyright (#95652 ) Updating the copyright to reflect on the website. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95652 Approved by: https://github.com/atalman	2023-02-27 23:15:55 +00:00
Jane Xu	b215af2db8	[optim] Add general documentation on our algorithm defaults (#95391 ) I added a section + table under Algorithms https://docs-preview.pytorch.org/95391/optim.html?highlight=optim#module-torch.optim <img width="725" alt="image" src="https://user-images.githubusercontent.com/31798555/221246256-99325a27-9016-407b-a9fe-404d61e41a82.png"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/95391 Approved by: https://github.com/albanD	2023-02-24 21:35:30 +00:00
Mark Saroufim	9f707f164e	Add more GPU metric instrumentation (#91717 ) Fixes https://github.com/pytorch/serve/issues/1937 A fairly common query I see folks running while using pytorch is `nvidia-smi --format=csv,noheader,nounits --query-gpu=utilization.gpu,utilization.memory,memory.total,memory.used,temperature.gpu,power.draw,clocks.current.sm,clocks.current.memory -l 10` Existing metrics we have * For kernel utilization`torch.cuda.utilization()` * For memory utilization we have them under `torch.cuda.memory` the memory allocated with `torch.cuda.memory.memory_allocated()` * For total available memory we have `torch.cuda.get_device_properties(0).total_memory` Which means the only metrics we're missing are * Temperature: now in `torch.cuda.temperature()` * Power draw: now in `torch.cuda.power()` * Clock speed: now in `torch.cuda.clock_speed()` With some important details on each * Clock speed settings: I picked the SM clock domain which is documented here https://docs.nvidia.com/deploy/nvml-api/group__nvmlDeviceEnumvs.html#group__nvmlDeviceEnumvs_1g805c0647be9996589fc5e3f6ff680c64 * Temperature: I use `pynvml.nvmlDeviceGetTemperature(handle, 0)` where 0 refers to the GPU die temperature Pull Request resolved: https://github.com/pytorch/pytorch/pull/91717 Approved by: https://github.com/ngimel	2023-02-24 00:38:03 +00:00
Atharva Kavitkar	627282fa6c	Corrected grammar in contribution guide (#93014 ) Corrected the grammar of a sentence in "Implementing Features or Fixing Bugs" section of the contribution guide. Before: Issues that are labeled first-new-issue, low, or medium priority provide the best entrance point are great places to start. After: Issues that are labeled first-new-issue, low, or medium priority provide the best entrance point _and_ are great places to start. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93014 Approved by: https://github.com/albanD, https://github.com/kit1980	2023-02-24 00:22:14 +00:00
fduwjj	b209d8fa0d	[PT-D][Sequence Parallelism] Enable DTensor based Naive sequence parallelism (#94369 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94369 Approved by: https://github.com/wanchaol	2023-02-16 21:21:00 +00:00

1 2 3 4 5 ...

2077 Commits