pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Xu Han	ba81c3c290	[inductor] add cpp builder code. (take 2) (#125849 ) Fully manual rebase the code of PR: https://github.com/pytorch/pytorch/pull/124045 The old PR seems crashed due to too many commits, and too many times rebase. Please reference: https://github.com/pytorch/pytorch/pull/124045#issuecomment-2103744588 ------- It is the first step of RFC https://github.com/pytorch/pytorch/issues/124245. Changes: 1. Add cpp builder code, the new cpp_builder support Windows OS. 2. Add CPU ISA checker which is cross OS and exported from backend cpuinfo. 3. Switch compiler ISA checker to new cpp builder. 4. CppCodeCache use the new ISA checker. 5. Add temprary `test_new_cpp_build_logical` UT to help on transfer to new code. <img width="1853" alt="Image" src="https://github.com/pytorch/pytorch/assets/8433590/ce6519ab-ba92-4204-b1d6-7d15d2ba2cbe"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125849 Approved by: https://github.com/jgong5, https://github.com/desertfire	2024-06-07 20:49:58 +00:00
Animesh Jain	662a78f957	[dynamo] Inline the getattr of fx graph and proxy graph (#128172 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128172 Approved by: https://github.com/yanboliang ghstack dependencies: #128001, #126578, #128158	2024-06-07 17:14:58 +00:00
Xuehai Pan	c97e3ebb96	Fix wrongly exposed variables in `torch/__init__.py` (#127795 ) <img width="609" alt="image" src="https://github.com/pytorch/pytorch/assets/16078332/964c6707-1856-4c2c-8cd8-ce1d96d38d36"> This PR removes temporary variables in `torch/__init__.py`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/127795 Approved by: https://github.com/albanD	2024-06-06 08:31:41 +00:00
PyTorch MergeBot	d1fad416a8	Revert "Add aten._unsafe_masked_index (#116491 )" This reverts commit `f03f8bc901`. Reverted https://github.com/pytorch/pytorch/pull/116491 on behalf of https://github.com/PaliC due to breaking onnx tests ([comment](https://github.com/pytorch/pytorch/pull/116491#issuecomment-2145557724))	2024-06-03 15:51:50 +00:00
Isuru Fernando	f03f8bc901	Add aten._unsafe_masked_index (#116491 ) To generate masked indexing operations that would generate masked loads in triton code Pull Request resolved: https://github.com/pytorch/pytorch/pull/116491 Approved by: https://github.com/lezcano, https://github.com/peterbell10	2024-06-03 14:44:03 +00:00
lezcano	48538d3d14	Implement svd_lowrank and pca_lowrank for complex numbers (#125580 ) We fix a number of bugs previously present in the complex implementation. We also heavily simplify the implementation, using, among other things, that we now have conjugate views. I saw there is a comment regarding how slow some checks on this function are. As such, I removed quite a few of the combinations of inputs to make the OpInfo lighter. I still left a couple relevant examples to not regress coverage though. Fixes https://github.com/pytorch/pytorch/issues/122188 Pull Request resolved: https://github.com/pytorch/pytorch/pull/125580 Approved by: https://github.com/pearu, https://github.com/peterbell10	2024-05-30 14:45:58 +00:00
Andrew M. James	80a8fc07b2	[dynamo] Handle np.iinfo/finfo/dtype as input (#124482 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124482 Approved by: https://github.com/lezcano ghstack dependencies: #124481	2024-05-29 16:00:15 +00:00
Animesh Jain	1507d5205a	[dynamo][fsdp] Skip Dynamo tracing of __getattr__ if its top-level frame (#127263 ) The generated bytecode for the first frame is below. Inlined comments about the LOAD_ATTR which causes Dynamo to trigger again on `__getattr__`. ~~~ [__bytecode] MODIFIED BYTECODE fn /data/users/anijain/pytorch2/test/dynamo/test_activation_checkpointing.py line 1129 [__bytecode] 1129 0 COPY_FREE_VARS 1 [__bytecode] 2 RESUME 0 [__bytecode] 4 PUSH_NULL [__bytecode] 6 LOAD_GLOBAL 10 (__compiled_fn_1) [__bytecode] 18 LOAD_FAST 0 (x) [__bytecode] 20 LOAD_DEREF 1 (mod) [__bytecode] 22 LOAD_ATTR 6 (_checkpoint_wrapped_module) [__bytecode] 32 LOAD_CONST 1 (0) [__bytecode] 34 BINARY_SUBSCR [__bytecode] 44 LOAD_ATTR 7 (weight) [__bytecode] 54 LOAD_DEREF 1 (mod) [__bytecode] 56 LOAD_ATTR 6 (_checkpoint_wrapped_module) [__bytecode] 66 LOAD_CONST 1 (0) [__bytecode] 68 BINARY_SUBSCR [__bytecode] 78 LOAD_ATTR 8 (bias) # When this optimized bytecode is executed, these two lines call the __getattr__ of ActivationWrapper module. # Dynamo gets invoked on __getattr__. # If we had inlined __getattr__ during the tracing, we would have seen the LOAD_ATTR # on more low level data structures like _modules, obviating the need for CPython # to call python overriden __getattr__. But today, UnspecializedNNModuleVariable # calls python getattr at tracing time (instead of inlining it), resulting in LOAD_ATTR # on the module itself. # To prevent Dynamo to skip tracing of __Getattr__ on the optimized bytecode, # we can check if its top level frame and just skip it. [__bytecode] 88 LOAD_DEREF 1 (mod) [__bytecode] 90 LOAD_ATTR 0 (a) [__bytecode] 100 PRECALL 4 [__bytecode] 104 CALL 4 [__bytecode] 114 UNPACK_SEQUENCE 1 [__bytecode] 118 RETURN_VALUE ~~~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127263 Approved by: https://github.com/yf225	2024-05-28 08:16:53 +00:00
Yu, Guangye	e7a42702f9	generalize custom_fwd&custom_bwd to be device-agnostic (#126531 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126531 Approved by: https://github.com/jgong5, https://github.com/gujinghui, https://github.com/albanD, https://github.com/EikanWang ghstack dependencies: #126527	2024-05-25 06:48:16 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	72f0bdcc22	Remove torch._constrain_as_value (#127103 ) Summary: This API doesn't do anything useful and should be subsumed by torch._check. Test Plan: CI Differential Revision: D57786740 Pull Request resolved: https://github.com/pytorch/pytorch/pull/127103 Approved by: https://github.com/angelayi	2024-05-24 22:49:46 +00:00
Oguz Ulgen	a6155d23d1	[easy] Delete dead code global (#126903 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/126903 Approved by: https://github.com/aorenste ghstack dependencies: #126083	2024-05-23 08:29:29 +00:00
Oguz Ulgen	cc61d03ac9	Do not trace into triton/backends (#126083 ) Fixes #125807 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126083 Approved by: https://github.com/yanboliang, https://github.com/jansel	2024-05-23 08:29:29 +00:00
Jack Taylor	d30cdc4321	[ROCm] amdsmi library integration (#119182 ) Adds monitoring support for ROCm using amdsmi in place of pynvml. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119182 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/xw285cornell	2024-05-21 01:59:26 +00:00
Jason Ansel	f9de510121	[dynamo] Graph break on set_num_threads (#126623 ) Fixes #125364 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126623 Approved by: https://github.com/yanboliang	2024-05-20 17:44:32 +00:00
PyTorch MergeBot	315389bfed	Revert "Remove deprecated _aminmax operator (#125995 )" This reverts commit `0116ffae7f`. Reverted https://github.com/pytorch/pytorch/pull/125995 on behalf of https://github.com/huydhn due to Sorry for reverting your change but we need to reland this after I get rid of all usage of _aminmax internally in Meta ([comment](https://github.com/pytorch/pytorch/pull/125995#issuecomment-2113769497))	2024-05-16 01:45:37 +00:00
cyy	0116ffae7f	Remove deprecated _aminmax operator (#125995 ) It has been deprecated for a long time. Co-authored-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/125995 Approved by: https://github.com/ezyang	2024-05-12 17:50:17 +00:00
PyTorch MergeBot	0d4fdb0bb7	Revert "[ROCm] amdsmi library integration (#119182 )" This reverts commit `85447c41e3`. Reverted https://github.com/pytorch/pytorch/pull/119182 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but the ROCm failed test is legit `85447c41e3` ([comment](https://github.com/pytorch/pytorch/pull/119182#issuecomment-2103433197))	2024-05-09 21:18:21 +00:00
Jack Taylor	85447c41e3	[ROCm] amdsmi library integration (#119182 ) Adds monitoring support for ROCm using amdsmi in place of pynvml. Pull Request resolved: https://github.com/pytorch/pytorch/pull/119182 Approved by: https://github.com/jeffdaily, https://github.com/malfet, https://github.com/xw285cornell	2024-05-09 18:21:38 +00:00
Michael Lazos	1b1b18a7a4	Add LRScheduler Composability E2E Tests (#125653 ) adds tests to verify the LRSchedulers correctly update the compiled optimizers without recompiles. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125653 Approved by: https://github.com/yanboliang ghstack dependencies: #123751, #123752, #123753, #125383	2024-05-09 00:52:43 +00:00
PyTorch MergeBot	2e237fcd70	Revert "[inductor] add cpp builder code. (#124045 )" This reverts commit `469383755f`. Reverted https://github.com/pytorch/pytorch/pull/124045 on behalf of https://github.com/clee2000 due to broke inductor/test_codecache and inductor/test_max_autotune `469383755f` https://github.com/pytorch/pytorch/actions/runs/8996772350/job/24724775182 ([comment](https://github.com/pytorch/pytorch/pull/124045#issuecomment-2100851419))	2024-05-08 15:33:20 +00:00
Xu Han	469383755f	[inductor] add cpp builder code. (#124045 ) Previous full PR https://github.com/pytorch/pytorch/pull/115248 is failed to merge due to fb_code is hard to debug. I also tried to submit them as two pieces, https://github.com/pytorch/pytorch/pull/118514 https://github.com/pytorch/pytorch/pull/118515. And they have passed PreCI at that time. Now I tried to split https://github.com/pytorch/pytorch/pull/115248 into smaller piece, and it is the first step of RFC https://github.com/pytorch/pytorch/issues/124245. Changes: 1. Add cpp builder code, the new cpp_builder support Windows OS. 2. Add CPU ISA checker which is cross OS and exported from backend cpuinfo. 3. Switch compiler ISA checker to new cpp builder. 4. CppCodeCache use the new ISA checker. 5. Add temprary `test_new_cpp_build_logical` UT to help on transfer to new code. <img width="1853" alt="Image" src="https://github.com/pytorch/pytorch/assets/8433590/ce6519ab-ba92-4204-b1d6-7d15d2ba2cbe"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124045 Approved by: https://github.com/jgong5, https://github.com/jansel	2024-05-08 05:27:15 +00:00
PyTorch MergeBot	2f79a18324	Revert "[inductor] add cpp builder code. (#124045 )" This reverts commit `7864d287a1`. Reverted https://github.com/pytorch/pytorch/pull/124045 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing trunk jobs `7864d287a1` including lint ([comment](https://github.com/pytorch/pytorch/pull/124045#issuecomment-2099306071))	2024-05-07 21:04:49 +00:00
Xu Han	7864d287a1	[inductor] add cpp builder code. (#124045 ) Previous full PR https://github.com/pytorch/pytorch/pull/115248 is failed to merge due to fb_code is hard to debug. I also tried to submit them as two pieces, https://github.com/pytorch/pytorch/pull/118514 https://github.com/pytorch/pytorch/pull/118515. And they have passed PreCI at that time. Now I tried to split https://github.com/pytorch/pytorch/pull/115248 into smaller piece, and it is the first step of RFC https://github.com/pytorch/pytorch/issues/124245. Changes: 1. Add cpp builder code, the new cpp_builder support Windows OS. 2. Add CPU ISA checker which is cross OS and exported from backend cpuinfo. 3. Switch compiler ISA checker to new cpp builder. 4. CppCodeCache use the new ISA checker. 5. Add temprary `test_new_cpp_build_logical` UT to help on transfer to new code. <img width="1853" alt="Image" src="https://github.com/pytorch/pytorch/assets/8433590/ce6519ab-ba92-4204-b1d6-7d15d2ba2cbe"> Pull Request resolved: https://github.com/pytorch/pytorch/pull/124045 Approved by: https://github.com/jgong5, https://github.com/jansel	2024-05-07 20:07:41 +00:00
Aaron Gokaslan	1dd42e42c4	[BE]: Try TCH autofixes on torch/ (#125536 ) Tries TCH autofixes and see what breaks Pull Request resolved: https://github.com/pytorch/pytorch/pull/125536 Approved by: https://github.com/ezyang	2024-05-05 23:13:59 +00:00
Chien-Chin Huang	1eb7b8eb60	[PT2D] Ensure the trace rules are correct with distributed (#125333 ) Summary: 1. Avoid using `torch._dynamo.disable`. 2. Clear the LRU cache of the trace rules. This won't do anything if rules are not evluated before PG initilization. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125333 Approved by: https://github.com/yanboliang	2024-05-02 16:28:38 +00:00
Brian Hirsh	5173cbe260	fix FakeTensor creation on noncontiguous subclasses (#124399 ) Fixes https://github.com/pytorch/pytorch/issues/125287 Fixes https://github.com/pytorch/pytorch/issues/124090, context on the issue Pull Request resolved: https://github.com/pytorch/pytorch/pull/124399 Approved by: https://github.com/soulitzer ghstack dependencies: #124398	2024-05-01 21:56:01 +00:00
Yanbo Liang	ce503c1b40	Dynamo x autograd.Function supports setup_context (#124802 ) Fixes part of #118397 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124802 Approved by: https://github.com/zou3519	2024-04-27 04:57:13 +00:00
Guilherme Leobas	763dc26e59	[Dynamo] Add dynamo support to torch.func.linearize (#123118 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123118 Approved by: https://github.com/zou3519	2024-04-23 21:31:49 +00:00
Peter Bell	7ecbbc40c3	[HOP][inductor] Add higher order associative scan operator (#119430 ) Currently only supports single tensor scans, e.g. `cumsum`, `cumprod`, `logcumsumexp` Pull Request resolved: https://github.com/pytorch/pytorch/pull/119430 Approved by: https://github.com/Chillee	2024-04-23 14:40:13 +00:00
Jeff Daily	6ede882c0b	preferred blas library; cublaslt gemm implementation (#122106 ) Following the example of PyTorch supporting a preferred Linalg library (cusolver or magma), this PR introduces a preferred blas library selector of either cublas or cublaslt for CUDA and hipblas or hipblaslt for ROCm via normal hipification of sources. The default blas implementation remains cublas or hipblas. cublaslt or hipblaslt can be enabled using environment variable TORCH_BLAS_PREFER_CUBLASLT=1 (or TORCH_BLAS_PREFER_HIPBLASLT=1 as an alias) or by calling `torch.backends.cuda.preferred_blas_library(backend="cublaslt")` or as an alias `backend="hipblaslt"`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/122106 Approved by: https://github.com/lezcano	2024-04-22 15:38:22 +00:00
Aaron Gokaslan	29cc293725	[BE]: FURB142 - Remove set mutations. Use set update (#124551 ) Uses set mutation methods instead of manually reimplementing (update, set_difference etc). Pull Request resolved: https://github.com/pytorch/pytorch/pull/124551 Approved by: https://github.com/ezyang	2024-04-21 14:12:33 +00:00
Animesh Jain	febc4d8759	[dynamo][easy] forbid_in_graph check to use getattr_static (#124445 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124445 Approved by: https://github.com/yanboliang, https://github.com/jansel	2024-04-20 14:11:05 +00:00
soulitzer	cf5ca58e7f	[NJT] Inline through torch.nested.nested_tensor_from_jagged instead of graph break (#124343 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124343 Approved by: https://github.com/jbschlosser	2024-04-19 23:13:59 +00:00
PyTorch MergeBot	4a0900d04b	Revert "[NJT] Inline through torch.nested.nested_tensor_from_jagged instead of graph break (#124343 )" This reverts commit `ef93402f61`. Reverted https://github.com/pytorch/pytorch/pull/124343 on behalf of https://github.com/DanilBaibak due to Broken trunk ([comment](https://github.com/pytorch/pytorch/pull/124343#issuecomment-2064937192))	2024-04-18 18:55:48 +00:00
Jason Ansel	7a6edb0b66	Possible fix for einops warning (#124084 ) See https://github.com/arogozhnikov/einops/issues/315 Pull Request resolved: https://github.com/pytorch/pytorch/pull/124084 Approved by: https://github.com/peterbell10	2024-04-18 17:09:50 +00:00
soulitzer	ef93402f61	[NJT] Inline through torch.nested.nested_tensor_from_jagged instead of graph break (#124343 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/124343 Approved by: https://github.com/jbschlosser	2024-04-18 14:42:54 +00:00
Aleksandar Samardžić	f5331aade5	Simplify ATen sparse semi-structured operators based on CUTLASS (#123473 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123473 Approved by: https://github.com/cpuhrsch	2024-04-14 06:57:41 +00:00
PyTorch MergeBot	97261be0a8	Revert "Simplify ATen sparse semi-structured operators based on CUTLASS (#123473 )" This reverts commit `b2a0b8c446`. Reverted https://github.com/pytorch/pytorch/pull/123473 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/123473#issuecomment-2053561077))	2024-04-13 07:47:32 +00:00
rzou	5d1f9bd2bc	Move the trace_rules.py docs up (#123873 ) I always remember that the docs exist but cannot actually find it in the file because it is on line 3000. Moving it to the top of the file for visibility. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123873 Approved by: https://github.com/yanboliang	2024-04-12 20:18:38 +00:00
Aleksandar Samardžić	b2a0b8c446	Simplify ATen sparse semi-structured operators based on CUTLASS (#123473 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123473 Approved by: https://github.com/cpuhrsch	2024-04-11 11:56:27 +00:00
Guilherme Leobas	2a37793249	[Dynamo] Ensure that Higher Order Ops can be composed in dynamo (#123357 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123357 Approved by: https://github.com/zou3519 ghstack dependencies: #122211	2024-04-09 18:50:17 +00:00
Guilherme Leobas	dbe0c474a9	Ensure all `torch.func.*` functions capture can be disabled (#122212 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122212 Approved by: https://github.com/zou3519 ghstack dependencies: #122211	2024-04-05 03:29:11 +00:00
Joel Schlosser	721dcaff94	Revert usage of NJT views in SDPA (#123215 ) For internal purposes, this PR reverts the use of real views in SDPA -> autograd.Function "views" (i.e. `ViewBufferFromNested` and `ViewNestedFromBuffer`). This is a temporary fix to get the FIRST model launched and working. Note: this breaks some other Dynamo tests related to SDPA that rely on real views, but the breakage there isn't expected to be likely in a real-world scenario. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123215 Approved by: https://github.com/YuqingJ	2024-04-04 18:45:47 +00:00
PyTorch MergeBot	63d17d3c90	Revert "Revert usage of NJT views in SDPA (#123215 )" This reverts commit `0fcddb5625`. Reverted https://github.com/pytorch/pytorch/pull/123215 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I think it needs to be skipped on ROCm `0fcddb5625` ([comment](https://github.com/pytorch/pytorch/pull/123215#issuecomment-2036080570))	2024-04-04 02:57:09 +00:00
Joel Schlosser	0fcddb5625	Revert usage of NJT views in SDPA (#123215 ) For internal purposes, this PR reverts the use of real views in SDPA -> autograd.Function "views" (i.e. `ViewBufferFromNested` and `ViewNestedFromBuffer`). This is a temporary fix to get the FIRST model launched and working. Note: this breaks some other Dynamo tests related to SDPA that rely on real views, but the breakage there isn't expected to be likely in a real-world scenario. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123215 Approved by: https://github.com/YuqingJ	2024-04-03 23:25:31 +00:00
rzou	44c0c0fc0f	Add torch.library.custom_op (#122344 ) This is the entrypoint for defining an opaque/blackbox (e.g. PyTorch will never peek into it) custom op. In this PR, you can specify backend impls and the abstract impl for this op. NB: most of this PR is docstrings, please don't be intimidated by the line count. There are a number of interesting features: - we infer the schema from type hints. In a followup I add the ability to manually specify a schema. - name inference. The user needs to manually specify an op name for now. In a followup we add the ability to automatically infer a name (this is a little tricky). - custom_op registrations can override each other. This makes them more pleasant to work with in environments like colab. - we require that the outputs of the custom_op do not alias any inputs or each other. We enforce this via a runtime check, but can relax this into an opcheck test if it really matters in the future. Test Plan: - new tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/122344 Approved by: https://github.com/ezyang, https://github.com/albanD	2024-04-03 18:36:17 +00:00
willfengg	f1c4d0fb2c	[dynamo] show inlining reasons from trace_rules (#123014 ) show specific inlining reasons with ``TORCH_LOGS="+dynamo" TORCHDYNAMO_VERBOSE=1`` * before, ``INLINING <code...>, inlined according trace_rules.lookup`` * after, ``INLINING <code...> inlined according trace_rules.lookup MOD_INLINELIST`` this can distanguish between inlining by default or by MOD_INLINELIST (specific rule) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123014 Approved by: https://github.com/jansel ghstack dependencies: #123013	2024-04-02 03:04:22 +00:00
willfengg	d765e223ac	[dynamo][PT2D] avoid skipping dynamo_resume_* in torch/testing/_internal (#123013 ) this PR ensures ``dynamo_resume_`` survives ``trace_rules.py``. As a ground truth, modules defined outside of ``pytorch/torch`` folders can survive ``trace_rules.py`` Pull Request resolved: https://github.com/pytorch/pytorch/pull/123013 Approved by: https://github.com/jansel	2024-04-01 21:12:48 +00:00
Yu, Guangye	eb7adc3ae0	Refactor gpu trace to be device-agnostic (#121794 ) # Motivation Refactor gpu trace to be device-agnostic. gpu trace is usually used in runtime components, including Device, Stream, Event, Guard, and Allocator. It should be device-agnostic and can be shared among each device backend. # Solution move `_cuda_trace.py` to `_gpu_trace.py`, which makes each device backend owns their callback, respectively. Pull Request resolved: https://github.com/pytorch/pytorch/pull/121794 Approved by: https://github.com/jgong5, https://github.com/albanD, https://github.com/EikanWang, https://github.com/gujinghui	2024-03-30 13:04:38 +00:00
Mikayla Gawarecki	487b6d40ec	Add RMSNorm module (#121364 ) Similar to `dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L51)` The implementation here is not optimized and we welcome pull requests to improve this - Use `normalized_shape` instead of singular integer `dim` to be aligned with the `nn.LayerNorm` implementation - Remove the [upcast to float and downcast ](`dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L73)`) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [D55485840](https://our.internmc.facebook.com/intern/diff/D55485840) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121364 Approved by: https://github.com/albanD	2024-03-29 18:05:28 +00:00

1 2 3

142 Commits