Commit Graph

3520 Commits

Author SHA1 Message Date
Basil Wong
aafc4b6188 Do not depend on numpy during the import (#150816)
Summary:
Related issue: https://github.com/pytorch/pytorch/issues/149681

We can follow up with a different implementation that does not use numpy(potentially with Torch primitives).

Test Plan:
pending:

contbuild & OSS CI

Differential Revision: D72609835

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150816
Approved by: https://github.com/jerryzh168, https://github.com/cyyever, https://github.com/albanD
2025-04-08 18:12:53 +00:00
zeshengzong
c9c0f8eae3 Add plot for torch.nn.Threshold and torch.nn.GLU (#150171)
Fixes #150170

## Changes

- Add plot for `torch.nn.Threshold` and `torch.nn.GLU`
- Add example output make them easier get result by users

## Test Result

![image](https://github.com/user-attachments/assets/f6c5bc46-f9b7-4db7-9797-e08d8423d1b3)

![image](https://github.com/user-attachments/assets/ad4e6c84-7b29-44f1-b7bd-9c81e4a92ef8)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150171
Approved by: https://github.com/albanD
2025-04-08 03:55:37 +00:00
Svetlana Karslioglu
277369ac16 Move formulas on separate line in loss.py (#150565)
Move formulas on separate line in loss.py for better readability.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150565
Approved by: https://github.com/mikaylagawarecki
2025-04-03 20:47:35 +00:00
drisspg
2e5d95a082 [FlexAttention] Remove dead code (#150575)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150575
Approved by: https://github.com/Chillee, https://github.com/BoyuanFeng
2025-04-03 01:46:19 +00:00
Ryan Guo
bb98749230 [dynamo] Always trace into tensor subclass __torch_function__ (#149792)
This patch effectively ignores traceable_tensor_subclasses, allowing
Dynamo to always try tracing into the `__torch_function__` of tensor
subclass. This helps us with 2 things:
1. allowing users to directly benefit from better compilation of tensor
   subclass, by just upgrading pytorch, without having to change legacy
   library code (see earlier patches in the stack for examples).
2. potentially exposing more issues in compiling tensor subclass, so we
   can get signals and improve them.

As a consequence, it exposed and fixes 2 subtle bugs:
1. In `build_torch_function_fn`, we could get
   `torch._C._disabled_torch_function_impl` because we have a
   `Parameter` subclass without `__torch_function__` override or if we
   have a tensor subclass with `__torch_dispatch__` override. We graph
   break on this for now, and plan to add support -- the logic for
   simulating `torch._C._disabled_torch_function_impl` is already in
   `SuperVariable`, we just need to reuse it.
2. Sometimes we create `SyntheticLocalSource` and need to remove all the
   guards installed on it, but we only removed the ones whose source
   _is_ the created synthetic source `s`, but forgot about chained
   source like `s.foo`, this showed up as
   `SYNTHETIC_LOCAL['tmp_0'].__torch_function__.__func__`.

Differential Revision: [D71906141](https://our.internmc.facebook.com/intern/diff/D71906141)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149792
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #149482, #149483, #149484
2025-04-02 20:57:00 +00:00
PyTorch MergeBot
e545567340 Revert "[dynamo] Always trace into tensor subclass __torch_function__ (#149792)"
This reverts commit 238109ad32.

Reverted https://github.com/pytorch/pytorch/pull/149792 on behalf of https://github.com/malfet due to Broke trunk, see b03c42109c/1 ([comment](https://github.com/pytorch/pytorch/pull/149482#issuecomment-2773650522))
2025-04-02 20:30:32 +00:00
Ryan Guo
238109ad32 [dynamo] Always trace into tensor subclass __torch_function__ (#149792)
This patch effectively ignores traceable_tensor_subclasses, allowing
Dynamo to always try tracing into the `__torch_function__` of tensor
subclass. This helps us with 2 things:
1. allowing users to directly benefit from better compilation of tensor
   subclass, by just upgrading pytorch, without having to change legacy
   library code (see earlier patches in the stack for examples).
2. potentially exposing more issues in compiling tensor subclass, so we
   can get signals and improve them.

As a consequence, it exposed and fixes 2 subtle bugs:
1. In `build_torch_function_fn`, we could get
   `torch._C._disabled_torch_function_impl` because we have a
   `Parameter` subclass without `__torch_function__` override or if we
   have a tensor subclass with `__torch_dispatch__` override. We graph
   break on this for now, and plan to add support -- the logic for
   simulating `torch._C._disabled_torch_function_impl` is already in
   `SuperVariable`, we just need to reuse it.
2. Sometimes we create `SyntheticLocalSource` and need to remove all the
   guards installed on it, but we only removed the ones whose source
   _is_ the created synthetic source `s`, but forgot about chained
   source like `s.foo`, this showed up as
   `SYNTHETIC_LOCAL['tmp_0'].__torch_function__.__func__`.

Differential Revision: [D71906141](https://our.internmc.facebook.com/intern/diff/D71906141)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149792
Approved by: https://github.com/jansel, https://github.com/mlazos
ghstack dependencies: #149482, #149483, #149484
2025-04-02 17:05:25 +00:00
dscamiss
59abb8c7a2 Fix documentation build errors caused by unsupported section titles (#150205)
Fixes #150134

Build with `make html` looks OK now:
```shell
reading sources... [100%] torch.compiler_get_started .. xpu
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [ 80%] generated/torch.nn.Softsign .. generated/torch.nn.modules.module.register_module_full_backward_writing output... [ 86%] generated/torch.nn.modules.module.register_module_module_registration_hook .. generated/torch.rwriting output... [100%] generated/torch.xpu.get_rng_state .. xpu
generating indices... genindex done
highlighting module code... [100%] typing
writing additional pages... search done
copying images... [100%] _static/img/torch_cuda_memory/allocator_state_history.png
copying static files... done
copying extra files... done
dumping search index in English (code: en)... done
dumping object inventory... done
build succeeded.

The HTML pages are in build/html.
```

New rendering looks like this:

![image](https://github.com/user-attachments/assets/af7e23a5-9dfd-4cb6-9333-a9e8cfe47ea0)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150205
Approved by: https://github.com/albanD
2025-03-31 04:27:44 +00:00
jj hunt
46c8f2e965 Update docstring to match code. (#148455)
Very tiny fix to doc string. Pass grid_size=None results in an Exception.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148455
Approved by: https://github.com/mikaylagawarecki
2025-03-31 04:16:11 +00:00
Horace He
3140565db6 Update type of create_block_mask to more accurately reflect things (#150244)
Fixes some mypy issues
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150244
Approved by: https://github.com/drisspg
2025-03-29 21:55:57 +00:00
zeshengzong
cb83850a24 Fix docs format error in torch.nn (#150156)
Fixes #150152

Fix format error in [torch.nn.CosineSimilarity](https://pytorch.org/docs/stable/generated/torch.nn.CosineSimilarity.html#torch.nn.CosineSimilarity), [torch.nn.KLDivLoss](https://pytorch.org/docs/stable/generated/torch.nn.KLDivLoss.html#torch.nn.KLDivLoss) and other pages.

## Test Result

### Before

#### torch.nn.CosineSimilarity

![Image](https://github.com/user-attachments/assets/1ad633d9-dfaf-43f0-a536-9035a24bf858)

#### torch.nn.KLDivLoss

![Image](https://github.com/user-attachments/assets/20a001b0-1f66-414e-b554-11934d65a4bf)

### After
#### torch.nn.CosineSimilarity
![image](https://github.com/user-attachments/assets/a2d9ea8d-5637-4604-a0e4-9231a4deee44)

#### torch.nn.KLDivLoss
![image](https://github.com/user-attachments/assets/d0e319f9-a3b3-47a7-b2f8-060d46d53bc7)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150156
Approved by: https://github.com/cyyever, https://github.com/malfet
2025-03-28 20:54:09 +00:00
Vincent Moens
3c85784980 Fix broken LazyLinear init (#149693)
Fixes #149691

I beleive it does not impact negatively the fix in https://github.com/pytorch/pytorch/pull/147599 as the tests stilll pass but @FFFrog should confirm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149693
Approved by: https://github.com/mikaylagawarecki, https://github.com/FFFrog, https://github.com/malfet
2025-03-25 23:49:49 +00:00
FFFrog
466d5295c1 Fixed abnormal behavior of LazyLinear when using LayzLinear and load_state together (#147599)
Update Points:
- Update the logic of ``initialize_parameters``
- Add new testcases

The ISSUE Related:
https://github.com/pytorch/pytorch/issues/147389
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147599
Approved by: https://github.com/mikaylagawarecki
2025-03-19 10:01:12 +00:00
zeshengzong
1cc5f6b623 Optimize MaxPool1d param ceil_mode description (#148869)
Fixes #148123

Add output shape formula based on `ceil_mode` value, according to

00199acdb8/aten/src/ATen/native/Pool.h (L61-L75)

## Test Result

### Before

![image](https://github.com/user-attachments/assets/0a175178-a104-4348-a14b-516e866d533a)

### After

![image](https://github.com/user-attachments/assets/ce621d4b-1986-41fb-bd71-2b03c0aa996e)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148869
Approved by: https://github.com/mikaylagawarecki
2025-03-17 08:50:40 +00:00
zeshengzong
a7f8de2198 Add nn.Bilinear param validation (#149018)
Fixes #103425

## Changes

- Add doc description size value `must be > 0`
- Add validation for `in1_features` param

Currently, only `in1_features` will cause runtime error, if add checks for `in2_features` and `out_features` as well, might be kind of BC breaking.

```python
import torch
from torch import nn

class lenet(nn.Module):
    def __init__(self):
        super(lenet, self).__init__()
        self.conv = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=5, stride=1)

        # Error, `in1_features=1, in2_features=0, out_features=0` no error
        self.linear = nn.Bilinear(in1_features=0, in2_features=0, out_features=0)

    def forward(self, x):
        # 1st block
        x = self.conv(x)
        x = self.linear(x)

        return x

if __name__ == '__main__':
    net = lenet()

```

## Test Result

```bash
pytest test/test_nn.py -k test_bilinear -vv
```

![image](https://github.com/user-attachments/assets/20617ba9-bac5-4db2-aecc-1831dbc8eb43)

![image](https://github.com/user-attachments/assets/401e4e1f-051a-4e1c-952b-48e85de64b0b)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149018
Approved by: https://github.com/mikaylagawarecki
2025-03-14 19:26:12 +00:00
Eddie Yan
0dcd482e54 [SDPA] Respect sdpa_kernel's priority_order setting in torch.compile (#147768)
[https://github.com/pytorch/pytorch/pull/140467](https://github.com/pytorch/pytorch/pull/140467) added the option to specify a priority order for SDPA but the `torch.compile` path silently ignored this setting as I wasn't aware of the separate context manager handling on `torch.compile`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147768
Approved by: https://github.com/drisspg
2025-03-13 18:52:34 +00:00
Aaron Gokaslan
edd640a95a [BE][Ez]: Use itertools.chain.from_iterable when possible (#148190)
Often makes the code more readable, more efficient, and adds support for infinite iterables.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148190
Approved by: https://github.com/jansel, https://github.com/malfet
2025-03-06 20:37:06 +00:00
zeshengzong
0b0d28accd Optimize param prepend class reference torch.nn.Module (#148304)
Fixes #147696

## Changes

Change `prepend` description  `torch.nn.modules.Module` to `torch.nn.Module`

## Test Result

### Before

![image](https://github.com/user-attachments/assets/054f54b7-9487-4505-a926-3e17a84bd2f9)

### After

![image](https://github.com/user-attachments/assets/1d2a5708-62d1-428e-b136-bcaa35e5e6da)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148304
Approved by: https://github.com/Skylion007
2025-03-04 08:46:14 +00:00
cyy
ec2805ada8 Remove outdated CUDA version check (#148142)
Since Torch requires CUDA>=11, some checks can be removed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148142
Approved by: https://github.com/janeyx99, https://github.com/eqy
2025-03-04 03:33:44 +00:00
cyy
98bf2f1170 Use Python 3.9 typing (#148157)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148157
Approved by: https://github.com/janeyx99
2025-03-04 03:09:55 +00:00
PyTorch MergeBot
5b6ad682bc Revert "[TorchRec][PT2] disable contextlib in PT2 train pipeline (#147254)"
This reverts commit 85ea679834.

Reverted https://github.com/pytorch/pytorch/pull/147254 on behalf of https://github.com/jeanschmidt due to introduced reds on main ([comment](https://github.com/pytorch/pytorch/pull/147254#issuecomment-2677700862))
2025-02-24 08:20:16 +00:00
Huanyu He
85ea679834
[TorchRec][PT2] disable contextlib in PT2 train pipeline (#147254)
[TorchRec][PT2] disable contextlib in PT2 train pipeline (#147254)

Summary:

# context
* more details in the [post](https://fb.workplace.com/groups/1075192433118967/permalink/1587079018596970/)
* disable contextlib with PT2

Test Plan:
* run command
```
TORCH_SHOW_CPP_STACKTRACES=1 TORCHDYNAMO_EXTENDED_DEBUG_CPP=1 TORCH_LOGS="+dynamo,+graph_code,output_code,dynamic,aot,guards,verbose_guards,recompiles,graph_breaks" TORCH_TRACE=/var/tmp/tt buck2 run fbcode//mode/opt fbcode//aps_models/ads/icvr:icvr_launcher_live -- mode=fmc/local_ig_fm_ultra_mini training.pipeline_type=pt2 data_loader.dataset.table_ds=[2024-12-02] 2>&1 | tee -a output.log
```
* old tlparse
https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpYYAS3o/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=100
* new tlparse
https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpUJhCGZ/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=100

Reviewed By: Microve

Differential Revision: D68480678
2025-02-22 18:57:55 +01:00
zeshengzong
a000c7e6d2 Add hint message for pack_padded_sequence (#146747)
Fixes #144207

Add truncate hint message in docs [torch.nn.utils.rnn.pack_padded_sequence](https://pytorch.org/docs/stable/generated/torch.nn.utils.rnn.pack_padded_sequence.html)

## Test Result

![image](https://github.com/user-attachments/assets/46258f36-f6c7-4f11-9213-8513e52a9001)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146747
Approved by: https://github.com/mikaylagawarecki
2025-02-20 06:27:07 +00:00
Aaron Orenstein
db4ce78d46 PEP585: More UP006 fixes (#146392)
This should be the final PR before we can enable RUFF UP006.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146392
Approved by: https://github.com/justinchuby, https://github.com/albanD, https://github.com/Skylion007
2025-02-20 06:18:13 +00:00
Simon Fan
ed83b0b70b [ddp] decouple python reducer from compilation mode (#147123)
Current implementation reads as: we will only actually use the "python_reducer" config if the DDP forward is compiled. Otherwise, we will silently fallback to C++ reducer + no DDPOptimizer.
I'm changing this behavior to always use the python reducer if the config is specified.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147123
Approved by: https://github.com/fegin
2025-02-19 15:51:40 +00:00
lzhang2
b16ae97ad0 Generalize mixed precision in DDP (#146808)
**Motivation:**

1. Generalize mixed precision in DDP.
2. Enable `SyncBatchNorm` for XPU device.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146808
Approved by: https://github.com/guangyey, https://github.com/gujinghui, https://github.com/wconstab
2025-02-16 11:59:40 +00:00
zeshengzong
4a545eb85d Fix torch.nn.functional.one_hot param num_classes optional description (#146470)
`torch.nn.functional.one_hot` [document](https://pytorch.org/docs/stable/generated/torch.nn.functional.one_hot.html) describe param `num_classes` not optional, but user can call method without pass it.

![image](https://github.com/user-attachments/assets/4e6d4feb-691f-451f-95b5-4ac11bac7bc2)

```python
>>> import torch
>>> a = torch.arange(0, 5) % 3  # [0,1,2,0,1]
>>> torch.nn.functional.one_hot(a)
tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1],
        [1, 0, 0],
        [0, 1, 0]])

```

`num_classes` has default value -1

93d98aca31/aten/src/ATen/native/native_functions.yaml (L6154-L6157)

## Test Result

![image](https://github.com/user-attachments/assets/2c7203b7-6226-4ebc-84c8-cbf912fc48e2)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146470
Approved by: https://github.com/albanD
2025-02-06 07:48:05 +00:00
Aaron Gokaslan
7f65a20884 [BE]: Enable ruff SLOT checks (#146276)
This enables a check that which a class which only inherits from immutable classes like str, tuple, and NamedTuple, also defined `__slots__` so they don't allocate memory unnecessarily. This also ensure contributors think about how they define their classes with subclass NamedTuples and str, of which we have many in our codebase

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146276
Approved by: https://github.com/aorenste
2025-02-04 19:18:23 +00:00
Aaron Gokaslan
292af3cc89 [BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408)
Apply ruff rule about implicit string concatenation, this autofixes strings that are all the same type and on the same line. These lines are broken up likely as the result of autoformatters in the past. All fixes are automated using the autofixes in ISC001.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146408
Approved by: https://github.com/justinchuby, https://github.com/janeyx99
2025-02-04 19:07:04 +00:00
Sahdev Zala
f97307f463 [Docs] Add clarification for target types in CrossEntropyLoss doc (#145444)
CrossEntropyLoss function requires that target for class indices are provided as a long and class probabilities are provided as a float datatype.

The CrossEntropyLoss function distinguish the two scenarios (indices and probabilities) by comparing the shapes. When input and target shapes are the same it’s a case for probabilities otherwise it will be used as a class index as already covered in the doc. The related code is here,
https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/LossNLL.cpp#L624

I think the current documentation is great but seems like it can confuse users about types as reported in the issues so this PR adds a bit more clarification.

Fixes #137188

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145444
Approved by: https://github.com/mikaylagawarecki
2025-02-01 18:55:58 +00:00
Alexander Kurakin
35f113e2a0 torch/nn/utils/rnn.py: docs: improvements (#138628)
Fix constants highlighting in generated documentation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138628
Approved by: https://github.com/mikaylagawarecki
2025-02-01 00:10:30 +00:00
chilli
2d5d022594 Fix a number of flexattention issues (cse, cudagraph, etc.) (#145059)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145059
Approved by: https://github.com/Skylion007, https://github.com/drisspg
2025-01-29 20:27:39 +00:00
Aaron Orenstein
7178b827d7 PEP585: Missed conversions (#145342)
Differential Revision: [D68785969](https://our.internmc.facebook.com/intern/diff/D68785969)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145342
Approved by: https://github.com/bobrenjc93
2025-01-29 05:24:36 +00:00
PyTorch MergeBot
09ae69a364 Revert "Fix type annotation of Linear.bias (#142326)"
This reverts commit 81e370fc6b.

Reverted https://github.com/pytorch/pytorch/pull/142326 on behalf of https://github.com/malfet due to This introduced a graph break and regressed inductor tests, see 73622fc5fa/1 ([comment](https://github.com/pytorch/pytorch/pull/142326#issuecomment-2614196349))
2025-01-26 03:41:00 +00:00
zeshengzong
5b988ac4fa [Easy] Replace paper description with link to make a concise description. (#145031)
Description in [Transformer,](https://pytorch.org/docs/main/generated/torch.nn.Transformer.html), [TransformerEncoderLayer](https://pytorch.org/docs/main/generated/torch.nn.TransformerEncoderLayer.html), [TransformerDecoderLayer](https://pytorch.org/docs/main/generated/torch.nn.TransformerDecoderLayer.html) pages contain authors and paper details seems redundant for users who want to know how to use it, replace with a link to paper content, users can go to the paper detail if they want to learn more.

**Test Result**

**Before**
![image](https://github.com/user-attachments/assets/678402b1-e759-402c-b56b-e24f63dc8490)
![image](https://github.com/user-attachments/assets/ca191734-f2ce-493f-bf34-2d7046a9868f)
![image](https://github.com/user-attachments/assets/10f55083-6eb6-4b1c-9a77-579f0c4c56ed)

**After**
![image](https://github.com/user-attachments/assets/020f81ca-d89b-47d1-a7a9-cae1893df968)
![image](https://github.com/user-attachments/assets/5b9b34df-b892-4d71-8cdb-df18380b2744)
![image](https://github.com/user-attachments/assets/b3348da2-842a-4037-bad3-f23687503cf8)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145031
Approved by: https://github.com/mikaylagawarecki
2025-01-24 23:01:02 +00:00
Fabian Keller
81e370fc6b Fix type annotation of Linear.bias (#142326)
Currently the `bias` attribute of `torch.nn.Linear` (and `Bilinear`) is typed incorrectly, because it relies on the implicit `Module.__getattr__` which types it as `Tensor | Module`. This has two issues:

- It hides the fact that `bias` is optional, and can be `None`, which in turn can hide actual bugs on user side.
- It blurs the type due to having `Module` in the union, which can require unnecessary `isistance(linear.bias, Tensor)` on user side.

This PR types the `bias` attribute explicitly to fix these issues.

CC @ezyang @Skylion007

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142326
Approved by: https://github.com/ezyang
2025-01-24 22:43:52 +00:00
drisspg
c6707734de Enable non power of 2 head_dim for FlexAttention (#133495)
# Summary
- Adds support for non-power of 2 headdim by launching blocks w/ head_dim rounded to the next valid power.
- Other option I considered was building up the final dot_products with smaller blocks (this would probably work but for sake of code complexity going with this option for now)

### Corollary
We had a bug in our backwards kernel where we were using index_k instead of index_v. This should have shown up for the qk_head_dim != v_head_dim cases..

Pull Request resolved: https://github.com/pytorch/pytorch/pull/133495
Approved by: https://github.com/Chillee
2025-01-23 17:05:38 +00:00
Aaron Orenstein
0afd335174 PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175
Approved by: https://github.com/bobrenjc93
2025-01-21 16:57:27 +00:00
PyTorch MergeBot
5fd881a5b6 Revert "PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175)"
This reverts commit 54a00af2c6.

Reverted https://github.com/pytorch/pytorch/pull/145175 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break some trunk tests ([comment](https://github.com/pytorch/pytorch/pull/145175#issuecomment-2603418267))
2025-01-21 00:49:55 +00:00
Aaron Orenstein
54a00af2c6 PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175
Approved by: https://github.com/bobrenjc93
2025-01-20 22:32:59 +00:00
Mario Vasilev
49bdc418be Add strict kwarg to nn.Module.set_submodule and fix bug for non dot delineated strings (#143455)
Before fixing set_submodule, it used to create leaf modules when the target was not a dot-delimited string. After the fix it will not create a new attribute if target is a non-dot-delimited string. If you want to create leaf nodes of `nn.Module` parent nodes, you can use `replace_or_create_new_leaf_module`.

Fixes https://github.com/pytorch/pytorch/issues/143441

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143455
Approved by: https://github.com/mikaylagawarecki
2025-01-16 05:06:33 +00:00
Boyuan Feng
069419569d [PagedAttention] Support different input position for each batch index (#144693)
In LLM inference, each request usually has different prefill length, leading to different input position for each batch index. This PR adds such support for paged attention.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144693
Approved by: https://github.com/drisspg
2025-01-15 18:03:52 +00:00
Aaron Orenstein
d782e46a36 [BE] typing for decorators - library (#138969)
Test Plan: unit tests

Differential Revision: D62302678

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138969
Approved by: https://github.com/zou3519
2025-01-15 17:08:55 +00:00
cyy
d87aad6877 [5/N] Apply Ruff fixes and pyupgrade to Python 3.9 (#144205)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144205
Approved by: https://github.com/albanD
2025-01-15 04:00:47 +00:00
Aaron Gokaslan
91dbd7b75c [BE]: Improve typing inference with TypeIs (#144682)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144682
Approved by: https://github.com/albanD

Co-authored-by: Aaron Orenstein <aorenste@meta.com>
2025-01-13 21:14:31 +00:00
bobrenjc93
f93d786f73 remove allow-untyped-defs from torch/nn/parameter.pyi (#144654)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144654
Approved by: https://github.com/Skylion007
2025-01-13 19:02:31 +00:00
Alexander Kurakin
68dad26b95 torch/nn/modules/linear.py: docs: improvements (#138484)
torch/nn/modules/linear.py: docs: improvements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138484
Approved by: https://github.com/mikaylagawarecki
2025-01-10 20:03:43 +00:00
Aaron Gokaslan
307ca094c9 [BE]: Remove redundant contiguous copy in flex attention (#144467)
Removes a redundant potential copy, instead use memory_format kwarg to fuse both operations into a single copy.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144467
Approved by: https://github.com/awgu
2025-01-09 18:30:09 +00:00
Mikayla Gawarecki
b8f383107e Link to transformer tutorial in transformer docs (#144425)
<img width="1045" alt="Screenshot 2025-01-08 at 4 50 20 PM" src="https://github.com/user-attachments/assets/05adfecb-8a23-4c48-9a2c-50c5b3f886b0" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144425
Approved by: https://github.com/albanD
2025-01-09 17:42:09 +00:00
bobrenjc93
168c2cb3f3 remove allow-untyped-defs from torch/nn/utils/_deprecation_utils.py (#144231)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144231
Approved by: https://github.com/albanD
2025-01-07 02:22:22 +00:00