Commit Graph

1263 Commits

Author SHA1 Message Date
PyTorch MergeBot
79ba65c0f2 Revert "Raise warning for unpickable local function (#80140)"
This reverts commit 4b75b7d3c1.

Reverted https://github.com/pytorch/pytorch/pull/80140 on behalf of https://github.com/ejguan due to It will break the CI for TorchData
2022-06-24 14:49:06 +00:00
erjia
4b75b7d3c1 Raise warning for unpickable local function (#80140)
Fixes https://github.com/pytorch/data/issues/538

- Improve the validation function to raise warning about unpickable function when either lambda or local function is provided to `DataPipe`.
- The inner function from `functools.partial` object is extracted as well for validation
- Mimic the behavior of `pickle` module for local lambda function: It would only raise Error for the local function rather than `lambda` function. So, we will raise warning about local function not lambda function.
```py
>>> import pickle
>>> def fn():
...     lf = lambda x: x
...     pickle.dumps(lf)
>>> pickle.dumps(fn)
AttributeError: Can't pickle local object 'fn.<locals>.<lambda>'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80140
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-24 13:50:51 +00:00
Robert
787ac4edf8 Add validation for mapper function in datapipes with input_col (#79344)
As linked in https://github.com/pytorch/data/issues/362
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79344
Approved by: https://github.com/ejguan, https://github.com/NivekT
2022-06-23 18:49:35 +00:00
PyTorch MergeBot
ec4be38ba9 Revert "To add hipify_torch as a submodule in pytorch/third_party (#74704)"
This reverts commit 93b0fec39d.

Reverted https://github.com/pytorch/pytorch/pull/74704 on behalf of https://github.com/malfet due to broke torchvision
2022-06-21 23:54:00 +00:00
Bhavya Medishetty
93b0fec39d To add hipify_torch as a submodule in pytorch/third_party (#74704)
`hipify_torch` as a submodule in `pytorch/third_party`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74704
Approved by: https://github.com/jeffdaily, https://github.com/malfet
2022-06-21 18:56:49 +00:00
erjia
ccccd0efec [DataLoader] Share seed via Distributed Store to get rid of CUDA dependency (#79829)
Fixes #79828

In distributed environment, before this PR, DataLoader would create a Tensor holding the shared seed in RANK 0 and send the Tensor to other processes. However, when `NCCL` is used as the distributed backend, the Tensor is required to be moved to cuda before broadcasted from RANK 0 to other RANKs. And, this causes the Issue where DataLoader doesn't move the Tensor to cuda before sharing using `NCCL`.

After offline discussion with @mrshenli, we think the distributed Store is a better solution as the shared seed is just an integer value. Then, we can get rid of the dependency on NCCL and CUDA when sharing info between distributed processes for DataLoader.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79829
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-20 19:18:35 +00:00
Kevin Tse
e8ed16f3c0 [DataPipe] Enable profiler record context in __next__ branch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79757

Approved by: https://github.com/ejguan
2022-06-17 16:52:07 +00:00
Kevin Tse
25ca006707 [DataPipe] Refactor _hook_iterator for readability
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79656

Approved by: https://github.com/ejguan
2022-06-17 16:52:07 +00:00
Edward Wang (EcoF)
0088172e38 [tensorboard] update assertion error for scalar() and fix docs (#76859)
Summary: title

Test Plan: unit test

Reviewed By: Reubend

Differential Revision: D35922397

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76859
Approved by: https://github.com/Reubend, https://github.com/ananthsub
2022-06-16 05:24:20 +00:00
Robert
3064982fb8 Support percentages in random_split (#78877)
Fixes #78510

This PR adds support for using fractions with `random_split`. This should be completely backwards-compatible as the fractional-style splitting is only applied when the sum across the input lengths is lower than 1.0
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78877
Approved by: https://github.com/ejguan
2022-06-16 02:00:25 +00:00
Kevin Tse
22c7b1ddb5 [DataPipe] Fix error message coming from singler iterator constraint
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79547

Approved by: https://github.com/ejguan
2022-06-14 21:38:36 +00:00
erjia
04f87f2ab9 [DataLoader] Fix the world_size when distributed sharding MapDataPipe (#79524)
Fixes #79449

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79524
Approved by: https://github.com/NivekT, https://github.com/VitalyFedyunin
2022-06-14 19:03:57 +00:00
Rohan Varma
44fe851feb [WIP] Fix non-reentrant hooks based checkpointing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78752

Approved by: https://github.com/albanD
2022-06-14 01:13:33 +00:00
PyTorch MergeBot
35eda5f959 [DataPipe] Correcting deprecation version
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79302

Approved by: https://github.com/ejguan
2022-06-10 19:31:29 +00:00
samdow
5e926aafab add utils for checking that all modes are in the same scope and finding the outermost mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78847

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-10 19:31:05 +00:00
samdow
3734fcc8f8 add ability to push a mode if the current mode is an ancestor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78822

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-10 18:27:04 +00:00
ErjiaGuan
5158a6b41a Foward fix sharding bug for DL (#79124)
This PR solves a bug introduced by #79041

`torch.utils.data.graph_settings.apply_sharding` changes the datapipe in-place and returns `None`

It would resolve the Error in TorchData. See: https://github.com/pytorch/data/actions/runs/2461030312
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79124
Approved by: https://github.com/VitalyFedyunin
2022-06-08 16:16:58 +00:00
erjia
b3ed65343d Fix sharding strategy for distributed DL (#79041)
1. Change the sharding strategy from sharding by worker first then by rank to sharding in the order of rank then workers.
2. Change to fetch Rank and World size in main process for the sake of `spawn`.

For the change 1:
Before this PR, for the case when dataset can not be evenly divided by `worker_num * world_size`, more data will be retrieved by workers in first RANKs.
Using the following example:
- dataset size: 100
- world_size: 4
- num_worker: 2

The number of data retrieved by each rank before this PR
- Rank 0: 26
- Rank 1: 26
- Rank 2: 24
- Rank 3: 24

The number of data retrieved by each rank after this PR
- Rank 0: 25
- Rank 1: 25
- Rank 2: 25
- Rank 3: 25

For the change 2:
Before this PR, `dist` functions are invoked inside worker processes. It's fine when the worker processes are forked from the parent process. All environment variables are inherited and exposed to these `dist` functions. However, when the worker processes are spawned, they won't be able to access to these environment variables, then the dataset won't be sharded by rank.
After this PR, `_sharding_worker_init_fn` should be working for both `spawn` and `fork` case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79041
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-06-07 20:56:32 +00:00
Kevin Tse
42fac176eb [DataPipe] Add function for deprecation of functional DataPipe names
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78970

Approved by: https://github.com/ejguan
2022-06-07 00:14:47 +00:00
Kevin Tse
c44472c5b1 [DataPipe] Disable profiler for IterDataPipe by default
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78674

Approved by: https://github.com/VitalyFedyunin
2022-06-06 22:12:56 +00:00
Eli Uriegas
be25566d13 tools: Ensure compat for collect_env with python 3.5
Users were reporting errors of not being able to use collect_env with
older versions of python. This adds a test to ensure that we maintain
compat for this script with older versions of python

Signed-off-by: Eli Uriegas <eliuriegasfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78946

Approved by: https://github.com/janeyx99
2022-06-06 21:32:57 +00:00
Vitaly Fedyunin
6fe6902f97 [DataLoader] Apply sharding settings in dist when num_workers is 0
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78950

Approved by: https://github.com/ejguan, https://github.com/NivekT
2022-06-06 20:03:02 +00:00
Robert Xiu
9fca008809 [DataPipe] Adding functional API for FileLister (#78419)
Fixes #78263

Follow-up from pytorch/data#387. This adds a functional API `list_files()` to `FileListerDataPipe`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78419
Approved by: https://github.com/NivekT, https://github.com/ejguan
2022-06-06 17:26:19 +00:00
erjia
9b6cb83b0c Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765)
Fixes https://github.com/pytorch/data/issues/426

This PR introduces two main changes:
- It ensures the `ShufflerDataPipe` would share the same seed across distributed processes.
- Users can reset `shuffle` for persistent workers per epoch.

Detail:
- `shared_seed` is shared across distributed and worker processes. It will seed a `shared_rng` to provide seeds to each `ShufflerDataPipe` in the pipeline
- `worker_loop` now accepts a new argument of `shared_seed` to accept this shared seed.
- The `shared_seed` is attached to `_ResumeIteration` for resetting seed per epoch for `persistent worker`
- I choose not to touch `base_seed` simply for BC issue

I used this [script](https://gist.github.com/ejguan/d88f75fa822cb696ab1bc5bc25844f47) to test the result with `world_size=4`. Please check the result in: https://gist.github.com/ejguan/6ee2d2de12ca57f9eb4b97ef5a0e300b

You can see there isn't any duplicated/missing element for each epoch. And, with the same seed, the order of data remains the same across epochs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78765
Approved by: https://github.com/VitalyFedyunin
2022-06-06 17:24:00 +00:00
PyTorch MergeBot
129d9dbb15 Revert "Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765)"
This reverts commit b769a0e18b.

Reverted https://github.com/pytorch/pytorch/pull/78765 on behalf of https://github.com/janeyx99 due to broke lint on trunk
2022-06-06 14:24:51 +00:00
erjia
b769a0e18b Make ShufflerDataPipe deterministic for persistent DL and distributed DL (#78765)
Fixes https://github.com/pytorch/data/issues/426

This PR introduces two main changes:
- It ensures the `ShufflerDataPipe` would share the same seed across distributed processes.
- Users can reset `shuffle` for persistent workers per epoch.

Detail:
- `shared_seed` is shared across distributed and worker processes. It will seed a `shared_rng` to provide seeds to each `ShufflerDataPipe` in the pipeline
- `worker_loop` now accepts a new argument of `shared_seed` to accept this shared seed.
- The `shared_seed` is attached to `_ResumeIteration` for resetting seed per epoch for `persistent worker`
- I choose not to touch `base_seed` simply for BC issue

I used this [script](https://gist.github.com/ejguan/d88f75fa822cb696ab1bc5bc25844f47) to test the result with `world_size=4`. Please check the result in: https://gist.github.com/ejguan/6ee2d2de12ca57f9eb4b97ef5a0e300b

You can see there isn't any duplicated/missing element for each epoch. And, with the same seed, the order of data remains the same across epochs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78765
Approved by: https://github.com/VitalyFedyunin
2022-06-06 13:36:37 +00:00
samdow
184e0065b3 add better error message for class method
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78821

Approved by: https://github.com/ezyang
2022-06-06 13:31:32 +00:00
Nikita Shulga
40e2aadf47 Create __init__.py (#78629)
To make `torch.utils.jit` a proper package, otherwise it will not be added to the wheel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78629
Approved by: https://github.com/seemethere, https://github.com/xuzhao9, https://github.com/davidberard98
2022-06-03 18:14:21 +00:00
Xiao Wang
ef0332e36d Allow relocatable device code linking in pytorch CUDA extensions (#78225)
Close https://github.com/pytorch/pytorch/issues/57543

Doc: check `Relocatable device code linking:` in https://docs-preview.pytorch.org/78225/cpp_extension.html#torch.utils.cpp_extension.CUDAExtension
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78225
Approved by: https://github.com/ezyang, https://github.com/malfet
2022-06-02 21:35:56 +00:00
Oliver Sellwood
cc6a51c9f3 added shape checking to WeightedRandomSampler (#78585)
Fixes #78236

An erronously shaped weights vector will result in the following output

```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/datarwe/pytorch/torch/utils/data/sampler.py in <module>
      [274](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=273) WeightedRandomSampler([1,2,3], 10)
----> [275](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=274) WeightedRandomSampler([[1,2,3], [4,5,6]], 10)

~/datarwe/pytorch/torch/utils/data/sampler.py in __init__(self, weights, num_samples, replacement, generator)
    [192](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=191)         weights = torch.as_tensor(weights, dtype=torch.double)
    [193](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=192)         if len(weights.shape) != 1:
--> [194](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=193)             raise ValueError("weights should be a 1d sequence but given "
    [195](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=194)                              "weights have shape {}".format(tuple(weights.shape)))
    [196](file:///home/oliver/datarwe/pytorch/torch/utils/data/sampler.py?line=195)

ValueError: weights should be a 1d sequence but given weights have shape (2, 3)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78585
Approved by: https://github.com/NivekT, https://github.com/ejguan
2022-06-02 21:12:14 +00:00
Vitaly Fedyunin
883f8ef62e [DataLoader] DataLoader now automatically apply sharding to DataPipes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78631

Approved by: https://github.com/ejguan, https://github.com/NivekT
2022-06-02 17:40:29 +00:00
Kevin Tse
575c420287 [DataPipe] Lazily generate exception message for performance
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78673

Approved by: https://github.com/ejguan
2022-06-01 23:19:31 +00:00
samdow
aa06d05297 enable with semantics
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78214

Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-06-01 21:14:45 +00:00
Sergii Dymchenko
e8bf3a9cd4 Remove Python 2-related code from dataloader (#78594)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78594
Approved by: https://github.com/seemethere
2022-06-01 05:25:23 +00:00
Kevin Tse
96deba836a [DataLoader] Fix unraised exception in eventloop
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78556

Approved by: https://github.com/ejguan
2022-05-31 21:45:58 +00:00
Elias Ellison
678213ead2 Fake Tensor Part 1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77969

Approved by: https://github.com/ezyang
2022-05-31 16:20:35 +00:00
Kevin Tse
51ecc366e1 [DataLoader] Minor documentation improvement
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78404

Approved by: https://github.com/ejguan
2022-05-31 15:59:46 +00:00
Kevin Tse
b4a6730ce1 [DataPipe] Refactor 'mux' to have buffer as an instance variable
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77775

Approved by: https://github.com/ejguan
2022-05-19 19:55:27 +00:00
erjia
99f6e614e8 Seed Shuffler for MP DataLoader without explicit manual_seed. (#77855)
Follow up on https://github.com/pytorch/pytorch/pull/77741

This PR guarantees the `Shuffler` in first iteration with MP DataLoader has the same seed across worker processes when users don't specify the seed.
Check newly added tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77855
Approved by: https://github.com/NivekT
2022-05-19 17:28:26 +00:00
Kevin Tse
97fa1d317f [DataPipe] Preventing automatic reset call after state is restored
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77774

Approved by: https://github.com/ejguan
2022-05-19 16:59:08 +00:00
PyTorch MergeBot
fdd5f7214e Revert "[DataPipe] Preventing automatic reset call after state is restored"
This reverts commit ac1837ddd3.

Reverted https://github.com/pytorch/pytorch/pull/77774 on behalf of https://github.com/janeyx99
2022-05-19 14:26:42 +00:00
Kurt Mohler
aea6e2c396 Merge torch.cuda._UntypedStorage into torch._UntypedStorage (#75459)
Fixes #74933

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75459
Approved by: https://github.com/ezyang
2022-05-19 13:54:39 +00:00
Kevin Tse
ac1837ddd3 [DataPipe] Preventing automatic reset call after state is restored
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77774

Approved by: https://github.com/ejguan
2022-05-19 13:53:14 +00:00
erjia
365ce350cb Make ShufflerDataPipe deterministic for SP & MP DataLoader (#77741)
This is the first PR to make DataPipe deterministic.

Users should be able to use `torch.manual_seed(seed)` to control the shuffle order for the following cases:
- Directly over `DataPipe`
- For single-process DataLoader
- Multiprocessing DataLoader

Unfortunately, for distributed training, users have to run `apply_shuffle_seed` manually to make sure all distributed processes having the same order of shuffle.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77741
Approved by: https://github.com/VitalyFedyunin, https://github.com/NivekT
2022-05-18 23:32:07 +00:00
Edward Z. Yang
4941e72e40 Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836)""
This reverts commit c35bd8d423.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719

Approved by: https://github.com/Chillee, https://github.com/malfet
2022-05-18 18:40:57 +00:00
Ning Li (Seattle)
4d1ead6dff [DataPipe] Update mux data pipe (#76384) (#77145)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76384

OSS issue discussion: https://github.com/pytorch/data/issues/346
This diff updates `mux` and `mux_longest` data pipe.
`mux`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It ends when the shortest input DataPipe is exhausted.

`mux` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(3)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22]
```

Test Plan:
buck test mode/opt //caffe2/test:datapipe

https://www.internalfb.com/intern/testinfra/testrun/4785074706282345

Differential Revision: D36017945

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77145
Approved by: https://github.com/NivekT, https://github.com/ejguan
2022-05-18 16:23:07 +00:00
PyTorch MergeBot
48581d74ad Revert "Add dispatch mode testing for meta tensors and other stuff"
This reverts commit c1cdb1216b.

Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet
2022-05-18 02:56:48 +00:00
Kevin Tse
ee080918df [DataPipe] Moving DataPipe buffers from __iter__ to instance (self)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76999

Approved by: https://github.com/VitalyFedyunin
2022-05-18 01:31:39 +00:00
Kevin Tse
bbaefdf6b5 [DataPipe] Enforcing single valid iterator for IterDataPipes multiple DataPipes as outputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75995

Approved by: https://github.com/VitalyFedyunin
2022-05-18 01:31:39 +00:00
Kevin Tse
7c52f204e0 [DataPipe] Enforcing single valid iterator for IterDataPipes without multiple outputs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70479

Approved by: https://github.com/ejguan
2022-05-18 01:31:38 +00:00
Kevin Tse
e0451d8022 [DataPipe] refactor to separate _IterDataPipeMeta
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76776

Approved by: https://github.com/ejguan, https://github.com/VitalyFedyunin
2022-05-18 01:31:38 +00:00
Edward Z. Yang
c1cdb1216b Add dispatch mode testing for meta tensors and other stuff
We don't have any coverage for meta tensor correctness for backwards
because torch function mode can only allow us to interpose on
Python torch API calls, but backwards invocations happen from C++.
To make this possible, I add torch_dispatch_meta test which runs the
tests with __torch_dispatch__

While doing this, I needed to generate fresh expected failure / skip
lists for the new test suite, and I discovered that my original
scaffolding for this purpose was woefully insufficient.  So I rewrote
how the test framework worked, and at the same time rewrote the
__torch_function__ code to also use the new logic.  Here's whats
new:

- Expected failure / skip is now done on a per function call basis,
  rather than the entire test.  This means that separate OpInfo
  samples for a function don't affect each other.

- There are now only two lists: expect failure list (where the test
  consistently fails on all runs) and skip list (where the test
  sometimes passes and fails.

- We explicitly notate the dtype that failed.  I considered detecting
  when something failed on all dtypes, but this was complicated and
  listing everything out seemed to be nice and simple.  To keep the
  dtypes short, I introduce a shorthand notation for dtypes.

- Conversion to meta tensors is factored into its own class
  MetaConverter

- To regenerate the expected failure / skip lists, just run with
  PYTORCH_COLLECT_EXPECT and filter on a specific test type
  (test_meta or test_dispatch_meta) for whichever you want to update.

Other misc fixes:

- Fix max_pool1d to work with BFloat16 in all circumstances, by making
  it dispatch and then fixing a minor compile error (constexpr doesn't
  work with BFloat16)

- Add resolve_name for turning random torch API functions into string
  names

- Add push classmethod to the Mode classes, so that you can more easily
  push a mode onto the mode stack

- Add some more skips for missing LAPACK

- Added an API to let you query if there's already a registration for
  a function, added a test to check that we register_meta for all
  decompositions (except detach, that decomp is wrong lol), and then
  update all the necessary sites to make the test pass.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477

Approved by: https://github.com/zou3519
2022-05-18 00:18:34 +00:00
Vitaly Fedyunin
edffd595c2 [DataLoader] Adding ability to use dill to pass DataPipes in mutiprocessing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77288

Approved by: https://github.com/ejguan, https://github.com/NivekT
2022-05-15 23:04:03 +00:00
David Berard
fa44e165ff Retry "[JIT] parse prim::Constant[value=annotate()] and prim::Constant[value={0}]"
Retry of https://github.com/pytorch/pytorch/pull/76875. It was reverted
due to torchvision failures, but it turned out that the failures were
caused by a different PR.

irparser previously didn't support these, which would cause failures in
log_extract.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77377

Approved by: https://github.com/datumbox
2022-05-13 15:12:07 +00:00
Ivan Yashchuk
09be44de7b Sparse BSR: Enable addmm, addmv, triangular_solve for BSR layout (#77255)
This PR enables `addmm`, `addmv`, `triangular_solve` functions for tensors with `torch.sparse_bsr` layout.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77255
Approved by: https://github.com/cpuhrsch
2022-05-12 08:31:44 +00:00
PyTorch MergeBot
2083b16f68 Revert "[JIT] parse prim::Constant[value=annotate()] and prim::Constant[value={0}]"
This reverts commit 31d3ce7000.

Reverted https://github.com/pytorch/pytorch/pull/76875 on behalf of https://github.com/janeyx99
2022-05-11 13:50:20 +00:00
Philip Meier
635aaa3d9d replace "grep" with Python processing in collect_env.py (#77148)
Fixes #77063.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77148
Approved by: https://github.com/ezyang
2022-05-10 19:35:15 +00:00
木澄
d22d749a0e faster batch sampler (#76951)
Fixes #76950

Improve the performance of iteration on `BatchSampler` , especially when `batch_size` is big.

Python 3.6.8:
```
  batch_size  drop_last     speedup
------------  -----------   -------
           4  True          -18.07%
           4  False         15.92%
           8  True          9.43%
           8  False         30.90%
          64  True          54.99%
          64  False         49.64%
         640  True          66.26%
         640  False         48.32%
        6400  True          69.06%
        6400  False         45.17%
```

Python 3.8.12:
```
  batch_size  drop_last    speedup
------------  -----------  --------
           4  True         -10.50%
           4  False        -0.78%
           8  True         24.40%
           8  False        10.20%
          64  True         90.96%
          64  False        26.09%
         640  True         112.88%
         640  False        20.09%
        6400  True         111.80%
        6400  False        18.37%

```

Check the issue page for more details of the tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76951
Approved by: https://github.com/ejguan
2022-05-10 18:19:54 +00:00
yanbing-j
cd33e412a2 Enable fp32/bf16 PRelu forward and backward in MkldnnCPU path (#60427)
Enable fp32/bf16 PRelu forward and backward in MkldnnCPU path.

Fixes https://github.com/pytorch/pytorch/issues/58896

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60427
Approved by: https://github.com/VitalyFedyunin, https://github.com/ngimel, https://github.com/malfet
2022-05-10 17:29:11 +00:00
Kevin Tse
a008d19ff7 [DataPipe] Revamp serialization logic of DataPipes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74984

Approved by: https://github.com/ejguan
2022-05-10 16:16:46 +00:00
David Berard
31d3ce7000 [JIT] parse prim::Constant[value=annotate()] and prim::Constant[value={0}]
irparser previously didn't support these, which would cause failures in
log_extract.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76875

Approved by: https://github.com/eellison, https://github.com/tugsbayasgalan
2022-05-06 19:40:09 +00:00
Kiarash Jamali
bc3c7a6cbd Fix issue with _checkpoint_without_reentrant
Fixes  #76737
I also added a test case for this bug.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76890
Approved by: https://github.com/albanD
2022-05-05 17:37:31 +00:00
Ivan Yashchuk
3df0140cbd Sparse CSR: Fix sampled_addmm for noncontiguous inputs and fix block sparse triangular solve
`torch.sparse.sampled_addmm` was incorrect for noncontiguous inputs on CUDA.
Unfortnately, it was overlooked in the tests that noncontiguous inputs
are not tested properly because 1x5, 5x1 shapes were used.

Block sparse triangular solver on CUDA could return incorrect results if
there's a zero on the diagonal in the sparse matrix. Now it returns nan.
Tests also revealed that unitriangular=True flag is not working
correctly on CPU in some cases. That part needs more investigation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76590

Approved by: https://github.com/cpuhrsch
2022-05-05 09:00:48 +00:00
samdow
6779366f27 add nested mode to python mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75965

Approved by: https://github.com/albanD, https://github.com/ezyang, https://github.com/zou3519
2022-05-04 13:01:06 +00:00
Michael Suo
fb0f285638 [lint] upgrade mypy to latest version
Fixes https://github.com/pytorch/pytorch/issues/75927.

Had to fix some bugs and add some ignores.

To check if clean:
```
lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753

Approved by: https://github.com/malfet
2022-05-03 20:51:34 +00:00
PyTorch MergeBot
3d7428d9ac Revert "[lint] upgrade mypy to latest version"
This reverts commit 9bf18aab94.

Reverted https://github.com/pytorch/pytorch/pull/76753 on behalf of https://github.com/suo
2022-05-03 20:01:18 +00:00
Michael Suo
9bf18aab94 [lint] upgrade mypy to latest version
Fixes https://github.com/pytorch/pytorch/issues/75927.

Had to fix some bugs and add some ignores.

To check if clean:
```
lintrunner --paths-cmd='git grep -Il .' --take MYPY,MYPYSTRICT
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76753

Approved by: https://github.com/malfet
2022-05-03 19:43:28 +00:00
samdow
598e7e5f19 [Reland] Change 'python mode' to 'torch dispatch mode'
Changes Python Mode name to Torch Dispatch Mode because there is now a Torch Function Mode, so Torch Dispatch Mode and Torch Function Mode are consistent with each other
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76562
Approved by: https://github.com/zou3519, https://github.com/albanD
2022-05-02 20:06:43 +00:00
PyTorch MergeBot
395a620a4f Revert "Change 'python mode' to 'torch dispatch mode'"
This reverts commit 7203a73986.

Reverted https://github.com/pytorch/pytorch/pull/76562 on behalf of https://github.com/janeyx99
2022-05-02 14:42:11 +00:00
Behrooz
76916ccf81 Fix lists in the docstring
Signed-off-by: Behrooz <3968947+drbeh@users.noreply.github.com>

Fixes #76593
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76594
Approved by: https://github.com/ejguan
2022-05-02 14:12:23 +00:00
samdow
7203a73986 Change 'python mode' to 'torch dispatch mode'
Changes Python Mode name to Torch Dispatch Mode because there is now a Torch Function Mode, so Torch Dispatch Mode and Torch Function Mode are consistent with each other
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76562
Approved by: https://github.com/zou3519
2022-05-02 13:33:58 +00:00
rraminen
7422ccea8b Hipify fixes for a successful DeepSpeed build
These commits are required to build DeepSpeed on ROCm without the hipify errors.

a41829d9ed
663c718462

cc: @jeffdaily

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76141
Approved by: https://github.com/jeffdaily, https://github.com/pruthvistony, https://github.com/albanD
2022-04-28 13:19:59 +00:00
Edward Wang (EcoF)
7c0ccb8a9d black formatting for utils/tensorboard (#76396)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/76396

Reviewed By: Reubend

Differential Revision: D35945748

Pulled By: edward-io

fbshipit-source-id: ffee22e88aaf49eb98b3e2eb6624a2dadb8ef754
(cherry picked from commit 6b5656b7c081cd69135b54f7d13d02c1c361b696)
2022-04-28 00:21:58 +00:00
zengk95
ef63408853 Revert [DataPipe] Update mux data pipe
Reverts #76384

this this is breaking tests test_demux_mux_datapipe (__main__.TestIterableDataPipeBasic. See logs: a997046017
and was red on the PR as well: https://hud.pytorch.org/pytorch/pytorch/pull/76384
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76507
Approved by: https://github.com/kit1980
2022-04-28 00:06:30 +00:00
Ning Li (Seattle)
a997046017 [DataPipe] Update mux data pipe (#76384)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76384

OSS issue discussion: https://github.com/pytorch/data/issues/346
This diff updates `mux` and `mux_longest` data pipe.
`mux`: Yields one element at a time from each of the input Iterable DataPipes (functional name: ``mux``). As in, one element from the 1st input DataPipe, then one element from the 2nd DataPipe in the next iteration, and so on. It ends when the shortest input DataPipe is exhausted.

`mux` example:

```
>>> from torchdata.datapipes.iter import IterableWrapper
>>> dp1, dp2, dp3 = IterableWrapper(range(3)), IterableWrapper(range(10, 15)), IterableWrapper(range(20, 25))
>>> list(dp1.mux(dp2, dp3))
[0, 10, 20, 1, 11, 21, 2, 12, 22]
```

Test Plan:
buck test mode/dev //pytorch/data/test:tests -- --exact 'pytorch/data/test:tests - test_mux_longest_iterdatapipe (test_datapipe.TestDataPipe)'

https://www.internalfb.com/intern/testinfra/testrun/3096224791148107

Reviewed By: ejguan

Differential Revision: D35799965

fbshipit-source-id: 320e71a342ec27e6e9200624aad42f4b99f97c3a
(cherry picked from commit 741ed595275df6c05026ed6f0e78d7052328fb7d)
2022-04-27 22:10:42 +00:00
Scott Wolchok
e816e17655 [PyTorch] Add native fast path for transformer encoder inference (#76333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333

The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
ghstack-source-id: 154737857

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: cpuhrsch

Differential Revision: D35239925

fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28
2022-04-26 12:58:03 -04:00
Kevin Tse
ccd7233fdd [DataPipe] clearing buffer for DataPipes during __del__
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76345

Approved by: https://github.com/ejguan
2022-04-26 14:24:28 +00:00
Jon Janzen
2387efd356 Revert "[PyTorch] Add native fast path for transformer encoder inference"
This reverts commit b369b89f23.

This has internal changes and should not have been landed via mergebot.

Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166
2022-04-25 11:40:02 -04:00
erjia
0ff05b1e97 [DataPipe] Add funtional API docstring and fix typo in test
Per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76272
Approved by: https://github.com/ishaan-mehta, https://github.com/NivekT
2022-04-25 14:16:53 +00:00
Scott Wolchok
b369b89f23 [PyTorch] Add native fast path for transformer encoder inference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809

The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.

Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/)!

Approved by: https://github.com/ezyang
2022-04-25 06:11:36 +00:00
Kevin Tse
383f026791 [DataPipe] Enabling graph traversal for MapDataPipe
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74851

Approved by: https://github.com/ejguan
2022-04-22 18:06:16 +00:00
erjia
ec591087fb [DataPipe] Add input_col to filter and add deprecation warning for DataPipe arguments
Last patch to align DataPipe API with TorchArrow DataFrame

For deprecation warning of DataPipe argument:
```
The argument `drop_empty_batches` of `FilterIterDataPipe()` is deprecated since 1.12 and will be removed in 1.14.
See https://github.com/pytorch/data/issues/163 for details.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76060
Approved by: https://github.com/NivekT
2022-04-22 17:49:39 +00:00
erjia
b8cce8847f [DataPipe] Add functional API to StreamReader and FileOpener
Per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76233
Approved by: https://github.com/NivekT
2022-04-22 17:49:26 +00:00
Kevin Tse
ed8e498c70 [DataPipe] Improving debug message when argument is a tuple/list of DataPipes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76134

Approved by: https://github.com/ejguan
2022-04-22 17:31:11 +00:00
Erjia Guan
0289ab2cec Fix data-related public API (#368)
Summary:
X-link: https://github.com/pytorch/data/pull/368

This is PR aims to expose the right data-relate API.

There are two more changes made in this PR to convert public api to private api
`check_lambda_fn` -> `_check_lambda_fn`
`deprecation_warning` -> `_deprecation_warning`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76143

Reviewed By: albanD, NivekT

Differential Revision: D35798311

Pulled By: ejguan

fbshipit-source-id: b13fded5c88a533c706702fb2070c918c839dca4
(cherry picked from commit 0b534b829a2e90e1e533951c6d334fdeaa9358b9)
2022-04-21 17:27:05 -07:00
Jeeja
45bbc4c028 Update Dataloader with default parameter device (#65402)
Summary:
pin_memory, has optional device parameter to specify
which device you want to pin for.  With this above change
the Dataloader will work only for CUDA backend. To add
support for other backend which supports pinned memory,
dataloader is updated with device as optional parameter.

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65402

Reviewed By: zou3519

Differential Revision: D32282204

Pulled By: VitalyFedyunin

fbshipit-source-id: e2e09876969af108d0db38af7c2d1b2f1cfa9858
(cherry picked from commit 3b76e151964fce442e27fe8fb5c37af930da4fa1)
2022-04-21 01:33:53 +00:00
Kevin Tse
116d0bec5d [DataPipe] Improving debug message when exceptions are raised within IterDataPipe
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75618

Approved by: https://github.com/ejguan
2022-04-20 14:45:34 +00:00
Min Si
9562aedb58 ROCm: add HIP_HOME/include,lib in cpp_extensions (#75548)
Summary:
hip/hip_runtime.h and libamdhip64.so may be required to compile
extension such as torch_ucc. They are in $ROCM_HOME/hip by default,
and may not be symlinked to $ROCM_HOME/include and $ROCM_HOME/lib.
This commit defines $ROCM_HOME/hip as $HIP_HOME, and adds its include
and lib paths when building hipified extension.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75548

Test Plan:
## Verify OSS pytorch + TorchUCC on an AMD GPU machine (MI100)
- step 1. Install OSS pytorch
```
export ROCM_PATH=/opt/rocm-4.5.2
git clone https://github.com/pytorch/pytorch.git
cd pytorch
python3 tools/amd_build/build_amd.py

USE_NCCL=0 USE_RCCL=0 USE_KINETO=0 with-proxy python3 setup.py develop
USE_NCCL=0 USE_RCCL=0 USE_KINETO=0 with-proxy python3 setup.py install
```

- step2. Install torchUCC extension
```
# /opt/rocm-4.5.2/include/hip does not exist, need include /opt/rocm-4.5.2/hip/include at compile time
export ROCM_PATH=/opt/rocm-4.5.2
export RCCL_INSTALL_DIR=/opt/rccl-rocm-rel-4.4-rdc
git clone https://github.com/facebookresearch/torch_ucc.git
cd torch_ucc
UCX_HOME=$RCCL_INSTALL_DIR UCC_HOME=$RCCL_INSTALL_DIR WITH_CUDA=$ROCM_PATH python setup.py
```
Build log before fix (error "hip/hip_runtime.h: No such file or directory"): P493038915
Build log after fix: P493037572

Reviewed By: ezyang

Differential Revision: D35506098

Pulled By: minsii

fbshipit-source-id: 76cbb6d4eaa6549a00898c9d9ebaca47a55330e9
(cherry picked from commit d684c080edf1fbd293e3321151976812c1da8533)
2022-04-19 20:51:37 +00:00
Han Qi
b34b192d6b Reland "Make debug_pkl smaller by only emitting unique traces." (#73368)
Summary:
## Original commit message:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73368

debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.
Since many SourceRange shares the same source, the string for trace can be deduped.
The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).
The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.

Test Plan:
## Original Test plan
unit test

Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K    archive/xl_model_weights
3.7M    archive/extra
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K    archive/code/__torch__/caffe2/torch/fb
8.0K    archive/code/__torch__/caffe2/torch
8.0K    archive/code/__torch__/caffe2
20M     archive/code/__torch__/torch/fx/graph_module
20M     archive/code/__torch__/torch/fx
8.0K    archive/code/__torch__/torch/classes
20M     archive/code/__torch__/torch
20M     archive/code/__torch__
20M     archive/code
2.7M    archive/constants
35M     archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K    resaved/extra
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K    resaved/code/__torch__/caffe2/torch/fb
8.0K    resaved/code/__torch__/caffe2/torch
8.0K    resaved/code/__torch__/caffe2
1.3M    resaved/code/__torch__/torch/fx/graph_module
1.3M    resaved/code/__torch__/torch/fx
8.0K    resaved/code/__torch__/torch/classes
1.4M    resaved/code/__torch__/torch
1.4M    resaved/code/__torch__
1.4M    resaved/code
2.7M    resaved/constants
13M     resaved
[qihan@devvm5585.vll0 ~]$
```
## Additional test:
`buck test mode/dev-tsan //caffe2/benchmarks/static_runtime:static_runtime_cpptest -- --exact 'caffe2/benchmarks/static_runtime:static_runtime_cpptest - StaticRuntime.to'` passes

 test jest.fbios.startup_cold_start.local.simulator f333356873 -

Differential Revision: D35196883

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74869
Approved by: https://github.com/gmagogsfm
2022-04-18 22:34:21 +00:00
Scott Wolchok
97c993ca7a [PyTorch] Add NestedTensor support functions for transformers
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75491

Here are the NestedTensor kernels we'll need for the improved transformer implementation.

Differential Revision: [D35409275](https://our.internmc.facebook.com/intern/diff/D35409275/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35409275/)!

Approved by: https://github.com/cpuhrsch
2022-04-14 16:30:23 +00:00
Sebastian Brodehl
b46f3a49b3 [tensorboard][writer] Add missing 'dataformats' argument to 'add_image' docs.
The [torch.utils.tensorboard.SummaryWriter.add_image](https://pytorch.org/docs/stable/_modules/torch/utils/tensorboard/writer.html#SummaryWriter.add_image) is missing the argument `dataformats` in the docs.

This PR adds the missing argument to the docs (analogous to `add_images` docs).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/48834
Approved by: https://github.com/ezyang
2022-04-14 03:39:19 +00:00
provefar
7a243ddd19 Add import to importlib.abc
Fixes #70525

```
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-334d309cf512> in <module>
----> 1 lltm_cpp = load(name="lltm_cpp", sources=["lltm.cpp"])

/usr/lib/python3.10/site-packages/torch/utils/cpp_extension.py in load(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1122                 verbose=True)
   1123     '''
-> 1124     return _jit_compile(
   1125         name,
   1126         [sources] if isinstance(sources, str) else sources,

/usr/lib/python3.10/site-packages/torch/utils/cpp_extension.py in _jit_compile(name, sources, extra_cflags, extra_cuda_cflags, extra_ldflags, extra_include_paths, build_directory, verbose, with_cuda, is_python_module, is_standalone, keep_intermediates)
   1360         return _get_exec_path(name, build_directory)
   1361
-> 1362     return _import_module_from_library(name, build_directory, is_python_module)
   1363
   1364

/usr/lib/python3.10/site-packages/torch/utils/cpp_extension.py in _import_module_from_library(module_name, path, is_python_module)
   1751         spec = importlib.util.spec_from_file_location(module_name, filepath)
   1752         module = importlib.util.module_from_spec(spec)
-> 1753         assert isinstance(spec.loader, importlib.abc.Loader)
   1754         spec.loader.exec_module(module)
   1755         return module

AttributeError: module 'importlib' has no attribute 'abc'
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75736
Approved by: https://github.com/ezyang
2022-04-14 03:32:30 +00:00
erjia
277c8fe646 [DataPipe] Make sure the profiler wrapper can delegate API for iterator
This PR is trying to solve the problem that delegate the API from the profiler layer to the `Iterator` returned from `IterDataPipe`.

We need this for internal usage `limit`, `resume`, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75275
Approved by: https://github.com/NivekT
2022-04-13 21:47:14 +00:00
Philip Meier
04db1b874f prevent overriding shuffle settings in DataLoader for datapipes
Fixes https://github.com/pytorch/data/issues/295

Follow-up to https://github.com/pytorch/pytorch/pull/75014#issuecomment-1091921305. We only need to update locations where we actually check `shuffle` for identity with a boolean value, i.e. `shuffle is False`. For bool-ish checks like `if shuffle:`, `None` behaves just like `False`.

`IterDataPipe`'s are currently not mentioned in the docstring. Since this change only applies to them, I didn't update it. LMK, if I should do that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75505
Approved by: https://github.com/ejguan
2022-04-12 18:26:33 +00:00
Kevin Tse
26d22b7fcf [DataPipe] Change interface generation process to revert back to original working process
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75602

Approved by: https://github.com/ejguan
2022-04-11 19:10:48 +00:00
Philip Meier
ff8705b374 improve datapipe deprecation warnings (#74685)
Summary:
Fixes pytorch/data#322.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74685

Reviewed By: mrshenli

Differential Revision: D35143494

Pulled By: NivekT

fbshipit-source-id: b87c312a93ea6a8ce2090da961312cb7e2cbc2eb
(cherry picked from commit cb74b81a822cf5a551f16b454a07430dfabaaa36)
2022-04-08 16:12:56 +00:00
David Berard
9d05ce602e [JIT] Move log_extract.py helper functions to torch.utils
This will allow us to reuse the log_extract.py tools in torchbench

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75436

Approved by: https://github.com/eellison
2022-04-07 22:17:58 +00:00
Philip Meier
3c10987692 don't add extra shuffle in DataLoader2 if one is present
Without this, `DataLoader2` will just add an `Shuffler` to the end of the datapipe if `shuffle=True`:

```py
from torch.utils.data.dataloader_experimental import DataLoader2

from torchdata.datapipes.iter import IterableWrapper, IterDataPipe, Shuffler

class Sorter(IterDataPipe):
    def __init__(self, datapipe):
        self.datapipe = datapipe

    def __iter__(self):
        return iter(sorted(self.datapipe))

data = list(range(1000))
dp = IterableWrapper(data)
dp = Shuffler(dp).set_shuffle(False)
dp = Sorter(dp)

dl2 = DataLoader2(dp, shuffle=True, batch_size=None)

assert list(dl2) == data  # fails unless you hit a lucky random seed
```

This example is somewhat non-sensical, but demonstrates we cannot simply add a `Shuffler`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75014
Approved by: https://github.com/ejguan
2022-04-05 19:53:08 +00:00
erjia
841a7f5187 [DataPipe] apply dill serialization for _Demux and add cache to traverse
- Fix _Demux can not be pickled with DILL presented https://github.com/pytorch/pytorch/pull/74958#issuecomment-1084637227
- And add cache to traverse function to prevent infinite recursion for circular reference of DataPipe (Fixes https://github.com/pytorch/data/issues/237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75034
Approved by: https://github.com/wenleix
2022-04-04 19:45:14 +00:00
Kevin Tse
4c5d532728 [DataPipe] only apply special serialization when dill is installed
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74958

Approved by: https://github.com/ejguan
2022-03-30 20:38:05 +00:00
amin-nejad
cce831c805 Fix misleading DataLoader docstring
Fixes description of `prefetch_factor` argument to `DataLoader` as discussed in #58030
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74558
Approved by: https://github.com/NivekT
2022-03-28 17:54:48 +00:00
Nicolas Hug
5667c4ea21 Remove default parameter of ShufflerIterDataPipe (#74370)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74370

Closes https://github.com/pytorch/data/issues/298. This PR:

- removes the `default` parameter of `ShufflerIterDataPipe`
- renames `set_shuffle_setting()` into `set_shuffle()`
- let `set_shuffle()` return `self`.

Test Plan: Imported from OSS

Reviewed By: george-qi

Differential Revision: D35073666

Pulled By: NicolasHug

fbshipit-source-id: 9847b037e70f44f36eaf4471f2c12fa8ec2ed73c
(cherry picked from commit b07ab646f308532886e8daddd57e937a53edb153)
2022-03-28 12:47:24 +00:00
Eli Uriegas
c170d395de utils: Only check for xnnpack if torch installed (#74342)
Summary:
Fixes a bug where collect_env.py was not able to be run without having
torch installed

Signed-off-by: Eli Uriegas <eliuriegas@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74342

Reviewed By: malfet, janeyx99

Differential Revision: D34943464

Pulled By: seemethere

fbshipit-source-id: dbaa0004b88cb643a9c6426c9ea7c5be3d3c9ef5
(cherry picked from commit 4f39ebb823f88df0c3902db15deaffc6ba481cb3)
2022-03-17 15:31:26 +00:00
Kevin Tse
ff3688f07a [BE Hackathon][DataPipe] Automatically generate datapipe.pyi via CMake (#73991)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73991

Automatically generate `datapipe.pyi` via CMake and removing the generated .pyi file from Git. Users should have the .pyi file locally after building for the first time.

I will also be adding an internal equivalent diff for buck.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34868001

Pulled By: NivekT

fbshipit-source-id: 448c92da659d6b4c5f686407d3723933c266c74f
(cherry picked from commit 306dbc5f469e63bc141dac57ef310e6f0e16d9cd)
2022-03-15 14:46:34 +00:00
Kevin Tse
eec994fc16 [DataPipe] Separating DataPipes from Dataset into different files (#73396)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73396

Separating DataPipes from Dataset into different files. This makes the code more maintainable and simplifies some of the code generation.

I have also tried to move `datapipe.py` into `torch.utils.data.datapipes`, but that will lead to circular import and rewriting many import statements. Should I put more time and go down that path some more?

Fixes https://github.com/pytorch/data/issues/213

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34481962

Pulled By: NivekT

fbshipit-source-id: 42fb26fe7fc334636852cfd8719fc807bdaa7912
(cherry picked from commit 81e76a64e297cb5c58caa951c554e49526173936)
2022-03-15 14:46:34 +00:00
Alban Desmaison
734281c3d6 Cleanup all module references in doc (#73983)
Summary:
Working towards https://docs.google.com/document/d/10yx2-4gs0gTMOimVS403MnoAWkqitS8TUHX73PN8EjE/edit?pli=1#

This PR:
- Ensure that all the submodules are listed in a rst file (that ensure they are considered by the coverage tool)
- Remove some long deprecated code that just error out on import
- Remove the allow list altogether to ensure nothing gets added back there

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73983

Reviewed By: anjali411

Differential Revision: D34787908

Pulled By: albanD

fbshipit-source-id: 163ce61e133b12b2f2e1cbe374f979e3d6858db7
(cherry picked from commit c9edfead7a01dc45bfc24eaf7220d2a84ab1f62e)
2022-03-10 22:26:29 +00:00
Evren Tumer
7534525735 Reset worker cycle iterator for determinism across runs (#73675)
Summary:
Reset worker cycle iterator for determinism across runs

Fixes https://github.com/pytorch/pytorch/issues/73603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73675

Reviewed By: bdhirsh

Differential Revision: D34688704

Pulled By: ejguan

fbshipit-source-id: 7bab11f0b9f59645d9b168fa11d92dc7c2c4d34e
(cherry picked from commit eb5fd559224988f9967528e154cf37c5031fe7c2)
2022-03-09 14:55:07 +00:00
Han Qi
0723639b60 Revert D34455360: Multisect successfully blamed D34455360 for test failures
Summary:
This diff is reverting D34455360 (61d6c43864)
D34455360 (61d6c43864) is making the following tests to fail and this revert diff is either the revert of the blame diff or the revert of the stack of diffs that need to be reverted to revert the blame diff

Tests affected:
- https://www.internalfb.com/intern/test/562950004334605/

Multisect link:
https://www.internalfb.com/intern/testinfra/multisect/756170

Test Plan: NA

Reviewed By: zhxchen17

Differential Revision: D34596156

fbshipit-source-id: a465bca0094db3caf6130c80f1ed49eea981359b
(cherry picked from commit ef5e5578c64ce9827570757fb016aafa9c782c6a)
2022-03-08 23:18:54 +00:00
Edward Z. Yang
35cfa74f97 Add a default implementation of __torch_dispatch__
I was working on an explanation of how to call into the "super"
implementation of some given ATen operation inside of __torch_dispatch__
(https://github.com/albanD/subclass_zoo/blob/main/trivial_tensors.py)
and I kept thinking to myself "Why doesn't just calling super() on
__torch_dispatch__ work"?  Well, after this patch, it does!  The idea
is if you don't actually unwrap the input tensors, you can call
super().__torch_dispatch__ to get at the original behavior.

Internally, this is implemented by disabling PythonKey and then
redispatching.  This implementation of disabled_torch_dispatch is
not /quite/ right, and some reasons why are commented in the code.
There is then some extra work I have to do to make sure we recognize
disabled_torch_dispatch as the "default" implementation (so we don't
start slapping PythonKey on all tensors, including base Tensors),
which is modeled the same way as how disabled_torch_function is done.

Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73684

Approved by: albanD
2022-03-03 20:19:33 +00:00
Han Qi
61d6c43864 Make debug_pkl smaller by only emitting unique traces. (#73368)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73368

debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.
Since many SourceRange shares the same source, the string for trace can be deduped.
The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).
The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.

Test Plan:
unit test

Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K    archive/xl_model_weights
3.7M    archive/extra
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K    archive/code/__torch__/caffe2/torch/fb
8.0K    archive/code/__torch__/caffe2/torch
8.0K    archive/code/__torch__/caffe2
20M     archive/code/__torch__/torch/fx/graph_module
20M     archive/code/__torch__/torch/fx
8.0K    archive/code/__torch__/torch/classes
20M     archive/code/__torch__/torch
20M     archive/code/__torch__
20M     archive/code
2.7M    archive/constants
35M     archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K    resaved/extra
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K    resaved/code/__torch__/caffe2/torch/fb
8.0K    resaved/code/__torch__/caffe2/torch
8.0K    resaved/code/__torch__/caffe2
1.3M    resaved/code/__torch__/torch/fx/graph_module
1.3M    resaved/code/__torch__/torch/fx
8.0K    resaved/code/__torch__/torch/classes
1.4M    resaved/code/__torch__/torch
1.4M    resaved/code/__torch__
1.4M    resaved/code
2.7M    resaved/constants
13M     resaved
[qihan@devvm5585.vll0 ~]$
```

Reviewed By: gmagogsfm

Differential Revision: D34455360

fbshipit-source-id: 8cc716f9bba7183746b1b4ecc33a2de34ac503b9
(cherry picked from commit f1a04730fc9ac8fdab6c8e4c44cb5529e42090e4)
2022-03-02 08:37:08 +00:00
Digant Desai
b2054d3025 Prepare for an update to the XNNPACK submodule (#72642)
Summary:
- Target Sha1: ae108ef49aa5623b896fc93d4298c49d1750d9ba
- Make USE_XNNPACK a dependent option on cmake minimum version 3.12
- Print USE_XNNPACK under cmake options summary, and print the
  availability from collet_env.py
- Skip XNNPACK based tests when XNNPACK is not available
    - Add SkipIfNoXNNPACK wrapper to skip tests
- Update cmake version for xenial-py3.7-gcc5.4 image to 3.12.4
    - This is required for the backwards compatibility test.
      The PyTorch op schema is XNNPACK dependent. See,
      aten/src/ATen/native/xnnpack/RegisterOpContextClass.cpp for
      example. The nightly version is assumed to have USE_XNNPACK=ON,
      so with this change we ensure that the test build can also
      have XNNPACK.
- HACK: skipping test_xnnpack_integration tests on ROCM

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72642

Reviewed By: kimishpatel

Differential Revision: D34456794

Pulled By: digantdesai

fbshipit-source-id: 85dbfe0211de7846d8a84321b14fdb061cd6c037
(cherry picked from commit 6cf48e7b64d6979962d701b5d493998262cc8bfa)
2022-02-25 00:39:15 +00:00
Kevin Tse
615ecac638 [DataPipe] Adding examples for MapDataPipes with small fixes for others (#73250)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73250

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34400782

Pulled By: NivekT

fbshipit-source-id: 4b8fc83a5a39d9661efbedc764cc68b1714db5c7
(cherry picked from commit 5f376972f0fa5ce10a40b8b083cc00de2ad237fe)
2022-02-24 20:38:15 +00:00
Alban Desmaison
3bd1507ff2 Revert D33994011: Make debug_pkl smaller by only emitting unique traces.
Test Plan: revert-hammer

Differential Revision:
D33994011 (3d37f5b052)

Original commit changeset: 8e6224c6e942

Original Phabricator Diff: D33994011 (3d37f5b052)

fbshipit-source-id: 885e739efa1081382e1fcf9c6cccba92c57e9f7a
(cherry picked from commit a6d98c85a736c2eb321a6f38005dd0f5dc43eb87)
2022-02-24 16:38:55 +00:00
Han Qi
3d37f5b052 Make debug_pkl smaller by only emitting unique traces. (#72596)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72596

debug_pkl file inside of pytorch's .pt file consists of a list of SourceRanges. Each SourceRange points to a Source which is a stack track, filename, and start, end numbers. Those are emitted in debug_pkl file as strings.

Since many SourceRange shares the same source, the string for trace can be deduped.

The newer format saves a set of unique traces in a tuple, then each SourceRange will save the offset of it's trace w.r.t. position in that tuple. (i.e. manually applying dictionary compression).

The above helps with smaller file size. On loading, if we copy each trace to Source as string the runtime memory would still blowup.
To mitigate this, we use SourceView directly instead of source which will take the reference of string inside of Deserializer and make that into string_view. This is safe because Deserializer is hold by Unpickler by shared_ptr, and Unpickler is also hold by shared_ptr by another Source object. That Source object will be alive during the model construction.

Test Plan:
unit test

Took original file (312271638_930.predictor.disagg.local); loaded with `torch.jit.load` save again with `torch.jit.save`. Unzip both, look at contents:
```
[qihan@devvm5585.vll0 ~]$ du archive -h
4.0K    archive/xl_model_weights
3.7M    archive/extra
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    archive/code/__torch__/caffe2/torch/fb/model_transform
8.0K    archive/code/__torch__/caffe2/torch/fb
8.0K    archive/code/__torch__/caffe2/torch
8.0K    archive/code/__torch__/caffe2
20M     archive/code/__torch__/torch/fx/graph_module
20M     archive/code/__torch__/torch/fx
8.0K    archive/code/__torch__/torch/classes
20M     archive/code/__torch__/torch
20M     archive/code/__torch__
20M     archive/code
2.7M    archive/constants
35M     archive
[qihan@devvm5585.vll0 ~]$ du resaved -h
4.0K    resaved/extra
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform/splitting
8.0K    resaved/code/__torch__/caffe2/torch/fb/model_transform
8.0K    resaved/code/__torch__/caffe2/torch/fb
8.0K    resaved/code/__torch__/caffe2/torch
8.0K    resaved/code/__torch__/caffe2
1.3M    resaved/code/__torch__/torch/fx/graph_module
1.3M    resaved/code/__torch__/torch/fx
8.0K    resaved/code/__torch__/torch/classes
1.4M    resaved/code/__torch__/torch
1.4M    resaved/code/__torch__
1.4M    resaved/code
2.7M    resaved/constants
13M     resaved
[qihan@devvm5585.vll0 ~]$
```

Reviewed By: JasonHanwen

Differential Revision: D33994011

fbshipit-source-id: 8e6224c6e942e91c3403f686c8f0937d1002ed41
(cherry picked from commit a7014dd4029308c95007f362a57c31796d686647)
2022-02-24 09:31:16 +00:00
Edgar Andrés Margffoy Tuay
86deecd7be Check clang++/g++ version when compiling CUDA extensions (#63230)
Summary:
See https://github.com/pytorch/pytorch/issues/55267

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63230

Reviewed By: soulitzer

Differential Revision: D34159119

Pulled By: malfet

fbshipit-source-id: 6eef7582388bf6a42dcc1d82b6e4b1f40f418dd7
(cherry picked from commit 2056d0a0be7951602de22f8d3b4efc28dd71b6c2)
2022-02-24 08:32:32 +00:00
Kevin Tse
38944a3c96 [DataPipe] Enable serialization of ForkerIterDataPipe (#73118)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73118

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D34354497

Pulled By: NivekT

fbshipit-source-id: 8f88ade2a568422e64c8252886cfd99d83ffbbf5
(cherry picked from commit 8497f4c31adab27072f53634cbe8e99003c4f7b2)
2022-02-23 16:31:21 +00:00
Kevin Tse
cd4ecce1bb [DataPipe] Fix issue with DataPipe serialization with dill (#72896)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72896

Fixing the issue described here: https://github.com/pytorch/data/issues/214

There will be a follow-up PR in TorchData as well

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D34258669

Pulled By: NivekT

fbshipit-source-id: 6dd88250ed14ebe779915dc46139be7e012e9d1b
(cherry picked from commit 025b8ed98019e576bfef04c33a3f33ed1a426a66)
2022-02-23 16:31:20 +00:00
Andrey Talman
46f9e16afe Documenting cuda 11.5 windows issue (#73013)
Summary:
Adding documentation about compiling extension with CUDA 11.5 and Windows

Example of failure: https://github.com/pytorch/pytorch/runs/4408796098?check_suite_focus=true

 Note: Don't use torch/extension.h In CUDA 11.5 under windows in your C++ code:
    Use aten instead of torch interface in all cuda 11.5 code under windows. It has been failing with errors, due to a bug in nvcc.
    Example use:
        >>> #include <ATen/ATen.h>
        >>> at::Tensor SigmoidAlphaBlendForwardCuda(....)
    Instead of:
        >>> #include <torch/extension.h>
        >>> torch::Tensor SigmoidAlphaBlendForwardCuda(...)
    Currently open issue for nvcc bug: https://github.com/pytorch/pytorch/issues/69460
    Complete Workaround code example: cb170ac024

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73013

Reviewed By: malfet, seemethere

Differential Revision: D34306134

Pulled By: atalman

fbshipit-source-id: 3c5b9d7a89c91bd1920dc63dbd356e45dc48a8bd
(cherry picked from commit 87098e7f17)
2022-02-19 02:34:59 +00:00
Kevin Tse
f5e201e4e9 [DataPipe] Adding usage examples for IterDataPipes (#73033)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73033

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34313793

Pulled By: NivekT

fbshipit-source-id: 51125be2f79d73d02658b2b1c2691f96be8d4769
(cherry picked from commit 3e3c2df7c6)
2022-02-18 15:12:34 +00:00
Chen Lai
cee84f4051 fix model dump for the lowered module (#72866)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72866

https://github.com/pytorch/pytorch/pull/71597 adds a wrapper `torch.jit.LoweredWrapper` and it breaks the model dump. Fix the model_dump in the notebook
ghstack-source-id: 149311636

Test Plan:
CI and test with N509022

Before:

{F701413403}

After:

{F701412963}

Reviewed By: iseeyuan

Differential Revision: D34247216

fbshipit-source-id: 695b02b03675fae596bb450441b327e4cdcffe9c
(cherry picked from commit d46a82a4c1)
2022-02-17 07:09:44 +00:00
Kevin Tse
87975d895c [DataPipe] Improve .pyi generation (#72829)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72829

Make two functions more flexible and usable from a different repo.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34227912

Pulled By: NivekT

fbshipit-source-id: 873934ed33caf485de7f56e9c4a1d3f3fa1a92ef
(cherry picked from commit b990c5e4c7)
2022-02-16 16:09:20 +00:00
Elijah Rippeth
78e481d07d add optional encoding argument to fileopener (#72715)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/72713

TODO: add test

cc ejguan

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72715

Reviewed By: samdow

Differential Revision: D34212650

Pulled By: ejguan

fbshipit-source-id: 78db4cc04ec0db8fd25b3d1e6c77eb0616075960
(cherry picked from commit c1898031c0)
2022-02-14 20:00:30 +00:00
Yuxin Wu
1ed4653e89 Stop writing logs to root logger (#72649)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/72648

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72649

Reviewed By: soulitzer

Differential Revision: D34172113

Pulled By: mrshenli

fbshipit-source-id: 98cb4140b978a0d9fa53876e427ea3b8bbe884cf
(cherry picked from commit c14297cee6)
2022-02-11 21:30:53 +00:00
Steven Troxler
c5f904aeb3 Convert type comments to annotations in caffe2/torch/util (#72667)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72667

Created by running
```
python -m libcst.tool codemod --no-format --jobs=1 convert_type_comments.ConvertTypeComments ~/fbsource/fbcode/caffe2/torch/utils/ --no-quote-annotations
```
and then manually cleaning up unreadable function headers (which is needed due to lack of autoformatting).

Test Plan:
Wait for CI - usually type annotations are safe to land, but the jit
compiler sometimes can choke if there's a problem.

Reviewed By: grievejia

Differential Revision: D34148011

fbshipit-source-id: 8f7c7a3b5ef78e0dea6d10ce70072f39e6d1ecc3
(cherry picked from commit 25a929ef8d)
2022-02-11 20:50:20 +00:00
Kevin Tse
3e1eff9a0e [DataPipe] Add docstrings for IterDataPipe and MapDataPipe, along with small doc changes for consistency (#72618)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72618

The major changes are in torch/utils/data/dataset.py

Let me know if anything is unclear. I'm open to suggestion.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D34119492

Pulled By: NivekT

fbshipit-source-id: 358cb6d33d18501f9042431350f872ebaa9b4070
(cherry picked from commit 53b484f60a)
2022-02-10 16:25:36 +00:00
Kevin Tse
b0dd2c2ef5 [DataPipe] Adding a note about FileLister behavior (#72619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72619

Adding a note. See Vitaly's comment here: https://github.com/pytorch/pytorch/pull/70602#discussion_r803025044

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D34119517

Pulled By: NivekT

fbshipit-source-id: e033817d63498ae480e854701d7fdd4ebc9dfb56
(cherry picked from commit 6bb1c0087b)
2022-02-10 15:49:34 +00:00
Kevin Tse
8886ed2dd5 [DataPipe] Fixing MapDataPipe docstrings (#72476)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72476

To render the change in documentation, please pull down this PR and build the doc in `TorchData`.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34078078

Pulled By: NivekT

fbshipit-source-id: f478d5a901f8364ab6de1fd4bf8a8ab1036b2231
(cherry picked from commit e00a9967e9)
2022-02-08 22:52:27 +00:00
Kevin Tse
b4e5b4d92e [DataPipe] Fixing IterDataPipe docstrings (#72475)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72475

To render the change in documentation, please pull down this PR and build the doc in `TorchData`.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34078064

Pulled By: NivekT

fbshipit-source-id: 0d3d02d5d05ecd774251cf8d04c40413660446f1
(cherry picked from commit 9f604689a4)
2022-02-08 22:52:27 +00:00
Brian Muse
8bf3179f6e #71946 Remove Python 3.6 references (#72211)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71946

This commit removes some bits of code that were hard coded for Python 3.6 support from the `.circleci` and `torch` folders. It should only be merged if https://github.com/pytorch/pytorch/issues/66462 is complete.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72211

Reviewed By: dagitses, seemethere

Differential Revision: D33982604

Pulled By: musebc

fbshipit-source-id: 8f453bf9909df615addd59538adb369c65484044
(cherry picked from commit 944a9970fe)
2022-02-08 03:46:20 +00:00
Louis Feng
83b3b5fb00 [PyTorch] Support NVTX range_start and range_end (#70030)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70030

range_push and range_pop do not support multi-thread. It only works for push and pop range in the same thread.

For process level ranges, we should use range_start and range_end. This is important because PyTorch forward is on one thread, while the autograd is on a different thread.

See NVidia implementation documentation:
cab2dec760/NSight/nvToolsExt.h (L397-L407)

Test Plan:
```
buck test caffe2/test:cuda

Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/8162774391483460
    ✓ ListingSuccess: caffe2/test:cuda - main (19.640)
Summary
  ListingSuccess: 1
If you need help understanding your runs, please follow the wiki: https://fburl.com/posting_in_tpx_users
Finished test run: https://www.internalfb.com/intern/testinfra/testrun/8162774391483460
```

Reviewed By: malfet

Differential Revision: D33155244

fbshipit-source-id: c7d5143f6da9b6ef0e0811e2fcae03a3e76f24de
(cherry picked from commit 22134e91b7)
2022-02-07 17:31:57 +00:00
Erjia Guan
6297aa114f [DataPipe] Extend FileLister to support load multiple directories (#72260)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72260

Test Plan: Imported from OSS

Reviewed By: dagitses, NivekT

Differential Revision: D33979744

Pulled By: ejguan

fbshipit-source-id: 5733d20382642fc2274afd838b33c98150d81e91
(cherry picked from commit f70537ae76)
2022-02-04 07:55:00 +00:00
Erjia Guan
4fc6ab5e81 [DataPipe] Fix OOM when traverse IterDataPipe due to pickling (#72209)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72209

This PR would fix https://fburl.com/jtfyksyr

Test Plan: Imported from OSS

Reviewed By: dagitses, NivekT

Differential Revision: D33955933

Pulled By: ejguan

fbshipit-source-id: 120203a3c2323a0c7081715bb6628d1768f8b1c4
(cherry picked from commit 469f3d0562)
2022-02-03 22:55:04 +00:00
Erjia Guan
7b014cc645 [DataPipe] Disable Typing for DataPipe before branch cut (#72123)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72123

There is a bug to fix the typing system in DataPipe, which would take more than 1 week to fix. I will follow up on it later this month. As branch cut is today, add this PR to disable typing to make sure release works.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33920610

Pulled By: ejguan

fbshipit-source-id: febff849ab2272fd3b1c5127a20f27eb82992d9c
(cherry picked from commit ee103e62e7)
2022-02-02 05:00:41 +00:00
Erjia Guan
67a275c293 Fix persistent worker exits before pin_memory thread (#71579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579

Fixes #1551

As the comment in the code, register a function to terminate persistent workers.
By adding a reference of these workers in `atexit`, it would prevent Python interpreter kills these persistent worker processes before `pin_memorh_thread` exits.
And, if users explicitly kills DataLoader iterator, such function in `atexit` would be a no-op.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33896537

Pulled By: ejguan

fbshipit-source-id: 36b57eac7523d8aa180180c2b61fc693ea4638ae
(cherry picked from commit 05add2ae0f)
2022-02-01 23:57:17 +00:00
Santiago Castro
5024c1bc7b Make get_file_pathnames_from_root output order deterministic (#70435)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/70103

I used an argument so it can be disabled. I called it `deterministic_order` because `sort` can be confusing, as it's actually sorted but by dir levels.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70435

Reviewed By: albanD

Differential Revision: D33899755

Pulled By: ejguan

fbshipit-source-id: e8a08f03a49120333b2d27f332cd21a3240a02a9
(cherry picked from commit 4616e43ec3)
2022-02-01 18:12:23 +00:00
Jinze (Richard) Xue
1a30954f44 CUDA TopK Optimization: use multiple block per slice (#71081)
Summary:
# Overview
Currently the cuda topk implementation uses only 1 block per slice, which limits the performance for big slices. This PR addresses this issue.

There are 2 parts in the topk calculation, find the kth value (`radixFindKthValues`) in each slice, then gather topk values (`gatherTopK`) based on the kth value. `radixFindKthValues` kernel now supports multiple blocks. `gatherTopK` may also need a multiple block version (separate PR?).

kthvalue, quantile, median could also use the same code (separate PR).

# Benchmark

Benchmark result with input `x = torch.randn((D1 (2d884f2263), D2 (9b53d3194c)), dtype=torch.float32)` and `k = 2000` on RTX 3080: https://docs.google.com/spreadsheets/d/1BAGDkTCHK1lROtjYSjuu_nLuFkwfs77VpsVPymyO8Gk/edit?usp=sharing

benchmark plot: left is multiblock, right is dispatched based on heuristics result from the above google sheet.
<p class="img">
<img width=49%  src="https://user-images.githubusercontent.com/9999318/150860547-7e450ed2-df09-4292-a02a-cb0e1040eebe.png">
<img width=49%  src="https://user-images.githubusercontent.com/9999318/150860579-672b88ca-e500-4846-825c-65d31d126df4.png">
</p>

The performance of divide-and-conquer implementation at https://github.com/pytorch/pytorch/pull/39850 is not stable in terms of the D1 (2d884f2263), D2 (9b53d3194c) size increasing, for more detail please check the above google sheet.

<p>
<img width=49%  src="https://user-images.githubusercontent.com/9999318/150860563-21d5a5a3-9d6a-4cef-9031-cac4d2d8edee.png">
</p>

# cubin binary size
The cubin binary size for TensorTopK.cubin (topk) and Sorting.cubin (kthvalue, quantile and etc) has been reduced by removing `#pragma unroll` at [SortingRadixSelect.cuh](https://github.com/pytorch/pytorch/pull/71081/files#diff-df06046dc4a2620f47160e1b16b8566def855c0f120a732e0d26bc1e1327bb90L321) and `largest` template argument without much performance regression.

The final binary size before and after the PR is
```
# master
-rw-rw-r-- 1 richard richard  18M Jan 24 20:07 TensorTopK.cu.1.sm_86.cubin
-rw-rw-r-- 1 richard richard  16M Jan 24 20:07 Sorting.cu.1.sm_86.cubin
# this PR
-rw-rw-r-- 1 richard richard 5.0M Jan 24 20:11 TensorTopK.cu.1.sm_86.cubin
-rw-rw-r-- 1 richard richard 2.5M Jan 24 20:11 Sorting.cu.1.sm_86.cubin
```

script to extract cubin
```
# build with REL_WITH_DEB_INFO=0
# at pytorch directory
cubin_path=build/caffe2/CMakeFiles/torch_cuda.dir/__/aten/src/ATen/native/cuda/cubin; mkdir -p $cubin_path; cd $cubin_path; find ../ -type f -name '*cu.o' -exec cuobjdump {} -xelf all \; ; ls -lh *.cubin -S | head -70
```

# benchmark script
```py
import torch
import time
import torch
import pandas as pd
import numpy as np
import torch.utils.benchmark as benchmark

torch.manual_seed(1)
dtype = torch.float
data = []

for d1 in [1, 20, 40, 60, 80, 100, 200, 400, 800, 1000, 2000, 4000, 6000, 8000, 10000, 100000, 500000]:
    if d1 <= 1000:
        D2 (9b53d3194c) = [100, 200, 300, 400, 800, 1000, 2000, 3000, 4000, 5000, 8000, 10000, 20000, 30000, 40000, 80000, 100000, 200000, 300000, 400000, 500000]
    else:
        D2 (9b53d3194c) = [100, 200, 300, 400, 800, 1000, 5000, 10000, 20000, 30000]
    for d2 in D2 (9b53d3194c):
        k = 2000 if d2 >= 2000 else d2 // 2
        print(f"----------------- D1 (2d884f2263) = {d1}, D2 (9b53d3194c) = {d2} -----------------")
        try:
            x = torch.randn((d1, d2), dtype=dtype, device="cuda")
            m = benchmark.Timer(
                stmt='x.topk(k=k, dim=1, sorted=False, largest=True)',
                globals={'x': x, 'k': k},
                num_threads=1,
            ).blocked_autorange(min_run_time=1)
            print(m)
            time_ms = m.median * 1000
        except RuntimeError: # OOM
            time_ms = -1
        data.append([d1, d2, k, time_ms])

df = pd.DataFrame(data=data, columns=['D1 (2d884f2263)', 'D2 (9b53d3194c)', 'k', 'time(ms)'])
print(df)
df.to_csv('benchmark.csv')
```

plot script could be found at: https://github.com/yueyericardo/misc/tree/master/share/topk-script

cc zasdfgbnm ngimel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71081

Reviewed By: albanD

Differential Revision: D33823002

Pulled By: ngimel

fbshipit-source-id: c0482664e9d74f7cafc559a07c6f0b564c9e3ed0
(cherry picked from commit be367b8d07)
2022-02-01 17:43:51 +00:00
Michael Carilli
cf3ef23713 Propagate full autocast state to CheckpointFunction's forward-inside-backward (#71169)
Summary:
Should fix https://github.com/pytorch/pytorch/issues/71124 (implements https://github.com/pytorch/pytorch/issues/71124#issuecomment-1009436056).

cc mcarilli ptrblck

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71169

Reviewed By: albanD

Differential Revision: D33793556

Pulled By: ngimel

fbshipit-source-id: 80a4b4f0657b922002e3446fb6b48f082fa98453
(cherry picked from commit cf9beee28b)
2022-01-27 00:31:53 +00:00
Vitaly Fedyunin
b36b11cbc1 Separating CaptureDataFrame out of DFIterDataPipe (#71776)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71776

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33771602

Pulled By: VitalyFedyunin

fbshipit-source-id: 59d85bc707a9568f1f0960fc184113a4f422d2df
(cherry picked from commit 93522768ef)
2022-01-26 03:25:02 +00:00
Ralf Gommers
9a2b43085d Improve docs for from_dlpack and to_dlpack (#70437)
Summary:
This moves the warning to the legacy function where it belongs, improves the phrasing, and adds examples.

There may be more to do to make `from_dlpack` more discoverable as a follow-up, because in multiple issues/PR we discovered people wanted new things (e.g., a memoryview-like object, or `__array_interface__` support) that `from_dlpack` already provides.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70437

Reviewed By: albanD

Differential Revision: D33760552

Pulled By: mruberry

fbshipit-source-id: e8a61fa99d42331cc4bf3adfe494cab13ca6d499
(cherry picked from commit 880ad96659)
2022-01-25 20:32:12 +00:00
Erjia Guan
bb157dd4eb Make methods of internal file_obj visible from StreamWrapper (#71653)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71653

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33718749

Pulled By: ejguan

fbshipit-source-id: f3a8244f22ca37049b8678afa0e329b23c957a9d
(cherry picked from commit a4d12ca48e)
2022-01-25 15:34:24 +00:00
pyhuang97@gmail.com
16a9ffba4b Allow specifying num_samples to RandomSampler even when replacement=False (#71568)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/38032 #39214

Hi, I modified the RandomSampler to satisfy the requirement of https://github.com/pytorch/pytorch/issues/38032. I also added and deleted some test cases in the test/test_dataloader.py to match with the new requirement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71568

Reviewed By: mikaylagawarecki

Differential Revision: D33741776

Pulled By: ejguan

fbshipit-source-id: 2d25f5096b7b36ad9fb6455107182f387cf8ee43
(cherry picked from commit 9c7e1891c2)
2022-01-25 15:34:24 +00:00
Nikita Shulga
86aefdc082 Revert D33694867: Fix persistent worker exits before pin_memory thread
Test Plan: revert-hammer

Differential Revision:
D33694867 (e2191e7084)

Original commit changeset: 0847f4d424a0

Original Phabricator Diff: D33694867 (e2191e7084)

fbshipit-source-id: 5f28616700d8647cbe468a9e300724a7f0c6cc15
(cherry picked from commit 3d8125ba6d)
2022-01-22 00:09:28 +00:00
Erjia Guan
e2191e7084 Fix persistent worker exits before pin_memory thread (#71579)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71579

Fixes #1551

As the comment in the code, register a function to terminate persistent workers. Using `atexit` to make sure termination of persistent workers always happens at the end (after pin_memory_thread exits).
We need such mechanism because Python interpreter would clean up worker process before DataLoader iterator in some rare cases.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D33694867

Pulled By: ejguan

fbshipit-source-id: 0847f4d424a0cd6b3c0be8235d505415970254e8
(cherry picked from commit 18ad4621af)
2022-01-21 20:31:16 +00:00
Kevin Tse
13ea2cb330 [DataPipe] Make GroupBy serializable with lambda function (#71497)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71497

Related to https://github.com/pytorch/data/issues/172

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33668749

Pulled By: NivekT

fbshipit-source-id: 6506614e9d4389dc645d8985c00fdb3402122d9b
(cherry picked from commit 458e76fcb1)
2022-01-21 16:04:45 +00:00
Kevin Tse
36b4c95e74 [DataPipe] adding serialization test for all core IterDataPipes (#71456)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71456

Related to https://github.com/pytorch/data/issues/172

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D33668748

Pulled By: NivekT

fbshipit-source-id: ea2085d5ed47533ca49258cc52471373c6ae1847
(cherry picked from commit d5f6fde1d0)
2022-01-21 16:04:45 +00:00
Nikita Shulga
dc5cda0cca Update min python version to 3.7 in setup.py and mypy configs (#71494)
Summary:
As Python-3.6 have reached EOL

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71494

Reviewed By: atalman

Differential Revision: D33667509

Pulled By: malfet

fbshipit-source-id: ab1f03085cfb9161df77ba5ce373b81f5e7ef3ae
(cherry picked from commit 60343166d9)
2022-01-20 00:03:57 +00:00
Nikita Shulga
8a9243996c Lazy load pandas when importing pytorch (#71316)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/71313

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71316

Reviewed By: wenleix

Differential Revision: D33595043

Pulled By: malfet

fbshipit-source-id: da8c7a7f132696645191d7b7055c4c21970d92c3
(cherry picked from commit 2d4847780a)
2022-01-19 17:02:50 +00:00
Erjia Guan
fd9e08df5d Make Demux serializable with lambda function (#71311)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71311

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33584552

Pulled By: ejguan

fbshipit-source-id: 52324faf5547f9f77582ec170ec91ce3114cfc61
2022-01-18 06:47:54 -08:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
1ecfa1d61a Load zip file in deploy interpreter (#71072)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71072

This PR replaces the old logic of loading frozen torch through cpython by directly loading zipped torch modules directly onto deploy interpreter. We use elf file to load the zip file as its' section and load it back in the interpreter executable. Then, we directly insert the zip file into sys.path of the each initialized interpreter. Python has implicit ZipImporter module that can load modules from zip file as long as they are inside sys.path.

Test Plan: buck test //caffe2/torch/csrc/deploy:test_deploy

Reviewed By: shunting314

Differential Revision: D32442552

fbshipit-source-id: 627f0e91e40e72217f3ceac79002e1d8308735d5
2022-01-15 14:39:59 -08:00
Santiago Castro
d74bb42f7a Add a missing precondition to DistributedSampler docstring (#70104)
Summary:
Distributed sampler sets different indices for different processes. By doing this, it assumes that the data is the same across the board and in the same order. This may seem trivial, however, there are times that users don't guarantee the order items are gonna have, because they rely on something such as the order the filesystem lists a directory (which is not guaranteed and may vary on different computers), or the order a `set` is iterated.

I think it's better to make it clearer.

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70104

Reviewed By: bdhirsh

Differential Revision: D33569539

Pulled By: rohan-varma

fbshipit-source-id: 68ff028cb360cadaee8c441256c1b027a57c7089
2022-01-14 13:55:12 -08:00
Kevin Tse
1e3893ecbb [DataPipe] Removing deprecated DataPipes (#71161)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71161

Users should import these DataPipes from [TorchData](https://github.com/pytorch/data) if they would like to use them. We will be checking for any downstream library usage before landing this PR.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33532272

Pulled By: NivekT

fbshipit-source-id: 9dbfb21baf2d1183e0aa379049ad8304753e08a1
2022-01-13 07:37:48 -08:00
naturomics
5749be4678 Fix the shape inconsistency of out and elem tensor (#71065)
Summary:
See bug report  https://github.com/pytorch/pytorch/issues/71063

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71065

Reviewed By: anjali411

Differential Revision: D33549921

Pulled By: ejguan

fbshipit-source-id: bc43f5f9a88f7dcd8729d0e0f4b90d20f40b3064
2022-01-12 13:57:19 -08:00
Kevin Tse
6a40bb0fdf [DataPipe] Update deprecation warning (#71171)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71171

Editing two warnings to more accurately portray the deprecation plan for the DataPipes

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33535785

Pulled By: NivekT

fbshipit-source-id: b902aaa3637ade0886c86a57b58544ff7993fd91
2022-01-12 09:34:53 -08:00
Erjia Guan
ac0d131291 Decprecating routed decoder (#70990)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70990

Releasing `decode` API for domains to let them implement custom `decode` DataPipe for now.

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33477620

Pulled By: ejguan

fbshipit-source-id: d3c30ba55c327f4849d56f42d328a932a31777ed
2022-01-11 06:56:48 -08:00
Erjia Guan
0721fc6474 Decouple MapDataPipe from Dataset (#70991)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70991

Test Plan: Imported from OSS

Reviewed By: dagitses

Differential Revision: D33477680

Pulled By: ejguan

fbshipit-source-id: d3e89492e921a96791319f35052a229684ddf7cf
2022-01-07 14:28:41 -08:00
Shintaro Iwasaki
4fa70a2483 [pytorch] fix hipify_python (#70619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70619

This Diff improves `hipify_python`, which is needed for AMD GPUs.

Change 1:
```
if (c == "," or ind == len(kernel_string) - 1) and closure == 0:
```
This is needed to deal with the following case (ex: https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/test/cuda_vectorized_test.cu#L111)
```
kernel<<<val, func()>>>(...)
// In this case, kernel_string is "val, func()"
// so closure gets 0 when ind == len(kernel_string) - 1.
```

Change 2:
```
mask_comments()
```
This is needed to deal with a case where "<<<" is included in a comment or a string literal (ex: https://github.com/pytorch/pytorch/blob/master/torch/csrc/deploy/interpreter/builtin_registry.cpp#L71)
```
abc = "<<<XYZ>>>"
// Though this <<<XYZ>>> is irrelevant to CUDA kernels,
// the current script attempts to hipify this and fails.
```

Test Plan:
This patch fixes errors I encountered by running
```
python3 tools/amd_build/build_amd.py
```

I confirmed, with Linux `diff`, that this patch does not change HIP code that was generated successfully with the original script.

Reviewed By: hyuen

Differential Revision: D33407743

fbshipit-source-id: bec822e040a154be4cda1c294536792ca8d596ae
2022-01-06 13:27:43 -08:00
Kevin Tse
8dcfdf39e7 [DataPipe] Renaming FileLoader to FileOpener with deprecation warning for FileLoader (#70367)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70367

This PR renames the `FileLoaderIterDataPipe` to `FileOpenerIterDataPipe`. For the sake of not breaking many CI tests immediately, it still preserves `FileLoader` as an alias. This will allow downstream libraries/users to migrate their use cases before we fully remove all references to `FileLoader` from PyTorch.

Fixes https://github.com/pytorch/data/issues/103. More detailed discussion about this decision is also in the linked issue.

cc VitalyFedyunin ejguan NivekT pmeier Nayef211

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33301648

Pulled By: NivekT

fbshipit-source-id: 59278dcd44e372df0ba2001a4eecbf9792580d0b
2022-01-04 09:14:50 -08:00
Kevin Tse
75dbe88b05 [DataPipe] removing unbatch_level from .groupby (#70249)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70249

IMO, the `unbatch_level` argument is not needed here since users can simply can `.unbatch` before calling `.groupby` if needed. One small step closer to an unified API with other libraries.

Note that we may rename the functional name from `.groupby` to `.group` in the future. TBD.

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33259104

Pulled By: NivekT

fbshipit-source-id: 490e3b6f5927f9ebe8772d5a5e4fbabe9665dfdf
2021-12-22 07:13:12 -08:00
Kevin Tse
74c834e0dc [DataPipe] adding a finally statement to ensure hook is reset (#70214)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70214

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33255306

Pulled By: NivekT

fbshipit-source-id: de2fe6bf08328e481c714aaad390db771073469e
2021-12-21 15:21:04 -08:00
Taylor Robie
978089c381 Prevent divide-by-zero errors in Timer (#70050)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/66503

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70050

Reviewed By: mruberry

Differential Revision: D33168868

Pulled By: robieta

fbshipit-source-id: 7d0ece9e888f6c69a9e0ced581c92d3259fb3540
2021-12-20 09:16:03 -08:00
Kevin Tse
ad0cd8a76e [DataPipe] Improve inline doc and testing for CollatorIterDataPipe (#70139)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70139

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D33199107

Pulled By: NivekT

fbshipit-source-id: f96d77490998ac9bc3da8d4ff1a9caa08e9e7f27
2021-12-20 08:05:21 -08:00
Kevin Tse
3d51c88032 [DataPipe] Unifying API - removing options to have fn_args and fn_kwargs from MapDataPipes (#69561)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69561

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32952099

Pulled By: NivekT

fbshipit-source-id: 95b725774a9d04d655e2542760726908f33043f4
2021-12-16 18:11:00 -08:00
Kevin Tse
b89c283c80 [DataPipe] Unifying API - removing options to have fn_args and fn_kwargs from IterDataPipes (#69560)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69560

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32952100

Pulled By: NivekT

fbshipit-source-id: e0cc31408c7cf3220fe274feed1c7202a1aaae70
2021-12-16 18:09:52 -08:00
Nikita Shulga
d71b8e1a8d More distutils.version.LooseVersion changes (#69947)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69947

Reviewed By: seemethere

Differential Revision: D33111996

Pulled By: malfet

fbshipit-source-id: e7d2cc4ed3e39452e809965e360b05f0b409ec0d
2021-12-15 08:07:36 -08:00
Kevin Tse
b67eaec853 [DateLoader] more clearly expose 'default_collate' and 'default_convert' to users (#69862)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69862

Fixes #69445

cc SsnL VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan, ngimel

Differential Revision: D33068792

Pulled By: NivekT

fbshipit-source-id: ef9791acdc23d014b8761fa7420062d454ce8969
2021-12-14 11:18:26 -08:00
Vitaly Fedyunin
d90012689f [DataPipe] Control shuffle settings from DataLoader2 (#65756)
Summary:
Makes `shuffle` DataPipe sensitive to DataLoader(2) `shuffle` kwarg.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65756

Reviewed By: albanD

Differential Revision: D31344867

Pulled By: VitalyFedyunin

fbshipit-source-id: e0084e0ac193ac784d6298328ca1222745681347
2021-12-14 07:35:26 -08:00
Erjia Guan
2b81ea4f9a [DataPipe] Export ShardingFilter (#69844)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69844

Test Plan: Imported from OSS

Reviewed By: NivekT

Differential Revision: D33062183

Pulled By: ejguan

fbshipit-source-id: 6b3f4ad376959c4d2e8c8b2751ae6657527dcd36
2021-12-13 19:30:56 -08:00
Jithun Nair
8dfdc3df82 [ROCm] Refactor how to specify AMD gpu targets using PYTORCH_ROCM_ARCH (#61706)
Summary:
Remove all hardcoded AMD gfx targets

PyTorch build and Magma build will use rocm_agent_enumerator as
backup if PYTORCH_ROCM_ARCH env var is not defined

PyTorch extensions will use same gfx targets as the PyTorch build,
unless PYTORCH_ROCM_ARCH env var is defined

torch.cuda.get_arch_list() now works for ROCm builds

PyTorch CI dockers will continue to be built for gfx900 and gfx906 for now.

PYTORCH_ROCM_ARCH env var can be a space or semicolon separated list of gfx archs eg. "gfx900 gfx906" or "gfx900;gfx906"
cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61706

Reviewed By: seemethere

Differential Revision: D32735862

Pulled By: malfet

fbshipit-source-id: 3170e445e738e3ce373203e1e4ae99c84e645d7d
2021-12-13 15:41:40 -08:00
Kevin Tse
a5a7e30943 [DataPipe] Adding interface for MapDataPipes (#69648)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69648

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32989066

Pulled By: NivekT

fbshipit-source-id: ef96bcd4ac4d7a576fdd2a3fb4ef52ae6a902e10
2021-12-10 12:06:08 -08:00
Kevin Tse
81a60b9813 [DataPipe] Adding output types to DataPipe interface file (#69647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69647

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D32989067

Pulled By: NivekT

fbshipit-source-id: 2c2e71e9e514e0d584affaa0b71b7b0d07a2ddbf
2021-12-10 12:04:45 -08:00
Kevin Tse
39fb855d91 [DataLoader] Implementing communication processes for Map-style DataPipes (#68549)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68549

cc SsnL VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D32922676

Pulled By: NivekT

fbshipit-source-id: fd918a342214d617a489ac5acffff15b55e9b255
2021-12-08 07:27:01 -08:00
Rohan Varma
049debd97d [Reland][Autograd/Checkpoint] Checkpoint implementation without reentrant autograd (#69508)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69508

Original Phabricator Diff: D32704467 (e032dae329)

Reland, fix is to not test traditional checkpoint when input does not require grad as that is unsupported as documented.

Original PR body:

Resubmission of https://github.com/pytorch/pytorch/pull/62964 with the
suggestions and tests discussed in
https://github.com/pytorch/pytorch/issues/65537.

Adds a `use_reentrant=False` flag to `checkpoint` function. When
`use_reentrant=True` is specified, a checkpointing implementation that uses
SavedVariableHooks instead of re-entrant autograd is used. This makes it more
composable with things such as `autograd.grad` as well as DDP (still need to
add thorough distributed testing).

As discussed in https://github.com/pytorch/pytorch/issues/65537, the tests that we need to add are:

- [x] Gradient hooks are called once
- [x] works when input does require grads but Tensor that require grads are captures (like first layer in a nn)
- [x] works for functions with arbitrary input/output objects
- [x] distributed tests (next PR)

Note that this is only for `torch.utils.checkpoint`, if this approach overall looks good, we will do something similar for `checkpoint_sequential`.
ghstack-source-id: 144948501

Test Plan: CI

Reviewed By: zhaojuanmao

Differential Revision: D32902634

fbshipit-source-id: 2ee87006e5045e5471ff80c36a07fbecc2bea3fe
2021-12-07 16:31:23 -08:00
Kevin Tse
bd8d4195a6 [DataPipe] Small change to generation script and update to DataPipe .pyi file (#69392)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69392

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32849463

Pulled By: NivekT

fbshipit-source-id: b6d419fbe0e4cc9d718f21fb3fe886f721f618d3
2021-12-07 11:40:53 -08:00
Kevin Tse
fdfdafd1e6 [DataPipe] Removing usage of unbatch_level from .batch interface and DataFrame (#69393)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69393

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32849461

Pulled By: NivekT

fbshipit-source-id: 16abbe289ad2092faaa029fd78f3d6924e7b2ff4
2021-12-07 11:40:50 -08:00
Kevin Tse
357160e68e [DataPipe] Unifying API - removing nesting_level argument from FilterIterDataPipe (#69391)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69391

As part of the efforts to unify the APIs across different data backends (e.g. TorchData, TorchArrow), we are making changes to different DataPipes' APIs. In this PR, we are removing the input argument `nesting_level` from `FilterIterDataPipe`.

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32849462

Pulled By: NivekT

fbshipit-source-id: 91cf1dc03dd3d3cbd7a9c6ccbd791ade91355f30
2021-12-07 11:40:46 -08:00
Kevin Tse
4478b14e4c [DataPipe] Unifying API - removing nesting_level argument from MapperIterDataPipe (#69390)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69390

As part of the efforts to unify the APIs across different data backends (e.g. TorchData, TorchArrow), we are making changes to different DataPipes' APIs. In this PR, we are removing the input argument `nesting_level` from `MapperIterDataPipe`.

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32849465

Pulled By: NivekT

fbshipit-source-id: 963ce70b84a7658331d126e5ed9fdb12273c8e1f
2021-12-07 11:39:08 -08:00
Michael Suo
59e98b66ac Revert D32704467: [Autograd/Checkpoint] Checkpoint implementation without reentrant autograd
Test Plan: revert-hammer

Differential Revision:
D32704467 (e032dae329)

Original commit changeset: 6eea1cce6b93

fbshipit-source-id: 1a788c1fd57cee46bba82e216e6162d078359cc2
2021-12-06 16:33:32 -08:00
Rohan Varma
e032dae329 [Autograd/Checkpoint] Checkpoint implementation without reentrant autograd (#69027)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69027

Resubmission of https://github.com/pytorch/pytorch/pull/62964 withe
suggestions and tests discussed in
https://github.com/pytorch/pytorch/issues/65537.

Adds a `use_reentrant=False` flag to `checkpoint` function. When
`use_reentrant=True` is specified, a checkpointing implementation that uses
SavedVariableHooks instead of re-entrant autograd is used. This makes it more
composable with things such as `autograd.grad` as well as DDP (still need to
add thorough distributed testing).

As discussed in https://github.com/pytorch/pytorch/issues/65537, we have added
the following tests:

-[ ] Gradient hooks are called once
ghstack-source-id: 144644859

Test Plan: CI

Reviewed By: pbelevich

Differential Revision: D32704467

fbshipit-source-id: 6eea1cce6b935ef5a0f90b769e395120900e4412
2021-12-06 13:29:37 -08:00
Kevin Tse
6baaec30cd [DataPipe] Adding ShufflerMapDataPipe (#68606)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68606

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32813290

Pulled By: NivekT

fbshipit-source-id: 8d1ebd5bc776563c23250f76a2efc1d395f1af9c
2021-12-03 11:36:33 -08:00
Nikita Shulga
bede18b061 Add support for C++ frontend wrapper on Linux (#69094)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69094

Partially addresses https://github.com/pytorch/pytorch/issues/68768

Test Plan: Imported from OSS

Reviewed By: seemethere

Differential Revision: D32730079

Pulled By: malfet

fbshipit-source-id: 854e4215ff66e087bdf354fed7a17e87f2649c87
2021-12-02 16:47:00 -08:00
Kevin Tse
0465f64bb8 [DataPipe] Adding BatcherMapDataPipe (#68197)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68197

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D32440963

Pulled By: NivekT

fbshipit-source-id: 277cbe8d735afe341a7c189be20e1d334ecf9d4a
2021-12-02 07:27:17 -08:00
Kevin Tse
d8a44270d6 [DataPipe] Simplify BatcherIterDataPipe by removing 'unbatch_level' argument and functionality (#68594)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68594

Based on my conversation with ejguan [here](https://github.com/pytorch/pytorch/pull/68197#pullrequestreview-809148827), we both believe that having the `unbatch_level` argument and functionality is making this DataPipe unnecessarily complicated, because users can call `.unbatch` before `.batch` if they would like to do so. That will likely be cleaner as well.

I also checked other libraries (for example, [TensorFlow](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#unbatch)), and I do not see them provide the ability the `unbatch` within the `batch` function either.

This PR simplifies the DataPipe by removing the argument.

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D32532594

Pulled By: NivekT

fbshipit-source-id: 7276ce76ba2a3f207c9dfa58803a48e320adefed
2021-12-01 22:00:31 -08:00
Mike Guo
23633bdb5c record the datapipe for each pieces of Dataset (#67613)
Summary:
Add record_function for each DataPipe.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67613

Reviewed By: H-Huang

Differential Revision: D32246672

Pulled By: ejguan

fbshipit-source-id: 02ef7e75748c5b84fdcbb103398532e1f2962fbf
2021-12-01 10:29:06 -08:00
Tim Poulsen
486ae5c733 Dataset & IterableDataset attribute errors prints attribute (#69021)
Summary:
The message is the message from a standard attribute error.
Thought it would be informative when the error is thrown.
Alternatively in python 3.10, one can set the keyword arguments 'name' and 'obj',
reference: https://github.com/python/cpython/blob/3.10/Doc/library/exceptions.rst#concrete-exceptions

Fixes #{?}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69021

Reviewed By: samdow

Differential Revision: D32730362

Pulled By: ejguan

fbshipit-source-id: 7132ba612fa6075aeffb9315ce651828e9a8e0bc
2021-12-01 10:16:31 -08:00
Nikita Shulga
c08e95dd9c Introduce IS_LINUX and IS_MACOS global vars (#69093)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69093

Test Plan: Imported from OSS

Reviewed By: samdow

Differential Revision: D32730080

Pulled By: malfet

fbshipit-source-id: aa3f218d09814b4edd96b01c7b57b85fd58c47fc
2021-12-01 09:47:38 -08:00
Nikita Shulga
f6f1b580f8 Fix mypy in cpp_extension.py (#69101)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69101

Test Plan: Imported from OSS

Reviewed By: atalman, janeyx99

Differential Revision: D32730081

Pulled By: malfet

fbshipit-source-id: 76ace65b51850b74b175a3c4688c05e107873e8d
2021-11-30 16:01:55 -08:00
Santiago Castro
f776f30780 Keep the sequence or mapping type in default_collate (#68779)
Summary:
`default_collate`, `default_convert`, and `pin_memory` convert sequences into lists. I believe they should keep the original type when possible (e.g., I have a class that inherits from `list`, which comes from a 3rd party library that I can't change, and provides extra functionality).

Note it's easy to do when the type supports an iterable in its creation but it's not always the case (e.g., `range`).

Even though this can be accomplished if using a custom `default_collate`/`default_convert`, 1) this is behavior they should support out-of-the-box IMHO, and 2) `pin_memory` still does it.

cc VitalyFedyunin ejguan NivekT

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68779

Reviewed By: wenleix

Differential Revision: D32651129

Pulled By: ejguan

fbshipit-source-id: 17c390934bacc0e4ead060469cf15dde815550b4
2021-11-29 13:14:20 -08:00
Ivan Yashchuk
61a4204d80 Sparse CSR CUDA: Add block torch.addmm when mat1 is sparse (#68707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68707

This PR adds a path for block CSR matrices for `torch.addmm`. cuSPARSE interface is restricted to 32-bit indices and square blocks.
My plan is to make everything work and tests passing using an unsafe constructor first, keeping it all private. Then discuss & implement constructors with block information separately unlocking the functions for wider use. Documentation will come with the update to constructors.

cc nikitaved pearu cpuhrsch IvanYashchuk ngimel

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D32650366

Pulled By: cpuhrsch

fbshipit-source-id: 430a9627901781ee3d2e2496097b71ec17727d98
2021-11-29 08:58:49 -08:00
Nikita Shulga
208e109dbf Revert D32633806: Sparse CSR CUDA: Add block torch.addmm when mat1 is sparse
Test Plan: revert-hammer

Differential Revision:
D32633806 (b28ddd72d3)

Original commit changeset: b98db0bd655c

fbshipit-source-id: 1c757628526bb1b88747257fc77d8b9cb996e502
2021-11-24 09:15:17 -08:00
Ivan Yashchuk
b28ddd72d3 Sparse CSR CUDA: Add block torch.addmm when mat1 is sparse (#68707)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68707

This PR adds a path for block CSR matrices for `torch.addmm`. cuSPARSE interface is restricted to 32-bit indices and square blocks.
My plan is to make everything work and tests passing using an unsafe constructor first, keeping it all private. Then discuss & implement constructors with block information separately unlocking the functions for wider use. Documentation will come with the update to constructors.

cc nikitaved pearu cpuhrsch IvanYashchuk ngimel

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D32633806

Pulled By: cpuhrsch

fbshipit-source-id: b98db0bd655cce651a5da457e78fca08619a5066
2021-11-23 22:55:46 -08:00
Erjia Guan
a66ff81837 [DataPipe] Optimize Grouper from N^2 to N (#68647)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68647

Fixes #68539

When all data from source datapipe depletes, there is no need to yield the biggest group in the buffer.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32562646

Pulled By: ejguan

fbshipit-source-id: ce91763656bc457e9c7d0af5861a5606c89965d5
2021-11-22 07:49:13 -08:00
Emilio Castillo
533e72e0a4 Fix DLPack CUDA stream convention (#67618)
Summary:
Apparently for the array API, cuda default stream and per thread stream should be 1 and 2 instead of 0 and 1:

https://data-apis.org/array-api/latest/API_specification/array_object.html?dlpack-self-stream-none#dlpack-self-stream-none.

This caused a problem in the interop with CuPy https://github.com/cupy/cupy/pull/5970#discussion_r739912926.

cc rgommers leofang mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67618

Reviewed By: albanD

Differential Revision: D32521805

Pulled By: mruberry

fbshipit-source-id: 95777e4014e5edf1f88ba10adc03c6e34c13248d
2021-11-18 08:36:05 -08:00
Erjia Guan
4c87aa77d1 [DataPipe] Traverse DataPipe graph excluding primitive and callable (#67783)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67783

Add `getstate_hook` to exclude primitive objects and callable when serialization when `exclude_primitive` is enabled for `traverse`.
For graph traversing, we don't have to handle the lambda and other stuff.
This is used by `OnDiskCacheHolder` to trace the DataPipe Graph.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D32146697

Pulled By: ejguan

fbshipit-source-id: 03b2ce981bb21066e807f57c167b77b2d0e0ce61
2021-11-15 06:46:31 -08:00
Kevin Tse
61a94495d9 [DataPipe] adding ZipperMapDataPipe (#68032)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68032

Part of #57031

cc VitalyFedyunin ejguan NivekT

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D32263058

Pulled By: NivekT

fbshipit-source-id: 13a30ee9d9779284a9fd9bb7222fc41253c6fe3b
2021-11-11 10:36:05 -08:00
Kevin Tse
803e88d418 [DataPipe] Fixing pickling issues with fork and demux (#67930)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67930

Fixes #67848

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D32222184

Pulled By: NivekT

fbshipit-source-id: 48871c45a855d92cd599e21f3b53827dd32c91ef
2021-11-09 07:54:02 -08:00
Kevin Tse
0b09d62cf3 [hackathon][DataPipe] adding .pyi file generation for torch.utils.data.datapipes (#67374)
Summary:
Stack from [ghstack](https://github.com/ezyang/ghstack):
* __->__ https://github.com/pytorch/pytorch/issues/67374

This is a work in progress.

Related TorchData issue: https://github.com/pytorch/data/issues/80

cc VitalyFedyunin ejguan NivekT

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67374

Reviewed By: H-Huang

Differential Revision: D32153211

Pulled By: NivekT

fbshipit-source-id: b4c61f191f20fd98ca44bb9e4f972c6d812994a0
2021-11-08 14:43:24 -08:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
9cacf2b718 Add custom zipper script to zip python modules for torch.deploy (#67006)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67006

Test Plan: nervouslaugh_

Reviewed By: shunting314

Differential Revision: D31822429

fbshipit-source-id: c2efeab1446fbeb70b98d4ee766fbc670cf091b0
2021-11-06 11:49:02 -07:00
Philip Meier
641ba36a4e fix annotation for Demultiplexer (#65998)
Summary:
cc SsnL VitalyFedyunin ejguan NivekT

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65998

Reviewed By: bdhirsh

Differential Revision: D32145926

Pulled By: ejguan

fbshipit-source-id: 60be3126fb9e73b8631b5040676264504e926707
2021-11-04 13:44:02 -07:00
Shashank Chaudhry
89c4e8c22b [NOOP][clangformat][codemod] Enable CLANGFORMAT for some folders in caffe2/* (#67746)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67746

Test Plan: Visual inspection. Sandcastle.

Reviewed By: zertosh

Differential Revision: D31986646

fbshipit-source-id: 91885c20c3cead3853c49abb9fe0a94a67f33cc8
2021-11-03 12:23:14 -07:00
Douglas Lehr
b8f07689f2 [ROCm] Enable frexp support for ROCm builds (#67226)
Summary:
The frexp function has been enabled in ROCm code.  Updating PyTorch
to enable this functionality.

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67226

Reviewed By: jbschlosser

Differential Revision: D31984606

Pulled By: ngimel

fbshipit-source-id: b58eb7f226f6eb3e17d8b1e2517a4ea7297dc1d5
2021-10-28 12:42:09 -07:00