Commit Graph

254 Commits

Author SHA1 Message Date
bhushan
a6c4ea66dd Passing indices as a list to Subset instead of Tensor (#17649)
Summary:
Indices in Subset were stored as tensors earlier
passing as list in random_split to ensure integer indexing

fixes: #17466
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17649

Differential Revision: D14400250

Pulled By: soumith

fbshipit-source-id: cd20a959f33773c4babf8e861ea37ec61c2713a0
2019-03-10 09:23:53 -07:00
peterjc123
fe90ee9dc8 Add /MD to prevent linking errors on Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17799

Differential Revision: D14385777

Pulled By: ezyang

fbshipit-source-id: 8c1d9f80c48399087f5fae4474690e6d80d740e6
2019-03-08 10:46:25 -08:00
peter
c78da0c6ed Enable using CMD when building cpp extensions on Windows
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17706

Differential Revision: D14346482

Pulled By: ezyang

fbshipit-source-id: 7c85e51c701f6c0947ad324ef19fafda40ae1cb9
2019-03-06 14:45:31 -08:00
Bryan He
01977c0a89 Change fake tqdm constructor to match real tqdm (#17636)
Summary:
Currently, the fake tqdm implementation requires an input (whereas real tqdm does not).

This caused a problem in torchvision (https://github.com/pytorch/vision/pull/770), and seems likely to cause minor irritations elsewhere.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17636

Differential Revision: D14296530

Pulled By: ezyang

fbshipit-source-id: bc077d898773c93dab34c985a7b30525a43e558a
2019-03-03 01:06:10 -08:00
Krishna Kalyan
d80f0a1f3a Add example to WeightedRandomSampler doc string (#17432)
Summary: Example for the weighted random sampler are missing [here](https://pytorch.org/docs/stable/data.html#torch.utils.data.WeightedRandomSampler)

Differential Revision: D14198642

Pulled By: soumith

fbshipit-source-id: af6d8445d31304011002dd4308faaf40b0c1b609
2019-02-23 20:29:06 -08:00
Olen ANDONI
be4ad3fe30 fix(typo): Change 'integeral' to 'integer'
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17396

Differential Revision: D14195023

Pulled By: soumith

fbshipit-source-id: 300ab68c24bfbf10768fefac44fad64784463c8f
2019-02-23 08:22:01 -08:00
jayleverett
016f212357 fix behavior of ConcatDataset w/ negative indices (#15756)
Summary:
Currently, when you pass a negative index to a `Dataset` created with `ConcatDataset`, it simply passes that index to the first dataset in the list. So if, for example, we took `concatenated_dataset[-1]`, this will give us the last entry of the *first* dataset, rather than the last entry of the *last* dataset, as we would expect.

This is a simple fix to support the expected behavior for negative indices.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15756

Reviewed By: ezyang

Differential Revision: D14081811

Pulled By: fmassa

fbshipit-source-id: a7783fd3fd9e1a8c00fd076c4978ca39ad5a8a2a
2019-02-14 13:02:54 -08:00
ptrblck
8abfd28f58 #16627 convert weights using torch.as_tensor to avoid warning (#17067)
Summary:
Minor change which fixes #16627
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17067

Differential Revision: D14078726

Pulled By: soumith

fbshipit-source-id: c04a5f1eff44e4a4b04b981f0ae8de6ff018515b
2019-02-13 20:54:29 -08:00
Pearu Peterson
7c1e4258a9 Workarounds to the lack of nvidia-smi and ldconfig programs in macosx (was PR 16968) (#16999)
Summary:
Fix issue #12174 for Mac OSX.

PS: This is a duplicate of PR #16968 that got messed up. Sorry for the confusion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16999

Differential Revision: D14050669

Pulled By: zou3519

fbshipit-source-id: a4594c03ae8e0ca91a4836408b6c588720162c9f
2019-02-12 14:39:28 -08:00
Daniel
e5742494f6 Minor typo
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16980

Differential Revision: D14033686

Pulled By: gchanan

fbshipit-source-id: 9f7967defc6795640e14157d0b701b185061741f
2019-02-12 08:02:04 -08:00
Eskil Jörgensen
8042edcdb1 Make pin_memory and default_collate preserve namedtuples (#16440)
Summary:
Open issue: https://github.com/pytorch/pytorch/issues/3281
Corresponding PR (conflict): https://github.com/pytorch/pytorch/pull/4577

Another open issue: https://github.com/pytorch/pytorch/issues/14613
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16440

Differential Revision: D14020901

Pulled By: ezyang

fbshipit-source-id: 4abe817fc43c281a510715d311bad544511995d3
2019-02-11 08:47:33 -08:00
Michael Carilli
0742874643 Allow dataloader to accept a custom memory pinning function (#16743)
Summary:
Renewed attempt at https://github.com/pytorch/pytorch/pull/14171

From the original PR:
> Currently, the pin_memory_batch function in the dataloader will return a batch comprised of any unrecognized type without pinning the data, because it doesn't know how.
>
>This behavior was preventing us from overlapping data prefetching in Mask-RCNN, whose custom collate_fn returns a custom batch type.

The old PR allowed the user to implement batch pinning for custom batch and data types by passing a custom pin function to the dataloader.  slayton58 suggested a cleaner approach:  allow the user to define a `pin_memory` method on their custom types, and have `pin_memory_batch` [check for the presence of that method](https://github.com/pytorch/pytorch/pull/16743/files#diff-9f154cbd884fe654066b1621fad654f3R56) in the incoming batch as a fallback.  I've updated the test and docstrings accordingly.

The old PR was merged but then reverted due to weird cuda OOM errors on windows that may or may not have been related.  I have no idea why my changes would cause such errors (then or now) but it's something to keep an eye out for.

fmassa and yf225 who were my POCs on the old PR.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16743

Differential Revision: D13991745

Pulled By: ezyang

fbshipit-source-id: 74e71f62a03be453b4caa9f5524e9bc53467fa17
2019-02-10 19:37:53 -08:00
Pearu Peterson
7ce33c586d Robust determination of cudnn library and relevant conda packages. (#16859)
Summary:
This PR implements:
1. a fix to issue #12174 - determine the location of cudnn library using `ldconfig`
2. a fix to determine the installed conda packages (in recent versions of conda, the command `conda` is a Bash function that cannot be called within a python script, so using CONDA_EXE environment variable instead)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16859

Differential Revision: D14000399

Pulled By: soumith

fbshipit-source-id: 905658ecacb0ca0587a162fade436de9582d32ab
2019-02-07 20:34:46 -08:00
Rodrigo Berriel
d327965dac Fix pip list format in collect_env (#16798)
Summary:
Since pip 18.0 (2018-07-22), `legacy` is no longer a valid choice for `pip list --format` as can be seen in the [Release Notes](https://pip.pypa.io/en/stable/news/#id62). Therefore, the options now are: `columns`, `freeze` and `json`. With `legacy`, this is how it looked like:

```
[...]
Versions of relevant libraries:
[pip3] numpy (1.16.1)
[pip3] torch (1.0.1)
[pip3] torchvision (0.2.1)
[...]
```

Changing to `freeze`, this is how it looks like:

```
[...]
Versions of relevant libraries:
[pip3] numpy==1.16.1
[pip3] torch==1.0.1
[pip3] torchvision==0.2.1
[...]
```

Currently, this is what happens:

```
[...]
Versions of relevant libraries:
[pip] Could not collect
[...]
```
The `freeze` option is also available in old pip, so this change is backwards compatible. Also, if we would like to keep the old style, which I think it is not necessary, I could easily change that.

 ---

In case anyone wants to know how `columns` looks like (I prefer `freeze`):

```
[...]
Versions of relevant libraries:
[pip3] numpy               1.16.1
[pip3] torch               1.0.1
[pip3] torchvision         0.2.1
[...]
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16798

Differential Revision: D13971793

Pulled By: soumith

fbshipit-source-id: 3721d9079a2afa245e1185f725598901185ea4cd
2019-02-06 07:48:08 -08:00
Antoine Busque
a44826e659 Fix: avoid race condition on model zoo directory creation (#16578)
Summary:
The current implementation of the `torch.utils.model_zoo.load_url`
function is prone to a race condition when creating the directory in
which it saves the loaded models, since it checks whether the
directory exists and then creates it in two separate steps. The
directory can be created after the check was made but before we
attempt to create the directory, resulting in an unhandled exception.

Instead, try to create the directory directly, and do nothing if it
already exists.

Note: for Python versions ≥ 3.2, we could simply use the
`exist_ok=True` flag on `os.makedirs`, but this is unavailable in
Python 2.7.

Signed-off-by: Antoine Busque <antoine.busque@elementai.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16578

Differential Revision: D13886470

Pulled By: soumith

fbshipit-source-id: 88815c8a65eec96caea32d6e9a7f83802502fdb9
2019-01-30 18:35:45 -08:00
Lu Fang
b1b00f329e Fix the flake8 linter
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16549

Reviewed By: bddppq

Differential Revision: D13877435

Pulled By: houseroad

fbshipit-source-id: dbe575ba3f6dd30d27ac6aa5eec2eea025063540
2019-01-30 09:36:00 -08:00
Zachary DeVito
21193bf123 try to get rid of tmp_install (#16414)
Summary:
Rehash of previous attempts. This tries a different approach where we accept the install as specified in cmake (leaving bin/ include/ and lib/ alone), and then try to adjust the rest of the files to this more standard layout.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16414

Differential Revision: D13863635

Pulled By: zdevito

fbshipit-source-id: 23725f5c64d7509bf3ca8f472dcdcad074de9828
2019-01-29 17:29:40 -08:00
Soumith Chintala
bd19dd4b90 url download bugfix for URLs served without Content-Length header (#16153)
Summary:
Some HTTP servers dont return Content-Length, account for that

Fixes: https://github.com/pytorch/pytorch/issues/16152

Differential Revision: D13858882

Pulled By: soumith

fbshipit-source-id: e4293e9368ed4c87548d22adec1ce0c25ea4bd8f
2019-01-29 01:28:47 -08:00
SsnL
4aae89fa7b Make test_proper_exit more robust (#16249)
Summary:
1. Improve error message for better debugging info
2. Increase timeout
3. Also apply the windows worker failure detection mechanism on non-Windows platforms, for better robustness

Attempt to fix #14501

cc ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16249

Differential Revision: D13784702

Pulled By: ezyang

fbshipit-source-id: 09a7cff83ab9edce561ed69f9fb555ab35d1275f
2019-01-25 08:25:05 -08:00
kyryl
a7415787ac fix RandomSampler length (#15991)
Summary:
Hi!

This PR addresses #15537  issue.
Please review.

Thanks!

Differential Revision: D13649890

Pulled By: soumith

fbshipit-source-id: 166212ae383331345423236dfc4fa2ea907d265d
2019-01-13 23:09:51 -08:00
SsnL
9b5ec2a076 Fix TestDataLoader.test_proper_exit (#15665)
Summary:
Currently, in `test_proper_exit`,
1. we do not kill the correct input `pid` in the `kill_pid` function
fe15d6a2c2/test/test_dataloader.py (L325-L329)
2. the Windows command that detects process status doesn't actually work
fe15d6a2c2/test/test_dataloader.py (L641-L646)
3. `worker_error` and `worker_kill` cases (sometimes?) are not tested because the workers may exit naturally due to the pre-fetching mechanism and a too small `dataset size / batch size`.

In this PR, I, in separate commits:
1. Install `psutil` (a python package specifically built for process monitoring) on some CI builds. (Linux builds installation are done in https://github.com/pietern/pytorch-dockerfiles/pull/29 https://github.com/pietern/pytorch-dockerfiles/pull/30  https://github.com/pytorch/ossci-job-dsl/pull/36 and https://github.com/pytorch/pytorch/pull/15795).
2. Rewrite `test_proper_exit` with `psutil` so we

    1. do not rely on the hacky `is_process_alive` fe15d6a2c2/test/test_dataloader.py (L640-L653)
   2. increase the #task per worker so `worker_error` and `worker_kill` properly trigger
   3. test error message content to ensure that the loader exits with correct message corresponding to each exiting scenario.

3. Fix Windows data loader not having any mechanism to detect worker failures.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15665

Differential Revision: D13615527

Pulled By: soumith

fbshipit-source-id: cfb2f67837d2d87928a53f00b4d20f09754b7949
2019-01-10 08:47:27 -08:00
Jon Crall
c7ec7cdd46 Fixed syntax error in doctest (#15646)
Summary:
I fixed a very small extra parenthesis in a doctest.

I'm also going to use this issue as a place to propose the eventual inclusion of xdoctest (a pip installable library I wrote) in pytorch's test suite. I think there are a lot of problems with Python's built in doctest module, and I've built xdoctest to fix them. I would love for my project to get some exposure and its addition to PyTorch may benefit both projects. Please see the readme for more details on what xdoctest brings to the table over the builtin doctest module: https://github.com/Erotemic/xdoctest

I came across this small syntax error when working on ensuring xdoctest was compatible with pytorch. It isn't 100% there yet, but I'm working on it. My goal is to ensure that xdoctest is 100% compatible with all of torch's doctest out-of-the-box before writing up the PR. I'm also airing the idea out-loud before I commit too much time into this (or get my hopes up), so I'm attaching this little blurb to a no-brainer-merge PR to (1) demonstrate a little bit of value (because xdoctest flagged this syntax error) and (2) see how its received.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15646

Differential Revision: D13606111

Pulled By: soumith

fbshipit-source-id: d4492801a38ee0ae64ea0326a83239cee4d811a4
2019-01-09 01:29:11 -08:00
Christoph
2a45050fdc Concatenate directly into shared memory when constructing batches for numpy (#14534)
Summary:
Since #1323 tensors are shared with shared memory, but this feature is not active for numpy.
This PR fix this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14534

Differential Revision: D13561649

Pulled By: soumith

fbshipit-source-id: b6bc9e99fb91e8b675c2ef131fba9fa11c1647c0
2018-12-29 17:51:02 -08:00
SsnL
fb22f76eb6 default_collate should collate bool list to byte tensors (#14669)
Summary:
Based on #15331 . Review only the last commit.

Fixes https://github.com/pytorch/pytorch/issues/14507.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14669

Reviewed By: ezyang

Differential Revision: D13528725

Pulled By: soumith

fbshipit-source-id: f12f1ac1c4ff2a3ddd6877c0c096a5da3a1ffa3c
2018-12-28 12:26:46 -08:00
vishwakftw
d9cad71b36 Enable running collect_env.py without building PyTorch (#15468)
Summary: Closes #15346

Differential Revision: D13537873

Pulled By: ezyang

fbshipit-source-id: 7765ce4108dae9479d8900c0815cc2f174596a83
2018-12-21 11:37:43 -08:00
surgan12
3a6d473b49 collect_env fix (#15447)
Summary:
fixes #15214
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15447

Differential Revision: D13531523

Pulled By: ezyang

fbshipit-source-id: 8f24f5ae9f3e78f6c5c9ee702ba14faca7aa297a
2018-12-20 16:56:34 -08:00
SsnL
9217bde807 Refactor dataloader.py (#15331)
Summary:
Same as #14668, and was approved there.

ailzhang , please apply this patch to Horizon's `data_streamer.py`: https://gist.github.com/SsnL/020fdb3d6b7016d81b6ba1d04cc41459 Thank you!

Below is the original description at #14668:

As I am working on tasks in https://github.com/pytorch/pytorch/issues/13023, I realized how unreadable the code is because all functions to be run in multiprocessing must be at top global level. Adding more functionalities to `dataloader.py` will only make things worse.

So in this PR, I refactor `dataloader.py` and move much of it into `data._utils`. E.g., the `_worker_loop` and related methods are now in `data._utils.worker`, signal handling code in `data._utils.signal_handling`, collating code in `data._utils.collate`, etc. This split, IMHO, makes code much clearer. I will base my future changes to DataLoader on top of this.

No functionality is changed, except that  I added `torch._six.queue`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15331

Reviewed By: yf225

Differential Revision: D13503120

Pulled By: ailzhang

fbshipit-source-id: 94df16b4d80ad1102c437cde0d5a2e62cffe1f8e
2018-12-19 12:36:03 -08:00
Derek Kim
656b565a0f Trivial comment correction in dataloader (#15276)
Summary:
Trivial comment correction in dataloader
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15276

Differential Revision: D13477324

Pulled By: soumith

fbshipit-source-id: 2a74a014999655d129311d611f2a09411339cb13
2018-12-15 10:59:00 -08:00
Peter Goldsborough
0bf1383f0a Python <-> C++ Frontend inter-op (#13481)
Summary:
This PR enables C++ frontend modules to be bound into Python and added as submodules of Python modules. For this, I added lots of pybind11 bindings for the `torch::nn::Module` class, and modified the `torch.nn.Module` class in Python to have a new Metaclass that makes `isinstance(m, torch.nn.Module)` return true when `m` is a C++ frontend module. The methods and fields of C++ modules are bound in such a way that they work seamlessly as submodules of Python modules for most operations (one exception I know of: calling `.to()` ends up calling `.apply()` on each submodule with a Python lambda, which cannot be used in C++ -- this may require small changes on Python side).

I've added quite a bunch of tests to verify the bindings and equality with Python. I think I should also try out adding a C++ module as part of some large PyTorch module, like a WLM or something, and see if everything works smoothly.

The next step for inter-op across our system is ScriptModule <-> C++ Frontend Module inter-op. I think this will then also allow using C++ frontend modules from TorchScript.

apaszke zdevito

CC dzhulgakov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13481

Differential Revision: D12981996

Pulled By: goldsborough

fbshipit-source-id: 147370d3596ebb0e94c82cec92993a148fee50a7
2018-12-13 08:04:02 -08:00
Michael Carilli
5d3a347685 Stashing checkpointing RNG states based on devices of arg tensors (#14518)
Summary:
This PR intends to address apaszke's concerns in https://github.com/pytorch/pytorch/pull/14253#issuecomment-441740016.  Preserving the rng state is now controlled by a kwarg rather than a global state, hopefully in a python 2.7-compatible way.

Additionally, the checkpointing function stashes and restores the RNG states of
1. devices associated with all input tensor args to run_fn as well as
2. the current device.

I could easily change this to only save and restore the RNG states associated 1. alone.  This would simplify the logic to create a [deduplicated, ordered](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R37) list of devices considered active.

I'm wondering if the [get_device_states](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R32) and [set_device_states](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R47) functions are general enough to reside elsewhere (presumably torch/random.py).  I'm also wondering if the check on [torch.cuda._initialized](https://github.com/pytorch/pytorch/compare/master...mcarilli:checkpointing_rng_touchup?expand=1#diff-58da227fc9b1d56752b7dfad90428fe0R47) would be better placed within `get_device_states`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14518

Differential Revision: D13356210

Pulled By: ezyang

fbshipit-source-id: afa4cc21ce7862142d5cb1dec3750018df222039
2018-12-11 09:48:45 -08:00
Richard Zou
e6a420114f collect_env.py: get conda magma and mkl information (#14854)
Summary:
Fixes #12371
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14854

Differential Revision: D13363635

Pulled By: zou3519

fbshipit-source-id: f8b5d05038bf5ce451399dfeed558ae298178128
2018-12-06 14:58:14 -08:00
Ailing Zhang
38eb1beff5 Revert D13289919: [pytorch][PR] [DataLoader] Refactor dataloader.py
Differential Revision:
D13289919

Original commit changeset: d701bc7bb48f

fbshipit-source-id: c350c491fefa98a0a7c0cf22cb832e78aeb15c3d
2018-12-04 20:25:16 -08:00
Andy Chen
33ea7eafef Make checkpoint_sequential work with multiple arguments (#14278)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14278

In this commit, we make checkpoint_sequential work for models with multiple tensor inputs. Previously, it only processed the first tensor and ignored the rest.

We introduce a new test in test/test_utils.py that replicates the issue referenced in this [GitHub issue](https://github.com/pytorch/pytorch/issues/11093), and we make sure that the test passes by changing the behavior of checkpoint_sequential to process all input tensors.

Reviewed By: ezyang

Differential Revision: D13144672

fbshipit-source-id: 24f58233a65a0f5b80b89c8d8cbced6f814004f7
2018-12-04 18:47:43 -08:00
SsnL
16558a1e9d Refactor dataloader.py (#14668)
Summary:
As I am working on tasks in https://github.com/pytorch/pytorch/issues/13023, I realized how unreadable the code is because all functions to be run in multiprocessing must be at top global level. Adding more functionalities to `dataloader.py` will only make things worse.

So in this PR, I refactor `dataloader.py` and move much of it into `data._utils`. E.g., the `_worker_loop` and related methods are now in `data._utils.worker`, signal handling code in `data._utils.signal_handling`, collating code in `data._utils.collate`, etc. This split, IMHO, makes code much clearer. I will base my future changes to DataLoader on top of this.

No functionality is changed, except that  I added `torch._six.queue`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14668

Reviewed By: soumith

Differential Revision: D13289919

Pulled By: ailzhang

fbshipit-source-id: d701bc7bb48f5dd7b163b5be941a9d27eb277a4c
2018-12-04 09:53:41 -08:00
Peter Goldsborough
db15f2e13f Fix version.groups() (#14505)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/14502

fmassa soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14505

Differential Revision: D13242386

Pulled By: goldsborough

fbshipit-source-id: faebae8795e1efd9c0ebc2294fe9648193d16624
2018-11-28 20:27:33 -08:00
Peter Goldsborough
6f2307ba6a Allow building libraries with setuptools that dont have abi suffix (#14130)
Summary:
When using `setuptools` to build a Python extension, setuptools will automatically add an ABI suffix like `cpython-37m-x86_64-linux-gnu` to the shared library name when using Python 3. This is required for extensions meant to be imported as Python modules. When we use setuptools to build shared libraries not meant as Python modules, for example libraries that define and register TorchScript custom ops, having your library called `my_ops.cpython-37m-x86_64-linux-gnu.so` is a bit annoying compared to just `my_ops.so`, especially since you have to reference the library name when loading it with `torch.ops.load_library` in Python.

This PR fixes this by adding a `with_options` class method to the `torch.utils.cpp_extension.BuildExtension` which allows configuring the `BuildExtension`. In this case, the first option we add is `no_python_abi_suffix`, which we then use in `get_ext_filename` (override from `setuptools.build_ext`) to throw away the ABI suffix.

I've added a test `setup.py` in a `no_python_abi_suffix_test` folder.

Fixes https://github.com/pytorch/pytorch/issues/14188

t-vi fmassa soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14130

Differential Revision: D13216575

Pulled By: goldsborough

fbshipit-source-id: 67dc345c1278a1a4ee4ca907d848bc1fb4956cfa
2018-11-27 17:35:53 -08:00
Will Feng
5918de8e84 Revert D13166669: [pytorch][PR] Allow dataloader to accept a custom memory pinning function
Differential Revision:
D13166669

Original commit changeset: ca965f9841d4

fbshipit-source-id: 0836b4f50f73ba01c97491a719660f02e36f20ad
2018-11-26 14:55:04 -08:00
Peter Goldsborough
a13fd7ec28 Allow torch.utils.cpp_extension.load to load shared libraries that aren't Python modules (#13941)
Summary:
For custom TorchScript operators, `torch.ops.load_library` must be used and passed the path to the shared library containing the custom ops. Our C++ extensions stuff generally is meant to build a Python module and import it. This PR changes `torch.utils.cpp_extension.load` to have an option to just return the shared library path instead of importing it as a Python module, so you can then pass it to `torch.ops.load_library`. This means folks can re-use `torch.utils.cpp_extension.load` and `torch.utils.cpp_extension.load_inline` to even write their custom ops inline. I think t-vi  and fmassa will appreciate this.

soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13941

Differential Revision: D13110592

Pulled By: goldsborough

fbshipit-source-id: 37756307dbf80a81d2ed550e67c8743dca01dc20
2018-11-26 09:39:21 -08:00
Michael Carilli
7557a993ab Allow dataloader to accept a custom memory pinning function (#14171)
Summary:
Currently, the `pin_memory_batch` function in the dataloader will return a batch comprised of any unrecognized type without pinning the data, because it doesn't know how.

This behavior was preventing us from overlapping data prefetching in Mask-RCNN, whose custom `collate_fn` returns a custom batch type.

The present PR adds the ability for the user to pass a `pin_fn` alongside any custom `collate_fn` to handle such custom types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14171

Differential Revision: D13166669

Pulled By: soumith

fbshipit-source-id: ca965f9841d4a259b3ca4413c8bd0d8743d433ab
2018-11-23 08:12:43 -08:00
Michael Carilli
c36156eded Option to preserve bitwise accuracy of gradient checkpointed vs non-checkpointed dropout (#14253)
Summary:
This issue was noticed, and fix proposed, by raulpuric.

Checkpointing is implemented by rerunning a forward-pass segment for each checkpointed segment during backward.  This can result in the RNG state advancing more than it would without checkpointing, which can cause checkpoints that include dropout invocations to lose end-to-end bitwise accuracy as compared to non-checkpointed passes.

The present PR contains optional logic to juggle the RNG states such that checkpointed passes containing dropout achieve bitwise accuracy with non-checkpointed equivalents.**  The user requests this behavior by supplying `preserve_rng_state=True` to `torch.utils.checkpoint` or `torch.utils.checkpoint_sequential`.

Currently, `preserve_rng_state=True` may incur a moderate performance hit because restoring MTGP states can be expensive.  However, restoring Philox states is dirt cheap, so syed-ahmed's [RNG refactor](https://github.com/pytorch/pytorch/pull/13070#discussion_r235179882), once merged, will make this option more or less free.

I'm a little wary of the [def checkpoint(function, *args, preserve_rng_state=False):](https://github.com/pytorch/pytorch/pull/14253/files#diff-58da227fc9b1d56752b7dfad90428fe0R75) argument-passing method (specifically, putting a kwarg after a variable argument list).  Python 3 seems happy with it.
Edit:  It appears Python 2.7 is NOT happy with a [kwarg after *args](https://travis-ci.org/pytorch/pytorch/builds/457706518?utm_source=github_status&utm_medium=notification).  `preserve_rng_state` also needs to be communicated in a way that doesn't break any existing usage.  I'm open to suggestions (a global flag perhaps)?

**Batchnorm may still be an issue, but that's a battle for another day.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14253

Differential Revision: D13166665

Pulled By: soumith

fbshipit-source-id: 240cddab57ceaccba038b0276151342344eeecd7
2018-11-23 08:09:43 -08:00
Peter Goldsborough
5b1b8682a3 Missing .decode() after check_output in cpp_extensions (#13935)
Summary:
soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13935

Differential Revision: D13090852

Pulled By: goldsborough

fbshipit-source-id: 47da269d074fd1e7220e90580692d6ee489ec78b
2018-11-16 12:16:29 -08:00
Anders Papitto
2983998bb3 add torch-python target (#12742)
Summary:
This is the next minimal step towards moving _C into cmake. For now,
leave _C in setup.py, but reduce it to an empty stub file. All of its
sources are now part of the new torch-python cmake target.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12742

Reviewed By: soumith

Differential Revision: D13089691

Pulled By: anderspapitto

fbshipit-source-id: 1c746fda33cfebb26e02a7f0781fefa8b0d86385
2018-11-16 11:43:48 -08:00
Peter Goldsborough
7978ba45ba Update path in CI script to access ninja (#13646)
Summary:
We weren't running C++ extensions tests in CI.
Also, let's error hard when `ninja` is not available instead of skipping C++ extensions tests.

Fixes https://github.com/pytorch/pytorch/issues/13622

ezyang soumith yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13646

Differential Revision: D12961468

Pulled By: goldsborough

fbshipit-source-id: 917c8a14063dc40e6ab79a0f7d345ae2d3566ba4
2018-11-07 14:31:29 -08:00
Peter Goldsborough
393ad6582d Use torch:: instead of at:: in all C++ APIs (#13523)
Summary:
In TorchScript and C++ extensions we currently advocate a mix of `torch::` and `at::` namespace usage. In the C++ frontend I had instead exported all symbols from `at::` and some from `c10::` into the `torch::` namespace. This is far, far easier for users to understand, and also avoid bugs around creating tensors vs. variables. The same should from now on be true for the TorchScript C++ API (for running and loading models) and all C++ extensions.

Note that since we're just talking about typedefs, this change does not break any existing code.

Once this lands I will update stuff in `pytorch/tutorials` too.

zdevito ezyang gchanan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13523

Differential Revision: D12942787

Pulled By: goldsborough

fbshipit-source-id: 76058936bd8707b33d9e5bbc2d0705fc3d820763
2018-11-06 14:32:25 -08:00
Guoxia Wang
cc3cecdba0 Fix the bug when compile using nvcc compiler. (#13509)
Summary:
I found a bug about compiling the cuda file when I install maskrcnn-benchmark lib.

`python setup.py build develop` will throw the error:
```
  File "/usr/local/lib/python2.7/dist-packages/torch/utils/cpp_extension.py", line 214, in unix_wrap_compile
    original_compile(obj, src, ext, cc_args, cflags, pp_opts)
  File "/usr/lib/python2.7/distutils/unixccompiler.py", line 125, in _compile
    self.spawn(compiler_so + cc_args + [src, '-o', obj] +
TypeError: coercing to Unicode: need string or buffer, list found
```

For more information, please see [issue](https://github.com/facebookresearch/maskrcnn-benchmark/issues/99).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13509

Differential Revision: D12902675

Pulled By: soumith

fbshipit-source-id: b9149f5de21ae29f94670cb2bbc93fa368f4e0f7
2018-11-02 11:09:43 -07:00
Peter Goldsborough
7b47262936 Use names instead of indices in format (#13266)
Summary:
apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13266

Differential Revision: D12841054

Pulled By: goldsborough

fbshipit-source-id: 7ce9f942367f82484cdae6ece419ed5c0dc1de2c
2018-10-31 15:17:47 -07:00
Peter Goldsborough
1c8a823b3b More robust ABI compatibility check for C++ extensions (#13092)
Summary:
This PR makes the ABI compatibility check for C++ extensions more robust by resolving the real path of the compiler binary, such that e.g. `"c++"` is resolved to the path of g++. This more robust than assuming that `c++ --version` will contain the word "gcc".

CC jcjohnson

Closes #10114

soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13092

Differential Revision: D12810448

Pulled By: goldsborough

fbshipit-source-id: 6ac460e24496c0d8933b410401702363870b7568
2018-10-29 11:56:02 -07:00
Yangqing Jia
852d6e8b65 Fix python2 and python 3 compatibility found by lint. (#13140)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13140

This is an example about the benefit of proper facebook linter. The old code
was not python 2.x (actually, pre-python 3.3) compatible. Note that FileExistsError
is added in Python 3.3:

https://stackoverflow.com/questions/20790580/python-specifically-handle-file-exists-exception

Reviewed By: mingzhe09088

Differential Revision: D10858804

fbshipit-source-id: a4c995aef9f720cb8b0ce463f0a51db667fc42f2
2018-10-25 17:20:11 -07:00
Yangqing Jia
c47f680086 arc lint torch/utils (#13141)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13141

This is an example diff to show what lint rules are being applied.

Reviewed By: mingzhe09088

Differential Revision: D10858478

fbshipit-source-id: cbeb013f10f755b0095478adf79366e7cf7836ff
2018-10-25 14:59:03 -07:00
Soumith Chintala
cf235e0894 fix lint after new flake8 release added new style constraints (#13047)
Summary:
fix lint after new flake8 release added new style constraints
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13047

Differential Revision: D10527804

Pulled By: soumith

fbshipit-source-id: 6f4d02662570b6339f69117b61037c8394b0bbd8
2018-10-24 09:03:38 -07:00