Commit Graph

211 Commits

Author SHA1 Message Date
Vasiliy Kuznetsov
a8ca042ec0 rename torch.Assert to torch._assert (#47763)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47763

Changing the name due to the discussion in
https://github.com/pytorch/pytorch/pull/47399.

Test Plan:
```
python test/test_utils.py TestAssert.test_assert_true
python test/test_fx.py TestFX.test_symbolic_trace_assert
python test/test_fx_experimental.py
```

Imported from OSS

Reviewed By: ezyang

Differential Revision: D24891767

fbshipit-source-id: 01c7a5acd83bf9c962751552780930c242134dd2
2020-11-12 23:59:34 -08:00
Taylor Robie
dda95e6914 More Timer refinement (#46023)
Summary:
This PR just adds more polish to the benchmark utils:

1) `common.py`, `timer.py`, and `valgrind_wrapper/timer_interface.py` are now MyPy strict compliant. (except for three violations due to external deps.) Compare and Fuzzer will be covered in a future PR.
2) `CallgrindStats` now uses `TaskSpec` rather than accepting the individual fields which brings it closer to `Measurement`.
3) Some `__repr__` logic has been moved into `TaskSpec` (which `Measurement` and `CallgrindStats` use in their own `__repr__`s) for a more unified feel and less horrible f-string hacking, and the repr's have been given a cleanup pass.
4) `Tuple[FunctionCount, ...]` has been formalized as the `FunctionCounts` class, which has a much nicer `__repr__` than just the raw tuple, as well as some convenience methods (`__add__`, `__sub__`, `filter`, `transform`) for easier DIY stat exploration. (I find myself using the latter two a lot now.) My personal experience is that manipulating `FunctionCounts` is massively more pleasant than the raw tuples of `FunctionCount`. (Though it's still possible to get at the raw data if you want.)
5) Better support for multi-line `stmt` and `setup`.
6) Compare now also supports rowwise coloring, which is often the more natural layout for A/B testing.
7) Limited support for `globals` in `collect_callgrind`. This should make it easier to benchmark JIT models. (CC ZolotukhinM)
8) More unit tests, including extensive tests for the Callgrind stats manipulation APIs.
9) Mitigate issue with `MKL_THREADING_LAYER` when run in Jupyter. (https://github.com/pytorch/pytorch/issues/37377)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46023

Test Plan: changes should be covered by existing and new unit tests.

Reviewed By: navahgar, malfet

Differential Revision: D24313911

Pulled By: robieta

fbshipit-source-id: 835d4b5cde336fb7ff0adef3c0fd614d64df0f77
2020-10-15 16:32:53 -07:00
Weiyi Zheng
22f4a58a45 [pytorch] activation checkpointing: enable mixing tensor without requires_grad (#45934)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45934

https://pytorch.org/docs/stable/checkpoint.html pytorch checkpoint requires all input to the function being checkpointed to requires_grad, but this assumption is not necessarily try. consider the following two examples

```
output = MultiheadedMaskedAtten(input, mask)

output = LSTM(input, seq_length)
```
both length and mask are tensors that won't requires grad, currently if you try to checkpoint torch.autograd.backward will complain

```
  File "/mnt/xarfuse/uid-124297/7d159c34-seed-nspid4026531836-ns-4026531840/torch/autograd/function.py
", line 87, in apply
    return self._forward_cls.backward(self, *args)
  File "/mnt/xarfuse/uid-124297/7d159c34-seed-nspid4026531836-ns-4026531840/torch/utils/checkpoint.py"
, line 99, in backward
    torch.autograd.backward(outputs, args)
  File "/mnt/xarfuse/uid-124297/7d159c34-seed-nspid4026531836-ns-4026531840/torch/autograd/__init__.py
", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: element 1 of tensors does not require grad and does not have a grad_fn
```

this diff allows skipping the non-grad-requiring tensor when running autograd.backward.

added documentation for this feature as well.

Test Plan: added unit test to make sure partial tensor grads can be used in checkpoint().

Differential Revision: D24094764

fbshipit-source-id: 6557e8e74132d5a392526adc7b57b6998609ed12
2020-10-14 21:28:02 -07:00
Taylor Robie
2b13d9413e Re-land: Add callgrind collection to Timer #44717 (#45586)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45586

Test Plan: The unit test has been softened to be less platform sensitive.

Reviewed By: mruberry

Differential Revision: D24025415

Pulled By: robieta

fbshipit-source-id: ee986933b984e736cf1525e1297de6b21ac1f0cf
2020-09-30 17:43:06 -07:00
Mike Ruberry
51d0ae9207 Revert D24010742: [pytorch][PR] Add callgrind collection to Timer
Test Plan: revert-hammer

Differential Revision:
D24010742 (9b27e0926b)

Original commit changeset: df6bc765f8ef

fbshipit-source-id: 4c1edd57ea932896f7052716427059c924222501
2020-09-30 10:15:46 -07:00
Taylor Robie
9b27e0926b Add callgrind collection to Timer (#44717)
Summary:
This PR allows Timer to collect deterministic instruction counts for (some) snippets. Because of the intrusive nature of Valgrind (effectively replacing the CPU with an emulated one) we have to perform our measurements in a separate process. This PR writes a `.py` file containing the Timer's `setup` and `stmt`, and executes it within a `valgrind` subprocess along with a plethora of checks and error handling. There is still a bit of jitter around the edges due to the Python glue that I'm using, but the PyTorch signal is quite good and thus this provides a low friction way of getting signal. I considered using JIT as an alternative, but:

A) Python specific overheads (e.g. parsing) are important
B) JIT might do rewrites which would complicate measurement.

Consider the following bit of code, related to https://github.com/pytorch/pytorch/issues/44484:
```
from torch.utils._benchmark import Timer
counts = Timer(
    "x.backward()",
    setup="x = torch.ones((1,)) + torch.ones((1,), requires_grad=True)"
).collect_callgrind()

for c, fn in counts[:20]:
    print(f"{c:>12}  {fn}")
```

```
      812800  ???:_dl_update_slotinfo
      355600  ???:update_get_addr
      308300  work/Python/ceval.c:_PyEval_EvalFrameDefault'2
      304800  ???:__tls_get_addr
      196059  ???:_int_free
      152400  ???:__tls_get_addr_slow
      138400  build/../c10/core/ScalarType.h:c10::typeMetaToScalarType(caffe2::TypeMeta)
      126526  work/Objects/dictobject.c:_PyDict_LoadGlobal
      114268  ???:malloc
      101400  work/Objects/unicodeobject.c:PyUnicode_FromFormatV
       85900  work/Python/ceval.c:_PyEval_EvalFrameDefault
       79946  work/Objects/typeobject.c:_PyType_Lookup
       72000  build/../c10/core/Device.h:c10::Device::validate()
       70000  /usr/include/c++/8/bits/stl_vector.h:std::vector<at::Tensor, std::allocator<at::Tensor> >::~vector()
       66400  work/Objects/object.c:_PyObject_GenericGetAttrWithDict
       63000  ???:pthread_mutex_lock
       61200  work/Objects/dictobject.c:PyDict_GetItem
       59800  ???:free
       58400  work/Objects/tupleobject.c:tupledealloc
       56707  work/Objects/dictobject.c:lookdict_unicode_nodummy
```

Moreover, if we backport this PR to 1.6 (just copy the `_benchmarks` folder) and load those counts as `counts_1_6`, then we can easily diff them:
```
print(f"Head instructions: {sum(c for c, _ in counts)}")
print(f"1.6 instructions:  {sum(c for c, _ in counts_1_6)}")
count_dict = {fn: c for c, fn in counts}
for c, fn in counts_1_6:
    _ = count_dict.setdefault(fn, 0)
    count_dict[fn] -= c
count_diffs = sorted([(c, fn) for fn, c in count_dict.items()], reverse=True)
for c, fn in count_diffs[:15] + [["", "..."]] + count_diffs[-15:]:
    print(f"{c:>8}  {fn}")
```

```
Head instructions: 7609547
1.6 instructions:  6059648
  169600  ???:_dl_update_slotinfo
  101400  work/Objects/unicodeobject.c:PyUnicode_FromFormatV
   74200  ???:update_get_addr
   63600  ???:__tls_get_addr
   46800  work/Python/ceval.c:_PyEval_EvalFrameDefault
   33512  work/Objects/dictobject.c:_PyDict_LoadGlobal
   31800  ???:__tls_get_addr_slow
   31700  build/../aten/src/ATen/record_function.cpp:at::RecordFunction::RecordFunction(at::RecordScope)
   28300  build/../torch/csrc/utils/python_arg_parser.cpp:torch::FunctionSignature::parse(_object*, _object*, _object*, _object**, bool)
   27800  work/Objects/object.c:_PyObject_GenericGetAttrWithDict
   27401  work/Objects/dictobject.c:lookdict_unicode_nodummy
   24115  work/Objects/typeobject.c:_PyType_Lookup
   24080  ???:_int_free
   21700  work/Objects/dictobject.c:PyDict_GetItemWithError
   20700  work/Objects/dictobject.c:PyDict_GetItem
          ...
   -3200  build/../c10/util/SmallVector.h:at::TensorIterator::binary_op(at::Tensor&, at::Tensor const&, at::Tensor const&, bool)
   -3400  build/../aten/src/ATen/native/TensorIterator.cpp:at::TensorIterator::resize_outputs(at::TensorIteratorConfig const&)
   -3500  /usr/include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:std::unique_lock<std::mutex>::unlock()
   -3700  build/../torch/csrc/utils/python_arg_parser.cpp:torch::PythonArgParser::raw_parse(_object*, _object*, _object**)
   -4207  work/Objects/obmalloc.c:PyMem_Calloc
   -4500  /usr/include/c++/8/bits/stl_vector.h:std::vector<at::Tensor, std::allocator<at::Tensor> >::~vector()
   -4800  build/../torch/csrc/autograd/generated/VariableType_2.cpp:torch::autograd::VariableType::add__Tensor(at::Tensor&, at::Tensor const&, c10::Scalar)
   -5000  build/../c10/core/impl/LocalDispatchKeySet.cpp:c10::impl::ExcludeDispatchKeyGuard::ExcludeDispatchKeyGuard(c10::DispatchKey)
   -5300  work/Objects/listobject.c:PyList_New
   -5400  build/../torch/csrc/utils/python_arg_parser.cpp:torch::FunctionParameter::check(_object*, std::vector<pybind11::handle, std::allocator<pybind11::handle> >&)
   -5600  /usr/include/c++/8/bits/std_mutex.h:std::unique_lock<std::mutex>::unlock()
   -6231  work/Objects/obmalloc.c:PyMem_Free
   -6300  work/Objects/listobject.c:list_repeat
  -11200  work/Objects/listobject.c:list_dealloc
  -28900  build/../torch/csrc/utils/python_arg_parser.cpp:torch::FunctionSignature::parse(_object*, _object*, _object**, bool)
```

Remaining TODOs:
  * Include a timer in the generated script for cuda sync.
  * Add valgrind to CircleCI machines and add a unit test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44717

Reviewed By: soumith

Differential Revision: D24010742

Pulled By: robieta

fbshipit-source-id: df6bc765f8efce7193893edba186cd62b4b23623
2020-09-30 05:52:54 -07:00
Taylor Robie
ccad73ab41 Fix D23995953 import.
Summary: https://github.com/pytorch/pytorch/pull/45511 could not be properly imported

Test Plan: See https://github.com/pytorch/pytorch/pull/45511

Reviewed By: zhangguanheng66

Differential Revision: D23995953

fbshipit-source-id: a6224a67d54617ddf34c2392e65f2142c4e78ea4
2020-09-29 19:30:23 -07:00
Taylor Robie
c6b7eeb654 Gh/taylorrobie/timer cleanup (#45361)
Summary:
This PR cleans up some of the rough edges around `Timer` and `Compare`
* Moves `Measurement` to be dataclass based
* Adds a bunch of type annotations. MyPy is now happy.
* Allows missing entries in `Compare`. This is one of the biggest usability issues with `Compare` right now, both from an API perspective and because the current failure mode is really unpleasant.
* Greatly expands the testing of `Compare`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45361

Test Plan: Changes to Timer are covered under existing tests, changes to `Compare` are covered by the expanded `test_compare` method.

Reviewed By: bwasti

Differential Revision: D23966816

Pulled By: robieta

fbshipit-source-id: 826969f73b42f72fa35f4de3c64d0988b61474cd
2020-09-28 14:56:43 -07:00
Vasiliy Kuznetsov
eee7dad376 Add torch.do_assert, which is symbolically traceable (#45188)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45188

This is a symbolically traceable alternative to Python's `assert`.
It should be useful to allow people who want to use FX to also
be able to assert things.

A bunch of TODO(before) land are inline - would love thoughts
on where is the best place for this code to live, and what this
function should be called (since `assert` is reserved).

Test Plan:
```
python test/test_fx.py TestFX.test_symbolic_trace_assert
```

Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D23861567

fbshipit-source-id: d9d6b9556140faccc0290eba1fabea401d7850de
2020-09-25 13:46:28 -07:00
Taylor Robie
8507ea22b2 replace timer test with a mocked variant (#45173)
Summary:
I noticed that the recently introduced adaptive_autorange tests occasionally timeout CI, and I've been meaning to improve the Timer tests for a while. This PR allows unit tests to swap the measurement portion of `Timer` with a deterministic mock so we can thoroughly test behavior without having to worry about flaky CI measurements. It also means that the tests can be much more detailed and still finish very quickly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45173

Test Plan: You're lookin' at it.

Reviewed By: ezyang

Differential Revision: D23873548

Pulled By: robieta

fbshipit-source-id: 26113e5cea0cbf46909b9bf5e90c878c29e87e88
2020-09-24 09:42:37 -07:00
ahassan@azavea.com
1cab27d485 Add a torch.hub.load_local() function that can load models from any local directory with a hubconf.py (#44204)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/43622

- Moves the model loading part of `torch.hub.load()` into a new `torch.hub.load_local()` function that takes in a path to a local directory that contains a `hubconf.py` instead of a repo name.
- Refactors `torch.hub.load()` so that it now calls `torch.hub.load_local()` after downloading and extracting the repo.
- Updates `torch.hub` docs to include the new function + minor fixes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44204

Reviewed By: malfet

Differential Revision: D23817429

Pulled By: ailzhang

fbshipit-source-id: 788fd83c87a94f487b558715b2809d346ead02b2
2020-09-21 14:17:21 -07:00
Xiang Gao
20ac736200 Remove py2 compatible future imports (#44735)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44735

Reviewed By: mruberry

Differential Revision: D23731306

Pulled By: ezyang

fbshipit-source-id: 0ba009a99e475ddbe22981be8ac636f8a1c8b02f
2020-09-16 12:55:57 -07:00
Victor Bittorf
68a5c361ae Adding Adapative Autorange to benchmark utils. (#44607)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/44219

Rebasing https://github.com/pytorch/pytorch/pull/44288 and fixing the git history.

This allows users to bencmark code without having to specify how long to run the benchmark. It runs the benchmark until the variance (IQR / Median) is low enough that we can be confident in the measurement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44607

Test Plan: There are unit tests, and we manually tested using Examples posted in git.

Reviewed By: robieta

Differential Revision: D23671208

Pulled By: bitfort

fbshipit-source-id: d63184290b88b26fb81c2452e1ae701c7d513d12
2020-09-13 20:55:40 -07:00
Ailing Zhang
51bab0877d Fix torch.hub for new zipfile format. (#42333)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/42239

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42333

Reviewed By: VitalyFedyunin

Differential Revision: D23215210

Pulled By: ailzhang

fbshipit-source-id: 161ead8b457c11655dd2cab5eecfd0edf7ae5c2b
2020-08-20 14:54:02 -07:00
alkad
14e75fbdb9 Remove py2 specific code from test_utils.py (#42105)
Summary:
As https://github.com/pytorch/pytorch/issues/23795 mentioned drop Python 2 support. albanD
Fixes https://github.com/pytorch/pytorch/issues/31796

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42105

Reviewed By: ngimel

Differential Revision: D22765768

Pulled By: mrshenli

fbshipit-source-id: bae114a21cd5598004c7f92d313938ad826b4a24
2020-07-28 08:25:40 -07:00
Taylor Robie
fab1795577 move benchmark utils into torch namespace (#41506)
Summary:
Move the timing utils to `torch.utils._benchmark`. I couldn't figure out how to get setuptools to pick it up and put it under `torch` unless it is in the `torch` directory. (And I think it has to be for `setup.py develop` anyway.)

I also modified the record function benchmark since `Timer` and `Compare` should always be available now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41506

Reviewed By: ngimel

Differential Revision: D22601460

Pulled By: robieta

fbshipit-source-id: 9cea7ff1dcb0bb6922c15b99dd64833d9631c37b
2020-07-23 09:48:39 -07:00
MohamedAliRashad
f3f9415f81 Add file_name argument to load_state_dict_from_url (#39749)
Summary:
Add the feature proposed here https://github.com/pytorch/pytorch/issues/39196
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39749

Differential Revision: D21962736

Pulled By: ailzhang

fbshipit-source-id: b60fb0d83fd0728354a46e2762cc3598b14b1fdb
2020-06-12 10:31:22 -07:00
Nikita Shulga
c0d3d2f60f Retry/skip test on URLError rather than on HTTPError (#39477)
Summary:
`HTTPError` are raised when server is overloaded, while `URLError` is
raised when network is not available
And since `HTTPError` is an extension of `URLError`, `URLError` should catch both exceptions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39477

Differential Revision: D21873560

Pulled By: malfet

fbshipit-source-id: 11806671b768705465f562087521ad4887fd20f7
2020-06-03 17:29:40 -07:00
Jithun Nair
41363b299a test_bottleneck_cuda works on ROCm 3.3 (#38249)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38249

Differential Revision: D21665097

Pulled By: ailzhang

fbshipit-source-id: cb2deab2fe8305db6fbe9ac4bfce4bb01cd9ff29
2020-05-28 17:48:29 -07:00
Nikita Shulga
0e8c65f756 Add timeout to TestBottleneck (#39191)
Summary:
Invoke `Popen.communicate` with `timeout` argument and kill the process in `TimeoutExpired` handler
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39191

Differential Revision: D21773510

Pulled By: malfet

fbshipit-source-id: 52b94315f8aa4d6c330dd5c9a8936100e49aef2d
2020-05-28 16:08:16 -07:00
Mike Ruberry
13120bf677 Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21740237

Pulled By: mruberry

fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042
2020-05-27 06:31:07 -07:00
Martin Valgur
de8c888232 Fix torch.hub.hub_dir inconsistencies (#38969)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/38401

* `torch.hub.load_state_dict_from_url()` now also downloads to `$TORCH_HOME/hub/checkpoints` instead of `$TORCH_HOME/checkpoints` like `torch.hub.load()` and others.
* Make `hub_dir` private, add and use `get_dir()` instead.

Also updated docs. Did not see a need for additional unit tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38969

Differential Revision: D21725880

Pulled By: ailzhang

fbshipit-source-id: 58cc6b32ddbda91e58c1c1433cc3916223556ea1
2020-05-26 21:06:52 -07:00
Rohan Varma
63e545e0fe Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol
Test Plan: revert-hammer

Differential Revision:
D21717199

Original commit changeset: 9feb856f94ee

fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259
2020-05-26 18:23:59 -07:00
Mike Ruberry
6ddca30b2d Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21717199

Pulled By: mruberry

fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a
2020-05-26 08:30:23 -07:00
Nikita Shulga
53aa7d8bc5 Add option to skip tests after retries (#38079)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38079

Differential Revision: D21470238

Pulled By: malfet

fbshipit-source-id: b2e63be34090c6f61acad8b6530658a835c68870
2020-05-07 21:56:29 -07:00
David Reiss
e75fb4356b Remove (most) Python 2 support from Python code (#35615)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).

Test Plan: CI

Differential Revision: D20842886

Pulled By: dreiss

fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
2020-04-22 09:23:14 -07:00
Brian Vaughan
54ed6fd3ee Use both absolute and relative tolerance in testing (#34258)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34258

This PR allows both atol and rtol to be specified, uses defaults based on the prior analysis (spreadsheet attached to https://github.com/pytorch/pytorch/pull/32538), but retains the absolute tolerance behavior in cases where precision was previously specified explicitly.

Test Plan: Imported from OSS

Differential Revision: D21110255

Pulled By: nairbv

fbshipit-source-id: 57b3a004c7d5ac1be80ee765f03668b1b13f4a7e
2020-04-19 06:16:49 -07:00
Ailing Zhang
471ddacd8b Add retry decorator and use it for Hub tests. (#34829)
Summary:
fix https://github.com/pytorch/pytorch/issues/34751
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34829

Differential Revision: D20476231

Pulled By: ailzhang

fbshipit-source-id: eb38ee655e28250352b15e8e37b3b39310a7c378
2020-03-16 20:19:45 -07:00
Edgar Andrés Margffoy Tuay
1b746b95fb Consider hub_dir alongside TORCH_HOME env variable for storing hub models (#32844)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/31944
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32844

Differential Revision: D19747566

Pulled By: ailzhang

fbshipit-source-id: caca41a3a057d7d280d4783515aba2cc48c82012
2020-02-05 15:35:53 -08:00
Brian W. Hart
ea968f5cc3 fix possible pandas import error during tensorboard tests (#29650)
Summary:
TensorBoard tests using SummaryWriter() may fail with a pandas import
complaint if TensorFlow packages are installed in the same python
environment as PyTorch:

Traceback (most recent call last):
  File "test_tensorboard.py", line 212, in test_writer
    with self.createSummaryWriter() as writer:
  File "test_tensorboard.py", line 64, in createSummaryWriter
    return SummaryWriter(temp_dir)
...
  File "[...]/site-packages/pandas/core/arrays/categorical.py", line 52, in <module>
    import pandas.core.algorithms as algorithms
AttributeError: module 'pandas' has no attribute 'core'

The exact failure may depend on the pandas version. We've also seen:

  File "[...]/site-packages/pandas/core/arrays/categorical.py", line 9, in <module>
    import pandas.compat as compat
AttributeError: module 'pandas' has no attribute 'compat'

The module import chain leading to the failure is tensorboard imports
tensorflow imports tensorflow_estimator imports pandas. pandas includes
a submodule named 'bottleneck', whose name collides with the PyTorch
'test/bottleneck/' subdirectory.

So IF tensorboard, tensorflow, tensorflow_estimator, and pandas are
installed in the python environment AND IF testing is run from within
PyTorch's 'test/' directory (or maybe just with 'test/' in PYTHONPATH,
etc.), then TensorBoard tests using SummaryWriter() will fail.

Rename the 'bottleneck/' directory slightly to avoid the name collision.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29650

Differential Revision: D19698638

Pulled By: ezyang

fbshipit-source-id: cb59342ed407cb37aefc833d67f768a8809129ac
2020-02-04 14:27:46 -08:00
Pritam Damania
f050b16dd9 Move pytorch distributed tests to separate folder for contbuild. (#30445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445

Create distributed and rpc directories under caffe/test for better management
of unit tests.

Differential Revision: D18702786

fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
2020-01-22 21:16:59 -08:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Pritam Damania
fde94e7556 Provide async mode for local autograd engine. (#31230)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31230

A major issue with distributed autograd currently is that we block an
RPC thread when we call Engine::execute_with_graph_task.

To resolve this issue, I've made modifications to the local autograd engine
such that `execute_with_graph_task` returns a Future instead. The `execute()`
methods for Engine::execute() and DistEngine::execute() still wait() on this
Future which ensures there is no change in behavior yet.

In follow up PRs we can modify the distributed autograd engine to take
advantage of this Future.

Closes #26359
ghstack-source-id: 96298057

Test Plan: waitforbuildbot

Differential Revision: D18999709

fbshipit-source-id: 388f54467fd2415a0acb7df17bd063aedc105229
2020-01-05 00:29:28 -08:00
Heungsub Hans Lee
fa251cfd97 Fully deprecate variadic inputs of checkpoint_sequential (#25985)
Summary:
To support variadic inputs of `checkpoint_sequential` was deprecated at https://github.com/pytorch/pytorch/issues/21006. This case should be warned with `DeprecationWarning` for PyTorch 1.2, but it should be simply failed with `TypeError` since PyTorch 1.3. This patch removes the `DeprecationWarning` for PyTorch 1.2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25985

Differential Revision: D18809875

Pulled By: albanD

fbshipit-source-id: e84dd8629c04979c4b2dc63e8ada94292e8cedd0
2019-12-05 09:23:28 -08:00
Negin Raoof
ebc216a076 Opset 11 updates (#28225)
Summary:
This PR contains:
1- pad updates for opset11 symbolic
2- Updated avg_pool for opset11
3- TopK updates for opset 11
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28225

Reviewed By: hl475

Differential Revision: D18282928

Pulled By: houseroad

fbshipit-source-id: aff2cabca9a155a9b475e35fed69a678544d6669
2019-11-04 12:16:12 -08:00
Ailing Zhang
d2eceee54b Fix hub when branch name contains slash. (#27960)
Summary:
fixes https://github.com/pytorch/pytorch/issues/27844
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27960

Differential Revision: D17964360

Pulled By: ailzhang

fbshipit-source-id: f5054fc251d2ebbf09ea4ea9fa4d1ce87db5fc52
2019-10-18 10:18:12 -07:00
Your Name
4bd8ae13c6 Move hipify to torch/utils to bundle them into torch package (#27425)
Summary:
Similar to https://github.com/pytorch/pytorch/pull/27418 but try to put it under "torch" namespace
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27425

Differential Revision: D17779490

Pulled By: bddppq

fbshipit-source-id: 688338d143509b37dfc110df17af3331db48a42b
2019-10-07 17:25:45 -07:00
Ailing Zhang
0f1fbc0eb2 Hub improvements (#26723)
Summary:
Resubmit of https://github.com/pytorch/pytorch/pull/25980.
Our old serialization was in tar (like `resnet18-5c106cde.pth` was in this format) so let's only support automatically unzip if checkpoints are zipfiles.
We can still manage to get it work with tarfile, but let's delay it when there's an ask.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26723

Differential Revision: D17551795

Pulled By: ailzhang

fbshipit-source-id: 00b4e7621f1e753ca9aa07b1fe356278c6693a1e
2019-09-25 08:21:50 -07:00
Ailing
9f1da984ef Enable hub tests on MacOS (#26697)
Summary:
fix https://github.com/pytorch/pytorch/issues/26032.
This was broken by a bad openssl release in conda. Should be fixed now. Testing...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26697

Differential Revision: D17542095

Pulled By: ailzhang

fbshipit-source-id: ba99f9b36ef2a7c793842cf91bd46fb2634ac1aa
2019-09-24 10:11:00 -07:00
Karl Ostmo
839e636fa1 Revert D17495679: [pytorch][PR] A few hub improvements
Test Plan: revert-hammer

Differential Revision:
D17495679

Original commit changeset: 695df3e803ad

fbshipit-source-id: 6c85bc980991971b08714f05155dd23147eed233
2019-09-23 23:38:19 -07:00
Ailing Zhang
1eaaf8b68b A few hub improvements (#25980)
Summary:
This PR does a few small improvements to hub:
- add support `verbose` option in `torch.load`. Note that this mutes hitting cache message but keeps the message of first download as suggested. fixes https://github.com/pytorch/pytorch/issues/24791
- add support loading state dict from tar file or zip file in `torch.hub.load_state_dict_from_url`.
- add `torch.hub.download_url_to_file` as public API, and add BC bit for `_download_url_to_file`.
- makes hash check in filename optional through `check_hash`, many users don't have control over the naming, relaxing this constraint could potentially avoid duplicating download code on user end.
- move pytorch CI off `pytorch/vision` and use `ailzhang/torchhub_example` as a dedicated test repo. fixes https://github.com/pytorch/pytorch/issues/25865
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25980

Differential Revision: D17495679

Pulled By: ailzhang

fbshipit-source-id: 695df3e803ad5f9ca33cfbcf62f1a4f8cde0dbbe
2019-09-23 17:24:19 -07:00
Pieter Noordhuis
d1d336168d Skip TestHub on macOS (#26033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26033

[test macos]

Test Plan: Imported from OSS

Differential Revision: D17323698

Pulled By: pietern

fbshipit-source-id: 1b5805d2b0f693d05a807299df4941a6bb528801
2019-09-11 13:56:03 -07:00
SsnL
7d637de771 Reduce excessive CI printing in TestHub (#22043)
Summary:
https://github.com/pytorch/pytorch/pull/21132 reverted https://github.com/pytorch/pytorch/pull/19606.

Now these tests again print like 40% lines of CI outputs (e.g., https://circleci.com/gh/pytorch/pytorch/2041825?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link)

This PR now uses the functionality introduced in https://github.com/pytorch/vision/issues/862.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22043

Differential Revision: D15947268

Pulled By: ailzhang

fbshipit-source-id: f84f4d6b86203dbe8687e04ae3ed8c99df0bdff8
2019-06-21 20:08:44 -07:00
Ailing Zhang
be9ce6318e remove import torchvision when testing torch.hub (#21132)
Summary:
This should pass once https://github.com/pytorch/vision/pull/971 is merged.
To remove torchvision as baseline, we just compare to sum of all param.sum() in pretrained resnet18 model, which means we need to manually update the number only when that pretrained weights are changed, which is generally rare.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21132

Differential Revision: D15563078

Pulled By: ailzhang

fbshipit-source-id: f28c6874149a1e6bd9894402f6847fd18f38b2b7
2019-05-31 07:38:30 -07:00
Hans Lee
ffdce79078 Deprecate variadic inputs of checkpoint_sequential (#21006)
Summary:
I've reported inconsistency between `checkpoint_sequential` and `nn.Sequential` at https://github.com/pytorch/pytorch/issues/19260. Both should provide the same input signature but they don't. I think the consistency is important and I agree with apaszke that `nn.Sequential`'s semantics should be kept instead of `checkpoint_sequential`.

I hope `checkpoint_sequential` raises `TypeError` on variadic arguments since PyTorch 1.2.0. But for now, it's okay just to warn as `DeprecationWarning`. I've talked about this approach with soumith.

Please review this pull request. Any comment will be my pleasure.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21006

Differential Revision: D15530801

Pulled By: soumith

fbshipit-source-id: 0ceb2cc6a17dcc547d0d00ebaf9df8603be53183
2019-05-28 21:33:45 -07:00
Tongzhou Wang
3b4d4ef503 Remove unnecessary printing from tests
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19606

Differential Revision: D15046583

Pulled By: ezyang

fbshipit-source-id: ea9bb691d23855e7eddbabe68bf112a726641ba4
2019-04-23 09:24:08 -07:00
Ailing Zhang
2787f1d8ed hub minor fixes (#19247)
Summary:
A few improvements while doing bert model
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19247

Differential Revision: D14989345

Pulled By: ailzhang

fbshipit-source-id: f4846813f62b6d497fbe74e8552c9714bd8dc3c7
2019-04-17 21:04:33 -07:00
Ailing Zhang
e54cb03a51 add/move a few apis in torch.hub (#18758)
Summary:
* `torch.hub.list('pytorch/vision')` - show all available hub models in `pytorch/vision`
* `torch.hub.show('pytorch/vision', 'resnet18')` - show docstring & example for `resnet18` in `pytorch/vision`
* Moved `torch.utils.model_zoo.load_url` to `torch.hub.load_state_dict_from_url` and deprecate `torch.utils.model_zoo`
* We have too many env to control where the cache dir is, it's not very necessary. I actually want to unify `TORCH_HUB_DIR`, `TORCH_HOME` and `TORCH_MODEL_ZOO`, but haven't done it. (more suggestions are welcome!)
* Simplify `pytorch/vision` example in doc, it was used to show how how hub entrypoint can be written so had some confusing unnecessary args.

An example of hub usage is shown below
```

In [1]: import torch

In [2]: torch.hub.list('pytorch/vision', force_reload=True)
Downloading: "https://github.com/pytorch/vision/archive/master.zip" to /private/home/ailzhang/.torch/hub/master.zip
Out[2]: ['resnet18', 'resnet50']

In [3]: torch.hub.show('pytorch/vision', 'resnet18')
Using cache found in /private/home/ailzhang/.torch/hub/vision_master

    Resnet18 model
    pretrained (bool): a recommended kwargs for all entrypoints
    args & kwargs are arguments for the function

In [4]: model = torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
Using cache found in /private/home/ailzhang/.torch/hub/vision_master
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18758

Differential Revision: D14883651

Pulled By: ailzhang

fbshipit-source-id: 6db6ab708a74121782a9154c44b0e190b23e8309
2019-04-10 23:10:39 -07:00
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
Choongwoo Han
40074d647c Allow None for checkpoint (#17969)
Summary:
Currently, we cannot run a checkpointed function with None argument.

```python
out = torch.utils.checkpoint.checkpoint(run_fn, input_var, None)
```

```
  File "/home/tunz/anaconda3/envs/torchdev/lib/python3.7/site-packages/torch/utils/checkpoint.py", line 14, in detach_variable
    x = inp.detach()
AttributeError: 'NoneType' object has no attribute 'detach'
```

This PR makes checkpoint function to safely handle None argument.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17969

Differential Revision: D14475148

Pulled By: ezyang

fbshipit-source-id: 9afe9e9aac511a6df1e1620e9ac341536890d451
2019-03-15 07:38:41 -07:00
Edward Yang
9089182ce4 Fix lint in test_utils.py (#17944)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17944
ghimport-source-id: 5b45086428b5a36e737882c78f285141121fd1bc

Stack:
* **#17944 Fix lint in test_utils.py**

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14430132

fbshipit-source-id: b00de7b4c685645ad5a4dc8c5fe6ce7e1893a3eb
2019-03-13 09:02:35 -07:00
Johannes M Dieterich
23e1c55cc0 enable unit tests working on ROCm 2.1 (#16871)
Summary:
This is the first round of enabling unit tests that work on ROCm 2.1 in my tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16871

Differential Revision: D13997662

Pulled By: bddppq

fbshipit-source-id: d909a3f7dd5fc8f85f126bf0613751c8e4ef949f
2019-02-09 00:30:50 -08:00
Andy Chen
33ea7eafef Make checkpoint_sequential work with multiple arguments (#14278)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14278

In this commit, we make checkpoint_sequential work for models with multiple tensor inputs. Previously, it only processed the first tensor and ignored the rest.

We introduce a new test in test/test_utils.py that replicates the issue referenced in this [GitHub issue](https://github.com/pytorch/pytorch/issues/11093), and we make sure that the test passes by changing the behavior of checkpoint_sequential to process all input tensors.

Reviewed By: ezyang

Differential Revision: D13144672

fbshipit-source-id: 24f58233a65a0f5b80b89c8d8cbced6f814004f7
2018-12-04 18:47:43 -08:00
Michael Carilli
c36156eded Option to preserve bitwise accuracy of gradient checkpointed vs non-checkpointed dropout (#14253)
Summary:
This issue was noticed, and fix proposed, by raulpuric.

Checkpointing is implemented by rerunning a forward-pass segment for each checkpointed segment during backward.  This can result in the RNG state advancing more than it would without checkpointing, which can cause checkpoints that include dropout invocations to lose end-to-end bitwise accuracy as compared to non-checkpointed passes.

The present PR contains optional logic to juggle the RNG states such that checkpointed passes containing dropout achieve bitwise accuracy with non-checkpointed equivalents.**  The user requests this behavior by supplying `preserve_rng_state=True` to `torch.utils.checkpoint` or `torch.utils.checkpoint_sequential`.

Currently, `preserve_rng_state=True` may incur a moderate performance hit because restoring MTGP states can be expensive.  However, restoring Philox states is dirt cheap, so syed-ahmed's [RNG refactor](https://github.com/pytorch/pytorch/pull/13070#discussion_r235179882), once merged, will make this option more or less free.

I'm a little wary of the [def checkpoint(function, *args, preserve_rng_state=False):](https://github.com/pytorch/pytorch/pull/14253/files#diff-58da227fc9b1d56752b7dfad90428fe0R75) argument-passing method (specifically, putting a kwarg after a variable argument list).  Python 3 seems happy with it.
Edit:  It appears Python 2.7 is NOT happy with a [kwarg after *args](https://travis-ci.org/pytorch/pytorch/builds/457706518?utm_source=github_status&utm_medium=notification).  `preserve_rng_state` also needs to be communicated in a way that doesn't break any existing usage.  I'm open to suggestions (a global flag perhaps)?

**Batchnorm may still be an issue, but that's a battle for another day.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14253

Differential Revision: D13166665

Pulled By: soumith

fbshipit-source-id: 240cddab57ceaccba038b0276151342344eeecd7
2018-11-23 08:09:43 -08:00
Ailing Zhang
4a3baec961 Hub Implementation (#12228)
Summary:
[Edit: after applied colesbury 's suggestions]
* Hub module enable users to share code + pretrained weights through github repos.
Example usage:
```
hub_model = hub.load(
     'ailzhang/vision:hub', # repo_owner/repo_name:branch
     'wrapper1', # entrypoint
      1234, # args for callable [not applicable to resnet18]
      pretrained=True) # kwargs for callable
```
* Protocol on repo owner side: example https://github.com/ailzhang/vision/tree/hub
     * The "published" models should be at least in a branch/tag. It can't be a random commit.
     * Repo owner should have the following field defined in `hubconf.py`
        * function/entrypoint with function signature `def wrapper1(pretrained=False, *args, **kwargs):`
        * `pretrained` allows users to load pretrained weights from repo owner.
        * `args` and `kwargs` are passed to the callable `resnet18`, repo owner should clearly specify their help message in the docstring

```
def wrapper1(pretrained=False, *args, **kwargs):
    """
    pretrained (bool): a recommended kwargs for all entrypoints
    args & kwargs are arguments for the function
    """
    from torchvision.models.resnet import resnet18
    model = resnet18(*args, **kwargs)
    checkpoint = 'https://download.pytorch.org/models/resnet18-5c106cde.pth'
    if pretrained:
        model.load_state_dict(model_zoo.load_url(checkpoint, progress=False))
    return model
```
* Hub_dir
    * `hub_dir` specifies where the intermediate files/folders will be saved. By default this is `~/.torch/hub`.
    * Users can change it by either setting the environment variable `TORCH_HUB_DIR` or calling `hub.set_dir(PATH_TO_HUB_DIR)`.
    * By default, we don't cleanup files after loading so that users can use cache next time.

* Cache logic :
    * We used the cache by default if it exists in `hub_dir`.
    * Users can force a fresh reload by calling `hub.load(..., force_reload=True)`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12228

Differential Revision: D10511470

Pulled By: ailzhang

fbshipit-source-id: 12ac27f01d33653f06b2483655546492f82cce38
2018-10-29 18:43:14 -07:00
Zachary DeVito
dae7616078 Shard all of tests based on how many tests exist. (#13160)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13160

Reduces pytorch_core build from 2 hours to 30 minutes

Reviewed By: soumith, dzhulgakov

Differential Revision: D10524261

fbshipit-source-id: 97270ac73404b5ea4c264cd0e9d8d4b1be79b0e9
2018-10-26 18:20:34 -07:00
James Sun
f4944f0f8a Rename test/common.py to test/common_utils.py (#12794)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794

common.py is used in base_module for almost all tests in test/. The
name of this file is so common that can easily conflict with other dependencies
if they happen to have another common.py in the base module. Rename the file to
avoid conflict.

Reviewed By: orionr

Differential Revision: D10438204

fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380
2018-10-17 23:04:29 -07:00
vishwakftw
dcd9d73d47 Expunge torch.utils.trainer.* (#12487)
Differential Revision: D10273602

Pulled By: SsnL

fbshipit-source-id: 630c1f8ee0e366f7092d4f93dbe1efa96fc860e0
2018-10-09 14:56:00 -07:00
Roy Li
15d28e400f remove support for c extensions (#12122)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12122

We are deprecating support for c extensions. Please use cpp extension in the future.

Reviewed By: Yangqing

Differential Revision: D10060541

fbshipit-source-id: 4f7149e06a254bd7af463fd7aa9740f65369963a
2018-10-01 13:55:28 -07:00
Christian Puhrsch
d8f6be686d Remove torch/legacy (#11823)
Summary:
Largely unused and hinders current development
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11823

Differential Revision: D9925094

Pulled By: cpuhrsch

fbshipit-source-id: c797f62180e2128f9a567b0c57c8347957470ea5
2018-09-20 14:00:54 -07:00
Jesse Hellemn
5c0d9a2493 Soumith's last few patches to v0.4.1
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10646

Reviewed By: ml7

Differential Revision: D9400556

Pulled By: pjh5

fbshipit-source-id: 1c9d54d5306f93d103fa1b172fa189fb68e32490
2018-08-20 18:28:27 -07:00
iotamudelta
75651d5b58 improve use of ROCm libraries, enable more tests, small fixes (#10406)
Summary:
* some small leftovers from the last PR review
* enable more unit test sets for CI
* replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
* use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
* use strided_batched gemm interface also from the batched internal interface
* re-enable Dropout.cu as we now have philox w/ rocRAND
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406

Reviewed By: Jorghi12

Differential Revision: D9277093

Pulled By: ezyang

fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2
2018-08-13 11:39:43 -07:00
iotamudelta
cfa05706ef ROCm contributions week 29 (#9653)
Summary:
In this changeset:
* improvements to `hipify-python.py`
* marking unit tests broken for ROCm
* reducing the number of jobs for the built to avoid out of memory issues
* switch to Thrust/cub-hip master for the CI
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9653

Differential Revision: D9117791

Pulled By: ezyang

fbshipit-source-id: a6c3c7b81f2bda9825974bf9bf89a97767244352
2018-08-02 09:09:00 -07:00
Edward Yang
9413fabb0b Nuke TestCollectEnv (#9459)
Summary:
The tests were too flaky, and the procedure for legitimately
updating versions of software too onerous, to warrant continually
testing these.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9459

Reviewed By: zou3519

Differential Revision: D8852357

Pulled By: ezyang

fbshipit-source-id: 24e99cd00b4252cdeec2a1d9af92456b4a54912a
2018-07-16 13:10:28 -07:00
Richard Zou
7310229426 Fix TestCollectEnv flakiness (#8983)
Summary:
The problem was a bad regex; the version hash match used to match 6
wildcards. This PR changes it to match \w+, which is sufficient for the
test because the version hash is always followed by either whitespace or
a right-paren.

Fixes #8981
Closes https://github.com/pytorch/pytorch/pull/8983

Differential Revision: D8677771

Pulled By: zou3519

fbshipit-source-id: dfdde98669bcd682335145cba98c82530a815afa
2018-06-28 11:45:37 -07:00
Tongzhou Wang
055f527242
[build] Use conda cmake in two CI builds (#8864)
* use conda cmake in pytorch-linux-xenial-cuda8-cudnn6-py2 and pytorch-linux-xenial-cuda9-cudnn6-py3

* update test_expect

* add exit 1

* check cmake 3.5

* bump expect driver version

* add back space
2018-06-26 17:22:04 -04:00
Tongzhou Wang
2b926aafb0 [build] disable test_expect for pinning cmake to 3.5* in dockerfiles repo (#8850)
* pin pytorch-linux-xenial* to use cmake 3.5*

* disable test_expect
2018-06-25 14:21:42 -04:00
anderspapitto
48e90e3339 Build system changes (#8627)
* All changes needed to get rid of process_github.sh

* allow thnn_h_path
2018-06-20 17:45:26 -04:00
Wei Yang
ae55865a3b Migrated hardshrink() to ATen and deprecated nn.Hardshrink() (#8117)
* 1. added hardshrink() to ATen (CPU + GPU); 2. removed nn.Hardshrink(); 3. reusing previous tests for nn.Hardshrink() and included CUDA tests at test_nn; 4. default parameter lambda=0.5 is not working yet

* optimized memory read/write

* 1. pass in lambd as scalar for CPU/CUDA_apply*; 2. removed tests for hardshrink at test_legacy_nn

* fixes test_utils

* 1. replace zeros_like with empty_like; 2. use scalar_cast in cuda

* 1. printing lambd value; 2. default lambd=0.5 is still failing

* getting around Scalar bug buy removing default value of lambd from native_functions.yaml, and declare it at nn/functional.py

* cleaned up debug printf
2018-06-14 16:42:20 -04:00
Will Feng
c84b97b979 [READY TO MERGE] Enable tests that use DataLoader with multiple workers on Windows (#6745)
* Don't import TEST_CUDA for test_dataloader on Windows

* test_partial_workers is stuck on Windows
2018-06-06 22:50:39 -04:00
Will Feng
b41050ff66 Re-enable build env check (#7969)
* Re-enable build env check

* Fix linux test error

* Try to fix macOS test error
2018-06-01 06:57:47 -04:00
tvn
146b951ec5 Fix seeding random module in DataLoader (#7886)
* fix seeding random module

* make base seed int

* follow 0.4 idiom

* add a test for random seeding
2018-05-29 15:55:04 -04:00
Will Feng
8d91a602cc
Temporarily disable build env check (#7768) 2018-05-22 12:51:00 -07:00
avmgithub
f02ae65727 skip test_utils.TestFFI.test_cpu for ppc64le due to incompatible exception handling (#7422) 2018-05-09 11:45:30 -04:00
Richard Zou
b3be71f046
[easy] Stop hardcoding "python" executable in bottleneck tests (#7105)
Right now, the bottleneck test_utils.py tests assume that a user's
python executable is 'python'. This may not be the case especially if
the user has multiple versions of python installed. This PR changes it
so that test_utils.py uses `sys.executable` as the python executable.
2018-04-30 22:01:36 -04:00
Edward Z. Yang
4caea64d72
Make all of TH and THC C++. (#6913)
Changelist:

- Move *.c to *.cpp
- Change includes of ".c" to ".cpp"
- A bunch of cmake configuration modifying CMAKE_C_FLAGS changed
to CMAKE_CXX_FLAGS or add_compile_options, because if you do CMAKE_C_FLAGS it only applies when you compile C code
- Explicitly cast void* to T* in a number of places
- Delete extern "C" { ... } blocks; instead, properly apply TH_API to everything that should have it (TH_API handles extern "C")
- Stop using stdatomic.h, instead, use <atomic>. This resulted in a bunch of placement-new/delete to be "totally properly correct"
- Refactor of THLongStorageView to not have static constructor methods (since it no longer has a copy/move constructor)
- Documentation about how the TH C interface (and extern C business) works
- Note that THD master_worker mode is dead
- C++ headers in TH libraries are given .hpp suffix, to make it less likely that you'll confuse them with the C-compatible headers (now suffixed .h)
- New function THCStream_stream and THCStream_device to project out fields of THCStream instead of accessing fields directly
- New function THStorage_(retainIfLive), which is equivalent to a retain but only if the refcount is greater than zero.
- In general, I tried to avoid using hpp headers outside of ATen/TH. However, there were a few places where I gave up and depended on the headers for my own sanity. See Note [TH abstraction violation] for all the sites where this occurred. All other sites were refactored to use functions
- Some extra Werror fixes (char* versus const char*)
2018-04-28 07:45:02 -04:00
cpuhrsch
ae35e0e924 Support non-contiguous tensors for unary ops (#6119) 2018-04-27 21:31:34 +02:00
Richard Zou
4040164097 Relax collect_env.py tests (#6859)
This PR makes it so that the collect_env.py tests ignore the most minor
number of most version strings. It also bumps the version up to 0.5.0a
to fix the CI.
2018-04-23 10:28:41 -04:00
Richard Zou
7a3c38ab59 Add environment collection script (#6635)
* Add environment collection script

Fixes #6111. This should make it easier for users to report bugs by giving
them a script to collect system environment information.

Changes include:
- Refactor out the environment collecting code from utils.bottleneck
- Add script (collect_env.py)
- Cleaned up the issues template so that it suggests using the script
  and is more readable.

Testing: added expect tests to go with 4 CI configurations. Whenever one
of these configurations gets updated, the test will fail until the test
also gets updated.

* Expect tests

* Update issue template

* Fix random space

* Minor improvement to issue template; fix expect test

* Skip expect test if BUILD_ENVIRONMENT not found; test fix; split off smoke/expect test
2018-04-22 15:18:14 -04:00
Tongzhou Wang
1c01eabd3c
Codemod to update our codebase to 0.4 standard (#6641)
* Codemod to update our codebase to 0.4 standard

* Update some of the test scri[ts

* remove Variable in test_clip_grad_value

* fix _symbolic_override_wrapper_maker
2018-04-17 22:06:54 -04:00
Priya Goyal
e3196e0ea8
[Re-checkpointing] Autograd container for trading compute for memory (#6467)
* Autograd container for trading compute for memory

* add a unit test for checkpoint

* address comments

* address review comments

* adding some docs for the checkpoint api

* more comments

* more comments

* repro bug

* Fix a subtle bug/apply some review comments

* Update checkpoint.py

* Run everything in grad mode

* fix flake and chunk=1

* use imperative backward as per discussion

* remove Variable and also add models and test for models

* Add a simple thread local variable to check for autograd grad mode

* remove models and models test after debugging

* address review comments

* address more comments

* address more comments
2018-04-10 15:26:24 -04:00
Richard Zou
1b3a5a4e7d bottleneck supports better user-provided arguments (#6425)
Fixes #6312.

Changed bottleneck's arg parser to user argparse.REMAINDER. This lets
the user specify args as `python -m torch.utils.bottleneck script.py
[args]` (previously, a -- was needed after `bottleneck` and before
`script.py`).
2018-04-09 13:57:26 -04:00
Will Feng
15a981e75a Disable TestBottleneck test_cuda on Windows (#5977) 2018-03-24 08:00:32 -04:00
Richard Zou
feb2785c5c Implement torch.util.bottleneck (#5216)
* Implement torch.util.bottleneck

This is a tool that is intended to be used as initial exploratory
debugging of bottlenecks in user scripts. Run it with

    python -m torch.utils.bottleneck /path/to/source/script.py

* Refactor and address comments

* Fix tests

* Allow passing of args to the profiled script

* Replace Variable
2018-03-23 17:27:35 -04:00
Will Feng
5ce46be17c Disable test_multi_keep on Windows (#5314) 2018-02-20 15:00:53 -05:00
Will Feng
9193dfd185 Disable test_multi_drop on Windows (#5290) 2018-02-17 20:49:12 -08:00
Dmytro Dzhulgakov
709fcfda8a Now actually fix padding (the tests are added in onnx-pytorch) (#3893)
* Now actually fix padding (the tests are added in onnx-pytorch)

* fix test
2017-11-27 23:39:48 -05:00
peterjc123
aa911939a3 Improve Windows Compatibility (for csrc/scripts) (#2941) 2017-11-08 19:51:35 +01:00
Lu Fang
36895e2dd2 update the comments, move the expect check logic into the helper function 2017-10-16 16:57:16 -04:00
Lu Fang
a1deb2d47f Move the exception logic to the helper function 2017-10-16 16:57:16 -04:00
Lu Fang
cad9438bb9 Add unit tests for onnx helper functions 2017-10-16 16:57:16 -04:00
Sam Gross
10e23943b3 Fix missing _forward_pre_hooks in serialized modules (#2057) 2017-07-11 18:23:35 -04:00
Xingdong Zuo
9f2a5d804d Add a flag to fix when dataset size is not divisible by batch size. (#1133) 2017-04-06 00:18:43 -04:00
Sam Gross
c4d1318662 Fix map_location in torch.load (#1006) 2017-03-15 16:54:19 -04:00
Luke Yeager
61bd5a0643 [Lint] Address F811 2017-02-27 19:33:00 -05:00
Adam Lerer
e71cf20192 improved serialization (no tar copy) (#713) 2017-02-22 22:24:20 +01:00
Luke Yeager
e7c1e6a8e3 [pep8] Fix most lint automatically with autopep8
Here's the command I used to invoke autopep8 (in parallel!):

    git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i

Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.

Also configures flake8 to match pep8's behavior.

Also configures TravisCI to check the whole project for lint.
2017-01-28 01:15:51 +01:00
Adam Paszke
a1fa995044 Fixes and improvements (#593)
* Fix error in ELU backward

* Add --seed flag for testst st

* Add test for BatchNorm eval

* Fix autograd.backward docs

* Support cc flags in cuDNN search

* Fix IndexSelect backward formula
2017-01-25 22:21:49 -05:00
Adam Paszke
95f0fa8a92 Change .grad attribute of Variables to be a Variable 2017-01-16 12:59:47 -05:00
Soumith Chintala
f45d75ed22 make the CUDA-aware tests backoff if CUDA no available 2016-12-24 15:36:00 -05:00
Adam Paszke
bcfa2d6c79 Add .t7 file reader 2016-11-25 00:41:55 +01:00
Soumith Chintala
6b821ece22 fixing trainer tests (#213) 2016-11-08 21:50:17 -05:00
Sam Gross
6db721b5dd Make DataLoader preserve the ordering of the dataset (#135) 2016-10-21 23:54:16 -04:00
Adam Paszke
3cbe66ba8c Change requires_grad default to False 2016-10-05 08:46:34 -07:00
soumith
33371c5164 ffi tests skip on cuda 2016-10-03 12:15:28 -07:00
Adam Lerer
a1f5fe6a8f Add multiprocess data loader + improvements to torch.utils.data 2016-09-30 16:23:43 -04:00
Adam Paszke
c92c82aa1a Really fix utils tests... 2016-09-29 12:52:12 -07:00
Adam Paszke
9c6ced1c0a Disable ffi tests if cffi is not available 2016-09-29 12:16:19 -07:00
Adam Paszke
941cf4e63d Add ffi utils for user C extensions 2016-09-29 09:35:56 -07:00
Adam Paszke
8fdec15a55 Codemod to remove camel case method naming 2016-09-20 08:40:28 -07:00
Adam Paszke
1703f2abed Add utils tests 2016-09-08 19:07:00 -07:00