Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31161
Previously, it wasn't necessary to specify `DT_NEEDED` in C++ extensions on Linux (aka pass `-l` flags) because all of the symbols would have already been loaded with `RTLD_GLOBAL`, so there wouldn't be any undefined symbols. But when we switch to loading `_C` with `RTLD_LOCAL`, it's now necessary for all the C++ extensions to know what libraries to link with. The resulting code is clearer and more uniform, so it's wins all around.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D19262578
Pulled By: ezyang
fbshipit-source-id: a893cc96f2e9aad1c064a6de4f7ccf79257dec3f
Summary:
When calling the add_images() method on the tensorboard SummaryWriter with a uint8 NCHW tensor, the tensor is incorrectly scaled, resulting in overflow behavior. This leads to incorrect images being displayed in tensorboard.
Issue: https://github.com/pytorch/pytorch/issues/31459
Local Testing (ran this code with and without the PR changes and printed scale_factor):
import torch
import torchvision
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()
x=torch.tensor([[[[1, 2, 3], [4, 5, 6]]]], dtype=torch.uint8)
writer.add_images("images", x)
Before- scale_factor: 255, After- scale_factor: 1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31778
Differential Revision: D19289189
Pulled By: anjali411
fbshipit-source-id: 350a1650337244deae4fd8f8b7fb0e354ae6986b
Summary:
The original `check-and-act` style can raise `FileExistsError` when multiple processes are jit-compiling the extension on the same node.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30956
Differential Revision: D19262570
Pulled By: ezyang
fbshipit-source-id: bb18c72e42648770b47f9378ac7c3929c3c03efc
Summary:
In the long string, formalstring thinks it is good to have a name.
When using dict, literal is better for readability and faster than dict constructor.
I always appreciate your efforts in creating the world's best frameworks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31352
Differential Revision: D19191967
Pulled By: ngimel
fbshipit-source-id: 21f063b163b67de8cf9761a4db5991f74318e991
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31151
same as title. I am not sure why this was not added in the first place.
Test Plan: wait for build to succeed.
Reviewed By: bddppq, xw285cornell
Differential Revision: D18880216
fbshipit-source-id: 8b17d4fbd5dd08c28c52df8b1da77b69d56d65dc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31047
Changelist:
- remove BUILD_NAMEDTENSOR from .cu files
- remove BUILD_NAMEDTENSOR special handling in function_wrapper.py
- remove BUILD_NAMEDTENSOR from cpp_extension.py. This code actually
did nothing because we always compile with BUILD_NAMEDTENSOR.
Test Plan: - run tests
Differential Revision: D18908442
Pulled By: zou3519
fbshipit-source-id: b239e24de58580adaf3cef573350773a38b1e4f0
Summary:
Copy-paste comment from code for reasoning:
```
# NOTE [ IterableDataset and __len__ ]
#
# For `IterableDataset`, `__len__` could be inaccurate when one naively
# does multi-processing data loading, since the samples will be duplicated.
# However, no real use case should be actually using that behavior, so
# it should count as a user error. We should generally trust user
# code to do the proper thing (e.g., configure each replica differently
# in `__iter__`), and give us the correct `__len__` if they choose to
# implement it (this will still throw if the dataset does not implement
# a `__len__`).
#
# To provide a further warning, we track if `__len__` was called on the
# `DataLoader`, save the returned value in `self._len_called`, and warn
# if the iterator ends up yielding more than this number of samples.
```
Fixes https://github.com/pytorch/pytorch/issues/30184
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23587
Differential Revision: D18852625
Pulled By: ailzhang
fbshipit-source-id: aea8d4d70c7f21aaa69b35908a6f43026493d826
Summary:
To support variadic inputs of `checkpoint_sequential` was deprecated at https://github.com/pytorch/pytorch/issues/21006. This case should be warned with `DeprecationWarning` for PyTorch 1.2, but it should be simply failed with `TypeError` since PyTorch 1.3. This patch removes the `DeprecationWarning` for PyTorch 1.2.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25985
Differential Revision: D18809875
Pulled By: albanD
fbshipit-source-id: e84dd8629c04979c4b2dc63e8ada94292e8cedd0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30315
The new structure is that libtorch_cpu contains the bulk of our
code, and libtorch depends on libtorch_cpu and libtorch_cuda.
This is a reland of https://github.com/pytorch/pytorch/pull/29731 but
I've extracted all of the prep work into separate PRs which can be
landed before this one.
Some things of note:
* torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library)
* The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774
In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/libprotobuf.a(arena.cc.o) is referenced by DSO"
* A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly
* I had to torch_cpu/torch_cuda caffe2_interface_library so that they get whole-archived linked into torch when you statically link. And I had to do this in an *exported* fashion because torch needs to depend on torch_cpu_library. In the end I exported everything and removed the redefinition in the Caffe2Config.cmake. However, I am not too sure why the old code did it in this way in the first place; however, it doesn't seem to have broken anything to switch it this way.
* There's some uses of `__HIP_PLATFORM_HCC__` still in `torch_cpu` code, so I had to apply it to that library too (UGH). This manifests as a failer when trying to run the CUDA fuser. This doesn't really matter substantively right now because we still in-place HIPify, but it would be good to fix eventually. This was a bit difficult to debug because of an unrelated HIP bug, see https://github.com/ROCm-Developer-Tools/HIP/issues/1706Fixes#27215 (as our libraries are smaller), and executes on
part of the plan in #29235.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18790941
Pulled By: ezyang
fbshipit-source-id: 01296f6089d3de5e8365251b490c51e694f2d6c7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30430
When a module isn't a TracedModule, attempt to get name information with `original_name` property on module and default to 'Module' when no such property exists.
Test Plan:
### Change child module to scripted module:
```
model = torchvision.models.alexnet()
model.classifier = torch.jit.script(model.classifier)
```
### Add graph
```
w = SummaryWriter()
w.add_graph(model, torch.rand((2, 3, 224, 224)))
w.close()
```
### No errors
However, graph is disconnected at parts and hard to understand.
{F223327878}
Reviewed By: sanekmelnikov
Differential Revision: D18690836
fbshipit-source-id: 42295d06b7c1d48d5401776dca1e0d12cd64b49d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30286
add_hparams() in torch.utils.tensorboard.writer produced the following error
python3.7/site-packages/torch/utils/tensorboard/writer.py", line 294, in add_hparams
with SummaryWriter(log_dir=os.path.join(self.file_writer.get_logdir(), str(time.time()))) as w_hp:
AttributeError: 'NoneType' object has no attribute 'get_logdir'
Other methods such as add_scalar() and add_histogram() use self._get_file_writer() instead of self.file_writer directly.
Test Plan:
```
writer = summary_writer()
writer.add_hparams({"a": 0, "b": 0}, {"hparam/test_accuracy": 0.5}))
writer.flush()
writer.close()
```
Reviewed By: J0Nreynolds, sanekmelnikov
Differential Revision: D18650610
fbshipit-source-id: 1039dd2067d37913a8a131c8b372491a63154899
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30244
This makes several small changes to the tensorboard graph parsing methods to address the recent changes to the PyTorch JIT trace/graph.
- Inline graph to get information for all nodes
- Assign and propagate scope names to GetAttr nodes
- Prune all useless GetAttr nodes (any with a ClassType output type - tensors and primitives are kept)
- Create output nodes so output tensor shape can be examined
Reviewed By: sanekmelnikov
Differential Revision: D18556323
fbshipit-source-id: b73a809bacfa554c3fe9c4ae3563525f57539874
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29731
The new structure is that libtorch_cpu contains the bulk of our
code, and libtorch depends on libtorch_cpu and libtorch_cuda.
Some subtleties about the patch:
- There were a few functions that crossed CPU-CUDA boundary without API macros. I just added them, easy enough. An inverse situation was aten/src/THC/THCTensorRandom.cu where we weren't supposed to put API macros directly in a cpp file.
- DispatchStub wasn't getting all of its symbols related to static members on DispatchStub exported properly. I tried a few fixes but in the end I just moved everyone off using DispatchStub to dispatch CUDA/HIP (so they just use normal dispatch for those cases.) Additionally, there were some mistakes where people incorrectly were failing to actually import the declaration of the dispatch stub, so added includes for those cases.
- torch/csrc/cuda/nccl.cpp was added to the wrong list of SRCS, now fixed (this didn't matter before because previously they were all in the same library)
- The dummy file for libtorch was brought back from the dead; it was previously deleted in #20774
- In an initial version of the patch, I forgot to make torch_cuda explicitly depend on torch_cpu. This lead to some very odd errors, most notably "bin/blob_test: hidden symbol `_ZNK6google8protobuf5Arena17OnArenaAllocationEPKSt9type_infom' in lib/l
ibprotobuf.a(arena.cc.o) is referenced by DSO"
- A number of places in Android/iOS builds have to add torch_cuda explicitly as a library, as they do not have transitive dependency calculation working correctly. This situation also happens with custom C++ extensions.
- There's a ROCm compiler bug where extern "C" on functions is not respected. There's a little workaround to handle this.
- Because I was too lazy to check if HIPify was converting TORCH_CUDA_API into TORCH_HIP_API, I just made it so HIP build also triggers the TORCH_CUDA_API macro. Eventually, we should translate and keep the nature of TORCH_CUDA_API constant in all cases.
Fixes#27215 (as our libraries are smaller), and executes on
part of the plan in #29235.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D18632773
Pulled By: ezyang
fbshipit-source-id: ea717c81e0d7554ede1dc404108603455a81da82
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27716
This uses the gfile filesystem abstraction that allows for writing to any filesystem that satisfies the interface (including S3).
Test Plan: Tested with local files and using internal S3 equivalent.
Reviewed By: natalialunova
Differential Revision: D17530694
fbshipit-source-id: c1f88c035fc03d91186b39092e42489f1c03d2cd
Summary:
Changelog:
- DataLoader argument `sampler` is now of type `Optional[Sampler[int]]`instead of `Sampler[int]`
- DataLoader argument `batch_sampler` is now of type `Optional[Sampler[Sequence[int]]]` instead of `Sampler[Sequence[int]]`
Fixes https://github.com/pytorch/pytorch/issues/27737
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27821
Differential Revision: D17906623
Pulled By: ezyang
fbshipit-source-id: 088cacbb7e9f7988995f40b71adc3e719815f5ad
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26666
Changes:
- Introduce a `ConcreteModuleType` concept. This acts both as the key into the type
cache, and as the source of truth for `ModuleValue::attr` queries. It needs
to do both jobs because that's how we ensure correctness (if the types are
different, it's because `ModuleValue::attr` would return different things).
- Now `recursive_script` will first construct a `ConcreteModuleType` and search for a
pre-existing type before starting compilation.
- All previous paths to creating a `ScriptModule` (including inheriting from
`ScriptModule`) are now rewritten to go through `create_script_module`, so
that we have only a single place where construction happens.
Behavioral changes:
- Big change to `torch.jit.ScriptModule` inheritance: all attributes are now
recursively scripted if possible, matching recursive scripting semantics.
This makes it hard to keep something from being scripted (for example, a
Python submodule). Possibly we'll need an `ignore()` type thing for
attributes. In particular, this adds `self.training` to *every* ScriptModule, since
it's present on every `nn.Module`.
- I believe this change to be transparent to existing users of the inheritance API, since if you had an attribute that is unscriptable that you never used, there is no error. In some cases, we will create new attributes (even if they are unused), which will increase serialized model size from before.
Test Plan: Imported from OSS
Differential Revision: D17551196
Pulled By: suo
fbshipit-source-id: b476d1c9feb3ddfd63406d90989aaf9dfe890591
Summary:
According to https://github.com/pytorch/pytorch/issues/27285 , seems we do not intend to use shebang as an indication of Python version, thus
we enable EXE001 flake8 check.
For violations, we either remove shebang from non-executable Python scripts or grant them executable permission.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27560
Differential Revision: D17831782
Pulled By: ezyang
fbshipit-source-id: 6282fd3617b25676a6d959af0d318faf05c09b26
Summary:
Back in April, malmaud added type annotations for `dataloader.py`. However, at about the same time, SsnL in https://github.com/pytorch/pytorch/issues/19228 replaced `_DataLoaderIter` with `_BaseDataLoaderIter` and two subclasses, `_SingleProcessDataLoaderIter`, and `_MultiProcessingDataLoaderIter`. However - probably because these changes happened in parallel at roughly the same time, the type stubs and several other references in the codebase were never updated to match this refactoring.
I've gone ahead and done the updates to reflect the refactoring in https://github.com/pytorch/pytorch/issues/19228, which fixes the specific type stub/impelementation mismatch pointed out in https://github.com/pytorch/pytorch/issues/26673, although not the broader problem that pytorch doesn't have a test to make sure that the `.pyi` type stub files match the real API defined in `.py` files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27105
Differential Revision: D17813641
Pulled By: ezyang
fbshipit-source-id: ed7ac025c8d6ad3f298dd073347ec83bb4b6600c
Summary:
Most of this was old cruft left over from special handling of `training` before we had a `bool` type. This makes all modules have a `training` attribute that is true by default and removes all other special handling.
Fixes#26884
](https://our.intern.facebook.com/intern/diff/17728129/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27109
Pulled By: driazati
Differential Revision: D17728129
fbshipit-source-id: 8ddc9fbb07a953dd05529538bfdd01ed88b5cb57
Summary:
The mesh plugin is now supported by default TensorBoard install, so removing this comment.
cc sanekmelnikov lanpa natalialunova
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27146
Differential Revision: D17717610
Pulled By: orionr
fbshipit-source-id: 516efad5b800f7261b1dc6728e798c830d88b6ca
Summary:
This PR is about add torch.backends.mkldnn.enabled flag said in https://github.com/pytorch/pytorch/issues/25186 which can be used disable mkldnn at runtime step as torch.backends.cudnn.enabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25459
Differential Revision: D17258926
Pulled By: ezyang
fbshipit-source-id: e179ad364cc608fdaa7d0f37e2e762ceb5eda598
Summary:
This fixes the empty graph problem since pytorch 1.2
To prevent such things happen, we have to make the test harder.
There 3 levels of verification.
lv 1. make sure that the graph is saved to some event file. <--currently here
lv 2. make sure the file can be read by tensorboard.
lv 3. make sure the graph in tensorboard is human-friendly.
I think (3) must be involved by a human.
(2) is possible, but it will be useless if we want to use lv 3 directly.
cc orionr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25599
Reviewed By: sanekmelnikov
Differential Revision: D17229276
Pulled By: orionr
fbshipit-source-id: b39f2f1805ee0b3a456b2c69d97e6e3622f5220e
Summary:
This is a follow-up to gh-23408. No longer supported are any arches < 3.5 (numbers + 'Fermi' and 'Kepler+Tegra').
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24442
Differential Revision: D16889283
Pulled By: ezyang
fbshipit-source-id: 3c0c35d51b7ac7642d1be7ab4b0f260ac93b60c9
Summary:
The old behavior was to always use `sm_30`. The new behavior is:
- For building via a setup.py, check if `'arch'` is in `extra_compile_args`. If so, don't change anything.
- If `TORCH_CUDA_ARCH_LIST` is set, respect that (can be 1 or more arches)
- Otherwise, query device capability and use that.
To test this, for example on a machine with `torch` installed for py37:
```
$ git clone https://github.com/pytorch/extension-cpp.git
$ cd extension-cpp/cuda
$ python setup.py install
$ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so
ELF file 1: lltm.1.sm_61.cubin
```
Existing tests in `test_cpp_extension.py` for `load_inline` and for compiling via `setup.py` in test/cpp_extensions/ cover this.
Closes gh-18657
EDIT: some more tests:
```
from torch.utils.cpp_extension import load
lltm = load(name='lltm', sources=['lltm_cuda.cpp', 'lltm_cuda_kernel.cu'])
```
```
# with TORCH_CUDA_ARCH_LIST undefined or an empty string
$ cuobjdump --list-elf /tmp/torch_extensions/lltm/lltm.so
ELF file 1: lltm.1.sm_61.cubin
# with TORCH_CUDA_ARCH_LIST = "3.5 5.2 6.0 6.1 7.0+PTX"
$ cuobjdump --list-elf build/lib.linux-x86_64-3.7/lltm_cuda.cpython-37m-x86_64-linux-gnu.so
ELF file 1: lltm_cuda.cpython-37m-x86_64-linux-gnu.1.sm_35.cubin
ELF file 2: lltm_cuda.cpython-37m-x86_64-linux-gnu.2.sm_52.cubin
ELF file 3: lltm_cuda.cpython-37m-x86_64-linux-gnu.3.sm_60.cubin
ELF file 4: lltm_cuda.cpython-37m-x86_64-linux-gnu.4.sm_61.cubin
ELF file 5: lltm_cuda.cpython-37m-x86_64-linux-gnu.5.sm_70.cubin
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23408
Differential Revision: D16784110
Pulled By: soumith
fbshipit-source-id: 69ba09e235e4f906b959fd20322c69303240ee7e