pytorch

OSSForks/pytorch

Fork 0

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Commit Graph

Author	SHA1	Message	Date
Shen Li	39d412194f	Fix ProcessGroupGloo allgather for tensors with shared storage (#21490 ) Summary: Fix https://github.com/pytorch/pytorch/issues/20421 `ProcessGroupGloo` only requires input/output tensors to be contiguous. Contiguous tensors might not start from the beginning of the underlying storage, e.g., `chunk(..., dim=0)[1]`. The current implementation passes `tensor.storage().data()` ptr to gloo buffer. This leads to wrong results if the tensor has a non-zero storage offset. The proposed solution is to use `tensor.data_ptr()` instead. Let's see if this breaks any tests. cc qijianan777 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21490 Differential Revision: D15768907 Pulled By: mrshenli fbshipit-source-id: 9d7d1e9baf0461b31187c7d21a4a53b1fbb07397	2019-06-12 11:59:17 -07:00
Shen Li	25d1496d58	Fix Process Group for tensors shared across processes (#21449 ) Summary: Ops on a Process Group (pg) instance will hit an error when input/output tensors are created on a different process, because, pg calls `recordStream` on `CUDACachingAllocator` which only knows tensors created within the same process. The proposed solution is to add a `suppressError` arg (suggestions for better names?) to `recordStream`. See comments in code for arguments. CC pichuang1984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/21449 Differential Revision: D15689736 Pulled By: mrshenli fbshipit-source-id: e7fc81b167868f8666536067eaa7ae2c8584d88e	2019-06-11 11:50:25 -07:00

Author

SHA1

Message

Date

Shen Li

39d412194f

Fix ProcessGroupGloo allgather for tensors with shared storage (#21490 )

Summary:
Fix https://github.com/pytorch/pytorch/issues/20421

`ProcessGroupGloo` only requires input/output tensors to be contiguous. Contiguous tensors might not start from the beginning of the underlying storage, e.g., `chunk(..., dim=0)[1]`. The current implementation passes `tensor.storage().data()` ptr to gloo buffer. This leads to wrong results if the tensor has a non-zero storage offset.

The proposed solution is to use `tensor.data_ptr()` instead. Let's see if this breaks any tests.

cc qijianan777
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21490

Differential Revision: D15768907

Pulled By: mrshenli

fbshipit-source-id: 9d7d1e9baf0461b31187c7d21a4a53b1fbb07397

2019-06-12 11:59:17 -07:00

Shen Li

25d1496d58

Fix Process Group for tensors shared across processes (#21449 )

Summary:
Ops on a Process Group (pg) instance will hit an error when input/output tensors are created on a different process, because, pg calls `recordStream` on `CUDACachingAllocator` which only knows tensors created within the same process.

The proposed solution is to add a `suppressError` arg (suggestions for better names?) to `recordStream`. See comments in code for arguments.

CC pichuang1984
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21449

Differential Revision: D15689736

Pulled By: mrshenli

fbshipit-source-id: e7fc81b167868f8666536067eaa7ae2c8584d88e

2019-06-11 11:50:25 -07:00

2 Commits