pytorch/caffe2/queue
Danny Huang cbe1eac1f4 [caffe2] adds Cancel to SafeDequeueBlobsOp and SafeEnqueueBlobsOp (#45177)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45177

## Motivation
* To be able to make C2 ops cancellable so we can safely exit.
* Some C2 operators are now blocking thus being non-cancellable. If an error
occurs we need to be able to safely stop all net execution so we can throw
the exception to the caller.

## Summary
* When an error occurs in a net or it got cancelled, running ops will have the
`Cancel` method called.
This diff adds `Cancel` method to the `SafeEnqueueBlobsOp`
and `SafeDequeueBlobsOp` to have the call queue->close() to force all the
blocking ops to return.
* Adds unit test that verified the error propagation.

Test Plan:
## Unit test added to verify that queue ops propagate errors

```
buck test caffe2/caffe2/python:hypothesis_test -- test_safe_dequeue_blob__raises_exception_when_hang --stress-runs 1000
```

```
Summary
  Pass: 1000
  ListingSuccess: 1
```

Reviewed By: d4l3k

Differential Revision: D23846967

fbshipit-source-id: c7ddd63259e033ed0bed9df8e1b315f87bf59394
2020-09-24 14:22:46 -07:00
..
blobs_queue_db.cc Replace c10::guts::stuff with std::stuff (#30915) 2019-12-16 13:57:19 -08:00
blobs_queue_db.h Remove template parameter from Tensor (#9939) 2018-07-27 10:56:39 -07:00
blobs_queue.cc fix -Wsign-compare warnings for some files inside c2 (#18123) 2019-03-19 10:39:20 -07:00
blobs_queue.h build changes to make cpu unified build working. (#10504) 2018-08-15 17:22:36 -07:00
CMakeLists.txt Change hip filename extension to .hip (#14036) 2018-11-16 11:55:59 -08:00
queue_ops_gpu.cc Eanble python tests on ROCM (#9616) 2018-07-24 11:37:58 -07:00
queue_ops.cc Remove Apache headers from source. 2018-03-27 13:10:18 -07:00
queue_ops.h [caffe2] adds Cancel to SafeDequeueBlobsOp and SafeEnqueueBlobsOp (#45177) 2020-09-24 14:22:46 -07:00
rebatching_queue_ops.cc Remove Apache headers from source. 2018-03-27 13:10:18 -07:00
rebatching_queue_ops.h Remove template parameter from Tensor (#9939) 2018-07-27 10:56:39 -07:00
rebatching_queue.cc Revert "Tensor construction codemod(raw_mutable_data) (#16373)" (#18680) 2019-04-01 14:39:13 -07:00
rebatching_queue.h Remove Apache headers from source. 2018-03-27 13:10:18 -07:00