pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Danny Huang	cbe1eac1f4	[caffe2] adds Cancel to SafeDequeueBlobsOp and SafeEnqueueBlobsOp (#45177 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45177 ## Motivation * To be able to make C2 ops cancellable so we can safely exit. * Some C2 operators are now blocking thus being non-cancellable. If an error occurs we need to be able to safely stop all net execution so we can throw the exception to the caller. ## Summary * When an error occurs in a net or it got cancelled, running ops will have the `Cancel` method called. This diff adds `Cancel` method to the `SafeEnqueueBlobsOp` and `SafeDequeueBlobsOp` to have the call queue->close() to force all the blocking ops to return. * Adds unit test that verified the error propagation. Test Plan: ## Unit test added to verify that queue ops propagate errors ``` buck test caffe2/caffe2/python:hypothesis_test -- test_safe_dequeue_blob__raises_exception_when_hang --stress-runs 1000 ``` ``` Summary Pass: 1000 ListingSuccess: 1 ``` Reviewed By: d4l3k Differential Revision: D23846967 fbshipit-source-id: c7ddd63259e033ed0bed9df8e1b315f87bf59394	2020-09-24 14:22:46 -07:00
Mike Ruberry	b6f4bb0a70	Revert D23236088: [pytorch][PR] [caffe2] adds Cancel to SafeDequeueBlobsOp and SafeEnqueueBlobsOp Test Plan: revert-hammer Differential Revision: D23236088 (`0ccc38b773`) Original commit changeset: daa90d9ee324 fbshipit-source-id: 933c7deab177250075683a9bea143ac37f16a598	2020-09-16 23:32:50 -07:00
Danny Huang	0ccc38b773	[caffe2] adds Cancel to SafeDequeueBlobsOp and SafeEnqueueBlobsOp (#44495 ) Summary: ## Motivation * To be able to make C2 ops cancellable so we can safely exit. * Some C2 operators are now blocking thus being non-cancellable. If an error occurs we need to be able to safely stop all net execution so we can throw the exception to the caller. * When an error occurs in a net or it got cancelled, running ops will have the `Cancel` method called. * This diff adds `Cancel` method to the `SafeEnqueueBlobsOp` and `SafeDequeueBlobsOp` to have the call queue->close() to force all the blocking ops to return. * Adds unit test that verified the error propagation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44495 Test Plan: ## Unit Test added to verify that queue ops propagate errors ``` buck test caffe2/caffe2/python:hypothesis_test ``` Reviewed By: dzhulgakov Differential Revision: D23236088 Pulled By: dahsh fbshipit-source-id: daa90d9ee32483fb51195e269a52cf5987bb0a5a	2020-09-16 18:17:34 -07:00
Jerry Zhang	83f32eebd9	Tensor construction codemod - 2/3 (#14836 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14836 Codemod generated with clangr shard mode, 25 files per diff, motivation: https://github.com/pytorch/pytorch/pull/12407 Reviewed By: bddppq Differential Revision: D13335176 fbshipit-source-id: 8d89510670e2cf70559d2f75e68f7181feb0b6d9	2018-12-10 19:30:56 -08:00
Dmytro Dzhulgakov	da9e49e586	Remove Context dependency from Tensor class (#14269 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14269 Removes reference to Context proper and instead adds a bool argument for async copy (the same as `copy_`) For CopyFrom - I haven't tweaked all callsites yet. Instead I rely on a terrible hack that pointer to context is implicitly converted to bool when passed, haha :) It's not a good code and I propose to fix it in a follow up diff (maybe using clangr tooling). Reviewed By: ezyang Differential Revision: D13117981 fbshipit-source-id: 7cb1dc2ba6a4c50ac26614f45ab8318ea96e3138	2018-11-28 15:45:38 -08:00
ArutyunovG	8e91da4cb3	Windows shared build (#13550 ) Summary: Hi guys, I'd like to build Caffe2 with more supported options in Windows with Microsoft Visual Studios. This is the first pull request. Running scripts/build_windows_shared.bat is able to build Caffe2 with both CMAKE_BUILD_TYPE=Debug and CMAKE_BUILD_TYPE=Release with Visual Studio 14 2015. CUDA is 9.0, cudnn is 7.0.5, glog, gflags and lmdb are supported on my system. Python is 3.5, Detectron works from python interface as well. It was even possible to debug detectron code and step into caffe2_gpu.dll with pdbs built. What is disappointing, that c10/experimental ops don't build with this Visual Studio generator, I added special option INCLUDE_EXPERIMENTAL_C10_OPS (default ON) to deal with it in build_windows_shared.bat. After this pull request the next step is to add Visual Studio 2017 support in the script. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13550 Reviewed By: ezyang Differential Revision: D13042597 Pulled By: orionr fbshipit-source-id: f313f909f599cd582a1d000eff766eef3a9fc4fc	2018-11-16 12:16:28 -08:00
Jerry Zhang	508f676c50	Rename ndim() -> dim() - 5/6 Summary: Codemod generated with clangr shard mode, 50 files per diff, clangr code(ndim()->dim()): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp Reviewed By: salexspb Differential Revision: D12935787 fbshipit-source-id: 303d71d3eb050789af2ab9575e5dcc48f6037086	2018-11-06 16:38:35 -08:00
Jerry Zhang	13b9fd3e05	Renaming meta() to dtype() - 2/2 (#13334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13334 Codemod generated with clangr shard mode, 50 files per diff, clangr code(meta->dtype): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp i-am-not-moving-c2-to-c10 Reviewed By: ezyang Differential Revision: D12845197 fbshipit-source-id: f87eb575d3c31593ca76b70780cc4fca888e706b	2018-10-30 18:24:30 -07:00
Jerry Zhang	91e87c0395	Renaming size() to numel() - 2/2 Summary: Codemod generated with clangr shard mode, 50 files per diff, clangr code(size->numel): diffusion/FBS/browse/master/fbcode/caffe2/caffe2/fb/codemods/TensorMethodRename.cpp i-am-not-moving-c2-to-c10 Reviewed By: ezyang Differential Revision: D12833748 fbshipit-source-id: 98dc2d3abc23c177c2c9e457b81499952d4b690c	2018-10-29 18:59:29 -07:00
Jerry Zhang	b790fcaf39	Renaming dims() to sizes() (caffe2/caffe2) - 4/4 Summary: Codemod generated with clangr shard mode, 25 files per diff, for renaming dims() to sizes() Reviewed By: ezyang Differential Revision: D10842900 fbshipit-source-id: 8d58ed4d403fb0308a8fa286659f8e830b040bec	2018-10-24 16:32:51 -07:00
Jerry Zhang	aebf3b47ae	Remove template parameter from Tensor (#9939 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9939 Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: ezyang, houseroad Differential Revision: D9024330 fbshipit-source-id: e0b8295d2dc6ebe2963383ded5af799ad17164ba	2018-07-27 10:56:39 -07:00
Jerry Zhang	969b62f276	Revert D8121878: Remove template parameter from Tensor Differential Revision: D8121878 Original commit changeset: 4a5e9a677ba4 fbshipit-source-id: d8e2c0bb145b52fbcca323b22d1d3346f0b3249e	2018-07-26 14:02:04 -07:00
Jerry Zhang	cd5adc7b5f	Remove template parameter from Tensor (#13 ) Summary: Pull Request resolved: https://github.com/facebookresearch/weakly-supervised-action-detection/pull/13 Pull Request resolved: https://github.com/pytorch/translate/pull/166 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9125 Closes https://github.com/pytorch/pytorch/pull/9125 Use inheritance for polymorphism, and remove template parameter This is to change the templating in call sites, the core implementations will change later Before Caffe2 Tensor class was compile-time fixed to bind to a particular device/context. With this change, we're making it a runtime property (stored inside the tensor), but preserve the same semantics. For example, one has to specify device type in order to create a Tensor - there are no uninitialized tensors. More specifically the changes are: 1. We added an extra argument DeviceType to most of the constructors of the tensor, e.g. (Tensor(DeviceType type)), 2. Semantics of constructor Tensor(const Tensor<SrcContext>& src, ContextForCopy* context); is changed, in this constructor, the second context is passed in to enable us to call the templated Copy function, it could be in a different context as source and target previously, now we'll enforce that the context should have same device type as src, if it is provided. 3. To preserve 'get-or-construct' semantics of Blob, we added specialized getter Blob::GetMutableTensor that verifies both that Blob contains a Tensor and that it's of a correct type 4. Specifically, Tensor type is not default-constructible any more (as we don't have unknown device tensors) and thus some of the code handling STL containers needs to change Note: Some changes are postponed just to keep this diff a bit smaller. Please see `TODO`s. Reviewed By: xw285cornell Differential Revision: D8121878 fbshipit-source-id: 4a5e9a677ba4ac82095df959851a054c81eccf81	2018-07-26 10:25:23 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Yan Shang	e7d4bbc9dd	Add CaffeEnforce in SafeDequeueOp Summary: Preivously in SafeDequeueOp, the in.dims()[0] would fail if in.ndim()=0. However the error message if not informative. I added a Caffe_Enforce, which would print out the input and output blob name. This is very helpful for future debugging as well. Differential Revision: D6821421 fbshipit-source-id: b07e5829a2c580aaaac88b0d9ff8d05f6da11713	2018-01-26 13:50:32 -08:00
Huazhong Ning	90543ff13a	weighted sampling reader dequeue outputs table index Summary: Weighted sampling reader dequeue randomly chooses a hive reader to read a mini-batch. This diff allows dequeue to output the index of the randomly chosen table to a specific blob. Reviewed By: kennyhorror Differential Revision: D6621070 fbshipit-source-id: 754b981fc2bcfdb0146d2a0a5b677e7cfe74211b	2018-01-24 19:06:25 -08:00
Yangqing Jia	efa7c895f6	Misc Windows lint Summary: Closes https://github.com/caffe2/caffe2/pull/1656 Differential Revision: D6633052 Pulled By: Yangqing fbshipit-source-id: 5eeb3912fc769cfd06d252f3ed1d8d5f2a207cfc	2017-12-23 20:07:27 -08:00
Pieter Noordhuis	2d07360938	Fix compilation on GCC 7 Summary: Thanks to BrettRyland for the initial fix in #805. Closes https://github.com/caffe2/caffe2/pull/1602 Reviewed By: Yangqing, asaadaldien Differential Revision: D6534431 Pulled By: pietern fbshipit-source-id: 1a3ecb77743e7cee76b61c516332137c07331067	2017-12-11 13:32:30 -08:00
Yangqing Jia	59b2654544	reapply header change after xplat move Summary: This is a reapplication of the earlier PR due to xplat move. Original author is Christoph Conrads <christoph.conrads@fluent.ai> christoph-conrads . Reviewed By: houseroad Differential Revision: D6379736 fbshipit-source-id: b7482ecf3b9487a528c15e92976e915791210002	2017-11-22 13:04:37 -08:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Alisson Gusatti Azzolini	b4b89e1bd5	Ability to dequeue and concat multiple records in a single QueueDequeue op Summary: This will allow to do data reading in small batches and concat the batches later on. Reviewed By: kennyhorror Differential Revision: D5739129 fbshipit-source-id: 66a8087e5f9d10d654e367c6111ac90cbf54224e	2017-08-31 10:48:59 -07:00
Junjie Bai	4e019dbb6f	Rename def() to debug_def() Summary: Also eliminated non-debug ueses of debug_def Reviewed By: akyrola Differential Revision: D5441534 fbshipit-source-id: 9dab5fb74e25b4da504fa893ec1f3478e282d3f3	2017-07-17 23:50:01 -07:00
Aapo Kyrola	f44991b398	add timeout argument to DequeueBlobs; use 10 min timeout for data workers Summary: As title. This helps with (quite common) cases where data input is stuck for reason or another, and the net execution never proceeds and is stuck forever. Reviewed By: andrewwdye Differential Revision: D5409885 fbshipit-source-id: 840261fd5964408f788fc0f50ece0d74193694ac	2017-07-13 18:52:03 -07:00
Lei Chen	8b5782ed5c	Weighted sampling dequeue operator Summary: Similar to SafeDequeueBlobsOp, but add weight-based sampling for reading from multiple input BlobsQueue. WeightedSampleDequeueBlobsOp will take a vector of weights (each weight is mapped to one input blob queue). Based on probability, we will choose which BlobQueue to fetch. WeightedSampleDequeueBlobsOp shall stop when any of input BlobQueue is empty. Reviewed By: dzhulgakov Differential Revision: D4905160 fbshipit-source-id: 5b1551e2250569f933a6c01ed04442843c5e0cb6	2017-04-19 12:02:06 -07:00
Alisson Gusatti Azzolini	b711c7d039	More perf stats for BlobsQueue Summary: Allow to drill down on data throuhgput overall and per field. Reviewed By: dzhulgakov Differential Revision: D4622168 fbshipit-source-id: 1462bb2fac05824fda0c02f4f5f0b8713893e650	2017-03-24 14:03:28 -07:00
Ross Girshick	2397b6a6f2	Add CUDA support for Safe{Enqueue,Dequeue}BlobsOps Summary: Add support for "safe" versions of enqueue and dequeue. I'm not sure if using `math::Set<bool, Context>` is the best context independent approach for setting the status. Differential Revision: D4398633 fbshipit-source-id: 7c88c8e11acfe36fd3d94f17dbf68ce558eb6df1	2017-02-01 09:44:37 -08:00
Andrey Malevich	2390dfefdb	Kill few more CHECKs. Summary: One more small batch of CHECKs that left in C2 codebase. Most of the left overs should be in tests/GPU only code. Reviewed By: Yangqing Differential Revision: D4243782 fbshipit-source-id: a4a03c116ea8ba16facd2efc135746d5921f19d5	2016-12-05 11:53:25 -08:00
Yangqing Jia	589398950f	fbsync at f5a877	2016-11-18 15:41:06 -08:00
Yangqing Jia	238ceab825	fbsync. TODO: check if build files need update.	2016-11-15 00:00:46 -08:00
Yangqing Jia	b23e51d467	chunky sync	2016-09-06 15:55:19 -07:00
Yangqing Jia	c15e45c9bb	chunky sync again	2016-08-01 20:58:46 -07:00
Yangqing Jia	09bed67e4f	add untracked files	2016-07-21 11:26:41 -07:00

32 Commits