pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Michael Ranieri	51d969e86a	preprocessor cleanup (#33957 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33957 lots of small preprocessor warning cleanup for windows Test Plan: CI green Reviewed By: malfet, albanD Differential Revision: D20153582 fbshipit-source-id: 18fd61c466fd1f55ededdae4448b3009a9cedc04	2020-03-02 13:37:19 -08:00
Kimish Patel	0e52627358	Fixing pthreadpool symbol conflict issue. (#33869 ) Summary: Mainly renaming pthread_create of C2, the only one referred internally in NNPACK, that is conflicting, to pthread_create_c2. Removed 2 other conflicting symbols that are not used internally at all. Pointing XNNPACK to original repo instead of the fork. Copy pasted the new interface and implementation to caff2/utils/threadpool, so that for internal builds we compile against this. When threadpool is unified this will be removed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/33869 Differential Revision: D20140580 Pulled By: kimishpatel fbshipit-source-id: de70df0af9c7d6bc065e85ede0e1c4dd6a9e6be3	2020-02-28 21:23:18 -08:00
Hao Lu	81394581a3	[Caffe2][ThreadPool] Make sure numThreads does not exceed the number of big cores (#33523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33523 When using `ThreadPool::setNumThreads` to set the number of threads, it should not exceed the number of big cores. Otherwise, the performance could degrade significantly. Test Plan: ``` cd ~/fbsource/xplat buck test caffe2:caffe2_testAndroid ``` Reviewed By: dreiss Differential Revision: D19779267 fbshipit-source-id: 4e980e8a0ccc2f37e1c8ed16e2f4651d72924dbd	2020-02-19 18:24:24 -08:00
Sebastian Messmer	643ca5def2	Replace c10::guts::stuff with std::stuff (#30915 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30915 Since we now have C++14, we don't need these c10::guts helpers anymore ghstack-source-id: 95777609 Test Plan: waitforsandcastle Differential Revision: D18869639 fbshipit-source-id: 97716f932297c64c6e814410ac47b444c33d4e2e	2019-12-16 13:57:19 -08:00
Ivan Kobzarev	ca8cb3241a	Expose setNumThreads to android api (#31205 ) Summary: PR https://github.com/pytorch/pytorch/pull/31033 was unlanded due to macos build failure: https://app.circleci.com/jobs/github/pytorch/pytorch/3916388 This PR has changes that `setNumThreads` is only for android and moved to separate class `org.pytorch.PytorchAndroid` as a static function which is better as it has global effect Pull Request resolved: https://github.com/pytorch/pytorch/pull/31205 Reviewed By: dreiss Differential Revision: D18977250 Pulled By: IvanKobzarev fbshipit-source-id: 4995859808af498c82933c4db52bd7c7dfae90e5	2019-12-12 18:57:27 -08:00
Michael Suo	c0bcfd0445	Revert D18923167: Expose setNumThreads to android api Test Plan: revert-hammer Differential Revision: D18923167 Original commit changeset: 8d98c2edbff4 fbshipit-source-id: 7db37cff298c511d0dd9eb373811c769e4a73be9	2019-12-12 09:23:58 -08:00
Ivan Kobzarev	6225443009	Expose setNumThreads to android api (#31033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31033 Intention: There are requests from users to control number of threads from android side: https://discuss.pytorch.org/t/android-pytorch-forward-method-running-in-a-separate-thread-slow-down-ui-thread/63516/2 https://discuss.pytorch.org/t/threading-of-model-pytorch-android/62490/2 At the moment `setNumThreads` is placed in `org.pytorch.Module`, but this method changes global threadPool size, in future we will move it to some separate class to repeat python binding structure, which has torch.set_num_threads() Test Plan: Imported from OSS Differential Revision: D18923167 Pulled By: IvanKobzarev fbshipit-source-id: 8d98c2edbff42e9b673509672dce3f2dd03a923e	2019-12-11 14:20:14 -08:00
Tao Xu	b730d04ed2	Fix deadlock issues in ThreadPool (#29885 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29885 ### Summary Currently, we have a deadlock issue on iOS when running Resnet50. The problem happens when the task being run in the ThreadPool wants to call `getNumThread()` who will try to acquire the same mutex. And thus cause the deadlock situation. The fix is just remove the guard for `_numThreads`, as it's not likely to change after initialization. ### Test Plan 1. Generate a Resnet50 model using trace_model.py 2. Run `ios/TestApp/bootstrap.sh` to do the benchmark cc shoumikhin AshkanAliabadi Test Plan: Imported from OSS Differential Revision: D18533505 Pulled By: xta0 fbshipit-source-id: 2a069d20b59833ec8b02ff05515c3739a85a15de	2019-11-15 19:27:52 -08:00
Michael Liu	92a516b9ff	Apply modernize-use-override - 2/2 Summary: Use C++11’s override and remove virtual where applicable. Change are automatically generated. Reviewed By: Orvid Differential Revision: D14054721 fbshipit-source-id: 15d266fa1779b1e3ea6270f00841d7fb1e4d44ee	2019-02-13 21:01:28 -08:00
Jerry Zhang	0c32e1b43e	use C10_MOBILE/ANDROID/IOS (#15363 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15363 Didn't define C10_MOBILE in the numa file move diff: D13380559 move CAFFE2_MOBILE/ANDROID/IOS to c10 ``` codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_MOBILE" "C10_MOBILE" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_ANDROID" "C10_ANDROID" codemod -m -d caffe2 --extensions h,hpp,cc,cpp,mm "CAFFE2_IOS" "C10_IOS" ``` i-am-not-moving-c2-to-c10 Reviewed By: marcinkwiatkowski Differential Revision: D13490020 fbshipit-source-id: c4f01cacbefc0f16d5de94155c26c92fd5d780e4	2019-01-09 15:08:20 -08:00
Yangqing Jia	7d5f7ed270	Using c10 namespace across caffe2. (#12714 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12714 This is a short change to enable c10 namespace in caffe2. We did not enable it before due to gflags global variable confusion, but it should have been mostly cleaned now. Right now, the plan on record is that namespace caffe2 and namespace aten will fully be supersets of namespace c10. Most of the diff is codemod, and only two places of non-codemod is in caffe2/core/common.h, where ``` using namespace c10; ``` is added, and in Flags.h, where instead of creating aliasing variables in c10 namespace, we directly put it in the global namespace to match gflags (and same behavior if gflags is not being built with). Reviewed By: dzhulgakov Differential Revision: D10390486 fbshipit-source-id: 5e2df730e28e29a052f513bddc558d9f78a23b9b	2018-10-17 12:57:19 -07:00
Yangqing Jia	38f3d1fc40	move flags to c10 (#12144 ) Summary: still influx. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12144 Reviewed By: smessmer Differential Revision: D10140176 Pulled By: Yangqing fbshipit-source-id: 1a313abed022039333e3925d19f8b3ef2d95306c	2018-10-04 02:09:56 -07:00
Marat Dukhan	67c6d93634	Tune minimal work size (#10599 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10599 Not spawning threads with spin-lock synchronization is bad because they will switch to `condvar` wait, which increases wake-up latency next time they are needed. Reviewed By: ajtulloch Differential Revision: D9366664 fbshipit-source-id: 3b9e4a502aeefaf0ddc4795303a855d98980b02e	2018-08-16 17:39:57 -07:00
Orion Reblitz-Richardson	1d5780d42c	Remove Apache headers from source. * LICENSE file contains details, so removing from individual source files.	2018-03-27 13:10:18 -07:00
Marat Dukhan	7462eca363	Initialize cpuinfo in the thread pool Thread pool called cpuinfo_get_processors_count() without initializing cpuinfo. Only by luck it didn't make Caffe2 single-threaded: threadpool is initialized after NNPACK, and NNPACK initializes cpuinfo itself. This commit also updates cpuinfo to a version that aborts with a fatal error if its used uninitialized.	2018-03-26 15:44:47 -04:00
sf-wind	602a09dde7	Update caffe2 from facebook 4f527ef46abf (#2234 ) * [GanH]: two_task_discriminator as titled and adding label smooth * [Dper2] Simplified UI options needed for blob magnitude visualization * [GanH]: fix tags as titled * Added type and shape inference for GatherRange operator This helps with type / shape inference when using this operator in layers. Also just a nice to have in general. * Demonstrate Caffe2 exception handling with StoreHandlerTimeoutError in Python We'd like to catch and recover from certain Caffe2 net exceptions. Use this diff to demonstrate a pattern of registering a pybind exception mapping and catching in Pythonusing caffe2::StoreHandlerTimeoutException. * Bind Gloo IoException to IoError in Python Allow peer failure handling and recovery using an exception based mechanism. This diff registers gloo::IoException with pybind. * [GanH]: add label smoothing to softmax with loss as titled * [C2] Enable LARS in Adagrad and hook it to DPER * [DPER] Don't pass LayerModelHelper in create_trainer_nodes Since we're planning to get rid of it eventually and I want to get access to NetDef only interface ASAP - I'm looking towards removing all references to LMH, where we don't really need them. * fix bugs in LambdaRankNdcgOp the loss and gradient in LambdaRankNdcgOp are incorrect. The loss should be negative log of probs instead of log. * Restrict thread pool on iOS to only big cores Historically, iPhones exposed only one type of cores, and Caffe2 thread pool used all of them. However, iPhone 8/iPhone X exposes 2 big + 4 LITTLE cores. As our thread pool doesn't support work stealing or other forms of load balancing, fast cores end up waiting for the slow ones, and it may be better to restrict execution to only 2 fast cores, like we do on Android. * Remove SparseLength Sum/WeightedSum/Mean operators with fp16 engine Remove SparseLength Sum/WeightedSum/Mean operators with fp16 engine * make clang happy and get fewer warnings make clang happy and get fewer warnings * [Personalization] Support add_output_schema() in layer_model_helper Problem: Currently the output_schema of sparse_nn can only be set once. https://fburl.com/efth5zer. Solution: For flexibility, we want to add fields to output_schema incrementally. Plan: Wrap the change of `model._output_schema` into a new function `add_output_schema()` for adding additional output_schema. Callsite: The add_output_schema() should be called instead at https://fburl.com/efth5zer Reference: The newly added `add_output_schema()` will be similar to `add_loss()` in https://fburl.com/t2ii8njh	2018-03-12 12:22:59 -07:00
Marat Dukhan	09b6ad5785	Use cpuinfo instead of Android's libcpufeatures in Android build	2018-03-09 22:20:37 -05:00
Andrew Tulloch	66131dec6f	Expose Caffe2 WorkerPool from ThreadPool Reviewed By: harouwu Differential Revision: D6946610 fbshipit-source-id: a9fef0f1c7732b534433ee9517abddc32d0ec702	2018-02-14 21:09:15 -08:00
Marat Dukhan	224493d9ce	NNPACK: Use new bindings and custom thread pool Summary: This change should dramatically (~10X) improve performance of convolution with NNPACK engine Closes https://github.com/caffe2/caffe2/pull/1730 Reviewed By: sf-wind Differential Revision: D6695895 Pulled By: Maratyszcza fbshipit-source-id: 26291916811ef4cb819a59aec848c4e23668e568	2018-01-11 10:48:12 -08:00
Yangqing Jia	8286ce1e3a	Re-license to Apache Summary: Closes https://github.com/caffe2/caffe2/pull/1260 Differential Revision: D5906739 Pulled By: Yangqing fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902	2017-09-28 16:22:00 -07:00
Marat Dukhan	bd17684252	Run thread pool only on fast cores Summary: choose the number of cores for the thread pool as the number of fast cores Didn't do any benchmarks, so its mostly FYI diff Reviewed By: ajtulloch Differential Revision: D5579797 fbshipit-source-id: 5ada001116c731780f38a62e9c0b500bd64a4bfe	2017-09-13 14:35:28 -07:00
Andrew Tulloch	898f3f398c	Use gemmlowp-based worker pool (spinning + #threads of blocks of work) instead of custom work-stealing impl Reviewed By: Yangqing Differential Revision: D5696841 fbshipit-source-id: 84b629d2c1ebd418c75d5da907799e580cc59d1e	2017-08-28 00:46:01 -07:00
Jon Morton	9b9df3fbeb	Sync mobile codebase changes back to fbcode Summary: Rather chunky sync of changes made exclusively to mobile codebases back to fbcode. Reviewed By: ajtulloch Differential Revision: D5314405 fbshipit-source-id: c4d0a7244468f953eb63288306bc9bc78eb9e1be	2017-07-18 17:54:41 -07:00
Andrew Tulloch	6bff82eb6a	Revert threadpool minWorkSize change on iOS Reviewed By: sf-wind Differential Revision: D5380298 fbshipit-source-id: fdf98bdda30e8cd6689c59fcc0357bca129d409b	2017-07-07 12:41:52 -07:00
Andrew Tulloch	43c46cc883	Reduce default ThreadPool min work size (~25% speedup for segmentation on S7). Summary: I noticed this when experimenting with the compute-bound convolutions for the ULP HWGQ binary conv/gemm. It's an ugly heuristic that Maratyszcza and co. are improving this half, but I think this will be a net win for C2 especially if segmentation/mask r-cnn are critical. Differential Revision: D5375976 fbshipit-source-id: 863f76d434f133bf5a00e7ced1cfadfcf92e3c84	2017-07-06 08:32:32 -07:00
Andrew Tulloch	7d9a0a41fd	Allow forcing single-threaded execution at runtime. Summary: Might be useful for the EXC_RESOURCE / CPU issues. Reviewed By: salexspb Differential Revision: D4565494 fbshipit-source-id: 74ac9edeba6334a46ee6799a93ca96eb68216439	2017-02-16 06:11:27 -08:00
Yangqing Jia	589398950f	fbsync at f5a877	2016-11-18 15:41:06 -08:00
Yangqing Jia	238ceab825	fbsync. TODO: check if build files need update.	2016-11-15 00:00:46 -08:00

28 Commits