Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32791
When a registered operator has varags (ends with ... in its schema),
the interpreter now appends the number of arguments to the top of
the stack before invoking the operator. This allows the removal of more
uses of Node* in the interpreter.
This PR also then cleans up the constructors for Operator to make
it more likely someone chooses the correct one. After making these ops:
```
USES NODE: prim::TupleUnpack(...) -> (...)
USES NODE: prim::TupleSlice(...) -> (...)
USES NODE: prim::TupleConstruct(...) -> (...)
USES NODE: prim::ListUnpack(...) -> (...)
USES NODE: prim::ListConstruct(...) -> (...)
USES NODE: prim::DictConstruct(...) -> (...)
USES NODE: prim::Constant() -> (...)
USES NODE: prim::isinstance(...) -> (...)
USES NODE: prim::CreateObject(...) -> (...)
USES NODE: prim::fork(...) -> (...)
USES NODE: aten::warn(str message, *, int stacklevel=2) -> () # need stack level information, so ideally in interpreter so it can look at the stack
```
Into interpreter primitives, we can remove all but two constructors for operators:
one that is (schema_string, operation), and one that is (symbol, op_creator) for
the remaining weird primitives.
Test Plan: Imported from OSS
Differential Revision: D19673158
Pulled By: zdevito
fbshipit-source-id: 95442a001538a6f53c1db4a210f8557ef118de66
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32682
This moves code around so that operator.h/cpp no longer requires a full
definition of Node* nor does it include alias analysis or the pretty printer.
This should make it possible to include in the mobile build.
Functionality for checking if operators match Node and to look up
and operator for a Node have moved to the Node object.
Test Plan: Imported from OSS
Differential Revision: D19615386
Pulled By: zdevito
fbshipit-source-id: e38bdf29971183597ef940d061c06ba56e71d9c5
Summary:
This handles a corner case when a user schedules second bailout after the first one and the first one doesn't fire.
Alternatively, we could go back to the implementation that uses a hash set to remember the indices of bailouts that need to fire.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32672
Differential Revision: D19596872
Pulled By: Krovatkin
fbshipit-source-id: 41dcc374cd2501ac20a9892fb31a9c56d6640258
Summary:
This API seems to be quite useful to make sure all bailouts in a graph are triggered. I used it for testing torchvision models and I was wondering if this might be something we might actually want to have? zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32518
Differential Revision: D19553147
Pulled By: Krovatkin
fbshipit-source-id: 7542c99051588b622091aec6d041c70731ca5d26
Summary:
This is another implementation of the maximum bailout depth.
The first version was implemented in https://github.com/pytorch/pytorch/pull/31521
This one has advantages that
* the bailout depth only exists in `CodeImpl` which seems to be an appropriate place to keep it in.
* threading many objects is reduced to threading through CodeImpl and getPlanFor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32073
Differential Revision: D19443432
Pulled By: Krovatkin
fbshipit-source-id: 898384bb2308a1532a50a33d9e05cfca504711e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31071
Previously the profiler would think Tensors would require grad, even
when the no_grad flag is enabled during execution. This makes the profiling
and guards respect the no_grad flag, which eliminates extra differentiable
graphs that appear in the backward graph (where no_grad is typically enabled).
Test Plan: Imported from OSS
Differential Revision: D18915468
Pulled By: zdevito
fbshipit-source-id: 1ae816a16ab78ae5352825cc6b4a68ed7681a089
Summary:
This uses newly added InlinedCallStack to print the original call stack
even if the real call stack is shallower because of inlining.
This change also makes torchscript stacktraces look like python ones.
Example:
```
torch.jit.script
def baz(c, b):
return c + b
torch.jit.script
def foo(c, b):
return baz(c, b)
torch.jit.script
def bar(c, b):
return foo(c, b)
bar(torch.rand(10), torch.rand(9))
```
Output before:
```
Traceback (most recent call last):
File "fail.py", line 25, in <module>
bar(torch.rand(10), torch.rand(9))
RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0
The above operation failed in interpreter, with the following stack trace:
at fail.py:15:11
torch.jit.script
def baz(c, b):
return c + b
~~~~~ <--- HERE
```
Output after:
```
Traceback (most recent call last):
File "fail.py", line 41, in <module>
bar(torch.rand(10), torch.rand(9))
RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0
The above operation failed in interpreter.
Traceback (most recent call last):
File "fail.py", line 33
torch.jit.script
def bar(c, b):
return foo(c, b)
~~~ <--- HERE
File "fail.py", line 29, in foo
torch.jit.script
def foo(c, b):
return baz(c, b)
~~~ <--- HERE
File "fail.py", line 25, in baz
torch.jit.script
def baz(c, b):
return c + b
~~~~~ <--- HERE
```
Output of non-scripted python code:
```
Traceback (most recent call last):
File "fail.py", line 36, in <module>
bar(torch.rand(10), torch.rand(9))
File "fail.py", line 21, in bar
return foo(c, b)
File "fail.py", line 18, in foo
return baz(c, b)
File "fail.py", line 15, in baz
return c + b
RuntimeError: The size of tensor a (10) must match the size of tensor b (9) at non-singleton dimension 0
```
Differential Revision: D18532812
Test Plan: Imported from OSS
Pulled By: ZolotukhinM
fbshipit-source-id: e7e5ba5e4a8f1c7086406271d0f1685d9db8541a
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27104
* The use case here is to replace prim::ListConstruct, which requires Node, but Node is not available in mobile lite interpreter.
* (OPN, X, N), X is the index to the vararg operator-name and operator tables. N is number of inputs. For ListConstruct example, operator name can be "aten::listconstruct" and the overloaded name is the output type ("int", "float", "bool", "tensor" and "generic").
* A vararg operator table is built with void(int input_size, Stack& stack) functions.
## Unit test
LiteInterpreterConv covers OPN instruction and conv operator.
Test Plan: Imported from OSS
Differential Revision: D17762853
fbshipit-source-id: 475aa0c6678e3760cec805862a78510913a89c83
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25187
The bytecode export flow: dump the bytecode format for the light weighted interpreter.
* The bytecode is generated without input spec optimization. It would be more generic (input independent) with no obvious performance degradation (to be tested).
* Main API: torch::jit::script::Module::save(filename, extra_files, bool *bytecode_format* = false).
* Both bytecode and module object are exported in pickle format.
* The module object (in data.pkl) is the same as the original JIT model.
* The serializer is dependent on pickle only (no protobuf or Json).
* The major functionality is forked in ScriptModuleSerializer2::serialize().
* The test loader is test_bc_export.cpp.
* Simple APIs are added in Code and its implementation to get necessary information (instructions, operators and constants).
* Since there's no dependency on graph/node, GetAttr is promoted from an operator to first-class instruction (https://github.com/pytorch/pytorch/pull/25151) .
* Some definitions (instructions, writeArchive, etc) that are shared by full JIT and bytecode are pulled out of the local namespace (https://github.com/pytorch/pytorch/pull/25148).
The output layout looks like:
* folders of methods.
* In each method folder (for example, forward/):
* bytecode.pkl: instructions and operators
* constants{.pkl,/}: constant list in constants.pkl. If there are tensors in constants, the binary tensor files in constants/ folder.
* data{.pkl,/}: the module object, with binary tensor files in data/ folder. The same as in torchscript.
Test Plan: Imported from OSS
Differential Revision: D17076411
fbshipit-source-id: 46eb298e7320d1e585b0101effc0fcfd09219046
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25439
This introduces a type() method on IValue that returns the tagged type
of the IValue. The intention is that this value is always present/accurate,
making it possible for clients to recover the Type from an IValue.
Currently our APIs here are incomplete: they can sometimes recover a type but not always.
This PR adds the function, and cleans up remaining cases where Lists/Dicts are not
tagged. However, this information does not survive serialization unchanged.
A second PR will use the type information in the ClassType being serialized
to fixup the serialized ivalues to have the correct types again.
After this patch it will be save to remove our incomplete APIs for recovering types.
Test Plan: Imported from OSS
Differential Revision: D17125595
Pulled By: zdevito
fbshipit-source-id: 71c8c1a0e44762647e8f15f45d8ed73af8e6cb92
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25151
The prim::GetAttr operator depends on node. However, in lite interpreter there will be no node dependency. Promote the operator to a first-class instruction.
Test Plan: Imported from OSS
Differential Revision: D17076412
fbshipit-source-id: 8de20978445bb598634c5462e66e4459dcd567be
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25148
Instructions will be used in lite interpreter as well. Pull it out of interpreter.cpp, so that the lite interpreter doesn't have to compile with interpreter.cpp.
Test Plan: Imported from OSS
Differential Revision: D17076413
fbshipit-source-id: 99b3d8d27a96823a4a4dde6b2337ee44635e34cb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25258
this is the first commit in a series to add interfaces to JIT.
Interfaces allow the specification through a blank python class of an
abstract interface that can be used in type annotations for Script functions.
If a TorchScript class implements all the methods in the interface with
the appropriate types, then it is implicitly considered to implement
that interface.
Follows required:
* implementation of serialization
* implementation in the parser frontend
* better error reporting for explaining why a class does not meet an
interface specification.
Test Plan: Imported from OSS
Differential Revision: D17079963
Pulled By: zdevito
fbshipit-source-id: a9986eeba2d4fdedd0064ce7d459c0251480a5a0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24284
This PR finishes the unification of all Tensor types into a single object.
ProfiledTensorType is renamed to TensorType and the old TensorType is
deleted.
Notes:
* Fixes bug in merge for VaryingShape by changing its representation to an
optional list of optional ints.
* Removes ProfiledTensorType::create(type) invocations that can now
simply be expect calls on tensor type.
Test Plan: Imported from OSS
Differential Revision: D16794034
Pulled By: zdevito
fbshipit-source-id: 10362398d0bb166d0d385d74801e95d9b87d9dfc
Summary:
Resolves https://github.com/pytorch/lockdown/issues/18
This implements NamedTuple by taking advantage of the existing `names` field in `TupleType`.
TODO: This currently doesn't retain the NamedTuple-ness through serialization. Discussed with suo offline, we can probably make a way to define an anonymous NamedTuple in script (e.g. `NamedTuple('Foo', [('a', int), ('b', float), ('c', List[float])])` and serialize that
TODO: implement support for calling the constructor with kwargs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21428
Differential Revision: D15741564
Pulled By: jamesr66a
fbshipit-source-id: c077cbcea1880675ca6deb340a9ec78f824a136c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177
- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now
Differential Revision: D15476433
fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18833
ghimport-source-id: 6f2be25fcc5e6be3ffe20582e604bd2c1fbab66b
Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18833 [STACK] Cache device on TensorImpl; clean up TensorImpl constructors.**
* #18832 [STACK] Disallow changing the device of a tensor via set_.
* #18831 [STACK] Stop swapping in Storages of the wrong device for Tensors.
1) We cache device on TensorImpl. This means we can access the device without a virtual function and allows us to more easily extend TensorImpls (because they don't need to figure out how to store the Device for themselves).
2) Clean up TensorImpl APIs. We had a constructor that took a TensorTypeId and an allocator and would allocate a Storage based on the recognized types of TensorTypeIds. Instead, we just have two different constructors: one for types with a storage, one without.
Reviewed By: dzhulgakov
Differential Revision: D14766230
fbshipit-source-id: 745b8db84dcd6cb58f1a8675ad3ff8d033bc50df
Summary:
This defines a generic counters API that users can utilize to provide monitoring functionality in e.g. a production service. We expose both counters for runtime internals as well as a TorchScript API to create user-defined counters. Synopsis of the API:
- `torch/csrc/jit/script/logging.h` specifies the externally-facing API in C++
- `torch/jit/_logging.py` specifies the Python API
We use an interface, `LoggerBase`, to define the interactions between users and a logging backend. Implementing a subclass of `LoggerBase` allows the user to handle these events in a custom way, such as logging into a DB or calling into an infra-specific counters API.
From the frontend perspective, we can create log events in two ways:
1. We provide an `add_stat_value(name, val)` function. This calls into the Logger backend with a key/value pair. For example, we might call `add_stat_value('foo', 1)` to bump an event counter.
2. We provide a `time_point()` function to record a timestamp in nanoseconds. This can be used in conjunction with `add_stat_value` to record runtime wall clock durations.
Examples of frontend usage can be found in `test_jit.py TestLogging`.
We provide a trivial `LockingLogger` implementation as an example and for testing purposes. It is likely not ready for production usage. It demonstrates that a backend implementing the API can do things like specify aggregation types and report these aggregate stats via the `get_counters()` API.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18235
Differential Revision: D14545060
Pulled By: jamesr66a
fbshipit-source-id: 04099543a1898cfdd411511e46e03d5dce9b4881
Summary:
1. Move ATen threadpool & open registration mechanism to C10
2. Move the `global_work_queue` to use this open registration mechanism, to allow users to substitute in their own
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17788
Reviewed By: zdevito
Differential Revision: D14379707
Pulled By: jamesr66a
fbshipit-source-id: 949662d0024875abf09907d97db927f160c54d45
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751
This was made more complicated by the fact that ivalue::IntList
is a thing. So I had to fix all of the sites where we referring
to IValue post facto.
The following codemods were run, in this order:
```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```
Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752
Reviewed By: dzhulgakov
Differential Revision: D13954363
fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15855
This is preparation work for moving IValue to c10.
Reviewed By: ezyang
Differential Revision: D13605259
fbshipit-source-id: cc545f582ab8607bb02aaf71273cb2710200b295