Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22084
For DictPtr/ListPtr, default construction was disallowed because it was ambigious if it's supposed to create an empty list or a nullptr.
But since we renamed them to Dict/List, we can now allow default construction without ambiguity.
Differential Revision: D15948098
fbshipit-source-id: 942a9235b51608d1870ee4a2f2f0a5d0d45ec6e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21177
- Integrate c10::ListPtr into IValue and the c10 dispatcher.
- Streamline conversion to/from IValue. Before, we had IValue::to<> and kernel_functor.h had its own ivalue_to_arg_type and return_type_to_ivalue. They are now unified. Also, this means that nested types like Dicts of Lists of Optional of Dict of ... do work as expected now
Differential Revision: D15476433
fbshipit-source-id: bde9df80df20091aa8e6ae17ba7e90abd149b954
Summary:
This PR is a eliminates unneeded grad_sum_to_size and in particular speeds up the LSTM backward by allowing better fusion.
It consists of two parts:
- In AutoDiff, record broadcasting sizes only if the broadcast output size is different from the input size, otherwise record None.
- The specialization of Optional arguments (#18407) allows us to then eliminate ` _grad_sum_to_size(t, None)` in the peephole optimization step.
Thus, in the LSTM case, no SumToSize remain in the crucial fusion group. The trick here is that we can specialize on the runtime information from the forward.
I'm testing that different broadcasting situations lead to different graphs.
I didn't move all symbolic_script _grad_sum_to_size to the new logic, but it might be better to do this incrementally, anyway.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18697
Differential Revision: D15482076
Pulled By: wanchaol
fbshipit-source-id: 7f89367e35b8729910077c95c02bccefc8678afb
Summary:
We can't introduce aliasing to a graph output, since they may be mutated after.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19576
Differential Revision: D15057734
Pulled By: eellison
fbshipit-source-id: 33594c05d985a0c58edebd6252e1ee2c0efb6f0e
Summary:
Move these 2 ops back to autodiff to unblock xla CI.
I will leave them for my next PR to cleanup symbolic_variable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18969
Differential Revision: D14816811
Pulled By: ailzhang
fbshipit-source-id: dd8a7e133dcad29560d3d1d25691883960117299
Summary:
As discussed with gchanan we should deduplicate symbolic_variable and symbolic_script to prepare for the future merge with derivatives.yaml.
This PR moves most easy formulas to symbolic_script.
TODO: run benchmarks to make sure no perf regression
cc: apaszke zdevito wanchaol
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17986
Differential Revision: D14766412
Pulled By: ailzhang
fbshipit-source-id: d95a3f876e256c0f505779a71587c985571d3b8f
Summary:
This PR did two things:
1. Enable scalar->float specialization in symbolic script, so AD formula that contains scalar in the schema, should write `float` instead.
2. add addcmul, lerp to AD and fuser.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18081
Differential Revision: D14490493
Pulled By: wanchaol
fbshipit-source-id: b3b86d960d5f051b30733bc908b19786111cdaa4
Summary:
This PR removes a few size of `self` that passed from forward pass to backward pass when `self` is already required in backward pass. This could be reason that cause the potential slow down in #16689 . I will attach a few perf numbers (still a bit volatile among runs tho) I got in the comment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17187
Differential Revision: D14179512
Pulled By: ailzhang
fbshipit-source-id: 5f3b1f6f26a3fef6dec15623b940380cc13656fa
Summary:
Trying to land again, make prim::None into a case of prim::Constant. Reverted the previous landing because it broke an important onnx export test.
https://github.com/pytorch/pytorch/pull/16160
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17186
Differential Revision: D14115304
Pulled By: eellison
fbshipit-source-id: 161435fc30460b4e116cdd62c7b2e5b94581dcb7
Summary:
Reenables rand_like fusion if no tensor is broadcasted in the fusion group. This is a sufficient but not necessary condition for fused rand_like to produce correct results, and it has an unpleasant side effect of falling back to non-fused path if rand_like was optimistically included in the fusion group, but there is a broadcast in the fusion group not necessarily related to rand_like. E.g. before this PR, if the network had (biasAdd -> relu -> dropout), fuser could fuse biasAdd and relu, now it will try fusing the whole thing (if dropout is expressed via rand_like) and fall back every time.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16087
Differential Revision: D13720232
Pulled By: zou3519
fbshipit-source-id: 1e19203bec4a59257bfc7078b054a19f00fab4ad
Summary:
This change simplifies analysis done on constants since prim::None does not need to be handled separately now. To check if a constant node is None, use node->isNone().
Next step will be to remove prim::Undefined.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16160
Differential Revision: D14109636
Pulled By: eellison
fbshipit-source-id: d26fd383976163a2ddd4c24984bd672a541cc876
Summary:
- Moved a few functions from `autograd` namespace to `aten` namespace to be visible from JIT nativeResolver.
- Added a hack to loop up keyword only argument. Will add proper support for kw only later
- Simulate function overload in aten using `_<number>` as function name suffix.
- Even `forward` returns multiple outputs like in `kthvalue`, there's at most one requires grad that we currently support.
- Removed the `TensorList` related ops here since partial `TensorList` support is prone to bugs. Our symbolic diff for `cat` was never tested with autodiff, and it seems broken. Need to find another proper way to support these ops(either by properly supporting `TensorList` or sth like `prim::ConstantChunk` and leave them for next PR.
Ops supported in this PR:
```
erf
expand_as
index
kthvalue
mean
permute
pow
rsub
select
sqrt
squeeze
t
to
topk
transpose
view
var
embedding
logsumexp
// grad is None
_dim_arange
contiguous
nonzero
ones_like
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16689
Differential Revision: D14020806
Pulled By: ailzhang
fbshipit-source-id: a5e2c144a7be5a0d39d7ac5f93cb402ec12503a5
Summary:
I went through my build log and did what I thought were reasonable fixes to all the C++ compilation warnings that came up
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16411
Differential Revision: D13901006
Pulled By: jamesr66a
fbshipit-source-id: 02df4e3e5a5c8dd9e69ac9f065cd3f2a80645033
Summary:
Here is a fresh attempt at getting some fusion back in autodiff-generated graphs in the presence of SumToSize.
- The sum to size operator is now `aten::_grad_sum_to_size` to allow symbolic script differentiation (and that in turn would need to use this in place of sum_to_size to signal that it strictly operates on gradients). This is also used in the autodiff code, replacing `prim::SumToSize`.
- `_grad_sum_to_size` is now fusable, `cat`s - which are fused afterwards thanks to Adam's simplification of the code - are only fused if there is no `_grad_sum_to_size` in the fusion group.
- I push the `_grad_sum_to_size` out of the the fusion group when compiling and record the desired summations in the KernelSpec. The reasoning is the following:
- As the autodiff is a repeated applicaiton of the chain rule, we always have the pattern `grad_in = mm(A, grad_out)`, with A often diagonal for cases interesting to the fuser, whence it is `grad_in = a * grad_out` (a pointwise multiplication). We know that only `grad_out` may have AutodiffGradSumToSize applied, so we can commute AutodiffGradSumToSize with the `mul` (and `div` and `neg` are of similar origin).
- For `type_as` the gradient might be giving the type, so just skip SumToSize,
- `add` (which was inserted as `prim::AutogradAdd`) adding gradients when the forward used the same value in several places. This is non-broadcasting, so we know that the two arguments would have the same sizes as inputs - which is good so we don't have to do bookkeeping of the two parts.
Details:
- During fusion, the Tensor arguments are always kept as the first parameters of the fusion group to accomodate indexing assumptions in the fuser.
- The rewriting of the fusion group to record the necessary output transformation and eliminate `_grad_sum_to_size` from the fusion group is now in the fuser compile step.
- In the execution step, the arguments are split into Tensor / Non-Tensor and the non-tensor args are mostly forgotten about except for doing `sum_to_size` at the end. This would want to be improved if/when we fuse nonconstant scalar arguments.
- In a number of places in the fuser, the non-Tensor arguments to the fusion group needed to be ignored.
Thank you, apaszke for the insightful discussion. All bad ideas and errors are my own.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14957
Differential Revision: D13888173
Pulled By: zou3519
fbshipit-source-id: 071992c876e8b845f2b3e6329ae03a835d39a0ea
Summary:
The PR clang-formats everything in `torch/csrc/jit/` and adds it to the pre-commit hook.
Here is a list of non-mechanical changes:
- I went over each file and fixed up whenever I could tell that clang-format was clobbering comment formatting.
- Made the macros in register_prim_ops a little more clang-format friendly by omitting trailing commas
- Refactored autodiff.cpp to use a helper class with explicit state rather than a bunch of capturing lambdas
- Small improvements to the precommit hook clang-format
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15524
Differential Revision: D13547989
Pulled By: suo
fbshipit-source-id: 3ff1541bb06433ccfe6de6e33f29227a2b5bb493
Summary:
This PR enables autodiff to use the forward/backward graph compiled from python code, instead of using symbolic gradients(modifying the original graph directly).
We put the map in a separate .h file for now to wait for the native_functions.yaml and derivatives.yaml merge. This should ideally go into native_functions.yaml eventually.
This PR should be enough to unblock us for now, we can start writing gradients for aten functions in python.
Differential Revision: D13494635
Pulled By: ailzhang
fbshipit-source-id: f8d51a15243ac46afd09d930c573ccdfcd9fdaaf
Summary:
This makes DCE more granular by tracking live values/aliases through the graph (rather than just nodes). So we can be more aggressive in DCE around control flow blocks. For example, in:
```
%a0 = aten::foo()
%b = aten::foo()
%a2, %b2 = prim::If(%cond) {
block0() {
%a1 = aten::foo(%.0)
%b1 = aten::foo(%b)
} -> (%a1, %b1)
}
return (%a2)
```
we will now dce all the `%b` stuff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14910
Differential Revision: D13476445
Pulled By: suo
fbshipit-source-id: 2bf5db19711c07dde946697a4f4b270bd8baf791
Summary:
Anywhere we used #include "foo.h", we now say #include <foo.h>
Paths are adjusted to be rooted out of aten/src, torch/lib, or
the root level directory.
I modified CMakeLists.txt by hand to remove TH and THC from
the include paths.
I used the following script to do the canonicalization:
```
import subprocess
import re
import os.path
files = subprocess.check_output(['git', 'ls-files']).decode('utf-8').rstrip().split('\n')
for fn in files:
if not any(fn.endswith(suff) for suff in ['.cu', '.cpp', '.in', '.h', '.hpp', '.cu', '.cuh', '.cc']):
continue
if not any(fn.startswith(pref) for pref in ["aten/", "torch/"]):
continue
with open(fn, 'r') as f:
c = f.read()
def fmt(p):
return "#include <{}>".format(p)
def repl(m):
p = m.group(1)
if p in ["dlfcn.h", "unistd.h", "nvrtc.h", "cuda.h", "cuda_runtime.h", "cstdint", "cudnn.h", "Python.h", "cusparse.h", "cuda_runtime_api.h", "cuda_fp16.h", "cublas_v2.h", "stdint.h", "curand_kernel.h"]:
return fmt(p)
if any(p.startswith(pref) for pref in ["torch/csrc", "c10/", "ATen/", "caffe2/", "TH/", "THC/", "Eigen/", "gtest/", "zdl/", "gloo/", "onnx/", "miopen/"]):
return fmt(p)
for root in ["aten/src", "torch/lib", ""]:
for bad_root in [os.path.dirname(fn), "aten/src/TH", "aten/src/THC", "torch/csrc"]:
new_p = os.path.relpath(os.path.join(bad_root, p), root)
if not new_p.startswith("../") and (os.path.exists(os.path.join(root, new_p)) or os.path.exists(os.path.join(root, new_p + ".in"))):
return fmt(new_p)
print("ERROR: ", fn, p)
return m.group(0)
new_c = re.sub(r'#include "([^"]+)"', repl, c)
if new_c != c:
print(fn)
with open(fn, 'w') as f:
f.write(new_c)
```
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14849
Reviewed By: dzhulgakov
Differential Revision: D13363445
Pulled By: ezyang
fbshipit-source-id: 52361f878a672785f9306c9e9ab2513128092b68
Summary:
Not ready yet, need some comments / help with this. It's good enough for https://github.com/pytorch/xla immediate goals (forward + backward trace fusion), but there are at least two issues with it:
1. If we don't allow it, `test/test_jit.py` fails to cover the change.
2. If we allow the weight to be set, running `test/test_jit.py TestJitGenerated.test_nn_nll_loss` fails with:
```
======================================================================
ERROR: test_nn_nll_loss (__main__.TestJitGenerated)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test/test_jit.py", line 10001, in do_test
fn, f_args_variable, kwargs_variable, no_grad=no_grad)
File "test/test_jit.py", line 9360, in check_against_reference
outputs_test = self.runAndSaveRNG(func, recording_inputs, kwargs)
File "test/test_jit.py", line 425, in runAndSaveRNG
results = func(*inputs, **kwargs)
File "test/test_jit.py", line 9298, in script_fn
self.assertExportImport(CU.the_method.graph, tensors)
File "test/test_jit.py", line 415, in assertExportImport
self.assertExportImportModule(m, inputs)
File "test/test_jit.py", line 419, in assertExportImportModule
self.assertEqual(self.runAndSaveRNG(m.forward, inputs),
File "test/test_jit.py", line 425, in runAndSaveRNG
results = func(*inputs, **kwargs)
RuntimeError:
arguments for call are not valid:
for operator aten::nll_loss_backward(Tensor grad_output, Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index, Tensor total_weight, *, Tensor out) -> Tensor:
expected a value of type Tensor for argument 'total_weight' but found bool
<internally-created-node>
~ <--- HERE
for operator aten::nll_loss_backward(Tensor grad_output, Tensor self, Tensor target, Tensor? weight, int reduction, int ignore_index, Tensor total_weight) -> Tensor:
expected a value of type Tensor for argument 'total_weight' but found bool
<internally-created-node>
~ <--- HERE
for call at:
<internally-created-node>
~ <--- HERE
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14305
Differential Revision: D13356265
Pulled By: ezyang
fbshipit-source-id: 504d783b2d87f923e698a6a4efc0fd9935a94a41
Summary:
**Review only the last commit.**
This commit adds a few optimizations to AD, that let us dramatically
reduce the number of sizes we capture from forward.
We now:
- collapse chains of SumToSize
- avoid capturing sizes of tensors that are captured anyway
- more aggressively DCE the reverse code
- run CSE on the primal code to deduplicate `aten::size` calls
cc zou3519 zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14758
Differential Revision: D13324440
Pulled By: zou3519
fbshipit-source-id: 45ccbc13605adcef2b461840c6089d3200000c72
Summary:
Previously symbolic AD formulas assumed that no broadcasting happened,
and would return gradients of incorrect shapes (possibly leading to
silent errors later).
Fixes a few bugs (known and unknown):
- #11736
- ArgumentSpec didn't compute the input types correctly [(it didn't advance the offset for non-tensor args)](https://github.com/pytorch/pytorch/pull/14485/files#diff-4fd3157a056596aefb8cdf41022a208bR153)
- Symbolic AD could suffer from use after free (dangling pointers in grad map), because [`EliminateDeadCode` could have removed nodes](https://github.com/pytorch/pytorch/pull/14485/files#diff-25d33ad1ed6855684dec79d927ca6142L781) that referenced gradients of certain values.
- Undefined behavior in `aten::size`
During my tests I've also found a few new problems, and I have opened issues for them:
- FusionGroup seems to think that cat nodes broadcast their inputs (#14483)
- `prim::ConstantChunk` derivative formula doesn't handle undefined inputs (#14484)
This patch unfortunately deoptimizes some of our code (Fusion doesn't happen past chunk nodes, and outputs more tensors only because we have to get their size). I know how to fix those issues, but wanted to fix this terrible bug quickly.
cc zou3519 zdevito ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14485
Reviewed By: eellison
Differential Revision: D13312888
Pulled By: suo
fbshipit-source-id: ad46bfb4d0a306ad9451002f8270f7a790f72d58
Summary:
Previously symbolic AD formulas assumed that no broadcasting happened,
and would return gradients of incorrect shapes (possibly leading to
silent errors later).
Fixes a few bugs (known and unknown):
- #11736
- ArgumentSpec didn't compute the input types correctly [(it didn't advance the offset for non-tensor args)](https://github.com/pytorch/pytorch/pull/14485/files#diff-4fd3157a056596aefb8cdf41022a208bR153)
- Symbolic AD could suffer from use after free (dangling pointers in grad map), because [`EliminateDeadCode` could have removed nodes](https://github.com/pytorch/pytorch/pull/14485/files#diff-25d33ad1ed6855684dec79d927ca6142L781) that referenced gradients of certain values.
- Undefined behavior in `aten::size`
During my tests I've also found a few new problems, and I have opened issues for them:
- FusionGroup seems to think that cat nodes broadcast their inputs (#14483)
- `prim::ConstantChunk` derivative formula doesn't handle undefined inputs (#14484)
This patch unfortunately deoptimizes some of our code (Fusion doesn't happen past chunk nodes, and outputs more tensors only because we have to get their size). I know how to fix those issues, but wanted to fix this terrible bug quickly.
cc zou3519 zdevito ngimel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14485
Differential Revision: D13280899
Pulled By: soumith
fbshipit-source-id: 80cc5ec9331be80e1bb9ddfe85b81c2b997e0b0c
Summary:
This adds scalar type support to the fuser, both internally (instead of auto / assuming float) and for the inputs/outputs.
We can now fuse things with input / output of arbitrary scalar type, in particular comparisons and where work well. So it fixes#13384 by returning the right type tensor (and adds a test where byte and double tensors are returned).
The type inference is done by re-calling PropagateTensorShapeOnNode in the compilation, I would venture that it isn't prohibitively expensive compared to the actual compilation. (Propagation was fixed for where to return the second argument's type and amended to handle FusedConcat.)
I'm not sure how to add a check for the code generated by the fuser, but I am not sure we absolutely need to (we'd see if it is invalid / produces wrong results).
Thanks in particular to apaszke, fmassa, mruberry for advice and encouragement! All the errors are my own.
I have discussed order of PRs briefly with mruberry, if this goes in before he submits the PR, he graciously agreed to rebasing his, but I'd happily rebase, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14336
Differential Revision: D13202620
Pulled By: soumith
fbshipit-source-id: 855159e261fa15f21aca3053bfc05fb3f720a8ef
Summary:
First draft of an alias analysis pass. It's a big PR unfortunately; a rough table of contents/suggested order of review:
1. `AliasAnalysis` pass, which traverses the graph and builds an `AliasDb`. The basic strategy is to assign alias information to every value of mutable type (list/tuple/tensor), and use the alias annotations of each node's schema to assign alias info to the outputs based on the alias info the inputs. Nodes that aren't explicitly schematized have hand-written analysis rules.
2. Integration of aliasing information into `moveBefore/AfterTopologicallyValid()`. Basically, we pass in an alias DB when we ask for moveBefore/After. Similar to how we can boil down dependency analysis to "what nodes use this node", we can boil down mutability analysis to "what nodes write to an alias set input/output'd by this node".
3. Integration of alias analysis to optimization passes that need it. Right now, it is `GraphFuser`, `CreateAutodiffSubgraphs`, constant prop, and CSE. Not sure if any others need it.
- Testing; still figuring out the best way to do this.
- Eventually we want to integrate the alias db into the graph, but we shouldn't do that until we can guarantee that the information can stay up to date with mutations.
- Do the same thing `python_printer` did for operators and force people to register alias analyzers if they can't schematize their op.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14018
Differential Revision: D13144906
Pulled By: suo
fbshipit-source-id: 1bc964f9121a504c237cef6dfeea6b233694de6a
Summary:
* Add hooks to get a callback whenever a valid graph is produced in the compiler or through tracing. These hooks can be used to pretty_print and then reparse every graph our tests produce to check that the serialization function works correctly. Currently this is guarded by an environment variable since there are a few remaining failures.
* Fix printing bugs: True and False rather than 1 and 0, print 0. for floating point zero
* Change behavior of NoneType. It is now no longer a subtype of Optional but instead implicitly converts to it, returning a prim::Node with an Option[T] type for some specific T. This allows functions like `_unwrap_optional` to correctly match against a None while still deriving the right type.
* Fix a bug where empty blocks did not correctly emit "pass" in printer.
* Fix a bug where prim::Undefine sometimes cannot be printed as None because it is being used in a schema-less op. This should be fixable once Optional[T] always uses the same None object.
* Other minor printing bugs
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13959
Reviewed By: jamesr66a
Differential Revision: D13073519
Pulled By: zdevito
fbshipit-source-id: 4167a6b614f2e87b4d21823275a26be5ba4fc3dd
Summary:
Enables almost all `modernize-*` checks in clang-tidy. This warns against things such as:
- Use of `const std::string&` instead of new-style `std::string` + move,
- Using old-style loops instead of range-for loops,
- Use of raw `new`
- Use of `push_back` instead of `emplace_back`
- Use of `virtual` together with `override` (`override` is sufficient)
ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13196
Differential Revision: D12891837
Pulled By: goldsborough
fbshipit-source-id: 4d0f782a09eb391ee718d3d66f74c095ee121c09
Summary:
This PR adds optional type to ATen native, autograd, JIT schema and Python Arg parser, closes#9513. It allows us to use optional default values (including None) for function signature and implementations like clamp, etc., and also let us remove the python_default_init hack.
Follow up:
remove python_default_init completely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12582
Differential Revision: D10417423
Pulled By: wanchaol
fbshipit-source-id: 1c80f0727bb528188b47c595629e2996be269b89
Summary:
This PR:
1. Makes clang-tidy diff against `master` instead of `HEAD~1` in CI, which makes much more sense
2. Enables all checks in the `bugprone-*` category (see https://clang.llvm.org/extra/clang-tidy/checks/list.html) except one about parantheses in macros, because it doesn't always apply too well for us.
Fixed some nice code smells.
ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12378
Differential Revision: D10247972
Pulled By: goldsborough
fbshipit-source-id: 97dc9e262effa6874d2854584bf41a86684eb8bd
Summary:
This PR adds a bool type to `IValue` and puts it into place.
* changes conds for `prim::If` and `prim::Loop` to use `bool` type
* changes operators that take `bool`s to match their native ops
* fixes ambiguous `aten` ops `aten::std` and `aten::var`
* fixes tests in `test_jit.py TestJitGenerated`
```
'test_std_dim',
'test_std_dim_1d',
'test_std_dim_1d_neg0',
'test_std_dim_neg0',
'test_var_dim',
'test_var_dim_1d',
'test_var_dim_1d_neg0',
'test_var_dim_neg0'
```
* adds `prim::BoolToTensor` and `prim::TensorToBool`
apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11834
Differential Revision: D9928570
Pulled By: driazati
fbshipit-source-id: 373c53df2f1a8ffa9e33d9a517002fbeef25f3eb
Summary:
This functionality replaces the Scalar-Tensor builtin operators,
with builtin functions.
Builtin functions are used in place of operators where one operator
can be defined using a composition of another. This simplifies later
optimization passes by allowing us to have fewer operator.
In the future, builtin functions can be used for other purposes.
For example, we can define derivative functions as code rather than
building graphs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12141
Reviewed By: ezyang
Differential Revision: D10088065
Pulled By: zdevito
fbshipit-source-id: a2acb06346e649c4c8a2fe423b420871161c21cf
Summary:
As reported in Issue #10726, the jit compiler, when running on ppc64le, may produce an isomorphic output but fail a diff test against the expected output file. The expected output file is created from a test that was ran on x86_64. This ensures that if ppc64le test output is different, the output is instead compared to an expected output file created when the test is run on a ppc64le system.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11579
Differential Revision: D10080890
Pulled By: soumith
fbshipit-source-id: 7249bf6b5dfa7c853368a3688a982bc9ed642bc9