Commit Graph

66 Commits

Author SHA1 Message Date
Nikita Shulga
1906eaf22f [BE] Get rid of future (#92596)
PyTorch has been Python-3.X+ for ages, so it's a shame to still rely on `future.utils` even in a deprecated Caffe2 codebase

For the reference:
https://peps.python.org/pep-0469/#migrating-directly-to-python-3

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92596
Approved by: https://github.com/kit1980, https://github.com/orionr
2023-01-19 08:46:50 +00:00
Ram Rachum
351d73b97f Fix exception causes all over the codebase (#90271)
This is the continuation to #90134 and hopefully the final PR in this series.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90271
Approved by: https://github.com/kit1980
2022-12-07 04:29:00 +00:00
Adam Simpkins
81b9aa743b [pytorch] Update caffe2/python to eliminate Pyre errors (#52083)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52083

This makes minor fixes in `caffe2/python` to address all errors currently
reported by Pyre.

I update the code to fix errors when doing so looked simple and safe,
and added `pyre-fixme` comments in other places.
ghstack-source-id: 121109695

Test Plan: Confirmed that Pyre no longer reports errors under `caffe2/python`

Differential Revision: D26272279

fbshipit-source-id: b1eb19d323b613f23280ce9c71e800e874ca1162
2021-02-11 11:04:59 -08:00
Xiaodong Wang
d386d3323f [dper] supress excessive msg (#48404)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48404

On bento this is printing a lot of msgs like (see N408483 if you're an internal user)
```
W1123 120952.322 schema.py:811] Scalar should be considered immutable. Only call Scalar.set() on newly created Scalar with unsafe=True. This will become an error soon.
```
And it's ignoring the log level I set at global level. Removing this line unless this is super important.

Test Plan: build a local dper package and verify

Differential Revision: D25163808

fbshipit-source-id: 338d01c82b4e67269328bbeafc088987c4cbac75
2020-11-30 14:55:52 -08:00
Gary Zheng
f1babb00f0 [caffe2] Fix ListWithEvicted _pprint_impl wrongly printing _evicted_values (#47881)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47881

ListWithEvicted's _pprint_impl was accidentally printing _items before this change.

Reviewed By: dzhulgakov

Differential Revision: D24928521

fbshipit-source-id: 0d7940719b4a27defbaae3b99af104d7fe7b5144
2020-11-13 09:23:10 -08:00
Gary Zheng
8b3f1d1288 [caffe2] Add __slots__ to all classes in schema.py (#47541)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47541

The profiler has guided us to `schema.py`. Since these `Field`s are used everywhere and in huge quantities, we can easily make some optimizations system wide by adding `__slots__`.

From StackOverflow, benefits include:

* faster attribute access.
* space savings in memory.

Read more: https://stackoverflow.com/a/28059785/

Reviewed By: dzhulgakov

Differential Revision: D24771078

fbshipit-source-id: 13f6064d367440069767131a433c820eabfe931b
2020-11-09 16:16:28 -08:00
Gary Zheng
4c52a56c40 [caffe2] Properly call super init in schema.py (#47542)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47542

The previous way of doing `Field.__init__(self, [])` is just wrong. Switching to Python2 compatible way: `super(ObjectName, self).__init__(...)`

Reviewed By: dzhulgakov

Differential Revision: D24771077

fbshipit-source-id: d6798c72090c0264b6c583602cae441a1b14587c
2020-11-09 15:02:22 -08:00
Bugra Akyildiz
27c7158166 Remove __future__ imports for legacy Python2 supports (#45033)
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:

```2to3 -f future -w caffe2```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033

Reviewed By: seemethere

Differential Revision: D23808648

Pulled By: bugra

fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
2020-09-23 17:57:02 -07:00
Brian Wignall
f326045b37 Fix typos, via a Levenshtein-type corrector (#31523)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos, with https://github.com/bwignall/typochecker to help automate the checking.

Uses an updated version of the tool used in https://github.com/pytorch/pytorch/pull/30606 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31523

Differential Revision: D19216749

Pulled By: mrshenli

fbshipit-source-id: 7fd489cb9a77cd7e4950c1046f925d57524960ea
2020-01-17 16:03:19 -08:00
Brian Wignall
e7fe64f6a6 Fix typos (#30606)
Summary:
Should be non-semantic.

Uses https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines to find likely typos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30606

Differential Revision: D18763028

Pulled By: mrshenli

fbshipit-source-id: 896515a2156d062653408852e6c04b429fc5955c
2019-12-02 20:17:42 -08:00
Alyssa Wang
bb07f2d063 Pass LRU hash output evicted_values to SparseLookup (#21389)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21389

As titled. To do weight re-init on evicted rows in embedding table, we need to pass the info of the evicted hashed values to SparseLookup, which is the layer model responsible for constructing the embedding table and do pooling.

To pass evicted values, we need to adjust the output record of lru_sparse_hash to include the evicted values, and add optional input to all processors that needs to take in sparse segment. For SparseLookup to get the evicted values, its input record needs to be adjusted. Now the input record can have type IdList/IdScoreList/or a struct of feature + evicted values

Reviewed By: itomatik

Differential Revision: D15590307

fbshipit-source-id: e493881909830d5ca5806a743a2a713198c100c2
2019-07-02 11:27:37 -07:00
Gerard Goossen
148e90ba2a Give clear error message when attempting to merge struct which can't be merged.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19804

Differential Revision: D15098833

fbshipit-source-id: 2950e247c74e125e033cd9cfbf5631eee5298ea0
2019-05-10 07:01:01 -07:00
Chandler Zuo
6737190b5c Make the exception raised from "numpy.dtype(numpy.void, (INT,))" less cryptic (#16809)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16809

https://fb.facebook.com/groups/582508038765902/permalink/736710343345670/?comment_id=824042307945806&reply_comment_id=824318864584817

numpy.dtype(numpy.void, (<INT>, )) raises a cryptic message "invalid itemsize in generic type tuple" that is hard to debug.

This diff adds the message to ask the user to investigate the error causing blob.

Reviewed By: kennyhorror

Differential Revision: D13973359

fbshipit-source-id: 43a0c492ffafbabdfd7f7541c08a258e5ac0280f
2019-02-08 16:46:50 -08:00
Lei Zhang
9c321a8779 Add util function from core type to dtype (#10716)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10716

title

Reviewed By: idning

Differential Revision: D9417357

fbshipit-source-id: 0f71805b1d64a46791d6ee4d8620763f878ffdb6
2018-08-21 10:55:19 -07:00
Sebastian Meßmer
49f8581745
Update from facebook (#7855)
* [mpscnn] MPSCNNChannelShuffle

att

* [Easy] Adding tags as an argument to the functional layer

Without it "tags" would be added as an argument to the operator.

The change here is based on the assumption that there is no operator that takes "tags" as an argument.

* Fix locally_connected_op schema check.

Fix locally_connected_op schema check.

* [C2] Add TypeAndShape inference for few more operators

As desc

* [c2] Shape inference should support 0 as dimension

Tensors can have 0 in their dimension.

* Make MockHiveReader loop over and support max_examples

Replace DatasetReader with RandomDatasetReader.

So that Mock Hive Reader can simulate a large data input using a small sample file as source.

* Utility function to wipe cache between benchmark runs

Caffe2 benchmark does not wipe out cache between runs, and this potentially creates an unrealistically optimistic picture of performance. This diff adds utility function to wipe out the cache.

* Allow caffe2 GlobalInit to be invoked multiple times

Allow caffe2 GlobalInit to be invoked multiple times. Will re-parse gflags and update logging levels on successive invocations, but will not re-run init functions or perform other one-time initialization.

* Add Caffe2 GlobalInitIsCalledGuard to base net and operator classes

Warn if caffe2's GlobalInit function has not been invoked before creating an operator or net object. This is based on discussion here: https://fb.quip.com/kqGIAbmK7vNG

* Rethrow current exception on failure

Rethrow current exception instead of copy constructing a new one on op failure.

* Make `clone()` return subclass of List/Struct

`clone()` is not working correctly when we subclass those classes

* Wipe the cache before the net run

the util function is copied from D7409424
will rebase once D7409424 is landed.

* [Caffe2] [Mobile] Support utils/cast.h::GetCastDataType with LITE_PROTO builds

* Correct includes

async_polling include -> async_base include

* Prepare execution flags for executor migration

Making async_scheduling aware of underlying net type to prepare for executor
migration

* Add operator level observers into async executor

Adding operator level observers into RunAsync operators' calls

* Cleanup TEST_Benchmark

Remove duplicate code and provide default implementation in NetBase

* [C2] Fix type and shape inference for binary comparison ops

As desc.

* Add GlobalInit to predictor to ensure initialization is always done before prediction

FACEBOOK:

Redo D7651453 the correct way.

Now use a static variable for the arguments passed to GLog

* Remove spammy log message

This method is currently used in various places inside Caffe itself.

* Disable events for operators inside a chain

We don't need to use events in operators within a chain because the chain is
always scheduled on a single stream, keeping only first and last event for
scheduling purposes

* Ensure correct finish run order

In rare cases we might call finishRun and trigger net's destruction while
another worker is still holding shared_ptr to a thread pool, that can cause
thread pool destruction from within a worker thread in case no other nets are
using the pool. This diff fixes the order of calling finishRun and also changes
pool() to return raw pointer to keep pool's ownership within the net

* Reduce unnecessary polling

Make sure we don't waste CPU by polling operators that we can set an efficient
callbacks on

* Squash commit of syncing 9506eeb from github to fbcode

Patch xplat buck fix

add virtual destructor to OptimizationPass

add virtual destructor to OptimizationPass

build fixes for sync

build fixes for sync

* Fix net tracing

Fix net tracing from async_scheduling

* Fix logging
2018-05-29 11:38:02 -07:00
bddppq
f94ae3ba1d
Update from facebook (#7696)
* Fix handling of empty batches in SumReduceDimsOp

As titled

* Deferrable async_scheduling finishRun fix

Proper order of finishing run operations in deferrable_async_scheduling net

* Simplify exception handling in async_scheduling

Simplify exception handling, no need to busy wait, thread that processes the
last task can finish the run

* [C2]worker_coordinator_memorize_worker_ids

As titled. This is related to T28689868, where the number of blobs we want to create is equal to the number of worker ids

* Add unit test for nets with no type set

* Ignore total length argument in sympolic_pad_packed_sequence

1- There was a mistake in the code that total_length was added to the wrong symbolic function (pack_padded_sequence) instead of (pad_packed_sequence)
2- No need to throw an exception if total_length is given since it is only used to enable data_parallel training on multi-gpus and doesn't have anything to do with onnx export, so just ignore it. https://fburl.com/tk4gciqp

* Add support for MKLDNN to async_scheduling

Just add MKLDNN as a possible CPU option to async_scheduling's pool function

* [AuFL][ensemble] support branch output for prediction

This diff supports using predictions from different branches and thus enables model ensembling (not fully independent).

* Fix a bug in add_loss in layer_model_helper

As titled.

* Support lradaption for adam

1.lr adaption operator
2.apply to dense adam

* Perf tweaks for async_scheduling

Restore single pool option + remove unnecessary (no-ops) calls

* add quantization to SparseSimdAdagradOp

add a bunch of quantization signatures to SparseSimdAdagradOp, implementations to come next

* [sr] [codemod] Change all SR callsites to use new API

@allow-large-files

This diff refactors all callsites of SR to use the slightly changed API introduced in the diff below. Really what this means is that you need to include the correct header. Also if you were using `ClientFactory::newFactory` you need to not prefix it with `ClientFactory::`.

```
cd ~/fbsource/fbcode
find ./ -type f -exec sed -i -e 's:#include "servicerouter/client/cpp2/ClientFactory.h":#include "servicerouter/client/cpp2/ServiceRouter.h":' -e 's:#include <servicerouter/client/cpp2/ClientFactory.h>:#include <servicerouter/client/cpp2/ServiceRouter.h>:' -e 's/ClientFactory::newFactory(/newFactory(/g' {} \;
```

Also manually fixed spots that couldn't be done automatically (or broke because they depended on transitive includes).

* Back out "Fix handling of empty batches in SumReduceDimsOp"

Original commit changeset: 282da1730cc2 This commit is blocking the
Github->fbcode sync, which really needs to get merged ASAP. D7881937 which this
diff depends on will be reverted in the sync D7990948 which causes this to
break. The sync diff cannot be patched with this reversion because it must be
landed against base revision 5c8c099 , and D7881937 must not be included in the
sync diff because it is breaking GPU tests that are not available in sandcastle
: https://ci.pytorch.org/jenkins/job/caffe2-builds/job/py2-cuda8.0-cudnn6-ubuntu16.04-test/3638/console
for one example.

* Add the flow to support operator benchmark

1) generate model with the operator 2) upload to everstore 3) generate model spec into json file 4) start running the benchmark

* [tum][gpu] Connect DPM trainer with flow and unit tests

This diff:
- Fix some small bugs for Yiming's recent changes to parallelizer, so it suits real use cases.
- Add correct tags to the TUM code, so we can do data parallel transform
- pass extra info when instantiation.
- add unit test for using DPM in TUM model

After this diff, we can do simple box, multi-gpu fully-sync trainer for TUM in Fblearner workflow, but may still need to do speed benchmarking.

* w/o normalized lradaption for adam dense only

The previous lr adaption includes a normalization step when performing the dot product operation. This is not exactly same as what is proposed in the paper. I add normalization as an option. Without it, the operator performs exactly what the paper proposed. With the option, we add the normalization step

* [fb] Use SharedPromise in DeferrableAsyncSchedulingNet

This code is to simplify DeferrableAsyncSchedulingNet by removing condition
variable + small fixes

* [tum] implement cuda sparseLengthsMean and LengthsMean

as title

* Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.

Adding an optional parameter to allow use of protobufs in InferShapesAndTypes function.

* Move feature_to_index to FeatureSpec.feature_to_index

move feature_to_index to FeatureSpec.feature_to_index to avoid override other fields

* [Caffe2] Rename bytes_moved to bytes_written

Just a rename in preparation for supporting bytes_read.

* [c2] fix ReduceFrontSumOp for empty case by setting 0

otherwise, it may use the results from last iteration when it's empty batch.

* [Caffe2] [Int8] Improve Intel CPU performance

* [Easy] Improve PrependDim op logging

as titled

* DBFileReader expand db_path using os.path.expanduser(..)

Since there are a lot of possible use cases of `DBFileReader` to read from user home path, like `~/local/sample.db`, I want to save people's trouble of calling `os.path.expanduser(db_path)` themselves.

* [Caffe2] Add bytes_read to cost structure

We're adding analytical read bytes to cost functions.  This extends the structure accordingly for all CostInference defined operators.
Additionally, some small bug fixes were performed:
1) Cost functions now extract type information of operands instead of assuming float

* Fix sleef on aarch64 for hhvm

@bypass-lint

Rename flag

* Remove duplicated part in caffe2/ideep/operators/conv_op.cc

should be sync error

* Rename test helper function test_adagrad_sparse_helper to adagrad_sparse_test_helper to avoid confusing pytest
2018-05-19 23:10:48 -07:00
François Garillot
b6adecdeee correct schema.Scalar's shape for a shape argument of 1 (#6493)
The schema.Scalar class makes pretty strict assumptions (via its docstring)
on the spec of the shape of its underlying object. Because of idiosyncracies
of numpy indexing and the use of np.dtype, those assumptions are broken on an
 edge case (dtype = (scalar_type, 1)). This corrects the behavior of this
edge case to conform to the spec.
2018-05-07 18:58:11 -07:00
Orion Reblitz-Richardson
1d5780d42c Remove Apache headers from source.
* LICENSE file contains details, so removing from individual source files.
2018-03-27 13:10:18 -07:00
Yan Zhu
36c49c9f4a change schema's __repr__() flat output to pprint style indented output
Summary: as title. This is similar with python pprint utility for nested json data structure. It can be useful for checking schema during debugging.

Reviewed By: kittipatv

Differential Revision: D6710767

fbshipit-source-id: e450aa5477fa1ad4f93c4573f8108a2f49956da8
2018-02-16 16:26:11 -08:00
Lin Yang
27b9b7b15a Make TypeInference work for HalfToFloat & FloatToHalf.
Summary: add missing type mapping.

Reviewed By: kennyhorror

Differential Revision: D6940574

fbshipit-source-id: b70cea4ce2e519cb3e72d0482a38f50dbb968b4a
2018-02-08 15:33:43 -08:00
Xiaolong Wang
168271f1b8 add struct get method
Summary: as titled, to improve the schema usage

Differential Revision: D6565050

fbshipit-source-id: a551fb4f3089410e9cd468ee58e756de6a8ed66e
2017-12-19 12:35:56 -08:00
Dmytro Dzhulgakov
2972a6ca02 Revert D6026557: [caffe2][PR] Fix "No handlers could be found for logger"
Summary:
This reverts commit 95c634872ac02be721257169e38c8fead04cd66b

bypass-lint

Differential Revision: D6026557

fbshipit-source-id: 663c28583ce3b01070ff5449115ed7e222f71776
2017-10-12 20:21:52 -07:00
Luke Yeager
75bece6ede Fix "No handlers could be found for logger"
Summary: Closes https://github.com/caffe2/caffe2/pull/1316

Differential Revision: D6026557

Pulled By: Yangqing

fbshipit-source-id: 95c634872ac02be721257169e38c8fead04cd66b
2017-10-10 22:32:13 -07:00
Bo Xie
4acf56cf80 Typo
Summary: Typo in the docstring

Reviewed By: azzolini

Differential Revision: D5943729

fbshipit-source-id: f4c7adfb8d8855ba66ee988868650acbf0f6ccdb
2017-09-29 16:31:11 -07:00
Yangqing Jia
8286ce1e3a Re-license to Apache
Summary: Closes https://github.com/caffe2/caffe2/pull/1260

Differential Revision: D5906739

Pulled By: Yangqing

fbshipit-source-id: e482ba9ba60b5337d9165f28f7ec68d4518a0902
2017-09-28 16:22:00 -07:00
Kittipat Virochsiri
5aac6a2e06 Make LastNWindowCollector thread-safe
Summary: Make LastNWindowCollector optionally thread-safe. The main benefit is that the mutex can then be used to lock the buffer later, avoiding the need to copy the data.

Reviewed By: chocjy

Differential Revision: D5858335

fbshipit-source-id: 209b4374544661936af597f741726510355f7d8e
2017-09-22 09:48:30 -07:00
Kittipat Virochsiri
d368b59177 logging the blob that has type error
Summary: Currently, it's not easy to track down which tensor is missing type and shape info. Print it out for easier debuggin.

Reviewed By: volkhin, xianjiec

Differential Revision: D5695223

fbshipit-source-id: 7f0be0be777a35bb5a71b3799b29b91f0763c159
2017-08-23 21:21:27 -07:00
Yan Shang
c662480ea6 Return empty Struct when get_field has empty input
Summary:
Currently, for `from_column_list` if the input col_names=[], it throws
errors. To solve this issue, we fix the get_field function so that it creates
an empty Struct when empty col_names is given.

Reviewed By: kittipatv

Differential Revision: D5543865

fbshipit-source-id: f6dfa25326e355f8ec24e5542761851a276beeb9
2017-08-01 19:49:47 -07:00
Bangsheng Tang
5f63f5697a IndexHash
Summary:
1. IndexHashOp
2. Helper class SparseFeatureHash
3. FeatureSpec changes to add desired_hash_size

Reviewed By: kennyhorror

Differential Revision: D5361370

fbshipit-source-id: bf02e3ca12b3654f1d291f77c8af9248b6c4ac55
2017-07-07 23:06:11 -07:00
Thomas Dudziak
5355634dac Dict fixes/improvements and unittest targets for Python 3 in caffe2 core
Summary: As title

Reviewed By: salexspb

Differential Revision: D5316104

fbshipit-source-id: aee43819d817842e5ce6ba3d045a55b1a2491c30
2017-06-29 17:05:41 -07:00
Brian Lan
e2bd3cfc8b Add __sub__ function for schema.Struct
Summary:
This is for the ease of removing the common fields of a struct from another.
For example,
  s1 = Struct(
      ('a', Scalar()),
      ('b', Scalar()),
  )
  s2 = Struct(('a', Scalar()))
  s1 - s2 == Struct(('b', Scalar()))

More examples are provided in the code comments.

Differential Revision: D5299277

fbshipit-source-id: 7008586ffdc8e24e1eccc8757da70330c4d90370
2017-06-28 11:24:01 -07:00
Yan Shang
cf4ac83a91 Make List.__getitem__() works with output of List.field_names()
Summary:
As described in T19378176 by kittipatv, in this diff, we fix the issue of __getitem__() of schema.List.

For example, given Map(int32, float) (Map is a special List), field_names() will return "lengths", "values:keys", & "values:values". "values:keys" and "values:values" are not accessible via __getitem__(). __getitem__() bypasses the values prefix and directly access the fields in the map. Other APIs (e.g., _SchemaNode & dataset_ops) expect "values:keys" and "values:values" as it simplifies traversal logic. Therefore, we should keep field_names() as is and fix __getitem__().

Reviewed By: kittipatv

Differential Revision: D5251657

fbshipit-source-id: 1acfb8d6e53e286eb866cf5ddab01d2dce97e1d2
2017-06-21 14:06:05 -07:00
Mehdi Drissi
6500d7f307 Fixing a small bug in schema where the number of default arguments doesn't match the number of fields
Summary:
The current version of schema.py has a Metadata class with three fields. The default for it is set to
four Nones. This is just changing that to three Nones so that the number of default values matches the number
of actual fields.

Reviewed By: kennyhorror

Differential Revision: D5250463

fbshipit-source-id: 42e5650d270f5f63662614d8445b4819ed370dec
2017-06-15 10:31:56 -07:00
haracejacob
2ec294a8bb Fix a few typos and grammars in comment
Summary:
Fix a few typos and grammars in comment

by using language-check, python library
spell_checker source code is here : https://github.com/17-1-SKKU-OSS/011A/blob/master/spell_checker/spell_checker.py
here is the text file which indicates what things should be fixed :  https://github.com/17-1-SKKU-OSS/011A/tree/master/spell_checker/fix/caffe2
Closes https://github.com/caffe2/caffe2/pull/719

Differential Revision: D5165118

Pulled By: aaronmarkham

fbshipit-source-id: 7fb8ef7a99d03cd5fd2f9ebdb01b9865e90fc37b
2017-06-14 18:22:39 -07:00
Dmytro Dzhulgakov
80fe2e5caf Fix from_column_list
Summary: Previous implementation relied on the order of fields for some reason.

Reviewed By: azzolini

Differential Revision: D5164478

fbshipit-source-id: 12717310860584e18ce4ca67d0bd5048354cdc0a
2017-06-06 01:17:02 -07:00
Yiming Wu
8871ef029b quick fix future issue with brew/core/schema/workspace/scope/utils.py
Summary:
fixing missing future package issue.

Recently we found some of our users does not have future module support. So we might need a try/catch wrapper around all past import

Reviewed By: Yangqing

Differential Revision: D5183547

fbshipit-source-id: 262fdf2940ee1be4454bf0b0abb9e6a0f1a0ee82
2017-06-05 12:01:48 -07:00
Kun Han
fc4d118e6b Caffe2 MemNN Production Model Saving
Summary:
Split the Caffe2 memory based model into to parts
- Dimension reduction MLP
- DNN with concatenation of memory and obj feature

Currently only implement simple mean

Differential Revision: D4866825

fbshipit-source-id: d2f6813402513ec9af30dbe29a50593e2d3cdb3b
2017-06-01 14:31:53 -07:00
Thomas Dudziak
3ccbf23132 String-related fixes for Python 3
Summary: This diff is one step towards enabling python 3 build by making it be more diligent in its handling of strings.

Reviewed By: salexspb

Differential Revision: D4893083

fbshipit-source-id: 28b8adf3280e8d1f0a7dc9b0fee5ad53f2fada57
2017-05-26 16:04:32 -07:00
Kittipat Virochsiri
3ca0de25da Prevent false overwriting of a field
Summary: The code snippet below is invalid in the add unit test is invalid but it may or may not cause exception. Disable the syntax so people don't accidentally use it.

Reviewed By: dzhulgakov

Differential Revision: D4985030

fbshipit-source-id: ffa2b26f7b29128b196aba1b1001a97c87e381cf
2017-05-02 23:18:49 -07:00
Kittipat Virochsiri
e8e36945cf make debug message more explicit & verbose
Summary: I ran into this earlier and the debug messages were not helpful enuogh

Reviewed By: kennyhorror

Differential Revision: D4985754

fbshipit-source-id: b3d12b5e2cfa1b54fca9126768c84c902664ef28
2017-05-02 12:39:14 -07:00
Kittipat Virochsiri
38d3bfa5d4 Warn on setting blob on Scalar
Summary: Calling `set()` or `set_value()` on Scalar is dangerous as something might be holding a reference to it. This is especially true with `LayerModel`, where instantiation is delayed. The code may still run but it will produce unexpected results, i.e., values maybe written to the wrong blob.

Reviewed By: kennyhorror

Differential Revision: D4955366

fbshipit-source-id: f5e8694a9a411ee319ca9f39a0fed632d180b8a5
2017-05-01 20:18:30 -07:00
Kittipat Virochsiri
aaafcfc529 Improving usability of schema
Summary:
This diff contains the following changes:

- implementing __repr__ on Field types; this makes it a little easier to see what broken in the unit tests
- preserve the shape of ndarray input to schema; previously, empty and scalar arrays lose their shape, while other keeps the shape.
- type-checking ndarray input; this ensures basic integrety of schema

Reviewed By: xianjiec

Differential Revision: D4913030

fbshipit-source-id: bd0f6b8722d95bfe800edf98ba05029c5b99d2af
2017-04-25 10:32:08 -07:00
Kittipat Virochsiri
fd9185ab21 fix getting empty struct
Summary: `not field` calls `__len__()`, causing the field to appear to be missing even when it's not

Differential Revision: D4910587

fbshipit-source-id: bc2b2fadab96571ae43c4af97b30e50c084437af
2017-04-19 22:36:05 -07:00
Xianjie Chen
4c70612320 small change to schema
Summary:
as desc.

small fix in the feature_proc layer for the case when we only have one preproc type

Reviewed By: chocjy

Differential Revision: D4908933

fbshipit-source-id: 1338048fc395f85c3724721a9996ad1ee51f0f20
2017-04-19 01:17:22 -07:00
Ou Jin
cd4160c894 distributed training for dper2
Summary:
Add distributed training to dper2 and keep the dper1 working.

* Created a ModelDelegator to wrap ModelHelper and LayerModelHelper to mitigate the difference.
* To get the average length for sparse feature, I extracted some information in feature_processor. There should be some better way to do it after we have new compute_meta.
* metric right now only runs on the first trainer.
* The model is saved correctly for evaluation. But I'm still not sure how to handle the weights for adagrad.

Reviewed By: kennyhorror

Differential Revision: D4767745

fbshipit-source-id: 0559d264827a7fd9327071e8367d1e84a936bea9
2017-03-30 19:04:50 -07:00
Aaron Markham
58f7f2b441 doxygen python block added
Summary: Closes https://github.com/caffe2/caffe2/pull/226

Differential Revision: D4793550

Pulled By: JoelMarcey

fbshipit-source-id: cc33e58186304fa8dcac2ee9115dcc271d785b1e
2017-03-29 06:46:16 -07:00
Kevin Waugh
eea0ea7712 Struct nested field name lookup supports List
Summary:
D4690225 added support for nested field name lookup in nested
`schema.Struct`s.  It would throw a KeyError if trying to access a nested
`List`s field.  Writing the lookup recursively avoids the need to enumerate
all complex field types in the lookup.

Differential Revision: D4719755

fbshipit-source-id: 37c87a32d730f0f45f72fb20894da3e32f820999
2017-03-24 18:17:19 -07:00
Huazhong Ning
ad4ae4528f migrate mtml to dper2
Summary:
1. migrate the basic mtml model to dper 2
2. test dper 2 mtml model
3. test all optimizers

Reviewed By: kittipatv

Differential Revision: D4680215

fbshipit-source-id: 7aac5c59bdac22fcad8ed869b98e9e62dca1d337
2017-03-16 17:48:05 -07:00
Huazhong Ning
bb58074332 support get/add a field by nested name
Summary:
We are having more and more nested Struct schema. There is increasing need to get/adda field by nested name, e.g., for the following nest Struct schema:

st = Struct(
  ('a': Scalar()),
  ('b': Struct(
     ('c': Scalar()),
  )),
)

We may want to get the field "b:c" and/or insert a new field "b:x". The immediate need is for dper2 metrics.

This diff is to achieve this.

Reviewed By: kittipatv

Differential Revision: D4690225

fbshipit-source-id: 71d4a74b36bd1228a2fefd901db2f200602152b7
2017-03-15 02:00:57 -07:00
Wenyi Huang
0308910c58 Enable use of Print for LayerModelHelper
Summary: Whe debug using LayerModelHelper, adding Print to model will trigger this assert.

Reviewed By: xianjiec

Differential Revision: D4687859

fbshipit-source-id: 6932e38f8dd17ba0b80da18a20943ecdb2e8af0a
2017-03-10 15:26:16 -08:00