mirror of
https://github.com/zebrajr/tensorflow.git
synced 2025-12-06 12:20:11 +01:00
New "SavedModel: Practical Uses" and "SavedModel: Architecture" documents.
PiperOrigin-RevId: 163669809
This commit is contained in:
parent
f19bb3bebf
commit
f9c758719e
|
|
@ -1,297 +0,0 @@
|
||||||
# Exporting a Trained Model for Serving
|
|
||||||
|
|
||||||
Once you have trained an `Estimator` model, you may want to create a service
|
|
||||||
from that model that takes requests and returns a result. You can run such a
|
|
||||||
service locally on your machine or deploy it scalably in the cloud.
|
|
||||||
|
|
||||||
To prepare a trained Estimator for serving, you must export it in the standard
|
|
||||||
[`SavedModel`](https://www.tensorflow.org/code/tensorflow/python/saved_model/README.md)
|
|
||||||
format, which wraps the TensorFlow graph, the trained variable values, any
|
|
||||||
required assets, and metadata together in a hermetic package.
|
|
||||||
|
|
||||||
In this tutorial, we will discuss how to:
|
|
||||||
|
|
||||||
* Add graph nodes that accept and prepare inference requests
|
|
||||||
* Specify the output nodes and the corresponding [APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto)
|
|
||||||
that can be served (Classify, Regress, or Predict)
|
|
||||||
* Export your model to the `SavedModel` format
|
|
||||||
* Deploy the model in Google Cloud ML Engine and request predictions
|
|
||||||
* Serve the model from a local server and request predictions
|
|
||||||
|
|
||||||
|
|
||||||
## The exported graph and its signatures
|
|
||||||
|
|
||||||
The export procedure assembles a new TensorFlow graph from two main components:
|
|
||||||
1) a Serving Input Receiver that defines the format of the inputs to be
|
|
||||||
accepted, and
|
|
||||||
2) the trained model itself.
|
|
||||||
|
|
||||||
An exported `SavedModel` contains that combined graph packaged together with one
|
|
||||||
or more *signatures*. Like a function signature in any programming language, a
|
|
||||||
graph signature specifies the required inputs (arguments) and the
|
|
||||||
expected outputs (return values) of performing the computation. In the typical
|
|
||||||
case, a single signature is present, corresponding to the predictions that the
|
|
||||||
model has learned to make.
|
|
||||||
|
|
||||||
The *input* portion of the signature is determined by the Serving Input
|
|
||||||
Receiver. To specify the inputs that your deployed model will accept, you must
|
|
||||||
provide a `serving_input_receiver_fn()` to `estimator.export_savedmodel()` (see
|
|
||||||
below).
|
|
||||||
|
|
||||||
The *output* portion of the signature is determined by the model. For instance,
|
|
||||||
canned Estimators know the nature of the outputs they produce (e.g. whether the
|
|
||||||
output is a classification or a regression, and the type and shape of those
|
|
||||||
outputs). Custom Estimators must provide this information via `export_outputs`
|
|
||||||
(see [below](#specifying_the_outputs_of_a_custom_model)).
|
|
||||||
|
|
||||||
> Note: A *multi-headed model* provides multiple signatures, each corresponding
|
|
||||||
> to a different "head", i.e. a set of predictions that can be made from the
|
|
||||||
> same inputs by executing a subgraph of the complete trained graph. The
|
|
||||||
> *output* portions of these signatures are determined by the model.
|
|
||||||
|
|
||||||
|
|
||||||

|
|
||||||
|
|
||||||
## Preparing serving inputs
|
|
||||||
|
|
||||||
During training, an @{$input_fn$`input_fn()`} ingests data and prepares it for
|
|
||||||
use by the model. At serving time, similarly, a `serving_input_receiver_fn()`
|
|
||||||
accepts inference requests and prepares them for the model. The purpose of this
|
|
||||||
function is to add placeholders to the graph which the serving system will feed
|
|
||||||
with inference requests, as well as to add any additional ops needed to convert
|
|
||||||
data from the input format into the feature `Tensor`s expected by the model.
|
|
||||||
The function returns a @{tf.estimator.export.ServingInputReceiver} object, which
|
|
||||||
packages the placeholders and the resulting feature `Tensor`s together.
|
|
||||||
|
|
||||||
A typical pattern is that inference requests arrive in the form of serialized
|
|
||||||
`tf.Example`s, so the `serving_input_receiver_fn()` creates a single string
|
|
||||||
placeholder to receive them. The `serving_input_receiver_fn()` is then also
|
|
||||||
responsible for parsing the `tf.Example`s by adding a @{tf.parse_example} op to
|
|
||||||
the graph.
|
|
||||||
|
|
||||||
When writing such a `serving_input_receiver_fn()`, you must pass a parsing
|
|
||||||
specification to @{tf.parse_example} to tell the parser what feature names to
|
|
||||||
expect and how to map them to `Tensor`s. A parsing specification takes the
|
|
||||||
form of a dict from feature names to @{tf.FixedLenFeature}, @{tf.VarLenFeature},
|
|
||||||
and @{tf.SparseFeature}. (Note this parsing specification should not include
|
|
||||||
any label or weight columns, since those will not be available at serving
|
|
||||||
time—in contrast to a parsing specification used in the `input_fn()` at
|
|
||||||
training time.)
|
|
||||||
|
|
||||||
In combination, then:
|
|
||||||
|
|
||||||
```py
|
|
||||||
feature_spec = {'foo': tf.FixedLenFeature(...),
|
|
||||||
'bar': tf.VarLenFeature(...)}
|
|
||||||
|
|
||||||
def serving_input_receiver_fn():
|
|
||||||
"""An input receiver that expects a serialized tf.Example."""
|
|
||||||
serialized_tf_example = tf.placeholder(dtype=tf.string,
|
|
||||||
shape=[default_batch_size],
|
|
||||||
name='input_example_tensor')
|
|
||||||
receiver_tensors = {'examples': serialized_tf_example}
|
|
||||||
features = tf.parse_example(serialized_tf_example, feature_spec)
|
|
||||||
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
|
|
||||||
```
|
|
||||||
|
|
||||||
The @{tf.estimator.export.build_parsing_serving_input_receiver_fn} utility
|
|
||||||
function provides that input receiver for the common case.
|
|
||||||
|
|
||||||
> Note: when training a model to be served using Google Cloud ML Engine (see
|
|
||||||
> below), the parsing step is not needed, because the model will receive raw
|
|
||||||
> feature data. This is also true when using the Predict API with a local
|
|
||||||
> server.
|
|
||||||
|
|
||||||
Even if you require no parsing or other input processing—i.e., if the
|
|
||||||
serving system will feed feature `Tensor`s directly—you must still provide
|
|
||||||
a `serving_input_receiver_fn()` that creates placeholders for the feature
|
|
||||||
`Tensor`s and passes them through. The
|
|
||||||
@{tf.estimator.export.build_raw_serving_input_receiver_fn} utility provides for
|
|
||||||
this.
|
|
||||||
|
|
||||||
If these utilities do not meet your needs, you are free to write your own
|
|
||||||
`serving_input_receiver_fn()`. One case where this may be needed is if your
|
|
||||||
training `input_fn()` incorporates some preprocessing logic that must be
|
|
||||||
recapitulated at serving time. To reduce the risk of training-serving skew, we
|
|
||||||
recommend encapsulating such processing in a function which is then called
|
|
||||||
from both `input_fn()` and `serving_input_receiver_fn()`.
|
|
||||||
|
|
||||||
|
|
||||||
## Performing the export
|
|
||||||
|
|
||||||
To export your trained Estimator, call
|
|
||||||
@{tf.estimator.Estimator.export_savedmodel} with the export base path, together
|
|
||||||
with the `serving_input_receiver_fn`.
|
|
||||||
|
|
||||||
```py
|
|
||||||
estimator.export_savedmodel(export_dir_base, serving_input_receiver_fn)
|
|
||||||
```
|
|
||||||
|
|
||||||
This method builds a new graph by first calling the
|
|
||||||
`serving_input_receiver_fn()` to obtain feature `Tensor`s, and then calling
|
|
||||||
this `Estimator`'s `model_fn()` to generate the model graph based on those
|
|
||||||
features. It starts a fresh `Session`, and, by default, restores the most recent
|
|
||||||
checkpoint into it. (A different checkpoint may be passed, if needed.)
|
|
||||||
Finally it creates a timestamped export directory below the given
|
|
||||||
`export_dir_base` (i.e., `export_dir_base/<timestamp>`), and writes a
|
|
||||||
`SavedModel` into it containing a single `MetaGraphDef` saved from this
|
|
||||||
Session.
|
|
||||||
|
|
||||||
> Note: there is currently no built-in mechanism to garbage-collect old exports,
|
|
||||||
> so successive exports will accumulate under `export_dir_base` unless deleted
|
|
||||||
> by some external means.
|
|
||||||
|
|
||||||
## Specifying the outputs of a custom model
|
|
||||||
|
|
||||||
When writing a custom `model_fn`, you must populate the `export_outputs` element
|
|
||||||
of the @{tf.estimator.EstimatorSpec} return value. This is a dict of
|
|
||||||
`{name: output}` describing the output signatures to be exported and used during
|
|
||||||
serving.
|
|
||||||
|
|
||||||
In the usual case of making a single prediction, this dict contains
|
|
||||||
one element, and the `name` is immaterial. In a multi-headed model, each head
|
|
||||||
is represented by an entry in this dict. In this case the `name` is a string
|
|
||||||
of your choice that can be used to request a specific head at serving time.
|
|
||||||
|
|
||||||
Each `output` value must be an `ExportOutput` object such as
|
|
||||||
@{tf.estimator.export.ClassificationOutput},
|
|
||||||
@{tf.estimator.export.RegressionOutput}, or
|
|
||||||
@{tf.estimator.export.PredictOutput}.
|
|
||||||
|
|
||||||
These output types map straightforwardly to the
|
|
||||||
[TensorFlow Serving APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto),
|
|
||||||
and so determine which request types will be honored.
|
|
||||||
|
|
||||||
> Note: In the multi-headed case, a `SignatureDef` will be generated for each
|
|
||||||
> element of the `export_outputs` dict returned from the model_fn, named using
|
|
||||||
> the same keys. These signatures differ only in their outputs, as provided by
|
|
||||||
> the corresponding `ExportOutput` entry. The inputs are always those provided
|
|
||||||
> by the `serving_input_receiver_fn`.
|
|
||||||
> An inference request may specify the head by name. One head must be named
|
|
||||||
> using [`signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`](https://www.tensorflow.org/code/tensorflow/python/saved_model/signature_constants.py)
|
|
||||||
> indicating which signature will be served when an inference request does not
|
|
||||||
> specify one.
|
|
||||||
|
|
||||||
|
|
||||||
## Serving the exported model on Google Cloud ML Engine
|
|
||||||
|
|
||||||
[Google Cloud ML Engine](https://cloud.google.com/ml-engine/) provides a fully
|
|
||||||
managed, scalable environment for serving your trained SavedModels to make
|
|
||||||
online or batch predictions.
|
|
||||||
|
|
||||||
Please see [Deploying Models](https://cloud.google.com/ml-engine/docs/how-tos/deploying-models)
|
|
||||||
to learn how to deploy your SavedModel on Cloud ML Engine.
|
|
||||||
|
|
||||||
> Note: Cloud ML Engine accepts inference requests in JSON, CSV, or TFRecords
|
|
||||||
> formats, depending on the circumstance. Parsing these formats is not the
|
|
||||||
> responsibility of the graph. Cloud ML Engine does the parsing for you, and
|
|
||||||
> feeds raw feature data directly into the graph. Thus, when targeting Cloud ML
|
|
||||||
> Engine, you should use a `serving_input_receiver_fn()` of the passthrough form
|
|
||||||
> that simply creates placeholders for each feature.
|
|
||||||
|
|
||||||
|
|
||||||
## Requesting predictions from Google Cloud ML Engine
|
|
||||||
|
|
||||||
To learn how to request predictions from a model deployed in Cloud ML Engine,
|
|
||||||
please see:
|
|
||||||
|
|
||||||
* [Prediction Basics](https://cloud.google.com/ml-engine/docs/concepts/prediction-overview)
|
|
||||||
* [Getting Online Predictions](https://cloud.google.com/ml-engine/docs/how-tos/online-predict)
|
|
||||||
* [Getting Batch Predictions](https://cloud.google.com/ml-engine/docs/how-tos/batch-predict)
|
|
||||||
|
|
||||||
|
|
||||||
## Serving the exported model locally
|
|
||||||
|
|
||||||
For local deployment, you can serve your model using
|
|
||||||
@{$deploy/tfserve$Tensorflow Serving}, an open-source project that loads a
|
|
||||||
`SavedModel` and exposes it as a [gRPC](http://www.grpc.io/) service.
|
|
||||||
|
|
||||||
First, [install TensorFlow Serving](https://tensorflow.github.io/serving/setup#prerequisites).
|
|
||||||
|
|
||||||
Then build and run the local model server, substituting `$export_dir_base` with
|
|
||||||
the path to the `SavedModel` you exported above:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
bazel build //tensorflow_serving/model_servers:tensorflow_model_server
|
|
||||||
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_base_path=$export_dir_base
|
|
||||||
```
|
|
||||||
|
|
||||||
Now you have a server listening for inference requests via gRPC on port 9000!
|
|
||||||
|
|
||||||
|
|
||||||
## Requesting predictions from a local server
|
|
||||||
|
|
||||||
The server responds to gRPC requests according to the [PredictionService](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto#L15)
|
|
||||||
gRPC API service definition. (The nested protocol buffers are defined in
|
|
||||||
various [neighboring files](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis)).
|
|
||||||
|
|
||||||
From the API service definition, the gRPC framework generates client libraries
|
|
||||||
in various languages providing remote access to the API. In a project using the
|
|
||||||
Bazel build tool, these libraries are built automatically and provided via
|
|
||||||
dependencies like these (using Python for example):
|
|
||||||
|
|
||||||
```build
|
|
||||||
deps = [
|
|
||||||
"//tensorflow_serving/apis:classification_proto_py_pb2",
|
|
||||||
"//tensorflow_serving/apis:regression_proto_py_pb2",
|
|
||||||
"//tensorflow_serving/apis:predict_proto_py_pb2",
|
|
||||||
"//tensorflow_serving/apis:prediction_service_proto_py_pb2"
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
Python client code can then import the libraries thus:
|
|
||||||
|
|
||||||
```py
|
|
||||||
from tensorflow_serving.apis import classification_pb2
|
|
||||||
from tensorflow_serving.apis import regression_pb2
|
|
||||||
from tensorflow_serving.apis import predict_pb2
|
|
||||||
from tensorflow_serving.apis import prediction_service_pb2
|
|
||||||
```
|
|
||||||
|
|
||||||
> Note: `prediction_service_pb2` defines the service as a whole and so
|
|
||||||
> is always required. However a typical client will need only one of
|
|
||||||
> `classification_pb2`, `regression_pb2`, and `predict_pb2`, depending on the
|
|
||||||
> type of requests being made.
|
|
||||||
|
|
||||||
Sending a gRPC request is then accomplished by assembling a protocol buffer
|
|
||||||
containing the request data and passing it to the service stub. Note how the
|
|
||||||
request protocol buffer is created empty and then populated via the
|
|
||||||
[generated protocol buffer API](https://developers.google.com/protocol-buffers/docs/reference/python-generated).
|
|
||||||
|
|
||||||
```py
|
|
||||||
from grpc.beta import implementations
|
|
||||||
|
|
||||||
channel = implementations.insecure_channel(host, int(port))
|
|
||||||
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
|
|
||||||
|
|
||||||
request = classification_pb2.ClassificationRequest()
|
|
||||||
example = request.input.example_list.examples.add()
|
|
||||||
example.features.feature['x'].float_list.value.extend(image[0].astype(float))
|
|
||||||
|
|
||||||
result = stub.Classify(request, 10.0) # 10 secs timeout
|
|
||||||
```
|
|
||||||
|
|
||||||
The returned result in this example is a `ClassificationResponse` protocol
|
|
||||||
buffer.
|
|
||||||
|
|
||||||
This is a skeletal example; please see the @{$deploy$Tensorflow Serving}
|
|
||||||
documentation and [examples](https://github.com/tensorflow/serving/tree/master/tensorflow_serving/example)
|
|
||||||
for more details.
|
|
||||||
|
|
||||||
> Note: `ClassificationRequest` and `RegressionRequest` contain a
|
|
||||||
> `tensorflow.serving.Input` protocol buffer, which in turn contains a list of
|
|
||||||
> `tensorflow.Example` protocol buffers. `PredictRequest`, by contrast,
|
|
||||||
> contains a mapping from feature names to values encoded via `TensorProto`.
|
|
||||||
> Correspondingly: When using the `Classify` and `Regress` APIs, TensorFlow
|
|
||||||
> Serving feeds serialized `tf.Example`s to the graph, so your
|
|
||||||
> `serving_input_receiver_fn()` should include a `tf.parse_example()` Op.
|
|
||||||
> When using the generic `Predict` API, however, TensorFlow Serving feeds raw
|
|
||||||
> feature data to the graph, so a passthrough `serving_input_receiver_fn()`
|
|
||||||
> should be used.
|
|
||||||
|
|
||||||
|
|
||||||
<!-- TODO(soergel): give examples of making requests against this server, using
|
|
||||||
the different Tensorflow Serving APIs, selecting the signature by key, etc. -->
|
|
||||||
|
|
||||||
<!-- TODO(soergel): document ExportStrategy here once Experiment moves
|
|
||||||
from contrib to core. -->
|
|
||||||
|
|
@ -26,8 +26,6 @@ To learn about the high-level API, read the following guides:
|
||||||
which takes you into a somewhat more sophisticated use of this API.
|
which takes you into a somewhat more sophisticated use of this API.
|
||||||
* @{$get_started/monitors$Logging and Monitoring Basics with tf.contrib.learn},
|
* @{$get_started/monitors$Logging and Monitoring Basics with tf.contrib.learn},
|
||||||
which explains how to audit the progress of model training.
|
which explains how to audit the progress of model training.
|
||||||
* @{$get_started/export$Exporting a Trained Model for Serving}, which shows
|
|
||||||
how to save a trained model in a form that is ready to deploy.
|
|
||||||
|
|
||||||
TensorBoard is a utility to visualize different aspects of machine learning.
|
TensorBoard is a utility to visualize different aspects of machine learning.
|
||||||
The following guides explain how to use TensorBoard:
|
The following guides explain how to use TensorBoard:
|
||||||
|
|
|
||||||
|
|
@ -44,8 +44,31 @@ programs:
|
||||||
|
|
||||||
* **Portability.** The dataflow graph is a language-independent representation
|
* **Portability.** The dataflow graph is a language-independent representation
|
||||||
of the code in your model. You can build a dataflow graph in Python, store it
|
of the code in your model. You can build a dataflow graph in Python, store it
|
||||||
in a [SavedModel](TODO), and restore it in a C++ program for low-latency
|
in a @{$saved_model$SavedModel}, and restore it in a C++ program for
|
||||||
inference.
|
low-latency inference.
|
||||||
|
|
||||||
|
|
||||||
|
## What is a @{tf.Graph}?
|
||||||
|
|
||||||
|
A @{tf.Graph} contains two relevant kinds of information:
|
||||||
|
|
||||||
|
* **Graph structure.** The nodes and edges of the graph, indicating how
|
||||||
|
individual operations are composed together, but not prescribing how they
|
||||||
|
should be used. The graph structure is like assembly code: inspecting it can
|
||||||
|
convey some useful information, but it does not contain all of the useful
|
||||||
|
context that source code conveys.
|
||||||
|
|
||||||
|
* **Graph collections.** TensorFlow provides a general mechanism for storing
|
||||||
|
collections of metadata in a @{tf.Graph}. The @{tf.add_to_collection} function
|
||||||
|
enables you to associate a list of objects with a key (where @{tf.GraphKeys}
|
||||||
|
defines some of the standard keys), and @{tf.get_collection} enables you to
|
||||||
|
look up all objects associated with a key. Many parts of the TensorFlow
|
||||||
|
library use this facility: for example, when you create a @{tf.Variable}, it
|
||||||
|
is added by default to collections representing "global variables" and
|
||||||
|
"trainable variables". When you later come to create a @{tf.train.Saver} or
|
||||||
|
@{tf.train.Optimizer}, the variables in these collections are used as the
|
||||||
|
default arguments.
|
||||||
|
|
||||||
|
|
||||||
## Building a @{tf.Graph}
|
## Building a @{tf.Graph}
|
||||||
|
|
||||||
|
|
@ -419,89 +442,6 @@ with tf.Session() as sess:
|
||||||
print(metadata.step_stats)
|
print(metadata.step_stats)
|
||||||
```
|
```
|
||||||
|
|
||||||
## `GraphDef` and `MetaGraphDef`
|
|
||||||
|
|
||||||
TensorFlow uses a dataflow graph as a portable representation for your
|
|
||||||
application. A @{tf.Graph} contains two relevant kinds of information:
|
|
||||||
|
|
||||||
* **Graph structure.** The nodes and edges of the graph, indicating how
|
|
||||||
individual operations are composed together, but not prescribing how they
|
|
||||||
should be used. The graph structure is like assembly code: inspecting it can
|
|
||||||
convey some useful information, but it does not contain all of the useful
|
|
||||||
context that source code conveys.
|
|
||||||
|
|
||||||
* **Graph collections.** TensorFlow provides a general mechanism for storing
|
|
||||||
collections of metadata in a @{tf.Graph}. The @{tf.add_to_collection} function
|
|
||||||
enables you to associate a list of objects with a key (where @{tf.GraphKeys}
|
|
||||||
defines some of the standard keys), and @{tf.get_collection} enables you to
|
|
||||||
look up all objects associated with a key. Many parts of the TensorFlow
|
|
||||||
library use this facility: for example, when you create a @{tf.Variable}, it
|
|
||||||
is added by default to collections representing "global variables" and
|
|
||||||
"trainable variables". When you later come to create a @{tf.train.Saver} or
|
|
||||||
@{tf.train.Optimizer}, the variables in these collections are used as the
|
|
||||||
default arguments.
|
|
||||||
|
|
||||||
A @{tf.Graph} can be saved in two forms:
|
|
||||||
|
|
||||||
* @{tf.GraphDef}: This is a low-level representation of the graph structure,
|
|
||||||
containing a description of all of its operations (as @{tf.NodeDef} protocol
|
|
||||||
buffers) and the edges between them. The @{tf.GraphDef} representation is
|
|
||||||
primarily used with low-level APIs, such as the `tensorflow::Session` C++
|
|
||||||
API, and it typically requires additional context (such as the names of
|
|
||||||
particular operations) to make use of it. The @{tf.Graph.as_graph_def} method
|
|
||||||
converts a @{tf.Graph} to a @{tf.GraphDef}.
|
|
||||||
|
|
||||||
* `tf.train.MetaGraphDef`: This is a higher-level representation of a dataflow
|
|
||||||
graph, which includes a @{tf.GraphDef}, and information that helps to
|
|
||||||
understand the graph (such as the contents of the graph collections). The
|
|
||||||
@{tf.train.export_meta_graph} function converts a @{tf.Graph} to a
|
|
||||||
`tf.train.MetaGraphDef`. The @{tf.train.Saver.save} method also writes a
|
|
||||||
`tf.train.MetaGraphDef` that can be used in conjunction with the saved
|
|
||||||
checkpoint to restore the state of a training process at the point it was
|
|
||||||
saved.
|
|
||||||
|
|
||||||
In most cases, we encourage you to use `tf.train.MetaGraphDef` instead of
|
|
||||||
@{tf.GraphDef}. There are cases where a @{tf.GraphDef} can be useful---for
|
|
||||||
example, when performing low-level graph modifications using functions like
|
|
||||||
@{tf.import_graph_def} or
|
|
||||||
the
|
|
||||||
[Graph Transform](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md) tool---but
|
|
||||||
`tf.train.MetaGraphDef` is a better building block for high-level applications.
|
|
||||||
For example the [SavedModel library](TODO) uses `tf.train.MetaGraphDef` to
|
|
||||||
package up a @{tf.Graph} and a set of trained model parameters so that it can be
|
|
||||||
used for serving.
|
|
||||||
|
|
||||||
If you have a `tf.train.MetaGraphDef`, the @{tf.train.import_meta_graph}
|
|
||||||
function will load it into the default graph. Calling this function has two
|
|
||||||
main features:
|
|
||||||
|
|
||||||
1. It will restore the contents of the graph collections from the original
|
|
||||||
graph. APIs such as @{tf.global_variables} and the default arguments to
|
|
||||||
APIs like @{tf.train.Optimizer.minimize} will work the same way as they
|
|
||||||
did in the original graph.
|
|
||||||
|
|
||||||
2. The function returns a @{tf.train.Saver}, which can be used to restore the
|
|
||||||
state (trained parameters, etc.) associated with the graph from a checkpoint.
|
|
||||||
The @{tf.train.latest_checkpoint} function can help to find the latest
|
|
||||||
checkpoint from a particular checkpoint directory.
|
|
||||||
|
|
||||||
If you have a @{tf.GraphDef}, the @{tf.import_graph_def} function enables you
|
|
||||||
to load the graph into an existing Python @{tf.Graph} object. To make use of the
|
|
||||||
imported graph, you must know the names of operations or tensors in the
|
|
||||||
@{tf.GraphDef}. The @{tf.import_graph_def} function has two main features to
|
|
||||||
help you use the imported graph:
|
|
||||||
|
|
||||||
1. You can **rebind** tensors in the imported graph to @{tf.Tensor} objects in
|
|
||||||
the default graph by passing the optional `input_map` argument. For example,
|
|
||||||
`input_map` enables you to take import a graph fragment defined in a
|
|
||||||
@{tf.GraphDef}, and statically connect tensors in the graph you are
|
|
||||||
building to @{tf.placeholder} tensors in that fragment.
|
|
||||||
|
|
||||||
2. You can **return** @{tf.Tensor} or @{tf.Operation} objects from the imported
|
|
||||||
graph by passing their names in the `return_elements` list.
|
|
||||||
|
|
||||||
In addition, you can use @{tf.device} and @{tf.name_scope} to control the
|
|
||||||
device placement and name of the imported nodes.
|
|
||||||
|
|
||||||
## Visualizing your graph
|
## Visualizing your graph
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -28,18 +28,6 @@ following guide:
|
||||||
which walks you through the use of `tfdbg` within an application. It covers
|
which walks you through the use of `tfdbg` within an application. It covers
|
||||||
using `tfdbg` with both the low-level TensorFlow API and the Estimator API.
|
using `tfdbg` with both the low-level TensorFlow API and the Estimator API.
|
||||||
|
|
||||||
A `MetaGraph` consists of both a computational graph and its associated
|
|
||||||
metadata. A `MetaGraph` contains the information required to continue
|
|
||||||
training, perform evaluation, or run inference on a previously
|
|
||||||
trained graph. The following guide details `MetaGraph` objects:
|
|
||||||
|
|
||||||
* @{$meta_graph$Exporting and Importing a MetaGraph}.
|
|
||||||
|
|
||||||
`SavedModel` is the universal serialization format for Tensorflow models. TensorFlow provides SavedModel CLI (command-line interface) as a tool to inspect and execute a MetaGraph in a SavedModel. The detailed usages and examples are
|
|
||||||
documented in the following guide:
|
|
||||||
|
|
||||||
* @{$saved_model_cli$SavedModel CLI (Command-Line Interface)}.
|
|
||||||
|
|
||||||
To learn about the TensorFlow versioning scheme consult:
|
To learn about the TensorFlow versioning scheme consult:
|
||||||
|
|
||||||
* @{$version_compat$The TensorFlow Version Compatibility Guide}, which explains
|
* @{$version_compat$The TensorFlow Version Compatibility Guide}, which explains
|
||||||
|
|
|
||||||
884
tensorflow/docs_src/programmers_guide/saved_model.md
Normal file
884
tensorflow/docs_src/programmers_guide/saved_model.md
Normal file
|
|
@ -0,0 +1,884 @@
|
||||||
|
# Saving and Restoring
|
||||||
|
|
||||||
|
This document explains how to save and restore
|
||||||
|
@{$variables$variables} and models.
|
||||||
|
|
||||||
|
|
||||||
|
## Saving and restoring variables
|
||||||
|
|
||||||
|
A TensorFlow variable provides the best way to represent shared, persistent
|
||||||
|
state manipulated by your program. (See @{$variables$Variables} for details.)
|
||||||
|
This section explains how to save and restore variables.
|
||||||
|
Note that Estimators automatically saves and restores variables
|
||||||
|
(in the `model_dir`).
|
||||||
|
|
||||||
|
The `tf.train.Saver` class provides methods for saving and restoring models.
|
||||||
|
The `tf.train.Saver` constructor adds `save` and `restore` ops to the graph
|
||||||
|
for all, or a specified list, of the variables in the graph. The `Saver`
|
||||||
|
object provides methods to run these ops, specifying paths for the checkpoint
|
||||||
|
files to write to or read from.
|
||||||
|
|
||||||
|
The saver will restore all variables already defined in your model. If you're
|
||||||
|
loading a model without knowing how to build its graph (for example, if you're
|
||||||
|
writing a generic program to load models), then read the
|
||||||
|
[Overview of saving and restoring models](#models) section
|
||||||
|
later in this document.
|
||||||
|
|
||||||
|
TensorFlow saves variables in binary **checkpoint files** that,
|
||||||
|
roughly speaking, map variable names to tensor values.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### Saving variables
|
||||||
|
|
||||||
|
Create a `Saver` with `tf.train.Saver()` to manage all variables in the
|
||||||
|
model. For example, the following snippet demonstrates how to call the
|
||||||
|
`tf.train.Saver.save` method to save variables to a checkpoint file:
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Create some variables.
|
||||||
|
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
|
||||||
|
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)
|
||||||
|
|
||||||
|
inc_v1 = v1.assign(v1+1)
|
||||||
|
dec_v2 = v2.assign(v2-1)
|
||||||
|
|
||||||
|
# Add an op to initialize the variables.
|
||||||
|
init_op = tf.global_variables_initializer()
|
||||||
|
|
||||||
|
# Add ops to save and restore all the variables.
|
||||||
|
saver = tf.train.Saver()
|
||||||
|
|
||||||
|
# Later, launch the model, initialize the variables, do some work, and save the
|
||||||
|
# variables to disk.
|
||||||
|
with tf.Session() as sess:
|
||||||
|
sess.run(init_op)
|
||||||
|
# Do some work with the model.
|
||||||
|
inc_v1.op.run()
|
||||||
|
dec_v2.op.run()
|
||||||
|
# Save the variables to disk.
|
||||||
|
save_path = saver.save(sess, "/tmp/model.ckpt")
|
||||||
|
print("Model saved in file: %s" % save_path)
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
### Restoring variables
|
||||||
|
|
||||||
|
The `tf.train.Saver` object not only saves variables to checkpoint files, it
|
||||||
|
also restores variables. Note that when you restore variables from a file you
|
||||||
|
do not have to initialize them beforehand. For example, the following snippet
|
||||||
|
demonstrates how to call the `tf.train.Saver.restore` method to restore
|
||||||
|
variables from a checkpoint file:
|
||||||
|
|
||||||
|
```python
|
||||||
|
tf.reset_default_graph()
|
||||||
|
|
||||||
|
# Create some variables.
|
||||||
|
v1 = tf.get_variable("v1", shape=[3])
|
||||||
|
v2 = tf.get_variable("v2", shape=[5])
|
||||||
|
|
||||||
|
# Add ops to save and restore all the variables.
|
||||||
|
saver = tf.train.Saver()
|
||||||
|
|
||||||
|
# Later, launch the model, use the saver to restore variables from disk, and
|
||||||
|
# do some work with the model.
|
||||||
|
with tf.Session() as sess:
|
||||||
|
# Restore variables from disk.
|
||||||
|
saver.restore(sess, "/tmp/model.ckpt")
|
||||||
|
print("Model restored.")
|
||||||
|
# Check the values of the variables
|
||||||
|
print("v1 : %s" % v1.eval())
|
||||||
|
print("v2 : %s" % v2.eval())
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Choosing which variables to save and restore
|
||||||
|
|
||||||
|
If you do not pass any arguments to `tf.train.Saver()`, the saver handles all
|
||||||
|
variables in the graph. Each variable is saved under the name that was passed
|
||||||
|
when the variable was created.
|
||||||
|
|
||||||
|
It is sometimes useful to explicitly specify names for variables in the
|
||||||
|
checkpoint files. For example, you may have trained a model with a variable
|
||||||
|
named `"weights"` whose value you want to restore into a variable named
|
||||||
|
`"params"`.
|
||||||
|
|
||||||
|
It is also sometimes useful to only save or restore a subset of the variables
|
||||||
|
used by a model. For example, you may have trained a neural net with five
|
||||||
|
layers, and you now want to train a new model with six layers that reuses the
|
||||||
|
existing weights of the five trained layers. You can use the saver to restore
|
||||||
|
the weights of just the first five layers.
|
||||||
|
|
||||||
|
You can easily specify the names and variables to save or load by passing to the
|
||||||
|
`tf.train.Saver()` constructor either of the following:
|
||||||
|
|
||||||
|
* A list of variables (which will be stored under their own names).
|
||||||
|
* A Python dictionary in which keys are the names to use and the values are the
|
||||||
|
variables to manage.
|
||||||
|
|
||||||
|
Continuing from the save/restore examples shown earlier:
|
||||||
|
|
||||||
|
```python
|
||||||
|
tf.reset_default_graph()
|
||||||
|
# Create some variables.
|
||||||
|
v1 = tf.get_variable("v1", [3], initializer = tf.zeros_initializer)
|
||||||
|
v2 = tf.get_variable("v2", [5], initializer = tf.zeros_initializer)
|
||||||
|
|
||||||
|
# Add ops to save and restore only `v2` using the name "v2"
|
||||||
|
saver = tf.train.Saver({"v2": v2})
|
||||||
|
|
||||||
|
# Use the saver object normally after that.
|
||||||
|
with tf.Session() as sess:
|
||||||
|
# Initialize v1 since the saver will not.
|
||||||
|
v1.initializer.run()
|
||||||
|
saver.restore(sess, "/tmp/model.ckpt")
|
||||||
|
|
||||||
|
print("v1 : %s" % v1.eval())
|
||||||
|
print("v2 : %s" % v2.eval())
|
||||||
|
```
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
|
||||||
|
* You can create as many `Saver` objects as you want if you need to save and
|
||||||
|
restore different subsets of the model variables. The same variable can be
|
||||||
|
listed in multiple saver objects; its value is only changed when the
|
||||||
|
`Saver.restore()` method is run.
|
||||||
|
|
||||||
|
* If you only restore a subset of the model variables at the start of a
|
||||||
|
session, you have to run an initialize op for the other variables. See
|
||||||
|
@{tf.variables_initializer} for more information.
|
||||||
|
|
||||||
|
* To inspect the variables in a checkpoint, you can use the
|
||||||
|
[`inspect_checkpoint`](https://www.tensorflow.org/code/tensorflow/python/tools/inspect_checkpoint.py)
|
||||||
|
library, particularly the `print_tensors_in_checkpoint_file` function.
|
||||||
|
|
||||||
|
* By default, `Saver` uses the value of the @{tf.Variable.name} property
|
||||||
|
for each variable. However, when you create a `Saver` object, you may
|
||||||
|
optionally choose names for the variables in the checkpoint files.
|
||||||
|
|
||||||
|
|
||||||
|
<a name="models"></a>
|
||||||
|
## Overview of saving and restoring models
|
||||||
|
|
||||||
|
When you want to save and load variables, the graph, and the
|
||||||
|
graph's metadata--basically, when you want to save or restore
|
||||||
|
your model--we recommend using SavedModel.
|
||||||
|
**SavedModel** is a language-neutral, recoverable, hermetic
|
||||||
|
serialization format. SavedModel enables higher-level systems
|
||||||
|
and tools to produce, consume, and transform TensorFlow models.
|
||||||
|
TensorFlow provides several mechanisms for interacting with
|
||||||
|
SavedModel, including tf.saved_model APIs, Estimator APIs and a CLI.
|
||||||
|
|
||||||
|
|
||||||
|
## APIs to build and load a SavedModel
|
||||||
|
|
||||||
|
This section focuses on the APIs for building and loading a SavedModel,
|
||||||
|
particularly when using lower-level TensorFlow APIs.
|
||||||
|
|
||||||
|
|
||||||
|
### Building a SavedModel
|
||||||
|
|
||||||
|
We provide a Python implementation of the SavedModel
|
||||||
|
@{tf.saved_model.builder$builder}.
|
||||||
|
The `SavedModelBuilder` class provides functionality to
|
||||||
|
save multiple `MetaGraphDef`s. A **MetaGraph** is a dataflow graph, plus
|
||||||
|
its associated variables, assets, and signatures. A **`MetaGraphDef`**
|
||||||
|
is the protocol buffer representation of a MetaGraph. A **signature** is
|
||||||
|
the set of inputs to and outputs from a graph.
|
||||||
|
|
||||||
|
If assets need to be saved and written or copied to disk, they can be provided
|
||||||
|
when the first `MetaGraphDef` is added. If multiple `MetaGraphDef`s are
|
||||||
|
associated with an asset of the same name, only the first version is retained.
|
||||||
|
|
||||||
|
Each `MetaGraphDef` added to the SavedModel must be annotated with
|
||||||
|
user-specified tags. The tags provide a means to identify the specific
|
||||||
|
`MetaGraphDef` to load and restore, along with the shared set of variables
|
||||||
|
and assets. These tags
|
||||||
|
typically annotate a `MetaGraphDef` with its functionality (for example,
|
||||||
|
serving or training), and optionally with hardware-specific aspects (for
|
||||||
|
example, GPU).
|
||||||
|
|
||||||
|
For example, the following code suggests a typical way to use
|
||||||
|
`SavedModelBuilder` to build a SavedModel:
|
||||||
|
|
||||||
|
```python
|
||||||
|
export_dir = ...
|
||||||
|
...
|
||||||
|
builder = tf.saved_model_builder.SavedModelBuilder(export_dir)
|
||||||
|
with tf.Session(graph=tf.Graph()) as sess:
|
||||||
|
...
|
||||||
|
builder.add_meta_graph_and_variables(sess,
|
||||||
|
[tag_constants.TRAINING],
|
||||||
|
signature_def_map=foo_signatures,
|
||||||
|
assets_collection=foo_assets)
|
||||||
|
...
|
||||||
|
# Add a second MetaGraphDef for inference.
|
||||||
|
with tf.Session(graph=tf.Graph()) as sess:
|
||||||
|
...
|
||||||
|
builder.add_meta_graph([tag_constants.SERVING])
|
||||||
|
...
|
||||||
|
builder.save()
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Loading a SavedModel in Python
|
||||||
|
|
||||||
|
The Python version of the SavedModel
|
||||||
|
@{tf.saved_model.loader$loader}
|
||||||
|
provides load and restore capability for a SavedModel. The `load` operation
|
||||||
|
requires the following information:
|
||||||
|
|
||||||
|
* The session in which to restore the graph definition and variables.
|
||||||
|
* The tags used to identify the MetaGraphDef to load.
|
||||||
|
* The location (directory) of the SavedModel.
|
||||||
|
|
||||||
|
Upon a load, the subset of variables, assets, and signatures supplied as part of
|
||||||
|
the specific MetaGraphDef will be restored into the supplied session.
|
||||||
|
|
||||||
|
|
||||||
|
```python
|
||||||
|
export_dir = ...
|
||||||
|
...
|
||||||
|
with tf.Session(graph=tf.Graph()) as sess:
|
||||||
|
tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir)
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Loading a Savedmodel in C++
|
||||||
|
|
||||||
|
The C++ version of the SavedModel
|
||||||
|
[loader](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/loader.h)
|
||||||
|
provides an API to load a SavedModel from a path, while allowing
|
||||||
|
`SessionOptions` and `RunOptions`.
|
||||||
|
You have to specify the tags associated with the graph to be loaded.
|
||||||
|
The loaded version of SavedModel is referred to as `SavedModelBundle`
|
||||||
|
and contains the MetaGraphDef and the session within which it is loaded.
|
||||||
|
|
||||||
|
```c++
|
||||||
|
const string export_dir = ...
|
||||||
|
SavedModelBundle bundle;
|
||||||
|
...
|
||||||
|
LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain},
|
||||||
|
&bundle);
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### Standard constants
|
||||||
|
|
||||||
|
SavedModel offers the flexibility to build and load TensorFlow graphs for a
|
||||||
|
variety of use-cases. For the most common use-cases, SavedModel's APIs
|
||||||
|
provide a set of constants in Python and C++ that are easy to
|
||||||
|
reuse and share across tools consistently.
|
||||||
|
|
||||||
|
#### Standard MetaGraphDef tags
|
||||||
|
|
||||||
|
You may use sets of tags to uniquely identify a `MetaGraphDef` saved in a
|
||||||
|
SavedModel. A subset of commonly used tags is specified in:
|
||||||
|
|
||||||
|
* [Python](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/tag_constants.py)
|
||||||
|
* [C++](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h)
|
||||||
|
|
||||||
|
|
||||||
|
#### Standard SignatureDef constants
|
||||||
|
|
||||||
|
A [**SignatureDef**](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/meta_graph.proto)
|
||||||
|
is a protocol buffer that defines the signature of a computation
|
||||||
|
supported by a graph.
|
||||||
|
Commonly used input keys, output keys, and method names are
|
||||||
|
defined in:
|
||||||
|
|
||||||
|
* [Python](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/signature_constants.py)
|
||||||
|
* [C++](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/signature_constants.h)
|
||||||
|
|
||||||
|
## Using SavedModel with Estimators
|
||||||
|
|
||||||
|
After training an `Estimator` model, you may want to create a service
|
||||||
|
from that model that takes requests and returns a result. You can run such a
|
||||||
|
service locally on your machine or deploy it scalably in the cloud.
|
||||||
|
|
||||||
|
To prepare a trained Estimator for serving, you must export it in the standard
|
||||||
|
SavedModel format. This section explains how to:
|
||||||
|
|
||||||
|
* Specify the output nodes and the corresponding
|
||||||
|
[APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto)
|
||||||
|
that can be served (Classify, Regress, or Predict).
|
||||||
|
* Export your model to the SavedModel format.
|
||||||
|
* Serve the model from a local server and request predictions.
|
||||||
|
|
||||||
|
|
||||||
|
### Preparing serving inputs
|
||||||
|
|
||||||
|
During training, an @{$input_fn$`input_fn()`} ingests data and prepares it for
|
||||||
|
use by the model. At serving time, similarly, a `serving_input_receiver_fn()`
|
||||||
|
accepts inference requests and prepares them for the model. This function
|
||||||
|
has the following purposes:
|
||||||
|
|
||||||
|
* To add placeholders to the graph that the serving system will feed
|
||||||
|
with inference requests.
|
||||||
|
* To add any additional ops needed to convert data from the input format
|
||||||
|
into the feature `Tensor`s expected by the model.
|
||||||
|
|
||||||
|
The function returns a @{tf.estimator.export.ServingInputReceiver} object,
|
||||||
|
which packages the placeholders and the resulting feature `Tensor`s together.
|
||||||
|
|
||||||
|
A typical pattern is that inference requests arrive in the form of serialized
|
||||||
|
`tf.Example`s, so the `serving_input_receiver_fn()` creates a single string
|
||||||
|
placeholder to receive them. The `serving_input_receiver_fn()` is then also
|
||||||
|
responsible for parsing the `tf.Example`s by adding a @{tf.parse_example} op to
|
||||||
|
the graph.
|
||||||
|
|
||||||
|
When writing such a `serving_input_receiver_fn()`, you must pass a parsing
|
||||||
|
specification to @{tf.parse_example} to tell the parser what feature names to
|
||||||
|
expect and how to map them to `Tensor`s. A parsing specification takes the
|
||||||
|
form of a dict from feature names to @{tf.FixedLenFeature}, @{tf.VarLenFeature},
|
||||||
|
and @{tf.SparseFeature}. Note this parsing specification should not include
|
||||||
|
any label or weight columns, since those will not be available at serving
|
||||||
|
time—in contrast to a parsing specification used in the `input_fn()` at
|
||||||
|
training time.
|
||||||
|
|
||||||
|
In combination, then:
|
||||||
|
|
||||||
|
```py
|
||||||
|
feature_spec = {'foo': tf.FixedLenFeature(...),
|
||||||
|
'bar': tf.VarLenFeature(...)}
|
||||||
|
|
||||||
|
def serving_input_receiver_fn():
|
||||||
|
"""An input receiver that expects a serialized tf.Example."""
|
||||||
|
serialized_tf_example = tf.placeholder(dtype=tf.string,
|
||||||
|
shape=[default_batch_size],
|
||||||
|
name='input_example_tensor')
|
||||||
|
receiver_tensors = {'examples': serialized_tf_example}
|
||||||
|
features = tf.parse_example(serialized_tf_example, feature_spec)
|
||||||
|
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
|
||||||
|
```
|
||||||
|
|
||||||
|
The @{tf.estimator.export.build_parsing_serving_input_receiver_fn} utility
|
||||||
|
function provides that input receiver for the common case.
|
||||||
|
|
||||||
|
> Note: when training a model to be served using the Predict API with a local
|
||||||
|
> server, the parsing step is not needed because the model will receive raw
|
||||||
|
> feature data.
|
||||||
|
|
||||||
|
Even if you require no parsing or other input processing—that is, if the
|
||||||
|
serving system will feed feature `Tensor`s directly—you must still provide
|
||||||
|
a `serving_input_receiver_fn()` that creates placeholders for the feature
|
||||||
|
`Tensor`s and passes them through. The
|
||||||
|
@{tf.estimator.export.build_raw_serving_input_receiver_fn} utility provides for
|
||||||
|
this.
|
||||||
|
|
||||||
|
If these utilities do not meet your needs, you are free to write your own
|
||||||
|
`serving_input_receiver_fn()`. One case where this may be needed is if your
|
||||||
|
training `input_fn()` incorporates some preprocessing logic that must be
|
||||||
|
recapitulated at serving time. To reduce the risk of training-serving skew, we
|
||||||
|
recommend encapsulating such processing in a function which is then called
|
||||||
|
from both `input_fn()` and `serving_input_receiver_fn()`.
|
||||||
|
|
||||||
|
Note that the `serving_input_receiver_fn()` also determines the *input*
|
||||||
|
portion of the signature. That is, when writing a
|
||||||
|
`serving_input_receiver_fn()`, you must tell the parser what signatures
|
||||||
|
to expect and how to map them to your model's expected inputs.
|
||||||
|
By contrast, the *output* portion of the signature is determined by the model.
|
||||||
|
|
||||||
|
|
||||||
|
### Performing the export
|
||||||
|
|
||||||
|
To export your trained Estimator, call
|
||||||
|
@{tf.estimator.Estimator.export_savedmodel} with the export base path and
|
||||||
|
the `serving_input_receiver_fn`.
|
||||||
|
|
||||||
|
```py
|
||||||
|
estimator.export_savedmodel(export_dir_base, serving_input_receiver_fn)
|
||||||
|
```
|
||||||
|
|
||||||
|
This method builds a new graph by first calling the
|
||||||
|
`serving_input_receiver_fn()` to obtain feature `Tensor`s, and then calling
|
||||||
|
this `Estimator`'s `model_fn()` to generate the model graph based on those
|
||||||
|
features. It starts a fresh `Session`, and, by default, restores the most recent
|
||||||
|
checkpoint into it. (A different checkpoint may be passed, if needed.)
|
||||||
|
Finally it creates a time-stamped export directory below the given
|
||||||
|
`export_dir_base` (i.e., `export_dir_base/<timestamp>`), and writes a
|
||||||
|
SavedModel into it containing a single `MetaGraphDef` saved from this
|
||||||
|
Session.
|
||||||
|
|
||||||
|
> Note: It is your responsibility to garbage-collect old exports.
|
||||||
|
> Otherwise, successive exports will accumulate under `export_dir_base`.
|
||||||
|
|
||||||
|
### Specifying the outputs of a custom model
|
||||||
|
|
||||||
|
When writing a custom `model_fn`, you must populate the `export_outputs` element
|
||||||
|
of the @{tf.estimator.EstimatorSpec} return value. This is a dict of
|
||||||
|
`{name: output}` describing the output signatures to be exported and used during
|
||||||
|
serving.
|
||||||
|
|
||||||
|
In the usual case of making a single prediction, this dict contains
|
||||||
|
one element, and the `name` is immaterial. In a multi-headed model, each head
|
||||||
|
is represented by an entry in this dict. In this case the `name` is a string
|
||||||
|
of your choice that can be used to request a specific head at serving time.
|
||||||
|
|
||||||
|
Each `output` value must be an `ExportOutput` object such as
|
||||||
|
@{tf.estimator.export.ClassificationOutput},
|
||||||
|
@{tf.estimator.export.RegressionOutput}, or
|
||||||
|
@{tf.estimator.export.PredictOutput}.
|
||||||
|
|
||||||
|
These output types map straightforwardly to the
|
||||||
|
[TensorFlow Serving APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto),
|
||||||
|
and so determine which request types will be honored.
|
||||||
|
|
||||||
|
Note: In the multi-headed case, a `SignatureDef` will be generated for each
|
||||||
|
element of the `export_outputs` dict returned from the model_fn, named using
|
||||||
|
the same keys. These `SignatureDef`s differ only in their outputs, as
|
||||||
|
provided by the corresponding `ExportOutput` entry. The inputs are always
|
||||||
|
those provided by the `serving_input_receiver_fn`.
|
||||||
|
An inference request may specify the head by name. One head must be named
|
||||||
|
using [`signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`](https://www.tensorflow.org/code/tensorflow/python/saved_model/signature_constants.py)
|
||||||
|
indicating which `SignatureDef` will be served when an inference request
|
||||||
|
does not specify one.
|
||||||
|
|
||||||
|
|
||||||
|
### Serving the exported model locally
|
||||||
|
|
||||||
|
For local deployment, you can serve your model using
|
||||||
|
@{$deploy/tfserve$Tensorflow Serving}, an open-source project that loads a
|
||||||
|
SavedModel and exposes it as a [gRPC](http://www.grpc.io/) service.
|
||||||
|
|
||||||
|
First, [install TensorFlow Serving](https://tensorflow.github.io/serving/setup#prerequisites).
|
||||||
|
|
||||||
|
Then build and run the local model server, substituting `$export_dir_base` with
|
||||||
|
the path to the SavedModel you exported above:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bazel build //tensorflow_serving/model_servers:tensorflow_model_server
|
||||||
|
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_base_path=$export_dir_base
|
||||||
|
```
|
||||||
|
|
||||||
|
Now you have a server listening for inference requests via gRPC on port 9000!
|
||||||
|
|
||||||
|
|
||||||
|
### Requesting predictions from a local server
|
||||||
|
|
||||||
|
The server responds to gRPC requests according to the
|
||||||
|
[PredictionService](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto#L15)
|
||||||
|
gRPC API service definition. (The nested protocol buffers are defined in
|
||||||
|
various [neighboring files](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis)).
|
||||||
|
|
||||||
|
From the API service definition, the gRPC framework generates client libraries
|
||||||
|
in various languages providing remote access to the API. In a project using the
|
||||||
|
Bazel build tool, these libraries are built automatically and provided via
|
||||||
|
dependencies like these (using Python for example):
|
||||||
|
|
||||||
|
```build
|
||||||
|
deps = [
|
||||||
|
"//tensorflow_serving/apis:classification_proto_py_pb2",
|
||||||
|
"//tensorflow_serving/apis:regression_proto_py_pb2",
|
||||||
|
"//tensorflow_serving/apis:predict_proto_py_pb2",
|
||||||
|
"//tensorflow_serving/apis:prediction_service_proto_py_pb2"
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Python client code can then import the libraries thus:
|
||||||
|
|
||||||
|
```py
|
||||||
|
from tensorflow_serving.apis import classification_pb2
|
||||||
|
from tensorflow_serving.apis import regression_pb2
|
||||||
|
from tensorflow_serving.apis import predict_pb2
|
||||||
|
from tensorflow_serving.apis import prediction_service_pb2
|
||||||
|
```
|
||||||
|
|
||||||
|
> Note: `prediction_service_pb2` defines the service as a whole and so
|
||||||
|
> is always required. However a typical client will need only one of
|
||||||
|
> `classification_pb2`, `regression_pb2`, and `predict_pb2`, depending on the
|
||||||
|
> type of requests being made.
|
||||||
|
|
||||||
|
Sending a gRPC request is then accomplished by assembling a protocol buffer
|
||||||
|
containing the request data and passing it to the service stub. Note how the
|
||||||
|
request protocol buffer is created empty and then populated via the
|
||||||
|
[generated protocol buffer API](https://developers.google.com/protocol-buffers/docs/reference/python-generated).
|
||||||
|
|
||||||
|
```py
|
||||||
|
from grpc.beta import implementations
|
||||||
|
|
||||||
|
channel = implementations.insecure_channel(host, int(port))
|
||||||
|
stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)
|
||||||
|
|
||||||
|
request = classification_pb2.ClassificationRequest()
|
||||||
|
example = request.input.example_list.examples.add()
|
||||||
|
example.features.feature['x'].float_list.value.extend(image[0].astype(float))
|
||||||
|
|
||||||
|
result = stub.Classify(request, 10.0) # 10 secs timeout
|
||||||
|
```
|
||||||
|
|
||||||
|
The returned result in this example is a `ClassificationResponse` protocol
|
||||||
|
buffer.
|
||||||
|
|
||||||
|
This is a skeletal example; please see the @{$deploy$Tensorflow Serving}
|
||||||
|
documentation and [examples](https://github.com/tensorflow/serving/tree/master/tensorflow_serving/example)
|
||||||
|
for more details.
|
||||||
|
|
||||||
|
> Note: `ClassificationRequest` and `RegressionRequest` contain a
|
||||||
|
> `tensorflow.serving.Input` protocol buffer, which in turn contains a list of
|
||||||
|
> `tensorflow.Example` protocol buffers. `PredictRequest`, by contrast,
|
||||||
|
> contains a mapping from feature names to values encoded via `TensorProto`.
|
||||||
|
> Correspondingly: When using the `Classify` and `Regress` APIs, TensorFlow
|
||||||
|
> Serving feeds serialized `tf.Example`s to the graph, so your
|
||||||
|
> `serving_input_receiver_fn()` should include a `tf.parse_example()` Op.
|
||||||
|
> When using the generic `Predict` API, however, TensorFlow Serving feeds raw
|
||||||
|
> feature data to the graph, so a pass through `serving_input_receiver_fn()`
|
||||||
|
> should be used.
|
||||||
|
|
||||||
|
|
||||||
|
<!-- TODO(soergel): give examples of making requests against this server, using
|
||||||
|
the different Tensorflow Serving APIs, selecting the signature by key, etc. -->
|
||||||
|
|
||||||
|
<!-- TODO(soergel): document ExportStrategy here once Experiment moves
|
||||||
|
from contrib to core. -->
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
## CLI to inspect and execute SavedModel
|
||||||
|
|
||||||
|
You can use the SavedModel Command Line Interface (CLI) to inspect and
|
||||||
|
execute a SavedModel.
|
||||||
|
For example, you can use the CLI to inspect the model's `SignatureDef`s.
|
||||||
|
The CLI enables you to quickly confirm that the input
|
||||||
|
@{$tensors$Tensor dtype and shape} match the model. Moreover, if you
|
||||||
|
want to test your model, you can use the CLI to do a sanity check by
|
||||||
|
passing in sample inputs in various formats (for example, Python
|
||||||
|
expressions) and then fetching the output.
|
||||||
|
|
||||||
|
|
||||||
|
### Installing the SavedModel CLI
|
||||||
|
|
||||||
|
Broadly speaking, you can install TensorFlow in either of the following
|
||||||
|
two ways:
|
||||||
|
|
||||||
|
* By installing a pre-built TensorFlow binary.
|
||||||
|
* By building TensorFlow from source code.
|
||||||
|
|
||||||
|
If you installed TensorFlow through a pre-built TensorFlow binary,
|
||||||
|
then the SavedModel CLI is already installed on your system
|
||||||
|
at pathname `bin\saved_model_cli`.
|
||||||
|
|
||||||
|
If you built TensorFlow from source code, you must run the following
|
||||||
|
additional command to build `saved_model_cli`:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ bazel build tensorflow/python/tools:saved_model_cli
|
||||||
|
```
|
||||||
|
|
||||||
|
### Overview of commands
|
||||||
|
|
||||||
|
The SavedModel CLI supports the following two commands on a
|
||||||
|
`MetaGraphDef` in a SavedModel:
|
||||||
|
|
||||||
|
* `show`, which shows a computation on a `MetaGraphDef` in a SavedModel.
|
||||||
|
* `run`, which runs a computation on a `MetaGraphDef`.
|
||||||
|
|
||||||
|
|
||||||
|
### `show` command
|
||||||
|
|
||||||
|
A SavedModel contains one or more `MetaGraphDef`s, identified by their tag-sets.
|
||||||
|
To serve a model, you
|
||||||
|
might wonder what kind of `SignatureDef`s are in each model, and what are their
|
||||||
|
inputs and outputs. The `show` command let you examine the contents of the
|
||||||
|
SavedModel in hierarchical order. Here's the syntax:
|
||||||
|
|
||||||
|
```
|
||||||
|
usage: saved_model_cli show [-h] --dir DIR [--all]
|
||||||
|
[--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY]
|
||||||
|
```
|
||||||
|
|
||||||
|
For example, the following command shows all available
|
||||||
|
MetaGraphDef tag-sets in the SavedModel:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli show --dir /tmp/saved_model_dir
|
||||||
|
The given SavedModel contains the following tag-sets:
|
||||||
|
serve
|
||||||
|
serve, gpu
|
||||||
|
```
|
||||||
|
|
||||||
|
The following command shows all available `SignatureDef` keys in
|
||||||
|
a `MetaGraphDef`:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve
|
||||||
|
The given SavedModel `MetaGraphDef` contains `SignatureDefs` with the
|
||||||
|
following keys:
|
||||||
|
SignatureDef key: "classify_x2_to_y3"
|
||||||
|
SignatureDef key: "classify_x_to_y"
|
||||||
|
SignatureDef key: "regress_x2_to_y3"
|
||||||
|
SignatureDef key: "regress_x_to_y"
|
||||||
|
SignatureDef key: "regress_x_to_y2"
|
||||||
|
SignatureDef key: "serving_default"
|
||||||
|
```
|
||||||
|
|
||||||
|
If a `MetaGraphDef` has *multiple* tags in the tag-set, you must specify
|
||||||
|
all tags, each tag separated by a comma. For example:
|
||||||
|
|
||||||
|
```none
|
||||||
|
$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu
|
||||||
|
```
|
||||||
|
|
||||||
|
To show all inputs and outputs TensorInfo for a specific `SignatureDef`, pass in
|
||||||
|
the `SignatureDef` key to `signature_def` option. This is very useful when you
|
||||||
|
want to know the tensor key value, dtype and shape of the input tensors for
|
||||||
|
executing the computation graph later. For example:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli show --dir \
|
||||||
|
/tmp/saved_model_dir --tag_set serve --signature_def serving_default
|
||||||
|
The given SavedModel SignatureDef contains the following input(s):
|
||||||
|
inputs['x'] tensor_info:
|
||||||
|
dtype: DT_FLOAT
|
||||||
|
shape: (-1, 1)
|
||||||
|
name: x:0
|
||||||
|
The given SavedModel SignatureDef contains the following output(s):
|
||||||
|
outputs['y'] tensor_info:
|
||||||
|
dtype: DT_FLOAT
|
||||||
|
shape: (-1, 1)
|
||||||
|
name: y:0
|
||||||
|
Method name is: tensorflow/serving/predict
|
||||||
|
```
|
||||||
|
|
||||||
|
To show all available information in the SavedModel, use the `--all` option.
|
||||||
|
For example:
|
||||||
|
|
||||||
|
```none
|
||||||
|
$ saved_model_cli show --dir /tmp/saved_model_dir --all
|
||||||
|
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
|
||||||
|
|
||||||
|
signature_def['classify_x2_to_y3']:
|
||||||
|
The given SavedModel SignatureDef contains the following input(s):
|
||||||
|
inputs['inputs'] tensor_info:
|
||||||
|
dtype: DT_FLOAT
|
||||||
|
shape: (-1, 1)
|
||||||
|
name: x2:0
|
||||||
|
The given SavedModel SignatureDef contains the following output(s):
|
||||||
|
outputs['scores'] tensor_info:
|
||||||
|
dtype: DT_FLOAT
|
||||||
|
shape: (-1, 1)
|
||||||
|
name: y3:0
|
||||||
|
Method name is: tensorflow/serving/classify
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
signature_def['serving_default']:
|
||||||
|
The given SavedModel SignatureDef contains the following input(s):
|
||||||
|
inputs['x'] tensor_info:
|
||||||
|
dtype: DT_FLOAT
|
||||||
|
shape: (-1, 1)
|
||||||
|
name: x:0
|
||||||
|
The given SavedModel SignatureDef contains the following output(s):
|
||||||
|
outputs['y'] tensor_info:
|
||||||
|
dtype: DT_FLOAT
|
||||||
|
shape: (-1, 1)
|
||||||
|
name: y:0
|
||||||
|
Method name is: tensorflow/serving/predict
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
### `run` command
|
||||||
|
|
||||||
|
Invoke the `run` command to run a graph computation, passing
|
||||||
|
inputs and then displaying (and optionally saving) the outputs.
|
||||||
|
Here's the syntax:
|
||||||
|
|
||||||
|
```
|
||||||
|
usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def
|
||||||
|
SIGNATURE_DEF_KEY [--inputs INPUTS]
|
||||||
|
[--input_exprs INPUT_EXPRS] [--outdir OUTDIR]
|
||||||
|
[--overwrite] [--tf_debug]
|
||||||
|
```
|
||||||
|
|
||||||
|
The `run` command provides the following two ways to pass inputs to the model:
|
||||||
|
|
||||||
|
* `--inputs` option enables you to pass numpy ndarray in files.
|
||||||
|
* `--input_exprs` option enables you to pass Python expressions.
|
||||||
|
|
||||||
|
|
||||||
|
#### `--inputs`
|
||||||
|
|
||||||
|
To pass input data in files, specify the `--inputs` option, which takes the
|
||||||
|
following general format:
|
||||||
|
|
||||||
|
```bsh
|
||||||
|
--inputs <INPUTS>
|
||||||
|
```
|
||||||
|
|
||||||
|
where *INPUTS* is either of the following formats:
|
||||||
|
|
||||||
|
* `<input_key>=<filename>`
|
||||||
|
* `<input_key>=<filename>[<variable_name>]`
|
||||||
|
|
||||||
|
You may pass multiple *INPUTS*. If you do pass multiple inputs, use a semicolon
|
||||||
|
to separate each of the *INPUTS*.
|
||||||
|
|
||||||
|
`saved_model_cli` uses `numpy.load` to load the *filename*.
|
||||||
|
The *filename* may be in any of the following formats:
|
||||||
|
|
||||||
|
* `.npy`
|
||||||
|
* `.npz`
|
||||||
|
* pickle format
|
||||||
|
|
||||||
|
A `.npy` file always contains a numpy ndarray. Therefore, when loading from
|
||||||
|
a `.npy` file, the content will be directly assigned to the specified input
|
||||||
|
tensor. If you specify a *variable_name* with that `.npy` file, the
|
||||||
|
*variable_name* will be ignored and a warning will be issued.
|
||||||
|
|
||||||
|
When loading from a `.npz` (zip) file, you may optionally specify a
|
||||||
|
*variable_name* to identify the variable within the zip file to load for
|
||||||
|
the input tensor key. If you don't specify a *variable_name*, the SavedModel
|
||||||
|
CLI will check that only one file is included in the zip file and load it
|
||||||
|
for the specified input tensor key.
|
||||||
|
|
||||||
|
When loading from a pickle file, if no `variable_name` is specified in the
|
||||||
|
square brackets, whatever that is inside the pickle file will be passed to the
|
||||||
|
specified input tensor key. Otherwise, the SavedModel CLI will assume a
|
||||||
|
dictionary is stored in the pickle file and the value corresponding to
|
||||||
|
the *variable_name* will be used.
|
||||||
|
|
||||||
|
|
||||||
|
#### `--inputs_exprs`
|
||||||
|
|
||||||
|
To pass inputs through Python expressions, specify the `--input_exprs` option.
|
||||||
|
This can be useful for when you don't have data
|
||||||
|
files lying around, but still want to sanity check the model with some simple
|
||||||
|
inputs that match the dtype and shape of the model's `SignatureDef`s.
|
||||||
|
For example:
|
||||||
|
|
||||||
|
```bsh
|
||||||
|
`input_key=[[1], [2], [3]]`
|
||||||
|
```
|
||||||
|
|
||||||
|
In addition to Python expressions, you may also pass numpy functions. For
|
||||||
|
example:
|
||||||
|
|
||||||
|
```bsh
|
||||||
|
input_key=np.ones((32, 32, 3))
|
||||||
|
```
|
||||||
|
|
||||||
|
(Note that the `numpy` module is already available to you as `np`.)
|
||||||
|
|
||||||
|
|
||||||
|
#### Save Output
|
||||||
|
|
||||||
|
By default, the SavedModel CLI writes output to stdout. If a directory is
|
||||||
|
passed to `--outdir` option, the outputs will be saved as npy files named after
|
||||||
|
output tensor keys under the given directory.
|
||||||
|
|
||||||
|
Use `--overwrite` to overwrite existing output files.
|
||||||
|
|
||||||
|
|
||||||
|
#### TensorFlow Debugger (tfdbg) Integration
|
||||||
|
|
||||||
|
If `--tf_debug` option is set, the SavedModel CLI will use the
|
||||||
|
TensorFlow Debugger (tfdbg) to watch the intermediate Tensors and runtime
|
||||||
|
graphs or subgraphs while running the SavedModel.
|
||||||
|
|
||||||
|
|
||||||
|
#### Full examples of `run`
|
||||||
|
|
||||||
|
Given:
|
||||||
|
|
||||||
|
* Your model simply adds `x1` and `x2` to get output `y`.
|
||||||
|
* All tensors in the model have shape `(-1, 1)`.
|
||||||
|
* You have two `npy` files:
|
||||||
|
* `/tmp/my_data1.npy`, which contains a numpy ndarray `[[1], [2], [3]]`.
|
||||||
|
* `/tmp/my_data2.npy`, which contains another numpy
|
||||||
|
ndarray `[[0.5], [0.5], [0.5]]`.
|
||||||
|
|
||||||
|
To run these two `npy` files through the model to get output `y`, issue
|
||||||
|
the following command:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
||||||
|
--signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npy;x2=/tmp/my_data2.npy \
|
||||||
|
--outdir /tmp/out
|
||||||
|
Result for output key y:
|
||||||
|
[[ 1.5]
|
||||||
|
[ 2.5]
|
||||||
|
[ 3.5]]
|
||||||
|
```
|
||||||
|
|
||||||
|
Let's change the preceding example slightly. This time, instead of two
|
||||||
|
`.npy` files, you now have an `.npz` file and a pickle file. Furthermore,
|
||||||
|
you want to overwrite any existing output file. Here's the command:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
||||||
|
--signature_def x1_x2_to_y \
|
||||||
|
--inputs x1=/tmp/my_data1.npz[x];x2=/tmp/my_data2.pkl --outdir /tmp/out \
|
||||||
|
--overwrite
|
||||||
|
Result for output key y:
|
||||||
|
[[ 1.5]
|
||||||
|
[ 2.5]
|
||||||
|
[ 3.5]]
|
||||||
|
```
|
||||||
|
|
||||||
|
You may specify python expression instead of an input file. For example,
|
||||||
|
the following command replaces input `x2` with a Python expression:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
||||||
|
--signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npz[x] \
|
||||||
|
--input_exprs 'x2=np.ones((3,1))'
|
||||||
|
Result for output key y:
|
||||||
|
[[ 2]
|
||||||
|
[ 3]
|
||||||
|
[ 4]]
|
||||||
|
```
|
||||||
|
|
||||||
|
To run the model with the TensorFlow Debugger on, issue the
|
||||||
|
following command:
|
||||||
|
|
||||||
|
```
|
||||||
|
$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
||||||
|
--signature_def serving_default --inputs x=/tmp/data.npz[x] --tf_debug
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
<a name="structure"></a>
|
||||||
|
## Structure of a SavedModel directory
|
||||||
|
|
||||||
|
When you save a model in SavedModel format, TensorFlow creates
|
||||||
|
a SavedModel directory consisting of the following subdirectories
|
||||||
|
and files:
|
||||||
|
|
||||||
|
```bsh
|
||||||
|
assets/
|
||||||
|
assets.extra/
|
||||||
|
variables/
|
||||||
|
variables.data-?????-of-?????
|
||||||
|
variables.index
|
||||||
|
saved_model.pb|saved_model.pbtxt
|
||||||
|
```
|
||||||
|
|
||||||
|
where:
|
||||||
|
|
||||||
|
* `assets` is a subfolder containing auxiliary (external) files,
|
||||||
|
such as vocabularies. Assets are copied to the SavedModel location
|
||||||
|
and can be read when loading a specific `MetaGraphDef`.
|
||||||
|
* `assets.extra` is a subfolder where higher-level libraries and users can
|
||||||
|
add their own assets that co-exist with the model, but are not loaded by
|
||||||
|
the graph. This subfolder is not managed by the SavedModel libraries.
|
||||||
|
* `variables` is a subfolder that includes output from
|
||||||
|
`tf.train.Saver`.
|
||||||
|
* `saved_model.pb` or `saved_model.pbtxt` is the SavedModel protocol buffer.
|
||||||
|
It includes the graph definitions as `MetaGraphDef` protocol buffers.
|
||||||
|
|
||||||
|
A single SavedModel can represent multiple graphs. In this case, all the
|
||||||
|
graphs in the SavedModel share a *single* set of checkpoints (variables)
|
||||||
|
and assets. For example, the following diagram shows one SavedModel
|
||||||
|
containing three `MetaGraphDef`s, all three of which share the same set
|
||||||
|
of checkpoints and assets:
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Each graph is associated with a specific set of tags, which enables
|
||||||
|
identification during a load or restore operation.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -1,251 +0,0 @@
|
||||||
# SavedModel CLI (Command-Line Interface)
|
|
||||||
|
|
||||||
[`SavedModel`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/README.md)
|
|
||||||
is a universal serialization format for Tensorflow. It provides a
|
|
||||||
language-neutral format to save machine-learned models and enables higher-level
|
|
||||||
systems and tools to produce, consume and transform TensorFlow models.
|
|
||||||
|
|
||||||
We provide SavedModel CLI(command-line interface) as a tool to inspect and
|
|
||||||
execute a [`MetaGraph`](https://www.tensorflow.org/programmers_guide/meta_graph)
|
|
||||||
in a SavedModel. You can inspect for example, what
|
|
||||||
[`SignatureDefs`](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/g3doc/signature_defs.md),
|
|
||||||
including their input and output tensors, are in the model without writing any
|
|
||||||
code. This can be useful in situations such as when you want to quickly check
|
|
||||||
your input dtype and shape match with the model. Moreover, if you want to test
|
|
||||||
out the model, it also allows you to do a sanity check by passing in sample
|
|
||||||
inputs in the format of for example, python expressions, and fetch the outputs
|
|
||||||
simply through command line.
|
|
||||||
|
|
||||||
## Get SavedModel CLI
|
|
||||||
|
|
||||||
If TensorFlow is installed on your system through pip, the `saved_model_cli`
|
|
||||||
binary can be invoked directly from command line.
|
|
||||||
|
|
||||||
To build the binary from source, run the following command:
|
|
||||||
|
|
||||||
```
|
|
||||||
$bazel build tensorflow/python/tools:saved_model_cli
|
|
||||||
```
|
|
||||||
|
|
||||||
## Commands
|
|
||||||
|
|
||||||
SavedModel CLI allows users to both show and run computations on a
|
|
||||||
[`MetaGraphDef`](https://www.tensorflow.org/code/tensorflow/core/protobuf/meta_graph.proto)
|
|
||||||
in a SavedModel. These are done through `show` and `run` commands. We will
|
|
||||||
explain the usages of both commands with detailed examples. SavedModel CLI will
|
|
||||||
also display this information with `-h` option.
|
|
||||||
|
|
||||||
### `show` command
|
|
||||||
|
|
||||||
A SavedModel contains one or more MetaGraphs, identified by their tag-sets. Each
|
|
||||||
MetaGraph contains both a TensorFlow GraphDef as well as associated metadata
|
|
||||||
necessary for running computation in a graph. In order to serve a model, you
|
|
||||||
might wonder what kind of SignatureDefs are in each model, and what are their
|
|
||||||
inputs and outputs etc. The `show` command let you examine the content of the
|
|
||||||
SavedModel in a hierarchical order.
|
|
||||||
|
|
||||||
```
|
|
||||||
usage: saved_model_cli show [-h] --dir DIR [--all]
|
|
||||||
[--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY]
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Examples
|
|
||||||
|
|
||||||
To show all available MetaGraphDef tag-sets in the SavedModel:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli show --dir /tmp/saved_model_dir
|
|
||||||
The given SavedModel contains the following tag-sets:
|
|
||||||
serve
|
|
||||||
serve, gpu
|
|
||||||
```
|
|
||||||
|
|
||||||
To show all available SignatureDef keys in a MetaGraphDef:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve
|
|
||||||
The given SavedModel MetaGraphDef contains SignatureDefs with the following keys:
|
|
||||||
SignatureDef key: "classify_x2_to_y3"
|
|
||||||
SignatureDef key: "classify_x_to_y"
|
|
||||||
SignatureDef key: "regress_x2_to_y3"
|
|
||||||
SignatureDef key: "regress_x_to_y"
|
|
||||||
SignatureDef key: "regress_x_to_y2"
|
|
||||||
SignatureDef key: "serving_default"
|
|
||||||
```
|
|
||||||
|
|
||||||
For a MetaGraphDef with multiple tags in the tag-set, all tags must be passed
|
|
||||||
in, separated by ',':
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu
|
|
||||||
```
|
|
||||||
|
|
||||||
To show all inputs and outputs TensorInfo for a specific SignatureDef, pass in
|
|
||||||
the SignatureDef key to `signature_def` option. This is very useful when you
|
|
||||||
want to know the tensor key value, dtype and shape of the input tensors for
|
|
||||||
executing the computation graph later.
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli show --dir \
|
|
||||||
/tmp/saved_model_dir --tag_set serve --signature_def serving_default
|
|
||||||
The given SavedModel SignatureDef contains the following input(s):
|
|
||||||
inputs['x'] tensor_info:
|
|
||||||
dtype: DT_FLOAT
|
|
||||||
shape: (-1, 1)
|
|
||||||
name: x:0
|
|
||||||
The given SavedModel SignatureDef contains the following output(s):
|
|
||||||
outputs['y'] tensor_info:
|
|
||||||
dtype: DT_FLOAT
|
|
||||||
shape: (-1, 1)
|
|
||||||
name: y:0
|
|
||||||
Method name is: tensorflow/serving/predict
|
|
||||||
```
|
|
||||||
|
|
||||||
To show all available information in the SavedModel, use `--all` option:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli show --dir /tmp/saved_model_dir --all
|
|
||||||
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
|
|
||||||
|
|
||||||
signature_def['classify_x2_to_y3']:
|
|
||||||
The given SavedModel SignatureDef contains the following input(s):
|
|
||||||
inputs['inputs'] tensor_info:
|
|
||||||
dtype: DT_FLOAT
|
|
||||||
shape: (-1, 1)
|
|
||||||
name: x2:0
|
|
||||||
The given SavedModel SignatureDef contains the following output(s):
|
|
||||||
outputs['scores'] tensor_info:
|
|
||||||
dtype: DT_FLOAT
|
|
||||||
shape: (-1, 1)
|
|
||||||
name: y3:0
|
|
||||||
Method name is: tensorflow/serving/classify
|
|
||||||
|
|
||||||
...
|
|
||||||
|
|
||||||
signature_def['serving_default']:
|
|
||||||
The given SavedModel SignatureDef contains the following input(s):
|
|
||||||
inputs['x'] tensor_info:
|
|
||||||
dtype: DT_FLOAT
|
|
||||||
shape: (-1, 1)
|
|
||||||
name: x:0
|
|
||||||
The given SavedModel SignatureDef contains the following output(s):
|
|
||||||
outputs['y'] tensor_info:
|
|
||||||
dtype: DT_FLOAT
|
|
||||||
shape: (-1, 1)
|
|
||||||
name: y:0
|
|
||||||
Method name is: tensorflow/serving/predict
|
|
||||||
```
|
|
||||||
|
|
||||||
### `run` command
|
|
||||||
|
|
||||||
SavedModel CLI also allows you to run a graph computation by passing in inputs,
|
|
||||||
displaying, and saving the outputs.
|
|
||||||
|
|
||||||
```
|
|
||||||
usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def
|
|
||||||
SIGNATURE_DEF_KEY [--inputs INPUTS]
|
|
||||||
[--input_exprs INPUT_EXPRS] [--outdir OUTDIR]
|
|
||||||
[--overwrite] [--tf_debug]
|
|
||||||
```
|
|
||||||
|
|
||||||
Tensor keys are used to specify which input we are passing in the values for.
|
|
||||||
There are two ways to pass inputs to the model. With '--inputs' option, you can
|
|
||||||
pass in numpy ndarray by files. With '--input_exprs' option, you can use python
|
|
||||||
expressions as inputs.
|
|
||||||
|
|
||||||
#### Input By File
|
|
||||||
|
|
||||||
To pass in inputs by files, use `--inputs` option in the format of
|
|
||||||
`<input_key>=<filename>`, or `<input_key>=<filename>[<variable_name>]`. Each
|
|
||||||
input is separated by semicolon. File specified by `filename` will be loaded
|
|
||||||
using `numpy.load`. Inputs can be loaded from only `.npy`, `.npz` or pickle
|
|
||||||
files. The `variable_name` key is optional depending on the input file type as
|
|
||||||
descripted in more details below.
|
|
||||||
|
|
||||||
When loading from a `.npy` file, which always contains a numpy ndarray, the
|
|
||||||
content will be directly assigned to the specified input tensor. If a
|
|
||||||
`variable_name` is specified, it will be ignored and a warning will be issued.
|
|
||||||
|
|
||||||
When loading from a `.npz` zip file, user can specify which variable within the
|
|
||||||
zip file to load for the input tensor key with `variable_name`. If nothing is
|
|
||||||
specified, SavedModel CLI will check that only one file is included in the zip
|
|
||||||
file and load it for the specified input tensor key.
|
|
||||||
|
|
||||||
When loading from a pickle file, if no `variable_name` is specified in the
|
|
||||||
square brackets, whatever that is inside the pickle file will be passed to the
|
|
||||||
specified input tensor key. Else SavedModel CLI will assume a dictionary is
|
|
||||||
stored in the pickle file and the value corresponding to the variable_name will
|
|
||||||
be used.
|
|
||||||
|
|
||||||
#### Input By Python Expression
|
|
||||||
|
|
||||||
To pass in inputs by python expressions, use `--input_exprs` option. `numpy`
|
|
||||||
module is available as `np`. For example, `input_key=np.ones((32, 32, 3))` or
|
|
||||||
`input_key=[[1], [2], [3]]`. This can be useful for when you don't have data
|
|
||||||
files lying around, but still want to sanity check the model with some simple
|
|
||||||
inputs that match the dtype and shape of the model signature.
|
|
||||||
|
|
||||||
#### Save Output
|
|
||||||
|
|
||||||
By default, SavedModel CLI will print outputs to console. If a directory is
|
|
||||||
passed to `--outdir` option, the outputs will be saved as npy files named after
|
|
||||||
output tensor keys under the given directory. Use `--overwrite` to overwrite
|
|
||||||
existing output files.
|
|
||||||
|
|
||||||
#### TensorFlow Debugger (tfdbg) Integration
|
|
||||||
|
|
||||||
If `--tf_debug` option is set, SavedModel CLI will use TensorFlow Debugger
|
|
||||||
(tfdbg) to watch the intermediate Tensors and runtime GraphDefs while running
|
|
||||||
the SavedModel.
|
|
||||||
|
|
||||||
#### Examples
|
|
||||||
|
|
||||||
If we have a simple model that adds `x1` and `x2` to get output `y`, where all
|
|
||||||
tensors are of shape `(-1, 1)`, and we have two `npz` files. File
|
|
||||||
`/tmp/my_data1.npy` contains a numpy ndarray `[[1], [2], [3]]`, file
|
|
||||||
`/tmp/my_data2.npy` contains another numpy ndarray `[[0.5], [0.5], [0.5]]`. Now
|
|
||||||
let's run these two `npy` files through the model to get `y`:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
|
||||||
--signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npy;x2=/tmp/my_data2.npy \
|
|
||||||
--outdir /tmp/out
|
|
||||||
Result for output key y:
|
|
||||||
[[ 1.5]
|
|
||||||
[ 2.5]
|
|
||||||
[ 3.5]]
|
|
||||||
```
|
|
||||||
|
|
||||||
Similarly, we can run input tensors from `npz` file and pickle file, as well as
|
|
||||||
overwrite the previous output file:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
|
||||||
--signature_def x1_x2_to_y \
|
|
||||||
--inputs x1=/tmp/my_data1.npz[x];x2=/tmp/my_data2.pkl --outdir /tmp/out \
|
|
||||||
--overwrite
|
|
||||||
Result for output key y:
|
|
||||||
[[ 1.5]
|
|
||||||
[ 2.5]
|
|
||||||
[ 3.5]]
|
|
||||||
```
|
|
||||||
|
|
||||||
You can also use python expression instead of input file. Here we replace input
|
|
||||||
`x2` with a python expression:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
|
||||||
--signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npz[x] \
|
|
||||||
--input_exprs 'x2=np.ones((3,1))'
|
|
||||||
Result for output key y:
|
|
||||||
[[ 2]
|
|
||||||
[ 3]
|
|
||||||
[ 4]]
|
|
||||||
```
|
|
||||||
|
|
||||||
To run model with TensorFlow Debugger on:
|
|
||||||
|
|
||||||
```
|
|
||||||
$saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \
|
|
||||||
--signature_def serving_default --inputs x=/tmp/data.npz[x] --tf_debug
|
|
||||||
```
|
|
||||||
|
|
@ -52,6 +52,8 @@ other_variable = tf.get_variable("other_variable", dtype=tf.int32,
|
||||||
Note that when the initializer is a `tf.Tensor` you should not specify the
|
Note that when the initializer is a `tf.Tensor` you should not specify the
|
||||||
variable's shape, as the shape of the initializer tensor will be used.
|
variable's shape, as the shape of the initializer tensor will be used.
|
||||||
|
|
||||||
|
|
||||||
|
<a name="collections"></a>
|
||||||
### Variable collections
|
### Variable collections
|
||||||
|
|
||||||
Because disconnected parts of a TensorFlow program might want to create
|
Because disconnected parts of a TensorFlow program might want to create
|
||||||
|
|
@ -223,152 +225,6 @@ with tf.control_dependencies([assignment]):
|
||||||
# assign_add operation.
|
# assign_add operation.
|
||||||
```
|
```
|
||||||
|
|
||||||
## Saving and Restoring
|
|
||||||
|
|
||||||
The easiest way to save and restore a model is to use a `tf.train.Saver` object.
|
|
||||||
The constructor adds `save` and `restore` ops to the graph for all, or a
|
|
||||||
specified list, of the variables in the graph. The `Saver` object provides
|
|
||||||
methods to run these ops, specifying paths for the checkpoint files to write to
|
|
||||||
or read from.
|
|
||||||
|
|
||||||
To restore a model checkpoint without a graph, you must first import the graph
|
|
||||||
from the `MetaGraph` file (typical extension is `.meta`). Do this by calling
|
|
||||||
@{tf.train.import_meta_graph}, which in turn returns a `Saver` from which one
|
|
||||||
can than perform a `restore`.
|
|
||||||
|
|
||||||
### Checkpoint Files
|
|
||||||
|
|
||||||
TensorFlow saves variables in binary files that, roughly speaking, map variable
|
|
||||||
names to tensor values.
|
|
||||||
|
|
||||||
When you create a `Saver` object, you can optionally choose names for the
|
|
||||||
variables in the checkpoint files. By default, `Saver` uses the value of the
|
|
||||||
@{tf.Variable.name} property for
|
|
||||||
each variable.
|
|
||||||
|
|
||||||
To inspect the variables in a checkpoint, you can use
|
|
||||||
the
|
|
||||||
[`inspect_checkpoint`](https://www.tensorflow.org/code/tensorflow/python/tools/inspect_checkpoint.py) library,
|
|
||||||
particularly the `print_tensors_in_checkpoint_file` function.
|
|
||||||
|
|
||||||
### Saving Variables
|
|
||||||
|
|
||||||
Create a `Saver` with `tf.train.Saver()` to manage all variables in the
|
|
||||||
model. For example, the following snippet demonstrates how to call the
|
|
||||||
`tf.train.Saver.save` method to save variables to a checkpoint file:
|
|
||||||
|
|
||||||
```python
|
|
||||||
# Create some variables.
|
|
||||||
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
|
|
||||||
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)
|
|
||||||
|
|
||||||
inc_v1 = v1.assign(v1+1)
|
|
||||||
dec_v2 = v2.assign(v2-1)
|
|
||||||
|
|
||||||
# Add an op to initialize the variables.
|
|
||||||
init_op = tf.global_variables_initializer()
|
|
||||||
|
|
||||||
# Add ops to save and restore all the variables.
|
|
||||||
saver = tf.train.Saver()
|
|
||||||
|
|
||||||
# Later, launch the model, initialize the variables, do some work, and save the
|
|
||||||
# variables to disk.
|
|
||||||
with tf.Session() as sess:
|
|
||||||
sess.run(init_op)
|
|
||||||
# Do some work with the model.
|
|
||||||
inc_v1.op.run()
|
|
||||||
dec_v2.op.run()
|
|
||||||
# Save the variables to disk.
|
|
||||||
save_path = saver.save(sess, "/tmp/model.ckpt")
|
|
||||||
print("Model saved in file: %s" % save_path)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Restoring Variables
|
|
||||||
|
|
||||||
The `tf.train.Saver` object not only saves variables to checkpoint files, it
|
|
||||||
also restores variables. Note that when you restore variables from a file you
|
|
||||||
do not have to initialize them beforehand. For example, the following snippet
|
|
||||||
demonstrates how to call the `tf.train.Saver.restore` method to restore
|
|
||||||
variables from a checkpoint file:
|
|
||||||
|
|
||||||
```python
|
|
||||||
tf.reset_default_graph()
|
|
||||||
|
|
||||||
# Create some variables.
|
|
||||||
v1 = tf.get_variable("v1", shape=[3])
|
|
||||||
v2 = tf.get_variable("v2", shape=[5])
|
|
||||||
|
|
||||||
# Add ops to save and restore all the variables.
|
|
||||||
saver = tf.train.Saver()
|
|
||||||
|
|
||||||
# Later, launch the model, use the saver to restore variables from disk, and
|
|
||||||
# do some work with the model.
|
|
||||||
with tf.Session() as sess:
|
|
||||||
# Restore variables from disk.
|
|
||||||
saver.restore(sess, "/tmp/model.ckpt")
|
|
||||||
print("Model restored.")
|
|
||||||
# Check the values of the variables
|
|
||||||
print("v1 : %s" % v1.eval())
|
|
||||||
print("v2 : %s" % v2.eval())
|
|
||||||
```
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
### Choosing which Variables to Save and Restore
|
|
||||||
|
|
||||||
If you do not pass any argument to `tf.train.Saver()`, the saver handles all
|
|
||||||
variables in the graph. Each variable is saved under the name that was passed
|
|
||||||
when the variable was created.
|
|
||||||
|
|
||||||
It is sometimes useful to explicitly specify names for variables in the
|
|
||||||
checkpoint files. For example, you may have trained a model with a variable
|
|
||||||
named `"weights"` whose value you want to restore into a variable named
|
|
||||||
`"params"`.
|
|
||||||
|
|
||||||
It is also sometimes useful to only save or restore a subset of the variables
|
|
||||||
used by a model. For example, you may have trained a neural net with five
|
|
||||||
layers, and you now want to train a new model with six layers that reuses the
|
|
||||||
existing weights of the five trained layers. You can use the saver to restore
|
|
||||||
the weights of just the first five layers.
|
|
||||||
|
|
||||||
You can easily specify the names and variables to save or load by passing to the
|
|
||||||
`tf.train.Saver()` constructor either a list of variables (which will be stored
|
|
||||||
under their own names), or a Python dictionary in which keys are the names to
|
|
||||||
use and values are the variables to manage.
|
|
||||||
|
|
||||||
Continuing from the save/restore examples, above:
|
|
||||||
|
|
||||||
```python
|
|
||||||
tf.reset_default_graph()
|
|
||||||
# Create some variables.
|
|
||||||
v1 = tf.get_variable("v1", [3], initializer = tf.zeros_initializer)
|
|
||||||
v2 = tf.get_variable("v2", [5], initializer = tf.zeros_initializer)
|
|
||||||
|
|
||||||
# Add ops to save and restore only `v2` using the name "v2"
|
|
||||||
saver = tf.train.Saver({"v2": v2})
|
|
||||||
|
|
||||||
# Use the saver object normally after that.
|
|
||||||
with tf.Session() as sess:
|
|
||||||
# Initialize v1 since the saver will not.
|
|
||||||
v1.initializer.run()
|
|
||||||
saver.restore(sess, "/tmp/model.ckpt")
|
|
||||||
|
|
||||||
print("v1 : %s" % v1.eval())
|
|
||||||
print("v2 : %s" % v2.eval())
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
Notes:
|
|
||||||
|
|
||||||
* You can create as many `Saver` objects as you want if you need to save and
|
|
||||||
restore different subsets of the model variables. The same variable can be
|
|
||||||
listed in multiple saver objects, its value is only changed when the
|
|
||||||
`Saver.restore()` method is run.
|
|
||||||
|
|
||||||
* If you only restore a subset of the model variables at the start of a
|
|
||||||
session, you have to run an initialize op for the other variables. See
|
|
||||||
@{tf.variables_initializer} for more information.
|
|
||||||
|
|
||||||
|
|
||||||
## Sharing variables
|
## Sharing variables
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue
Block a user