mirror of
https://github.com/zebrajr/faceswap.git
synced 2025-12-06 12:20:27 +01:00
* Core Updates
- Remove lib.utils.keras_backend_quiet and replace with get_backend() where relevant
- Document lib.gpu_stats and lib.sys_info
- Remove call to GPUStats.is_plaidml from convert and replace with get_backend()
- lib.gui.menu - typofix
* Update Dependencies
Bump Tensorflow Version Check
* Port extraction to tf2
* Add custom import finder for loading Keras or tf.keras depending on backend
* Add `tensorflow` to KerasFinder search path
* Basic TF2 training running
* model.initializers - docstring fix
* Fix and pass tests for tf2
* Replace Keras backend tests with faceswap backend tests
* Initial optimizers update
* Monkey patch tf.keras optimizer
* Remove custom Adam Optimizers and Memory Saving Gradients
* Remove multi-gpu option. Add Distribution to cli
* plugins.train.model._base: Add Mirror, Central and Default distribution strategies
* Update tensorboard kwargs for tf2
* Penalized Loss - Fix for TF2 and AMD
* Fix syntax for tf2.1
* requirements typo fix
* Explicit None for clipnorm if using a distribution strategy
* Fix penalized loss for distribution strategies
* Update Dlight
* typo fix
* Pin to TF2.2
* setup.py - Install tensorflow from pip if not available in Conda
* Add reduction options and set default for mirrored distribution strategy
* Explicitly use default strategy rather than nullcontext
* lib.model.backup_restore documentation
* Remove mirrored strategy reduction method and default based on OS
* Initial restructure - training
* Remove PingPong
Start model.base refactor
* Model saving and resuming enabled
* More tidying up of model.base
* Enable backup and snapshotting
* Re-enable state file
Remove loss names from state file
Fix print loss function
Set snapshot iterations correctly
* Revert original model to Keras Model structure rather than custom layer
Output full model and sub model summary
Change NNBlocks to callables rather than custom keras layers
* Apply custom Conv2D layer
* Finalize NNBlock restructure
Update Dfaker blocks
* Fix reloading model under a different distribution strategy
* Pass command line arguments through to trainer
* Remove training_opts from model and reference params directly
* Tidy up model __init__
* Re-enable tensorboard logging
Suppress "Model Not Compiled" warning
* Fix timelapse
* lib.model.nnblocks - Bugfix residual block
Port dfaker
bugfix original
* dfl-h128 ported
* DFL SAE ported
* IAE Ported
* dlight ported
* port lightweight
* realface ported
* unbalanced ported
* villain ported
* lib.cli.args - Update Batchsize + move allow_growth to config
* Remove output shape definition
Get image sizes per side rather than globally
* Strip mask input from encoder
* Fix learn mask and output learned mask to preview
* Trigger Allow Growth prior to setting strategy
* Fix GUI Graphing
* GUI - Display batchsize correctly + fix training graphs
* Fix penalized loss
* Enable mixed precision training
* Update analysis displayed batch to match input
* Penalized Loss - Multi-GPU Fix
* Fix all losses for TF2
* Fix Reflect Padding
* Allow different input size for each side of the model
* Fix conv-aware initialization on reload
* Switch allow_growth order
* Move mixed_precision to cli
* Remove distrubution strategies
* Compile penalized loss sub-function into LossContainer
* Bump default save interval to 250
Generate preview on first iteration but don't save
Fix iterations to start at 1 instead of 0
Remove training deprecation warnings
Bump some scripts.train loglevels
* Add ability to refresh preview on demand on pop-up window
* Enable refresh of training preview from GUI
* Fix Convert
Debug logging in Initializers
* Fix Preview Tool
* Update Legacy TF1 weights to TF2
Catch stats error on loading stats with missing logs
* lib.gui.popup_configure - Make more responsive + document
* Multiple Outputs supported in trainer
Original Model - Mask output bugfix
* Make universal inference model for convert
Remove scaling from penalized mask loss (now handled at input to y_true)
* Fix inference model to work properly with all models
* Fix multi-scale output for convert
* Fix clipnorm issue with distribution strategies
Edit error message on OOM
* Update plaidml losses
* Add missing file
* Disable gmsd loss for plaidnl
* PlaidML - Basic training working
* clipnorm rewriting for mixed-precision
* Inference model creation bugfixes
* Remove debug code
* Bugfix: Default clipnorm to 1.0
* Remove all mask inputs from training code
* Remove mask inputs from convert
* GUI - Analysis Tab - Docstrings
* Fix rate in totals row
* lib.gui - Only update display pages if they have focus
* Save the model on first iteration
* plaidml - Fix SSIM loss with penalized loss
* tools.alignments - Remove manual and fix jobs
* GUI - Remove case formatting on help text
* gui MultiSelect custom widget - Set default values on init
* vgg_face2 - Move to plugins.extract.recognition and use plugins._base base class
cli - Add global GPU Exclude Option
tools.sort - Use global GPU Exlude option for backend
lib.model.session - Exclude all GPUs when running in CPU mode
lib.cli.launcher - Set backend to CPU mode when all GPUs excluded
* Cascade excluded devices to GPU Stats
* Explicit GPU selection for Train and Convert
* Reduce Tensorflow Min GPU Multiprocessor Count to 4
* remove compat.v1 code from extract
* Force TF to skip mixed precision compatibility check if GPUs have been filtered
* Add notes to config for non-working AMD losses
* Rasie error if forcing extract to CPU mode
* Fix loading of legace dfl-sae weights + dfl-sae typo fix
* Remove unused requirements
Update sphinx requirements
Fix broken rst file locations
* docs: lib.gui.display
* clipnorm amd condition check
* documentation - gui.display_analysis
* Documentation - gui.popup_configure
* Documentation - lib.logger
* Documentation - lib.model.initializers
* Documentation - lib.model.layers
* Documentation - lib.model.losses
* Documentation - lib.model.nn_blocks
* Documetation - lib.model.normalization
* Documentation - lib.model.session
* Documentation - lib.plaidml_stats
* Documentation: lib.training_data
* Documentation: lib.utils
* Documentation: plugins.train.model._base
* GUI Stats: prevent stats from using GPU
* Documentation - Original Model
* Documentation: plugins.model.trainer._base
* linting
* unit tests: initializers + losses
* unit tests: nn_blocks
* bugfix - Exclude gpu devices in train, not include
* Enable Exclude-Gpus in Extract
* Enable exclude gpus in tools
* Disallow multiple plugin types in a single model folder
* Automatically add exclude_gpus argument in for cpu backends
* Cpu backend fixes
* Relax optimizer test threshold
* Default Train settings - Set mask to Extended
* Update Extractor cli help text
Update to Python 3.8
* Fix FAN to run on CPU
* lib.plaidml_tools - typofix
* Linux installer - check for curl
* linux installer - typo fix
288 lines
11 KiB
Python
288 lines
11 KiB
Python
#!/usr/bin python3
|
|
|
|
""" PlaidML tools.
|
|
|
|
Statistics and setup for PlaidML on AMD devices.
|
|
|
|
This module must be kept separate from Keras, and be called prior to any Keras import, as the
|
|
plaidML Keras backend is set from this module.
|
|
"""
|
|
|
|
import json
|
|
import logging
|
|
import os
|
|
import sys
|
|
|
|
import plaidml
|
|
|
|
_INIT = False
|
|
_LOGGER = None
|
|
_EXCLUDE_DEVICES = []
|
|
|
|
|
|
class PlaidMLStats():
|
|
""" Handles the initialization of PlaidML and the returning of GPU information for connected
|
|
cards from the PlaidML library.
|
|
|
|
This class is initialized early in Faceswap's Launch process from :func:`setup_plaidml`, with
|
|
statistics made available from :class:`~lib.gpu_stats.GPUStats`
|
|
|
|
Parameters
|
|
---------
|
|
log_level: str, optional
|
|
The requested Faceswap log level. Also dictates the level that PlaidML logging is set at.
|
|
Default:`"INFO"`
|
|
log: bool, optional
|
|
Whether this class should output to the logger. If statistics are being accessed during a
|
|
crash, then the logger may not be available, so this gives the option to turn logging off
|
|
in those kinds of situations. Default:``True``
|
|
"""
|
|
def __init__(self, log_level="INFO", log=True):
|
|
if not _INIT and log:
|
|
# Logger held internally, as we don't want to log when obtaining system stats on crash
|
|
global _LOGGER # pylint:disable=global-statement
|
|
_LOGGER = logging.getLogger(__name__)
|
|
_LOGGER.debug("Initializing: %s: (log_level: %s, log: %s)",
|
|
self.__class__.__name__, log_level, log)
|
|
self._initialize(log_level)
|
|
self._ctx = plaidml.Context()
|
|
self._supported_devices = self._get_supported_devices()
|
|
self._devices = self._get_all_devices()
|
|
|
|
self._device_details = [json.loads(device.details.decode())
|
|
for device in self._devices if device.details]
|
|
if self._devices and not self.active_devices:
|
|
self._load_active_devices()
|
|
if _LOGGER:
|
|
_LOGGER.debug("Initialized: %s", self.__class__.__name__)
|
|
|
|
# PROPERTIES
|
|
@property
|
|
def devices(self):
|
|
"""list: The :class:`pladml._DeviceConfig` objects for GPUs that PlaidML has
|
|
discovered. """
|
|
return self._devices
|
|
|
|
@property
|
|
def active_devices(self):
|
|
""" list: List of device indices for active GPU devices. """
|
|
return [idx for idx, d_id in enumerate(self._ids)
|
|
if d_id in plaidml.settings.device_ids and idx not in _EXCLUDE_DEVICES]
|
|
|
|
@property
|
|
def device_count(self):
|
|
""" int: The total number of GPU Devices discovered. """
|
|
return len(self._devices)
|
|
|
|
@property
|
|
def drivers(self):
|
|
""" list: The driver versions for each GPU device that PlaidML has discovered. """
|
|
return [device.get("driverVersion", "No Driver Found") for device in self._device_details]
|
|
|
|
@property
|
|
def vram(self):
|
|
""" list: The VRAM of each GPU device that PlaidML has discovered. """
|
|
return [int(device.get("globalMemSize", 0)) / (1024 * 1024)
|
|
for device in self._device_details]
|
|
|
|
@property
|
|
def names(self):
|
|
""" list: The name of each GPU device that PlaidML has discovered. """
|
|
return ["{} - {} ({})".format(
|
|
device.get("vendor", "unknown"),
|
|
device.get("name", "unknown"),
|
|
"supported" if idx in self._supported_indices else "experimental")
|
|
for idx, device in enumerate(self._device_details)]
|
|
|
|
@property
|
|
def _ids(self):
|
|
""" list: The device identification for each GPU device that PlaidML has discovered. """
|
|
return [device.id.decode() for device in self._devices]
|
|
|
|
@property
|
|
def _experimental_indices(self):
|
|
""" list: The indices corresponding to :attr:`_ids` of GPU devices marked as
|
|
"experimental". """
|
|
retval = [idx for idx, device in enumerate(self.devices)
|
|
if device not in self._supported_indices]
|
|
if _LOGGER:
|
|
_LOGGER.debug(retval)
|
|
return retval
|
|
|
|
@property
|
|
def _supported_indices(self):
|
|
""" list: The indices corresponding to :attr:`_ids` of GPU devices marked as
|
|
"supported". """
|
|
retval = [idx for idx, device in enumerate(self._devices)
|
|
if device in self._supported_devices]
|
|
if _LOGGER:
|
|
_LOGGER.debug(retval)
|
|
return retval
|
|
|
|
# INITIALIZATION
|
|
def _initialize(self, log_level):
|
|
""" Initialize PlaidML.
|
|
|
|
Set PlaidML to use Faceswap's logger, and set the logging level
|
|
|
|
Parameters
|
|
----------
|
|
log_level: str, optional
|
|
The requested Faceswap log level. Also dictates the level that PlaidML logging is set
|
|
at.
|
|
"""
|
|
global _INIT # pylint:disable=global-statement
|
|
if _INIT:
|
|
if _LOGGER:
|
|
_LOGGER.debug("PlaidML already initialized")
|
|
return
|
|
if _LOGGER:
|
|
_LOGGER.debug("Initializing PlaidML")
|
|
self._set_plaidml_logger()
|
|
self._set_verbosity(log_level)
|
|
_INIT = True
|
|
if _LOGGER:
|
|
_LOGGER.debug("Initialized PlaidML")
|
|
|
|
@classmethod
|
|
def _set_plaidml_logger(cls):
|
|
""" Set PlaidMLs default logger to Faceswap Logger and prevent propagation. """
|
|
if _LOGGER:
|
|
_LOGGER.debug("Setting PlaidML Default Logger")
|
|
plaidml.DEFAULT_LOG_HANDLER = logging.getLogger("plaidml_root")
|
|
plaidml.DEFAULT_LOG_HANDLER.propagate = 0
|
|
if _LOGGER:
|
|
_LOGGER.debug("Set PlaidML Default Logger")
|
|
|
|
@classmethod
|
|
def _set_verbosity(cls, log_level):
|
|
""" Set the PlaidML logging verbosity
|
|
|
|
log_level: str
|
|
The requested Faceswap log level. Also dictates the level that PlaidML logging is set
|
|
at.
|
|
"""
|
|
if _LOGGER:
|
|
_LOGGER.debug("Setting PlaidML Loglevel: %s", log_level)
|
|
if isinstance(log_level, int):
|
|
numeric_level = log_level
|
|
else:
|
|
numeric_level = getattr(logging, log_level.upper(), None)
|
|
if numeric_level < 10:
|
|
# DEBUG Logging
|
|
plaidml._internal_set_vlog(1) # pylint:disable=protected-access
|
|
elif numeric_level < 20:
|
|
# INFO Logging
|
|
plaidml._internal_set_vlog(0) # pylint:disable=protected-access
|
|
else:
|
|
# WARNING Logging
|
|
plaidml.quiet()
|
|
|
|
def _get_supported_devices(self):
|
|
""" Obtain GPU devices from PlaidML that are marked as "supported".
|
|
|
|
Returns
|
|
-------
|
|
list
|
|
The :class:`pladml._DeviceConfig` objects for GPUs that PlaidML has discovered.
|
|
"""
|
|
experimental_setting = plaidml.settings.experimental
|
|
plaidml.settings.experimental = False
|
|
devices = plaidml.devices(self._ctx, limit=100, return_all=True)[0]
|
|
plaidml.settings.experimental = experimental_setting
|
|
|
|
supported = [device for device in devices
|
|
if device.details
|
|
and json.loads(device.details.decode()).get("type", "cpu").lower() == "gpu"]
|
|
if _LOGGER:
|
|
_LOGGER.debug(supported)
|
|
return supported
|
|
|
|
def _get_all_devices(self):
|
|
""" Obtain all available (experimental and supported) GPU devices from PlaidML.
|
|
|
|
Returns
|
|
-------
|
|
list
|
|
The :class:`pladml._DeviceConfig` objects for GPUs that PlaidML has discovered.
|
|
"""
|
|
experimental_setting = plaidml.settings.experimental
|
|
plaidml.settings.experimental = True
|
|
devices, _ = plaidml.devices(self._ctx, limit=100, return_all=True)
|
|
plaidml.settings.experimental = experimental_setting
|
|
|
|
experi = [device for device in devices
|
|
if device.details
|
|
and json.loads(device.details.decode()).get("type", "cpu").lower() == "gpu"]
|
|
if _LOGGER:
|
|
_LOGGER.debug("Experimental Devices: %s", experi)
|
|
all_devices = experi + self._supported_devices
|
|
if _LOGGER:
|
|
_LOGGER.debug(all_devices)
|
|
return all_devices
|
|
|
|
def _load_active_devices(self):
|
|
""" If the plaidml user configuration settings exist, then set the default GPU from the
|
|
settings file, Otherwise set the GPU to be the one with most VRAM. """
|
|
if not os.path.exists(plaidml.settings.user_settings): # pylint:disable=no-member
|
|
if _LOGGER:
|
|
_LOGGER.debug("Setting largest PlaidML device")
|
|
self._set_largest_gpu()
|
|
else:
|
|
if _LOGGER:
|
|
_LOGGER.debug("Setting PlaidML devices from user_settings")
|
|
|
|
def _set_largest_gpu(self):
|
|
""" Set the default GPU to be a supported device with the most available VRAM. If no
|
|
supported device is available, then set the GPU to be the an experimental device with the
|
|
most VRAM available. """
|
|
category = "supported" if self._supported_devices else "experimental"
|
|
if _LOGGER:
|
|
_LOGGER.debug("Obtaining largest %s device", category)
|
|
indices = getattr(self, "_{}_indices".format(category))
|
|
if not indices:
|
|
_LOGGER.error("Failed to automatically detect your GPU.")
|
|
_LOGGER.error("Please run `plaidml-setup` to set up your GPU.")
|
|
sys.exit(1)
|
|
max_vram = max([self.vram[idx] for idx in indices])
|
|
if _LOGGER:
|
|
_LOGGER.debug("Max VRAM: %s", max_vram)
|
|
gpu_idx = min([idx for idx, vram in enumerate(self.vram)
|
|
if vram == max_vram and idx in indices])
|
|
if _LOGGER:
|
|
_LOGGER.debug("GPU IDX: %s", gpu_idx)
|
|
|
|
selected_gpu = self._ids[gpu_idx]
|
|
if _LOGGER:
|
|
_LOGGER.info("Setting GPU to largest available %s device. If you want to override "
|
|
"this selection, run `plaidml-setup` from the command line.", category)
|
|
|
|
plaidml.settings.experimental = category == "experimental"
|
|
plaidml.settings.device_ids = [selected_gpu]
|
|
|
|
|
|
def setup_plaidml(log_level, exclude_devices):
|
|
""" Setup PlaidML for AMD Cards.
|
|
|
|
Sets the Keras backend to PlaidML, loads the plaidML backend and makes GPU Device information
|
|
from PlaidML available to :class:`~lib.gpu_stats.GPUStats`.
|
|
|
|
|
|
Parameters
|
|
----------
|
|
log_level: str
|
|
Faceswap's log level. Used for setting the log level inside PlaidML
|
|
exclude_devices: list
|
|
A list of integers of device IDs that should not be used by Faceswap
|
|
"""
|
|
logger = logging.getLogger(__name__) # pylint:disable=invalid-name
|
|
logger.info("Setting up for PlaidML")
|
|
logger.verbose("Setting Keras Backend to PlaidML")
|
|
# Add explicitly excluded devices to list. The contents have already been checked in GPUStats
|
|
if exclude_devices:
|
|
_EXCLUDE_DEVICES.extend(int(idx) for idx in exclude_devices)
|
|
os.environ["KERAS_BACKEND"] = "plaidml.keras.backend"
|
|
plaid = PlaidMLStats(log_level)
|
|
logger.info("Using GPU(s): %s", [plaid.names[i] for i in plaid.active_devices])
|
|
logger.info("Successfully set up for PlaidML")
|