CedarBackup3 package¶
Implements local and remote backups to CD or DVD media.
Cedar Backup is a software package designed to manage system backups for a pool of local and remote machines. Cedar Backup understands how to back up filesystem data as well as MySQL and PostgreSQL databases and Subversion repositories. It can also be easily extended to support other kinds of data sources.
Cedar Backup is focused around weekly backups to a single CD or DVD disc, with the expectation that the disc will be changed or overwritten at the beginning of each week. If your hardware is new enough, Cedar Backup can write multisession discs, allowing you to add incremental data to a disc on a daily basis.
Besides offering command-line utilities to manage the backup process, Cedar Backup provides a well-organized library of backup-related functionality, written in the Python programming language.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
Subpackages¶
- CedarBackup3.actions package
- Submodules
- CedarBackup3.actions.collect module
- CedarBackup3.actions.constants module
- CedarBackup3.actions.initialize module
- CedarBackup3.actions.purge module
- CedarBackup3.actions.rebuild module
- CedarBackup3.actions.stage module
- CedarBackup3.actions.store module
- CedarBackup3.actions.util module
- CedarBackup3.actions.validate module
- CedarBackup3.extend package
- Submodules
- CedarBackup3.extend.amazons3 module
- CedarBackup3.extend.capacity module
- CedarBackup3.extend.encrypt module
- CedarBackup3.extend.mbox module
- CedarBackup3.extend.mysql module
- CedarBackup3.extend.postgresql module
- CedarBackup3.extend.split module
- CedarBackup3.extend.subversion module
- CedarBackup3.extend.sysinfo module
- CedarBackup3.tools package
- CedarBackup3.writers package
Submodules¶
CedarBackup3.action module¶
Provides interface backwards compatibility.
In Cedar Backup 2.10.0, a refactoring effort took place to reorganize the code for the standard actions. The code formerly in action.py was split into various other files in the CedarBackup3.actions package. This mostly-empty file remains to preserve the Cedar Backup library interface.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
CedarBackup3.cli module¶
Provides command-line interface implementation for the cback3 script.
Summary¶
The functionality in this module encapsulates the command-line interface for the cback3 script. The cback3 script itself is very short, basically just an invokation of one function implemented here. That, in turn, makes it simpler to validate the command line interface (for instance, it’s easier to run pychecker against a module, and unit tests are easier, too).
The objects and functions implemented in this module are probably not useful to any code external to Cedar Backup. Anyone else implementing their own command-line interface would have to reimplement (or at least enhance) all of this anyway.
Backwards Compatibility¶
The command line interface has changed between Cedar Backup 1.x and Cedar Backup 2.x. Some new switches have been added, and the actions have become simple arguments rather than switches (which is a much more standard command line format). Old 1.x command lines are generally no longer valid.
Module Attributes¶
-
CedarBackup3.cli.
DEFAULT_CONFIG
¶ The default configuration file
-
CedarBackup3.cli.
DEFAULT_LOGFILE
¶ The default log file path
-
CedarBackup3.cli.
DEFAULT_OWNERSHIP
¶ Default ownership for the logfile
-
CedarBackup3.cli.
DEFAULT_MODE
¶ Default file permissions mode on the logfile
-
CedarBackup3.cli.
VALID_ACTIONS
¶ List of valid actions
-
CedarBackup3.cli.
COMBINE_ACTIONS
¶ List of actions which can be combined with other actions
-
CedarBackup3.cli.
NONCOMBINE_ACTIONS
¶ List of actions which cannot be combined with other actions
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
class
CedarBackup3.cli.
Options
(argumentList=None, argumentString=None, validate=True)[source]¶ Bases:
object
Class representing command-line options for the cback3 script.
The
Options
class is a Python object representation of the command-line options of the cback3 script.The object representation is two-way: a command line string or a list of command line arguments can be used to create an
Options
object, and then changes to the object can be propogated back to a list of command-line arguments or to a command-line string. AnOptions
object can even be created from scratch programmatically (if you have a need for that).There are two main levels of validation in the
Options
class. The first is field-level validation. Field-level validation comes into play when a given field in an object is assigned to or updated. We use Python’sproperty
functionality to enforce specific validations on field values, and in some places we even use customized list classes to enforce validations on list members. You should expect to catch aValueError
exception when making assignments to fields if you are programmatically filling an object.The second level of validation is post-completion validation. Certain validations don’t make sense until an object representation of options is fully “complete”. We don’t want these validations to apply all of the time, because it would make building up a valid object from scratch a real pain. For instance, we might have to do things in the right order to keep from throwing exceptions, etc.
All of these post-completion validations are encapsulated in the
Options.validate
method. This method can be called at any time by a client, and will always be called immediately after creating aOptions
object from a command line and before exporting aOptions
object back to a command line. This way, we get acceptable ease-of-use but we also don’t accept or emit invalid command lines.Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(argumentList=None, argumentString=None, validate=True)[source]¶ Initializes an options object.
If you initialize the object without passing either
argumentList
orargumentString
, the object will be empty and will be invalid until it is filled in properly.No reference to the original arguments is saved off by this class. Once the data has been parsed (successfully or not) this original information is discarded.
The argument list is assumed to be a list of arguments, not including the name of the command, something like
sys.argv[1:]
. If you passsys.argv
instead, things are not going to work.The argument string will be parsed into an argument list by the
util.splitCommandLine
function (see the documentation for that function for some important notes about its limitations). There is an assumption that the resulting list will be equivalent tosys.argv[1:]
, just likeargumentList
.Unless the
validate
argument isFalse
, theOptions.validate
method will be called (with its default arguments) after successfully parsing any passed-in command line. This validation ensures that appropriate actions, etc. have been specified. Keep in mind that even ifvalidate
isFalse
, it might not be possible to parse the passed-in command line, so an exception might still be raised.Note: The command line format is specified by the
_usage
function. Call_usage
to see a usage statement for the cback3 script.Note: It is strongly suggested that the
validate
option always be set toTrue
(the default) unless there is a specific need to read in invalid command line arguments.Parameters: - argumentList (List of arguments, i.e.
sys.argv
) – Command line for a program - argumentString (String, i.e. "cback3 --verbose stage store") – Command line for a program
- validate (Boolean true/false) – Validate the command line after parsing it
Raises: getopt.GetoptError
– If the command-line arguments could not be parsedValueError
– If the command-line arguments are invalid
- argumentList (List of arguments, i.e.
-
actions
¶ Command-line actions list.
-
buildArgumentList
(validate=True)[source]¶ Extracts options into a list of command line arguments.
The original order of the various arguments (if, indeed, the object was initialized with a command-line) is not preserved in this generated argument list. Besides that, the argument list is normalized to use the long option names (i.e. –version rather than -V). The resulting list will be suitable for passing back to the constructor in the
argumentList
parameter. UnlikebuildArgumentString
, string arguments are not quoted here, because there is no need for it.Unless the
validate
parameter isFalse
, theOptions.validate
method will be called (with its default arguments) against the options before extracting the command line. If the options are not valid, then an argument list will not be extracted.Note: It is strongly suggested that the
validate
option always be set toTrue
(the default) unless there is a specific need to extract an invalid command line.Parameters: validate (Boolean true/false) – Validate the options before extracting the command line Returns: List representation of command-line arguments Raises: ValueError
– If options within the object are invalid
-
buildArgumentString
(validate=True)[source]¶ Extracts options into a string of command-line arguments.
The original order of the various arguments (if, indeed, the object was initialized with a command-line) is not preserved in this generated argument string. Besides that, the argument string is normalized to use the long option names (i.e. –version rather than -V) and to quote all string arguments with double quotes (
"
). The resulting string will be suitable for passing back to the constructor in theargumentString
parameter.Unless the
validate
parameter isFalse
, theOptions.validate
method will be called (with its default arguments) against the options before extracting the command line. If the options are not valid, then an argument string will not be extracted.Note: It is strongly suggested that the
validate
option always be set toTrue
(the default) unless there is a specific need to extract an invalid command line.Parameters: validate (Boolean true/false) – Validate the options before extracting the command line Returns: String representation of command-line arguments Raises: ValueError
– If options within the object are invalid
-
config
¶ Command-line configuration file (
-c,--config
) parameter.
-
debug
¶ Command-line debug (
-d,--debug
) flag.
-
diagnostics
¶ Command-line diagnostics (
-D,--diagnostics
) flag.
-
full
¶ Command-line full-backup (
-f,--full
) flag.
-
help
¶ Command-line help (
-h,--help
) flag.
-
logfile
¶ Command-line logfile (
-l,--logfile
) parameter.
-
managed
¶ Command-line managed (
-M,--managed
) flag.
-
managedOnly
¶ Command-line managed-only (
-N,--managed-only
) flag.
-
mode
¶ Command-line mode (
-m,--mode
) parameter.
-
output
¶ Command-line output (
-O,--output
) flag.
-
owner
¶ Command-line owner (
-o,--owner
) parameter, as tuple(user,group)
.
-
quiet
¶ Command-line quiet (
-q,--quiet
) flag.
-
stacktrace
¶ Command-line stacktrace (
-s,--stack
) flag.
-
validate
()[source]¶ Validates command-line options represented by the object.
Unless
--help
or--version
are supplied, at least one action must be specified. Other validations (as for allowed values for particular options) will be taken care of at assignment time by the properties functionality.Note: The command line format is specified by the
_usage
function. Call_usage
to see a usage statement for the cback3 script.Raises: ValueError
– If one of the validations fails
-
verbose
¶ Command-line verbose (
-b,--verbose
) flag.
-
version
¶ Command-line version (
-V,--version
) flag.
-
-
CedarBackup3.cli.
cli
()[source]¶ Implements the command-line interface for the
cback3
script.Essentially, this is the “main routine” for the cback3 script. It does all of the argument processing for the script, and then sets about executing the indicated actions.
As a general rule, only the actions indicated on the command line will be executed. We will accept any of the built-in actions and any of the configured extended actions (which makes action list verification a two- step process).
The
'all'
action has a special meaning: it means that the built-in set of actions (collect, stage, store, purge) will all be executed, in that order. Extended actions will be ignored as part of the'all'
action.Raised exceptions always result in an immediate return. Otherwise, we generally return when all specified actions have been completed. Actions are ignored if the help, version or validate flags are set.
A different error code is returned for each type of failure:
1
: The Python interpreter version is < 3.42
: Error processing command-line arguments3
: Error configuring logging4
: Error parsing indicated configuration file5
: Backup was interrupted with a CTRL-C or similar6
: Error executing specified backup actions
Note: This function contains a good amount of logging at the INFO level, because this is the right place to document high-level flow of control (i.e. what the command-line options were, what config file was being used, etc.)
Note: We assume that anything that must be seen on the screen is logged at the ERROR level. Errors that occur before logging can be configured are written to
sys.stderr
.Returns: Error code as described above
-
CedarBackup3.cli.
setupLogging
(options)[source]¶ Set up logging based on command-line options.
There are two kinds of logging: flow logging and output logging. Output logging contains information about system commands executed by Cedar Backup, for instance the calls to
mkisofs
ormount
, etc. Flow logging contains error and informational messages used to understand program flow. Flow log messages and output log messages are written to two different loggers target (CedarBackup3.log
andCedarBackup3.output
). Flow log messages are written at the ERROR, INFO and DEBUG log levels, while output log messages are generally only written at the INFO log level.By default, output logging is disabled. When the
options.output
oroptions.debug
flags are set, output logging will be written to the configured logfile. Output logging is never written to the screen.By default, flow logging is enabled at the ERROR level to the screen and at the INFO level to the configured logfile. If the
options.quiet
flag is set, flow logging is enabled at the INFO level to the configured logfile only (i.e. no output will be sent to the screen). If theoptions.verbose
flag is set, flow logging is enabled at the INFO level to both the screen and the configured logfile. If theoptions.debug
flag is set, flow logging is enabled at the DEBUG level to both the screen and the configured logfile.Parameters: options ( Options
object) – Command-line optionsReturns: Path to logfile on disk
-
CedarBackup3.cli.
setupPathResolver
(config)[source]¶ Set up the path resolver singleton based on configuration.
Cedar Backup’s path resolver is implemented in terms of a singleton, the
PathResolverSingleton
class. This function takes options configuration, converts it into the dictionary form needed by the singleton, and then initializes the singleton. After that, any function that needs to resolve the path of a command can use the singleton.Parameters: config ( Config
object) – Configuration
CedarBackup3.config module¶
Provides configuration-related objects.
Summary¶
Cedar Backup stores all of its configuration in an XML document typically called
cback3.conf
. The standard location for this document is in/etc
, but users can specify a different location if they want to.The
Config
class is a Python object representation of a Cedar Backup XML configuration file. The representation is two-way: XML data can be used to create aConfig
object, and then changes to the object can be propogated back to disk. AConfig
object can even be used to create a configuration file from scratch programmatically.The
Config
class is intended to be the only Python-language interface to Cedar Backup configuration on disk. Cedar Backup will use the class as its internal representation of configuration, and applications external to Cedar Backup itself (such as a hypothetical third-party configuration tool written in Python or a third party extension module) should also use the class when they need to read and write configuration files.
Backwards Compatibility¶
The configuration file format has changed between Cedar Backup 1.x and Cedar Backup 2.x. Any Cedar Backup 1.x configuration file is also a valid Cedar Backup 2.x configuration file. However, it doesn’t work to go the other direction, as the 2.x configuration files contains additional configuration is not accepted by older versions of the software.
XML Configuration Structure¶
A
Config
object can either be created “empty”, or can be created based on XML input (either in the form of a string or read in from a file on disk). Generally speaking, the XML input must result in aConfig
object which passes the validations laid out below in the Validation section.An XML configuration file is composed of seven sections:
- reference: specifies reference information about the file (author, revision, etc)
- extensions: specifies mappings to Cedar Backup extensions (external code)
- options: specifies global configuration options
- peers: specifies the set of peers in a master’s backup pool
- collect: specifies configuration related to the collect action
- stage: specifies configuration related to the stage action
- store: specifies configuration related to the store action
- purge: specifies configuration related to the purge action
Each section is represented by an class in this module, and then the overall
Config
class is a composition of the various other classes.Any configuration section that is missing in the XML document (or has not been filled into an “empty” document) will just be set to
None
in the object representation. The same goes for individual fields within each configuration section. Keep in mind that the document might not be completely valid if some sections or fields aren’t filled in - but that won’t matter until validation takes place (see the Validation section below).
Unicode vs. String Data¶
By default, all string data that comes out of XML documents in Python is unicode data (i.e.u"whatever"
). This is fine for many things, but when it comes to filesystem paths, it can cause us some problems. We really want strings to be encoded in the filesystem encoding rather than being unicode. So, most elements in configuration which represent filesystem paths are coverted to plain strings usingutil.encodePath
. The main exception is the variousabsoluteExcludePath
andrelativeExcludePath
lists. These are not converted, because they are generally only used for filtering, not for filesystem operations.
Validation¶
There are two main levels of validation in the
Config
class and its children. The first is field-level validation. Field-level validation comes into play when a given field in an object is assigned to or updated. We use Python’sproperty
functionality to enforce specific validations on field values, and in some places we even use customized list classes to enforce validations on list members. You should expect to catch aValueError
exception when making assignments to configuration class fields.The second level of validation is post-completion validation. Certain validations don’t make sense until a document is fully “complete”. We don’t want these validations to apply all of the time, because it would make building up a document from scratch a real pain. For instance, we might have to do things in the right order to keep from throwing exceptions, etc.
All of these post-completion validations are encapsulated in the
Config.validate
method. This method can be called at any time by a client, and will always be called immediately after creating aConfig
object from XML data and before exporting aConfig
object to XML. This way, we get decent ease-of-use but we also don’t accept or emit invalid configuration files.The
Config.validate
implementation actually takes two passes to completely validate a configuration document. The first pass at validation is to ensure that the proper sections are filled into the document. There are default requirements, but the caller has the opportunity to override these defaults.The second pass at validation ensures that any filled-in section contains valid data. Any section which is not set to
None
is validated according to the rules for that section (see below).Reference Validations
No validations.
Extensions Validations
The list of actions may be either
None
or an empty list[]
if desired. Each extended action must include a name, a module and a function. Then, an extended action must include either an index or dependency information. Which one is required depends on which order mode is configured.Options Validations
All fields must be filled in except the rsh command. The rcp and rsh commands are used as default values for all remote peers. Remote peers can also rely on the backup user as the default remote user name if they choose.
Peers Validations
Local peers must be completely filled in, including both name and collect directory. Remote peers must also fill in the name and collect directory, but can leave the remote user and rcp command unset. In this case, the remote user is assumed to match the backup user from the options section and rcp command is taken directly from the options section.
Collect Validations
The target directory must be filled in. The collect mode, archive mode and ignore file are all optional. The list of absolute paths to exclude and patterns to exclude may be either
None
or an empty list[]
if desired.Each collect directory entry must contain an absolute path to collect, and then must either be able to take collect mode, archive mode and ignore file configuration from the parent
CollectConfig
object, or must set each value on its own. The list of absolute paths to exclude, relative paths to exclude and patterns to exclude may be eitherNone
or an empty list[]
if desired. Any list of absolute paths to exclude or patterns to exclude will be combined with the same list in theCollectConfig
object to make the complete list for a given directory.Stage Validations
The target directory must be filled in. There must be at least one peer (remote or local) between the two lists of peers. A list with no entries can be either
None
or an empty list[]
if desired.If a set of peers is provided, this configuration completely overrides configuration in the peers configuration section, and the same validations apply.
Store Validations
The device type and drive speed are optional, and all other values are required (missing booleans will be set to defaults, which is OK).
The image writer functionality in the
writer
module is supposed to be able to handle a device speed ofNone
. Any caller which needs a “real” (non-None
) value for the device type can useDEFAULT_DEVICE_TYPE
, which is guaranteed to be sensible.Purge Validations
The list of purge directories may be either
None
or an empty list[]
if desired. All purge directories must contain a path and a retain days value.
Module Attributes¶
-
CedarBackup3.config.
DEFAULT_DEVICE_TYPE
¶ The default device type
-
CedarBackup3.config.
DEFAULT_MEDIA_TYPE
¶ The default media type
-
CedarBackup3.config.
VALID_DEVICE_TYPES
¶ List of valid device types
-
CedarBackup3.config.
VALID_MEDIA_TYPES
¶ List of valid media types
-
CedarBackup3.config.
VALID_COLLECT_MODES
¶ List of valid collect modes
-
CedarBackup3.config.
VALID_COMPRESS_MODES
¶ List of valid compress modes
-
CedarBackup3.config.
VALID_ARCHIVE_MODES
¶ List of valid archive modes
-
CedarBackup3.config.
VALID_ORDER_MODES
¶ List of valid extension order modes
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
class
CedarBackup3.config.
ActionDependencies
(beforeList=None, afterList=None)[source]¶ Bases:
object
Class representing dependencies associated with an extended action.
Execution ordering for extended actions is done in one of two ways: either by using index values (lower index gets run first) or by having the extended action specify dependencies in terms of other named actions. This class encapsulates the dependency information for an extended action.
The following restrictions exist on data in this class:
- Any action name must be a non-empty string matching
ACTION_NAME_REGEX
-
__init__
(beforeList=None, afterList=None)[source]¶ Constructor for the
ActionDependencies
class.Parameters: - beforeList – List of named actions that this action must be run before
- afterList – List of named actions that this action must be run after
Raises: ValueError
– If one of the values is invalid
-
afterList
¶ List of named actions that this action must be run after.
-
beforeList
¶ List of named actions that this action must be run before.
- Any action name must be a non-empty string matching
-
class
CedarBackup3.config.
ActionHook
(action=None, command=None)[source]¶ Bases:
object
Class representing a hook associated with an action.
A hook associated with an action is a shell command to be executed either before or after a named action is executed.
The following restrictions exist on data in this class:
- The action name must be a non-empty string matching
ACTION_NAME_REGEX
- The shell command must be a non-empty string.
The internal
before
andafter
instance variables are always set to False in this parent class.-
__init__
(action=None, command=None)[source]¶ Constructor for the
ActionHook
class.Parameters: - action – Action this hook is associated with
- command – Shell command to execute
Raises: ValueError
– If one of the values is invalid
-
action
¶ Action this hook is associated with.
-
after
¶ Indicates whether command should be executed after action.
-
before
¶ Indicates whether command should be executed before action.
-
command
¶ Shell command to execute.
- The action name must be a non-empty string matching
-
class
CedarBackup3.config.
BlankBehavior
(blankMode=None, blankFactor=None)[source]¶ Bases:
object
Class representing optimized store-action media blanking behavior.
The following restrictions exist on data in this class:
- The blanking mode must be a one of the values in
VALID_BLANK_MODES
- The blanking factor must be a positive floating point number
-
__init__
(blankMode=None, blankFactor=None)[source]¶ Constructor for the
BlankBehavior
class.Parameters: - blankMode – Blanking mode
- blankFactor – Blanking factor
Raises: ValueError
– If one of the values is invalid
-
blankFactor
¶ Blanking factor
-
blankMode
¶ Blanking mode
- The blanking mode must be a one of the values in
-
class
CedarBackup3.config.
ByteQuantity
(quantity=None, units=None)[source]¶ Bases:
object
Class representing a byte quantity.
A byte quantity has both a quantity and a byte-related unit. Units are maintained using the constants from util.py. If no units are provided,
UNIT_BYTES
is assumed.The quantity is maintained internally as a string so that issues of precision can be avoided. It really isn’t possible to store a floating point number here while being able to losslessly translate back and forth between XML and object representations. (Perhaps the Python 2.4 Decimal class would have been an option, but I originally wanted to stay compatible with Python 2.3.)
Even though the quantity is maintained as a string, the string must be in a valid floating point positive number. Technically, any floating point string format supported by Python is allowble. However, it does not make sense to have a negative quantity of bytes in this context.
-
__init__
(quantity=None, units=None)[source]¶ Constructor for the
ByteQuantity
class.Parameters: - quantity – Quantity of bytes, something interpretable as a float
- units – Unit of bytes, one of VALID_BYTE_UNITS
Raises: ValueError
– If one of the values is invalid
-
bytes
¶ Byte quantity, as a floating point number.
-
quantity
¶ Byte quantity, as a string
-
units
¶ Units for byte quantity, for instance UNIT_BYTES
-
-
class
CedarBackup3.config.
CollectConfig
(targetDir=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, excludePatterns=None, collectFiles=None, collectDirs=None)[source]¶ Bases:
object
Class representing a Cedar Backup collect configuration.
The following restrictions exist on data in this class:
- The target directory must be an absolute path.
- The collect mode must be one of the values in
VALID_COLLECT_MODES
. - The archive mode must be one of the values in
VALID_ARCHIVE_MODES
. - The ignore file must be a non-empty string.
- Each of the paths in
absoluteExcludePaths
must be an absolute path - The collect file list must be a list of
CollectFile
objects. - The collect directory list must be a list of
CollectDir
objects.
For the
absoluteExcludePaths
list, validation is accomplished through theutil.AbsolutePathList
list implementation that overrides common list methods and transparently does the absolute path validation for us.For the
collectFiles
andcollectDirs
list, validation is accomplished through theutil.ObjectTypeList
list implementation that overrides common list methods and transparently ensures that each element has an appropriate type.Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(targetDir=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, excludePatterns=None, collectFiles=None, collectDirs=None)[source]¶ Constructor for the
CollectConfig
class.Parameters: - targetDir – Directory to collect files into
- collectMode – Default collect mode
- archiveMode – Default archive mode for collect files
- ignoreFile – Default ignore file name
- absoluteExcludePaths – List of absolute paths to exclude
- excludePatterns – List of regular expression patterns to exclude
- collectFiles – List of collect files
- collectDirs – List of collect directories
Raises: ValueError
– If one of the values is invalid
-
absoluteExcludePaths
¶ List of absolute paths to exclude.
-
archiveMode
¶ Default archive mode for collect files.
-
collectDirs
¶ List of collect directories.
-
collectFiles
¶ List of collect files.
-
collectMode
¶ Default collect mode.
-
excludePatterns
¶ List of regular expressions patterns to exclude.
-
ignoreFile
¶ Default ignore file name.
-
targetDir
¶ Directory to collect files into.
-
class
CedarBackup3.config.
CollectDir
(absolutePath=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, relativeExcludePaths=None, excludePatterns=None, linkDepth=None, dereference=False, recursionLevel=None)[source]¶ Bases:
object
Class representing a Cedar Backup collect directory.
The following restrictions exist on data in this class:
- Absolute paths must be absolute
- The collect mode must be one of the values in
VALID_COLLECT_MODES
. - The archive mode must be one of the values in
VALID_ARCHIVE_MODES
. - The ignore file must be a non-empty string.
For the
absoluteExcludePaths
list, validation is accomplished through theutil.AbsolutePathList
list implementation that overrides common list methods and transparently does the absolute path validation for us.Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(absolutePath=None, collectMode=None, archiveMode=None, ignoreFile=None, absoluteExcludePaths=None, relativeExcludePaths=None, excludePatterns=None, linkDepth=None, dereference=False, recursionLevel=None)[source]¶ Constructor for the
CollectDir
class.Parameters: - absolutePath – Absolute path of the directory to collect
- collectMode – Overridden collect mode for this directory
- archiveMode – Overridden archive mode for this directory
- ignoreFile – Overidden ignore file name for this directory
- linkDepth – Maximum at which soft links should be followed
- dereference – Whether to dereference links that are followed
- absoluteExcludePaths – List of absolute paths to exclude
- relativeExcludePaths – List of relative paths to exclude
- excludePatterns – List of regular expression patterns to exclude
Raises: ValueError
– If one of the values is invalid
-
absoluteExcludePaths
¶ List of absolute paths to exclude.
-
absolutePath
¶ Absolute path of the directory to collect.
-
archiveMode
¶ Overridden archive mode for this directory.
-
collectMode
¶ Overridden collect mode for this directory.
-
dereference
¶ Whether to dereference links that are followed.
-
excludePatterns
¶ List of regular expression patterns to exclude.
-
ignoreFile
¶ Overridden ignore file name for this directory.
-
linkDepth
¶ Maximum at which soft links should be followed.
-
recursionLevel
¶ Recursion level to use for recursive directory collection
-
relativeExcludePaths
¶ List of relative paths to exclude.
-
class
CedarBackup3.config.
CollectFile
(absolutePath=None, collectMode=None, archiveMode=None)[source]¶ Bases:
object
Class representing a Cedar Backup collect file.
The following restrictions exist on data in this class:
- Absolute paths must be absolute
- The collect mode must be one of the values in
VALID_COLLECT_MODES
. - The archive mode must be one of the values in
VALID_ARCHIVE_MODES
.
-
__init__
(absolutePath=None, collectMode=None, archiveMode=None)[source]¶ Constructor for the
CollectFile
class.Parameters: - absolutePath – Absolute path of the file to collect
- collectMode – Overridden collect mode for this file
- archiveMode – Overridden archive mode for this file
Raises: ValueError
– If one of the values is invalid
-
absolutePath
¶ Absolute path of the file to collect.
-
archiveMode
¶ Overridden archive mode for this file.
-
collectMode
¶ Overridden collect mode for this file.
-
class
CedarBackup3.config.
CommandOverride
(command=None, absolutePath=None)[source]¶ Bases:
object
Class representing a piece of Cedar Backup command override configuration.
The following restrictions exist on data in this class:
- The absolute path must be absolute
Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(command=None, absolutePath=None)[source]¶ Constructor for the
CommandOverride
class.Parameters: - command – Name of command to be overridden
- absolutePath – Absolute path of the overrridden command
Raises: ValueError
– If one of the values is invalid
-
absolutePath
¶ Absolute path of the overrridden command.
-
command
¶ Name of command to be overridden.
-
class
CedarBackup3.config.
Config
(xmlData=None, xmlPath=None, validate=True)[source]¶ Bases:
object
Class representing a Cedar Backup XML configuration document.
The
Config
class is a Python object representation of a Cedar Backup XML configuration file. It is intended to be the only Python-language interface to Cedar Backup configuration on disk for both Cedar Backup itself and for external applications.The object representation is two-way: XML data can be used to create a
Config
object, and then changes to the object can be propogated back to disk. AConfig
object can even be used to create a configuration file from scratch programmatically.This class and the classes it is composed from often use Python’s
property
construct to validate input and limit access to values. Some validations can only be done once a document is considered “complete” (see module notes for more details).Assignments to the various instance variables must match the expected type, i.e.
reference
must be aReferenceConfig
. The internal check uses the built-inisinstance
function, so it should be OK to use subclasses if you want to.If an instance variable is not set, its value will be
None
. When an object is initialized without using an XML document, all of the values will beNone
. Even when an object is initialized using XML, some of the values might beNone
because not every section is required.Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(xmlData=None, xmlPath=None, validate=True)[source]¶ Initializes a configuration object.
If you initialize the object without passing either
xmlData
orxmlPath
, then configuration will be empty and will be invalid until it is filled in properly.No reference to the original XML data or original path is saved off by this class. Once the data has been parsed (successfully or not) this original information is discarded.
Unless the
validate
argument isFalse
, theConfig.validate
method will be called (with its default arguments) against configuration after successfully parsing any passed-in XML. Keep in mind that even ifvalidate
isFalse
, it might not be possible to parse the passed-in XML document if lower-level validations fail.Note: It is strongly suggested that the
validate
option always be set toTrue
(the default) unless there is a specific need to read in invalid configuration from disk.Parameters: - xmlData (String data) – XML data representing configuration
- xmlPath (Absolute path to a file on disk) – Path to an XML file on disk
- validate (Boolean true/false) – Validate the document after parsing it
Raises: ValueError
– If bothxmlData
andxmlPath
are passed-inValueError
– If the XML data inxmlData
orxmlPath
cannot be parsedValueError
– If the parsed configuration document is not valid
-
collect
¶ Collect configuration in terms of a
CollectConfig
object.
-
extensions
¶ Extensions configuration in terms of a
ExtensionsConfig
object.
-
extractXml
(xmlPath=None, validate=True)[source]¶ Extracts configuration into an XML document.
If
xmlPath
is not provided, then the XML document will be returned as a string. IfxmlPath
is provided, then the XML document will be written to the file andNone
will be returned.Unless the
validate
parameter isFalse
, theConfig.validate
method will be called (with its default arguments) against the configuration before extracting the XML. If configuration is not valid, then an XML document will not be extracted.Note: It is strongly suggested that the
validate
option always be set toTrue
(the default) unless there is a specific need to write an invalid configuration file to disk.Parameters: - xmlPath (Absolute path to a file) – Path to an XML file to create on disk
- validate (Boolean true/false) – Validate the document before extracting it
Returns: XML string data or
None
as described aboveRaises: ValueError
– If configuration within the object is not validIOError
– If there is an error writing to the fileOSError
– If there is an error writing to the file
-
options
¶ Options configuration in terms of a
OptionsConfig
object.
-
peers
¶ Peers configuration in terms of a
PeersConfig
object.
-
purge
¶ Purge configuration in terms of a
PurgeConfig
object.
-
reference
¶ Reference configuration in terms of a
ReferenceConfig
object.
-
stage
¶ Stage configuration in terms of a
StageConfig
object.
-
store
¶ Store configuration in terms of a
StoreConfig
object.
-
validate
(requireOneAction=True, requireReference=False, requireExtensions=False, requireOptions=True, requireCollect=False, requireStage=False, requireStore=False, requirePurge=False, requirePeers=False)[source]¶ Validates configuration represented by the object.
This method encapsulates all of the validations that should apply to a fully “complete” document but are not already taken care of by earlier validations. It also provides some extra convenience functionality which might be useful to some people. The process of validation is laid out in the Validation section in the class notes (above).
Parameters: - requireOneAction – Require at least one of the collect, stage, store or purge sections
- requireReference – Require the reference section
- requireExtensions – Require the extensions section
- requireOptions – Require the options section
- requirePeers – Require the peers section
- requireCollect – Require the collect section
- requireStage – Require the stage section
- requireStore – Require the store section
- requirePurge – Require the purge section
Raises: ValueError
– If one of the validations fails
-
-
class
CedarBackup3.config.
ExtendedAction
(name=None, module=None, function=None, index=None, dependencies=None)[source]¶ Bases:
object
Class representing an extended action.
Essentially, an extended action needs to allow the following to happen:
exec("from %s import %s" % (module, function)) exec("%s(action, configPath")" % function)
The following restrictions exist on data in this class:
- The action name must be a non-empty string consisting of lower-case letters and digits.
- The module must be a non-empty string and a valid Python identifier.
- The function must be an on-empty string and a valid Python identifier.
- If set, the index must be a positive integer.
- If set, the dependencies attribute must be an
ActionDependencies
object.
-
__init__
(name=None, module=None, function=None, index=None, dependencies=None)[source]¶ Constructor for the
ExtendedAction
class.Parameters: - name – Name of the extended action
- module – Name of the module containing the extended action function
- function – Name of the extended action function
- index – Index of action, used for execution ordering
- dependencies – Dependencies for action, used for execution ordering
Raises: ValueError
– If one of the values is invalid
-
dependencies
¶ Dependencies for action, used for execution ordering.
-
function
¶ Name of the extended action function.
-
index
¶ Index of action, used for execution ordering.
-
module
¶ Name of the module containing the extended action function.
-
name
¶ Name of the extended action.
-
class
CedarBackup3.config.
ExtensionsConfig
(actions=None, orderMode=None)[source]¶ Bases:
object
Class representing Cedar Backup extensions configuration.
Extensions configuration is used to specify “extended actions” implemented by code external to Cedar Backup. For instance, a hypothetical third party might write extension code to collect database repository data. If they write a properly-formatted extension function, they can use the extension configuration to map a command-line Cedar Backup action (i.e. “database”) to their function.
The following restrictions exist on data in this class:
- If set, the order mode must be one of the values in
VALID_ORDER_MODES
- The actions list must be a list of
ExtendedAction
objects.
-
__init__
(actions=None, orderMode=None)[source]¶ Constructor for the
ExtensionsConfig
class. :param actions: List of extended actions
-
actions
¶ List of extended actions.
-
orderMode
¶ Order mode for extensions, to control execution ordering.
- If set, the order mode must be one of the values in
-
class
CedarBackup3.config.
LocalPeer
(name=None, collectDir=None, ignoreFailureMode=None)[source]¶ Bases:
object
Class representing a Cedar Backup peer.
The following restrictions exist on data in this class:
- The peer name must be a non-empty string.
- The collect directory must be an absolute path.
- The ignore failure mode must be one of the values in
VALID_FAILURE_MODES
.
-
__init__
(name=None, collectDir=None, ignoreFailureMode=None)[source]¶ Constructor for the
LocalPeer
class.Parameters: - name – Name of the peer, typically a valid hostname
- collectDir – Collect directory to stage files from on peer
- ignoreFailureMode – Ignore failure mode for peer
Raises: ValueError
– If one of the values is invalid
-
collectDir
¶ Collect directory to stage files from on peer.
-
ignoreFailureMode
¶ Ignore failure mode for peer.
-
name
¶ Name of the peer, typically a valid hostname.
-
class
CedarBackup3.config.
OptionsConfig
(startingDay=None, workingDir=None, backupUser=None, backupGroup=None, rcpCommand=None, overrides=None, hooks=None, rshCommand=None, cbackCommand=None, managedActions=None)[source]¶ Bases:
object
Class representing a Cedar Backup global options configuration.
The options section is used to store global configuration options and defaults that can be applied to other sections.
The following restrictions exist on data in this class:
- The working directory must be an absolute path.
- The starting day must be a day of the week in English, i.e.
"monday"
,"tuesday"
, etc. - All of the other values must be non-empty strings if they are set to something other than
None
. - The overrides list must be a list of
CommandOverride
objects. - The hooks list must be a list of
ActionHook
objects. - The cback command must be a non-empty string.
- Any managed action name must be a non-empty string matching
ACTION_NAME_REGEX
-
__init__
(startingDay=None, workingDir=None, backupUser=None, backupGroup=None, rcpCommand=None, overrides=None, hooks=None, rshCommand=None, cbackCommand=None, managedActions=None)[source]¶ Constructor for the
OptionsConfig
class.Parameters: - startingDay – Day that starts the week
- workingDir – Working (temporary) directory to use for backups
- backupUser – Effective user that backups should run as
- backupGroup – Effective group that backups should run as
- rcpCommand – Default rcp-compatible copy command for staging
- rshCommand – Default rsh-compatible command to use for remote shells
- cbackCommand – Default cback-compatible command to use on managed remote peers
- overrides – List of configured command path overrides, if any
- hooks – List of configured pre- and post-action hooks
- managedActions – Default set of actions that are managed on remote peers
Raises: ValueError
– If one of the values is invalid
-
addOverride
(command, absolutePath)[source]¶ If no override currently exists for the command, add one. :param command: Name of command to be overridden :param absolutePath: Absolute path of the overrridden command
-
backupGroup
¶ Effective group that backups should run as.
-
backupUser
¶ Effective user that backups should run as.
-
cbackCommand
¶ Default cback-compatible command to use on managed remote peers.
-
hooks
¶ List of configured pre- and post-action hooks.
-
managedActions
¶ Default set of actions that are managed on remote peers.
-
overrides
¶ List of configured command path overrides, if any.
-
rcpCommand
¶ Default rcp-compatible copy command for staging.
-
replaceOverride
(command, absolutePath)[source]¶ If override currently exists for the command, replace it; otherwise add it. :param command: Name of command to be overridden :param absolutePath: Absolute path of the overrridden command
-
rshCommand
¶ Default rsh-compatible command to use for remote shells.
-
startingDay
¶ Day that starts the week.
-
workingDir
¶ Working (temporary) directory to use for backups.
-
class
CedarBackup3.config.
PeersConfig
(localPeers=None, remotePeers=None)[source]¶ Bases:
object
Class representing Cedar Backup global peer configuration.
This section contains a list of local and remote peers in a master’s backup pool. The section is optional. If a master does not define this section, then all peers are unmanaged, and the stage configuration section must explicitly list any peer that is to be staged. If this section is configured, then peers may be managed or unmanaged, and the stage section peer configuration (if any) completely overrides this configuration.
The following restrictions exist on data in this class:
- The list of local peers must contain only
LocalPeer
objects - The list of remote peers must contain only
RemotePeer
objects
Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(localPeers=None, remotePeers=None)[source]¶ Constructor for the
PeersConfig
class.Parameters: - localPeers – List of local peers
- remotePeers – List of remote peers
Raises: ValueError
– If one of the values is invalid
-
hasPeers
()[source]¶ Indicates whether any peers are filled into this object. :returns: Boolean true if any local or remote peers are filled in, false otherwise
-
localPeers
¶ List of local peers.
-
remotePeers
¶ List of remote peers.
- The list of local peers must contain only
-
class
CedarBackup3.config.
PostActionHook
(action=None, command=None)[source]¶ Bases:
CedarBackup3.config.ActionHook
Class representing a pre-action hook associated with an action.
A hook associated with an action is a shell command to be executed either before or after a named action is executed. In this case, a post-action hook is executed after the named action.
The following restrictions exist on data in this class:
- The action name must be a non-empty string consisting of lower-case letters and digits.
- The shell command must be a non-empty string.
The internal
before
instance variable is always set to True in this class.
-
class
CedarBackup3.config.
PreActionHook
(action=None, command=None)[source]¶ Bases:
CedarBackup3.config.ActionHook
Class representing a pre-action hook associated with an action.
A hook associated with an action is a shell command to be executed either before or after a named action is executed. In this case, a pre-action hook is executed before the named action.
The following restrictions exist on data in this class:
- The action name must be a non-empty string consisting of lower-case letters and digits.
- The shell command must be a non-empty string.
The internal
before
instance variable is always set to True in this class.
-
class
CedarBackup3.config.
PurgeConfig
(purgeDirs=None)[source]¶ Bases:
object
Class representing a Cedar Backup purge configuration.
The following restrictions exist on data in this class:
- The purge directory list must be a list of
PurgeDir
objects.
For the
purgeDirs
list, validation is accomplished through theutil.ObjectTypeList
list implementation that overrides common list methods and transparently ensures that each element is aPurgeDir
.Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(purgeDirs=None)[source]¶ Constructor for the
Purge
class. :param purgeDirs: List of purge directoriesRaises: ValueError
– If one of the values is invalid
-
purgeDirs
¶ List of directories to purge.
- The purge directory list must be a list of
-
class
CedarBackup3.config.
PurgeDir
(absolutePath=None, retainDays=None)[source]¶ Bases:
object
Class representing a Cedar Backup purge directory.
The following restrictions exist on data in this class:
- The absolute path must be an absolute path
- The retain days value must be an integer >= 0.
-
__init__
(absolutePath=None, retainDays=None)[source]¶ Constructor for the
PurgeDir
class.Parameters: - absolutePath – Absolute path of the directory to be purged
- retainDays – Number of days content within directory should be retained
Raises: ValueError
– If one of the values is invalid
-
absolutePath
¶ Absolute path of directory to purge.
-
retainDays
¶ Number of days content within directory should be retained.
-
class
CedarBackup3.config.
ReferenceConfig
(author=None, revision=None, description=None, generator=None)[source]¶ Bases:
object
Class representing a Cedar Backup reference configuration.
The reference information is just used for saving off metadata about configuration and exists mostly for backwards-compatibility with Cedar Backup 1.x.
-
__init__
(author=None, revision=None, description=None, generator=None)[source]¶ Constructor for the
ReferenceConfig
class.Parameters: - author – Author of the configuration file
- revision – Revision of the configuration file
- description – Description of the configuration file
- generator – Tool that generated the configuration file
Author of the configuration file.
-
description
¶ Description of the configuration file.
-
generator
¶ Tool that generated the configuration file.
-
revision
¶ Revision of the configuration file.
-
-
class
CedarBackup3.config.
RemotePeer
(name=None, collectDir=None, remoteUser=None, rcpCommand=None, rshCommand=None, cbackCommand=None, managed=False, managedActions=None, ignoreFailureMode=None)[source]¶ Bases:
object
Class representing a Cedar Backup peer.
The following restrictions exist on data in this class:
- The peer name must be a non-empty string.
- The collect directory must be an absolute path.
- The remote user must be a non-empty string.
- The rcp command must be a non-empty string.
- The rsh command must be a non-empty string.
- The cback command must be a non-empty string.
- Any managed action name must be a non-empty string matching
ACTION_NAME_REGEX
- The ignore failure mode must be one of the values in
VALID_FAILURE_MODES
.
-
__init__
(name=None, collectDir=None, remoteUser=None, rcpCommand=None, rshCommand=None, cbackCommand=None, managed=False, managedActions=None, ignoreFailureMode=None)[source]¶ Constructor for the
RemotePeer
class.Parameters: - name – Name of the peer, must be a valid hostname
- collectDir – Collect directory to stage files from on peer
- remoteUser – Name of backup user on remote peer
- rcpCommand – Overridden rcp-compatible copy command for peer
- rshCommand – Overridden rsh-compatible remote shell command for peer
- cbackCommand – Overridden cback-compatible command to use on remote peer
- managed – Indicates whether this is a managed peer
- managedActions – Overridden set of actions that are managed on the peer
- ignoreFailureMode – Ignore failure mode for peer
Raises: ValueError
– If one of the values is invalid
-
cbackCommand
¶ Overridden cback-compatible command to use on remote peer.
-
collectDir
¶ Collect directory to stage files from on peer.
-
ignoreFailureMode
¶ Ignore failure mode for peer.
-
managed
¶ Indicates whether this is a managed peer.
-
managedActions
¶ Overridden set of actions that are managed on the peer.
-
name
¶ Name of the peer, must be a valid hostname.
-
rcpCommand
¶ Overridden rcp-compatible copy command for peer.
-
remoteUser
¶ Name of backup user on remote peer.
-
rshCommand
¶ Overridden rsh-compatible remote shell command for peer.
-
class
CedarBackup3.config.
StageConfig
(targetDir=None, localPeers=None, remotePeers=None)[source]¶ Bases:
object
Class representing a Cedar Backup stage configuration.
The following restrictions exist on data in this class:
- The target directory must be an absolute path
- The list of local peers must contain only
LocalPeer
objects - The list of remote peers must contain only
RemotePeer
objects
Note: Lists within this class are “unordered” for equality comparisons.
-
__init__
(targetDir=None, localPeers=None, remotePeers=None)[source]¶ Constructor for the
StageConfig
class.Parameters: - targetDir – Directory to stage files into, by peer name
- localPeers – List of local peers
- remotePeers – List of remote peers
Raises: ValueError
– If one of the values is invalid
-
hasPeers
()[source]¶ Indicates whether any peers are filled into this object. :returns: Boolean true if any local or remote peers are filled in, false otherwise
-
localPeers
¶ List of local peers.
-
remotePeers
¶ List of remote peers.
-
targetDir
¶ Directory to stage files into, by peer name.
-
class
CedarBackup3.config.
StoreConfig
(sourceDir=None, mediaType=None, deviceType=None, devicePath=None, deviceScsiId=None, driveSpeed=None, checkData=False, warnMidnite=False, noEject=False, checkMedia=False, blankBehavior=None, refreshMediaDelay=None, ejectDelay=None)[source]¶ Bases:
object
Class representing a Cedar Backup store configuration.
The following restrictions exist on data in this class:
- The source directory must be an absolute path.
- The media type must be one of the values in
VALID_MEDIA_TYPES
. - The device type must be one of the values in
VALID_DEVICE_TYPES
. - The device path must be an absolute path.
- The SCSI id, if provided, must be in the form specified by
validateScsiId
. - The drive speed must be an integer >= 1
- The blanking behavior must be a
BlankBehavior
object - The refresh media delay must be an integer >= 0
- The eject delay must be an integer >= 0
Note that although the blanking factor must be a positive floating point number, it is stored as a string. This is done so that we can losslessly go back and forth between XML and object representations of configuration.
-
__init__
(sourceDir=None, mediaType=None, deviceType=None, devicePath=None, deviceScsiId=None, driveSpeed=None, checkData=False, warnMidnite=False, noEject=False, checkMedia=False, blankBehavior=None, refreshMediaDelay=None, ejectDelay=None)[source]¶ Constructor for the
StoreConfig
class.Parameters: - sourceDir – Directory whose contents should be written to media
- mediaType – Type of the media (see notes above)
- deviceType – Type of the device (optional, see notes above)
- devicePath – Filesystem device name for writer device, i.e.
/dev/cdrw
- deviceScsiId – SCSI id for writer device, i.e.
[<method>:]scsibus,target,lun
- driveSpeed – Speed of the drive, i.e.
2
for 2x drive, etc - checkData – Whether resulting image should be validated
- checkMedia – Whether media should be checked before being written to
- warnMidnite – Whether to generate warnings for crossing midnite
- noEject – Indicates that the writer device should not be ejected
- blankBehavior – Controls optimized blanking behavior
- refreshMediaDelay – Delay, in seconds, to add after refreshing media
- ejectDelay – Delay, in seconds, to add after ejecting media before closing the tray
Raises: ValueError
– If one of the values is invalid
-
blankBehavior
¶ Controls optimized blanking behavior.
-
checkData
¶ Whether resulting image should be validated.
-
checkMedia
¶ Whether media should be checked before being written to.
-
devicePath
¶ Filesystem device name for writer device.
-
deviceScsiId
¶ SCSI id for writer device (optional, see notes above).
-
deviceType
¶ Type of the device (optional, see notes above).
-
driveSpeed
¶ Speed of the drive.
-
ejectDelay
¶ Delay, in seconds, to add after ejecting media before closing the tray
-
mediaType
¶ Type of the media (see notes above).
-
noEject
¶ Indicates that the writer device should not be ejected.
-
refreshMediaDelay
¶ Delay, in seconds, to add after refreshing media.
-
sourceDir
¶ Directory whose contents should be written to media.
-
warnMidnite
¶ Whether to generate warnings for crossing midnite.
-
CedarBackup3.config.
addByteQuantityNode
(xmlDom, parentNode, nodeName, byteQuantity)[source]¶ Adds a text node as the next child of a parent, to contain a byte size.
If the
byteQuantity
is None, then the node will be created, but will be empty (i.e. will contain no text node child).The size in bytes will be normalized. If it is larger than 1.0 GB, it will be shown in GB (“1.0 GB”). If it is larger than 1.0 MB (“1.0 MB”), it will be shown in MB. Otherwise, it will be shown in bytes (“423413”).
Parameters: - xmlDom – DOM tree as from
impl.createDocument()
- parentNode – Parent node to create child for
- nodeName – Name of the new container node
- byteQuantity – ByteQuantity object to put into the XML document
Returns: Reference to the newly-created node
- xmlDom – DOM tree as from
-
CedarBackup3.config.
readByteQuantity
(parent, name)[source]¶ Read a byte size value from an XML document.
A byte size value is an interpreted string value. If the string value ends with “MB” or “GB”, then the string before that is interpreted as megabytes or gigabytes. Otherwise, it is intepreted as bytes.
Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: ByteQuantity parsed from XML document
CedarBackup3.customize module¶
Implements customized behavior.
Some behaviors need to vary when packaged for certain platforms. For instance, while Cedar Backup generally uses cdrecord and mkisofs, Debian ships compatible utilities called wodim and genisoimage. I want there to be one single place where Cedar Backup is patched for Debian, rather than having to maintain a variety of patches in different places.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
CedarBackup3.customize.
customizeOverrides
(config, platform='standard')[source]¶ Modify command overrides based on the configured platform.
On some platforms, we want to add command overrides to configuration. Each override will only be added if the configuration does not already contain an override with the same name. That way, the user still has a way to choose their own version of the command if they want.
Parameters: - config – Configuration to modify
- platform – Platform that is in use
CedarBackup3.filesystem module¶
Provides filesystem-related objects. :author: Kenneth J. Pronovici <pronovic@ieee.org>
-
class
CedarBackup3.filesystem.
BackupFileList
[source]¶ Bases:
CedarBackup3.filesystem.FilesystemList
List of files to be backed up.
A BackupFileList is a
FilesystemList
containing a list of files to be backed up. It only contains files, not directories (soft links are treated like files). On top of the generic functionality provided byFilesystemList
, this class adds functionality to keep a hash (checksum) for each file in the list, and it also provides a method to calculate the total size of the files in the list and a way to export the list into tar form.-
addDir
(path)[source]¶ Adds a directory to the list.
Note that this class does not allow directories to be added by themselves (a backup list contains only files). However, since links to directories are technically files, we allow them to be added.
This method is implemented in terms of the superclass method, with one additional validation: the superclass method is only called if the passed-in path is both a directory and a link. All of the superclass’s existing validations and restrictions apply.
Parameters: path (String representing a path on disk) – Directory path to be added to the list
Returns: Number of items added to the list
Raises: ValueError
– If path is not a directory or does not existValueError
– If the path could not be encoded properly
-
generateDigestMap
(stripPrefix=None)[source]¶ Generates a mapping from file to file digest.
Currently, the digest is an SHA hash, which should be pretty secure. In the future, this might be a different kind of hash, but we guarantee that the type of the hash will not change unless the library major version number is bumped.
Entries which do not exist on disk are ignored.
Soft links are ignored. We would end up generating a digest for the file that the soft link points at, which doesn’t make any sense.
If
stripPrefix
is passed in, then that prefix will be stripped from each key when the map is generated. This can be useful in generating two “relative” digest maps to be compared to one another.Parameters: stripPrefix (String with any contents) – Common prefix to be stripped from paths Returns: Dictionary mapping file to digest value @see:
removeUnchanged
-
generateFitted
(capacity, algorithm='worst_fit')[source]¶ Generates a list of items that fit in the indicated capacity.
Sometimes, callers would like to include every item in a list, but are unable to because not all of the items fit in the space available. This method returns a copy of the list, containing only the items that fit in a given capacity. A copy is returned so that we don’t lose any information if for some reason the fitted list is unsatisfactory.
The fitting is done using the functions in the knapsack module. By default, the first fit algorithm is used, but you can also choose from best fit, worst fit and alternate fit.
Parameters: - capacity (Integer, in bytes) – Maximum capacity among the files in the new list
- algorithm (One of "first_fit", "best_fit", "worst_fit", "alternate_fit") – Knapsack (fit) algorithm to use
Returns: Copy of list with total size no larger than indicated capacity
Raises: ValueError
– If the algorithm is invalid
-
generateSizeMap
()[source]¶ Generates a mapping from file to file size in bytes. The mapping does include soft links, which are listed with size zero. Entries which do not exist on disk are ignored. :returns: Dictionary mapping file to file size
-
generateSpan
(capacity, algorithm='worst_fit')[source]¶ Splits the list of items into sub-lists that fit in a given capacity.
Sometimes, callers need split to a backup file list into a set of smaller lists. For instance, you could use this to “span” the files across a set of discs.
The fitting is done using the functions in the knapsack module. By default, the first fit algorithm is used, but you can also choose from best fit, worst fit and alternate fit.
Note: If any of your items are larger than the capacity, then it won’t be possible to find a solution. In this case, a value error will be raised.
Parameters: - capacity (Integer, in bytes) – Maximum capacity among the files in the new list
- algorithm (One of "first_fit", "best_fit", "worst_fit", "alternate_fit") – Knapsack (fit) algorithm to use
Returns: List of
SpanItem
objectsRaises: ValueError
– If the algorithm is invalidValueError
– If it’s not possible to fit some items
-
generateTarfile
(path, mode='tar', ignore=False, flat=False)[source]¶ Creates a tar file containing the files in the list.
By default, this method will create uncompressed tar files. If you pass in mode
'targz'
, then it will create gzipped tar files, and if you pass in mode'tarbz2'
, then it will create bzipped tar files.The tar file will be created as a GNU tar archive, which enables extended file name lengths, etc. Since GNU tar is so prevalent, I’ve decided that the extra functionality out-weighs the disadvantage of not being “standard”.
If you pass in
flat=True
, then a “flat” archive will be created, and all of the files will be added to the root of the archive. So, the file/tmp/something/whatever.txt
would be added as justwhatever.txt
.By default, the whole method call fails if there are problems adding any of the files to the archive, resulting in an exception. Under these circumstances, callers are advised that they might want to call
removeInvalid
and then attempt to extract the tar file a second time, since the most common cause of failures is a missing file (a file that existed when the list was built, but is gone again by the time the tar file is built).If you want to, you can pass in
ignore=True
, and the method will ignore errors encountered when adding individual files to the archive (but not errors opening and closing the archive itself).We’ll always attempt to remove the tarfile from disk if an exception will be thrown.
Note: No validation is done as to whether the entries in the list are files, since only files or soft links should be in an object like this. However, to be safe, everything is explicitly added to the tar archive non-recursively so it’s safe to include soft links to directories.
Note: The Python
tarfile
module, which is used internally here, is supposed to deal properly with long filenames and links. In my testing, I have found that it appears to be able to add long really long filenames to archives, but doesn’t do a good job reading them back out, even out of an archive it created. Fortunately, all Cedar Backup does is add files to archives.Parameters: - path (String representing a path on disk) – Path of tar file to create on disk
- mode (One of either
'tar'
,'targz'
or'tarbz2'
) – Tar creation mode - ignore (Boolean) – Indicates whether to ignore certain errors
- flat (Boolean) – Creates “flat” archive by putting all items in root
Raises: ValueError
– If mode is not validValueError
– If list is emptyValueError
– If the path could not be encoded properlyTarError
– If there is a problem creating the tar file
-
removeUnchanged
(digestMap, captureDigest=False)[source]¶ Removes unchanged entries from the list.
This method relies on a digest map as returned from
generateDigestMap
. For each entry indigestMap
, if the entry also exists in the current list and the entry in the current list has the same digest value as in the map, the entry in the current list will be removed.This method offers a convenient way for callers to filter unneeded entries from a list. The idea is that a caller will capture a digest map from
generateDigestMap
at some point in time (perhaps the beginning of the week), and will save off that map usingpickle
or some other method. Then, the caller could use this method sometime in the future to filter out any unchanged files based on the saved-off map.If
captureDigest
is passed-in asTrue
, then digest information will be captured for the entire list before the removal step occurs using the same rules as ingenerateDigestMap
. The check will involve a lookup into the complete digest map.If
captureDigest
is passed in asFalse
, we will only generate a digest value for files we actually need to check, and we’ll ignore any entry in the list which isn’t a file that currently exists on disk.The return value varies depending on
captureDigest
, as well. To preserve backwards compatibility, ifcaptureDigest
isFalse
, then we’ll just return a single value representing the number of entries removed. Otherwise, we’ll return a tuple of C{(entries removed, digest map)}. The returned digest map will be in exactly the form returned bygenerateDigestMap
.Note: For performance reasons, this method actually ends up rebuilding the list from scratch. First, we build a temporary dictionary containing all of the items from the original list. Then, we remove items as needed from the dictionary (which is faster than the equivalent operation on a list). Finally, we replace the contents of the current list based on the keys left in the dictionary. This should be transparent to the caller.
Parameters: - digestMap (Map as returned from
generateDigestMap
) – Dictionary mapping file name to digest value - captureDigest (Boolean) – Indicates that digest information should be captured
Returns: Results as discussed above (format varies based on arguments)
- digestMap (Map as returned from
-
-
class
CedarBackup3.filesystem.
FilesystemList
[source]¶ Bases:
list
Represents a list of filesystem items.
This is a generic class that represents a list of filesystem items. Callers can add individual files or directories to the list, or can recursively add the contents of a directory. The class also allows for up-front exclusions in several forms (all files, all directories, all items matching a pattern, all items whose basename matches a pattern, or all directories containing a specific “ignore file”). Symbolic links are typically backed up non-recursively, i.e. the link to a directory is backed up, but not the contents of that link (we don’t want to deal with recursive loops, etc.).
The custom methods such as
addFile
will only add items if they exist on the filesystem and do not match any exclusions that are already in place. However, since a FilesystemList is a subclass of Python’s standard list class, callers can also add items to the list in the usual way, using methods likeappend()
orinsert()
. No validations apply to items added to the list in this way; however, many list-manipulation methods deal “gracefully” with items that don’t exist in the filesystem, often by ignoring them.Once a list has been created, callers can remove individual items from the list using standard methods like
pop()
orremove()
or they can use custom methods to remove specific types of entries or entries which match a particular pattern.Note: Regular expression patterns that apply to paths are assumed to be bounded at front and back by the beginning and end of the string, i.e. they are treated as if they begin with
^
and end with$
. This is true whether we are matching a complete path or a basename.-
addDir
(path)[source]¶ Adds a directory to the list.
The path must exist and must be a directory or a link to an existing directory. It will be added to the list subject to any exclusions that are in place. The
ignoreFile
does not apply to this method, only toaddDirContents
.Parameters: path (String representing a path on disk) – Directory path to be added to the list
Returns: Number of items added to the list
Raises: ValueError
– If path is not a directory or does not existValueError
– If the path could not be encoded properly
-
addDirContents
(path, recursive=True, addSelf=True, linkDepth=0, dereference=False)[source]¶ Adds the contents of a directory to the list.
The path must exist and must be a directory or a link to a directory. The contents of the directory (as well as the directory path itself) will be recursively added to the list, subject to any exclusions that are in place. If you only want the directory and its immediate contents to be added, then pass in
recursive=False
.Note: If a directory’s absolute path matches an exclude pattern or path, or if the directory contains the configured ignore file, then the directory and all of its contents will be recursively excluded from the list.
Note: If the passed-in directory happens to be a soft link, it will be recursed. However, the linkDepth parameter controls whether any soft links within the directory will be recursed. The link depth is maximum depth of the tree at which soft links should be followed. So, a depth of 0 does not follow any soft links, a depth of 1 follows only links within the passed-in directory, a depth of 2 follows the links at the next level down, etc.
Note: Any invalid soft links (i.e. soft links that point to non-existent items) will be silently ignored.
Note: The
excludeDirs
flag only controls whether any given directory path itself is added to the list once it has been discovered. It does not modify any behavior related to directory recursion.Note: If you call this method on a link to a directory that link will never be dereferenced (it may, however, be followed).
Parameters: - path (String representing a path on disk) – Directory path whose contents should be added to the list
- recursive (Boolean value) – Indicates whether directory contents should be added recursively
- addSelf (Boolean value) – Indicates whether the directory itself should be added to the list
- linkDepth (Integer value) – Maximum depth of the tree at which soft links should be followed, zero means not to folow
- dereference (Boolean value) – Indicates whether soft links, if followed, should be dereferenced
Returns: Number of items recursively added to the list
Raises: ValueError
– If path is not a directory or does not existValueError
– If the path could not be encoded properly
-
addFile
(path)[source]¶ Adds a file to the list.
The path must exist and must be a file or a link to an existing file. It will be added to the list subject to any exclusions that are in place.
Parameters: path (String representing a path on disk) – File path to be added to the list
Returns: Number of items added to the list
Raises: ValueError
– If path is not a file or does not existValueError
– If the path could not be encoded properly
-
excludeBasenamePatterns
¶ List of regular expression patterns (matching basename) to be excluded.
-
excludeDirs
¶ Boolean indicating whether directories should be excluded.
-
excludeFiles
¶ Boolean indicating whether files should be excluded.
-
excludeLinks
¶ Boolean indicating whether soft links should be excluded.
-
excludePaths
¶ List of absolute paths to be excluded.
-
excludePatterns
¶ List of regular expression patterns (matching complete path) to be excluded.
-
ignoreFile
¶ Name of file which will cause directory contents to be ignored.
-
removeDirs
(pattern=None)[source]¶ Removes directory entries from the list.
If
pattern
is not passed in or isNone
, then all directory entries will be removed from the list. Otherwise, only those directory entries matching the pattern will be removed. Any entry which does not exist on disk will be ignored (useremoveInvalid
to purge those entries).This method might be fairly slow for large lists, since it must check the type of each item in the list. If you know ahead of time that you want to exclude all directories, then you will be better off setting
excludeDirs
toTrue
before adding items to the list (note that this will not prevent you from recursively adding the contents of directories).Parameters: pattern – Regular expression pattern representing entries to remove Returns: Number of entries removed Raises: ValueError
– If the passed-in pattern is not a valid regular expression
-
removeFiles
(pattern=None)[source]¶ Removes file entries from the list.
If
pattern
is not passed in or isNone
, then all file entries will be removed from the list. Otherwise, only those file entries matching the pattern will be removed. Any entry which does not exist on disk will be ignored (useremoveInvalid
to purge those entries).This method might be fairly slow for large lists, since it must check the type of each item in the list. If you know ahead of time that you want to exclude all files, then you will be better off setting
excludeFiles
toTrue
before adding items to the list.Parameters: pattern – Regular expression pattern representing entries to remove Returns: Number of entries removed Raises: ValueError
– If the passed-in pattern is not a valid regular expression
-
removeInvalid
()[source]¶ Removes from the list all entries that do not exist on disk.
This method removes from the list all entries which do not currently exist on disk in some form. No attention is paid to whether the entries are files or directories.
Returns: Number of entries removed
-
removeLinks
(pattern=None)[source]¶ Removes soft link entries from the list.
If
pattern
is not passed in or isNone
, then all soft link entries will be removed from the list. Otherwise, only those soft link entries matching the pattern will be removed. Any entry which does not exist on disk will be ignored (useremoveInvalid
to purge those entries).This method might be fairly slow for large lists, since it must check the type of each item in the list. If you know ahead of time that you want to exclude all soft links, then you will be better off setting
excludeLinks
toTrue
before adding items to the list.Parameters: pattern – Regular expression pattern representing entries to remove Returns: Number of entries removed Raises: ValueError
– If the passed-in pattern is not a valid regular expression
-
removeMatch
(pattern)[source]¶ Removes from the list all entries matching a pattern.
This method removes from the list all entries which match the passed in
pattern
. Since there is no need to check the type of each entry, it is faster to call this method than to call theremoveFiles
,removeDirs
orremoveLinks
methods individually. If you know which patterns you will want to remove ahead of time, you may be better off settingexcludePatterns
orexcludeBasenamePatterns
before adding items to the list.Note: Unlike when using the exclude lists, the pattern here is not bounded at the front and the back of the string. You can use any pattern you want.
Parameters: pattern – Regular expression pattern representing entries to remove Returns: Number of entries removed Raises: ValueError
– If the passed-in pattern is not a valid regular expression
-
-
class
CedarBackup3.filesystem.
PurgeItemList
[source]¶ Bases:
CedarBackup3.filesystem.FilesystemList
List of files and directories to be purged.
A PurgeItemList is a
FilesystemList
containing a list of files and directories to be purged. On top of the generic functionality provided byFilesystemList
, this class adds functionality to remove items that are too young to be purged, and to actually remove each item in the list from the filesystem.The other main difference is that when you add a directory’s contents to a purge item list, the directory itself is not added to the list. This way, if someone asks to purge within in
/opt/backup/collect
, that directory doesn’t get removed once all of the files within it is gone.-
addDirContents
(path, recursive=True, addSelf=True, linkDepth=0, dereference=False)[source]¶ Adds the contents of a directory to the list.
The path must exist and must be a directory or a link to a directory. The contents of the directory (but not the directory path itself) will be recursively added to the list, subject to any exclusions that are in place. If you only want the directory and its contents to be added, then pass in
recursive=False
.Note: If a directory’s absolute path matches an exclude pattern or path, or if the directory contains the configured ignore file, then the directory and all of its contents will be recursively excluded from the list.
Note: If the passed-in directory happens to be a soft link, it will be recursed. However, the linkDepth parameter controls whether any soft links within the directory will be recursed. The link depth is maximum depth of the tree at which soft links should be followed. So, a depth of 0 does not follow any soft links, a depth of 1 follows only links within the passed-in directory, a depth of 2 follows the links at the next level down, etc.
Note: Any invalid soft links (i.e. soft links that point to non-existent items) will be silently ignored.
Note: The
excludeDirs
flag only controls whether any given soft link path itself is added to the list once it has been discovered. It does not modify any behavior related to directory recursion.Note: The
excludeDirs
flag only controls whether any given directory path itself is added to the list once it has been discovered. It does not modify any behavior related to directory recursion.Note: If you call this method on a link to a directory that link will never be dereferenced (it may, however, be followed).
Parameters: - path (String representing a path on disk) – Directory path whose contents should be added to the list
- recursive (Boolean value) – Indicates whether directory contents should be added recursively
- addSelf – Ignored in this subclass
- linkDepth (Integer value, where zero means not to follow any soft links) – Depth of soft links that should be followed
- dereference (Boolean value) – Indicates whether soft links, if followed, should be dereferenced
Returns: Number of items recursively added to the list
Raises: ValueError
– If path is not a directory or does not existValueError
– If the path could not be encoded properly
-
purgeItems
()[source]¶ Purges all items in the list.
Every item in the list will be purged. Directories in the list will not be purged recursively, and hence will only be removed if they are empty. Errors will be ignored.
To faciliate easy removal of directories that will end up being empty, the delete process happens in two passes: files first (including soft links), then directories.
Returns: Tuple containing count of (files, dirs) removed
-
removeYoungFiles
(daysOld)[source]¶ Removes from the list files younger than a certain age (in days).
Any file whose “age” in days is less than (
<
) the value of thedaysOld
parameter will be removed from the list so that it will not be purged later whenpurgeItems
is called. Directories and soft links will be ignored.The “age” of a file is the amount of time since the file was last used, per the most recent of the file’s
st_atime
andst_mtime
values.Note: Some people find the “sense” of this method confusing or “backwards”. Keep in mind that this method is used to remove items from the list, not from the filesystem! It removes from the list those items that you would not want to purge because they are too young. As an example, passing in
daysOld
of zero (0) would remove from the list no files, which would result in purging all of the files later. I would be happy to make a synonym of this method with an easier-to-understand “sense”, if someone can suggest one.Parameters: daysOld (Integer value >= 0) – Minimum age of files that are to be kept in the list Returns: Number of entries removed
-
-
class
CedarBackup3.filesystem.
SpanItem
(fileList, size, capacity, utilization)[source]¶ Bases:
object
Item returned by
BackupFileList.generateSpan
.
-
CedarBackup3.filesystem.
compareContents
(path1, path2, verbose=False)[source]¶ Compares the contents of two directories to see if they are equivalent.
The two directories are recursively compared. First, we check whether they contain exactly the same set of files. Then, we check to see every given file has exactly the same contents in both directories.
This is all relatively simple to implement through the magic of
BackupFileList.generateDigestMap
, which knows how to strip a path prefix off the front of each entry in the mapping it generates. This makes our comparison as simple as creating a list for each path, then generating a digest map for each path and comparing the two.If no exception is thrown, the two directories are considered identical.
If the
verbose
flag isTrue
, then an alternate (but slower) method is used so that any thrown exception can indicate exactly which file caused the comparison to fail. The thrownValueError
exception distinguishes between the directories containing different files, and containing the same files with differing content.Note: Symlinks are not followed for the purposes of this comparison.
Parameters: - path1 (String representing a path on disk) – First path to compare
- path2 (String representing a path on disk) – First path to compare
- verbose (Boolean) – Indicates whether a verbose response should be given
Raises: ValueError
– If a directory doesn’t exist or can’t be readValueError
– If the two directories are not equivalentIOError
– If there is an unusual problem reading the directories
-
CedarBackup3.filesystem.
compareDigestMaps
(digest1, digest2, verbose=False)[source]¶ Compares two digest maps and throws an exception if they differ.
Parameters: - digest1 (Digest as returned from BackupFileList.generateDigestMap() – First digest to compare
- digest2 (Digest as returned from BackupFileList.generateDigestMap() – Second digest to compare
- verbose (Boolean) – Indicates whether a verbose response should be given
Raises: ValueError
– If the two directories are not equivalent
-
CedarBackup3.filesystem.
normalizeDir
(path)[source]¶ Normalizes a directory name.
For our purposes, a directory name is normalized by removing the trailing path separator, if any. This is important because we want directories to appear within lists in a consistent way, although from the user’s perspective passing in
/path/to/dir/
and/path/to/dir
are equivalent.Parameters: path (String representing a path on disk) – Path to be normalized Returns: Normalized path, which should be equivalent to the original
CedarBackup3.image module¶
Provides interface backwards compatibility.
In Cedar Backup 2.10.0, a refactoring effort took place while adding code to support DVD hardware. All of the writer functionality was moved to the writers/ package. This mostly-empty file remains to preserve the Cedar Backup library interface.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
CedarBackup3.knapsack module¶
Provides the implementation for various knapsack algorithms.
Knapsack algorithms are “fit” algorithms, used to take a set of “things” and decide on the optimal way to fit them into some container. The focus of this code is to fit files onto a disc, although the interface (in terms of item, item size and capacity size, with no units) is generic enough that it can be applied to items other than files.
All of the algorithms implemented below assume that “optimal” means “use up as much of the disc’s capacity as possible”, but each produces slightly different results. For instance, the best fit and first fit algorithms tend to include fewer files than the worst fit and alternate fit algorithms, even if they use the disc space more efficiently.
Usually, for a given set of circumstances, it will be obvious to a human which algorithm is the right one to use, based on trade-offs between number of files included and ideal space utilization. It’s a little more difficult to do this programmatically. For Cedar Backup’s purposes (i.e. trying to fit a small number of collect-directory tarfiles onto a disc), worst-fit is probably the best choice if the goal is to include as many of the collect directories as possible.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
CedarBackup3.knapsack.
alternateFit
(items, capacity)[source]¶ Implements the alternate-fit knapsack algorithm.
This algorithm (which I’m calling “alternate-fit” as in “alternate from one to the other”) tries to balance small and large items to achieve better end-of-disk performance. Instead of just working one direction through a list, it alternately works from the start and end of a sorted list (sorted from smallest to largest), throwing away any item which causes capacity to be exceeded. The algorithm tends to be slower than the best-fit and first-fit algorithms, and slightly faster than the worst-fit algorithm, probably because of the number of items it considers on average before completing. It often achieves slightly better capacity utilization than the worst-fit algorithm, while including slighly fewer items.
The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.
The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.
The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass
items.copy()
if they do not want their version of the list modified.The function returns a list of chosen items and the unitless amount of capacity used by the items.
Parameters: - items (dictionary, keyed on item, of
item, size
tuples, item as string and size as integer) – Items to operate on - capacity (integer) – Capacity of container to fit to
Returns: Tuple
(items, used)
as described above- items (dictionary, keyed on item, of
-
CedarBackup3.knapsack.
bestFit
(items, capacity)[source]¶ Implements the best-fit knapsack algorithm.
The best-fit algorithm proceeds through a sorted list of items (sorted from largest to smallest) until running out of items or meeting capacity exactly. If capacity is exceeded, the item that caused capacity to be exceeded is thrown away and the next one is tried. The algorithm effectively includes the minimum number of items possible in its search for optimal capacity utilization. For large lists of mixed-size items, it’s not ususual to see the algorithm achieve 100% capacity utilization by including fewer than 1% of the items. Probably because it often has to look at fewer of the items before completing, it tends to be a little faster than the worst-fit or alternate-fit algorithms.
The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.
The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.
The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass
items.copy()
if they do not want their version of the list modified.The function returns a list of chosen items and the unitless amount of capacity used by the items.
Parameters: - items (dictionary, keyed on item, of
item, size
tuples, item as string and size as integer) – Items to operate on - capacity (integer) – Capacity of container to fit to
Returns: Tuple
(items, used)
as described above- items (dictionary, keyed on item, of
-
CedarBackup3.knapsack.
firstFit
(items, capacity)[source]¶ Implements the first-fit knapsack algorithm.
The first-fit algorithm proceeds through an unsorted list of items until running out of items or meeting capacity exactly. If capacity is exceeded, the item that caused capacity to be exceeded is thrown away and the next one is tried. This algorithm generally performs more poorly than the other algorithms both in terms of capacity utilization and item utilization, but can be as much as an order of magnitude faster on large lists of items because it doesn’t require any sorting.
The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.
The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.
The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass
items.copy()
if they do not want their version of the list modified.The function returns a list of chosen items and the unitless amount of capacity used by the items.
Parameters: - items (dictionary, keyed on item, of
item, size
tuples, item as string and size as integer) – Items to operate on - capacity (integer) – Capacity of container to fit to
Returns: Tuple
(items, used)
as described above- items (dictionary, keyed on item, of
-
CedarBackup3.knapsack.
worstFit
(items, capacity)[source]¶ Implements the worst-fit knapsack algorithm.
The worst-fit algorithm proceeds through an a sorted list of items (sorted from smallest to largest) until running out of items or meeting capacity exactly. If capacity is exceeded, the item that caused capacity to be exceeded is thrown away and the next one is tried. The algorithm effectively includes the maximum number of items possible in its search for optimal capacity utilization. It tends to be somewhat slower than either the best-fit or alternate-fit algorithm, probably because on average it has to look at more items before completing.
The “size” values in the items and capacity arguments must be comparable, but they are unitless from the perspective of this function. Zero-sized items and capacity are considered degenerate cases. If capacity is zero, no items fit, period, even if the items list contains zero-sized items.
The dictionary is indexed by its key, and then includes its key. This seems kind of strange on first glance. It works this way to facilitate easy sorting of the list on key if needed.
The function assumes that the list of items may be used destructively, if needed. This avoids the overhead of having the function make a copy of the list, if this is not required. Callers should pass
items.copy()
if they do not want their version of the list modified.The function returns a list of chosen items and the unitless amount of capacity used by the items.
Parameters: - items (dictionary, keyed on item, of
item, size
tuples, item as string and size as integer) – Items to operate on - capacity (integer) – Capacity of container to fit to
Returns: Tuple
(items, used)
as described above- items (dictionary, keyed on item, of
CedarBackup3.peer module¶
Provides backup peer-related objects and utility functions.
Module Attributes¶
-
CedarBackup3.peer.
DEF_COLLECT_INDICATOR
¶ Name of the default collect indicator file
-
CedarBackup3.peer.
DEF_STAGE_INDICATOR
¶ Name of the default stage indicator file
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
class
CedarBackup3.peer.
LocalPeer
(name, collectDir, ignoreFailureMode=None)[source]¶ Bases:
object
Backup peer representing a local peer in a backup pool.
This is a class representing a local (non-network) peer in a backup pool. Local peers are backed up by simple filesystem copy operations. A local peer has associated with it a name (typically, but not necessarily, a hostname) and a collect directory.
The public methods other than the constructor are part of a “backup peer” interface shared with the
RemotePeer
class.-
__init__
(name, collectDir, ignoreFailureMode=None)[source]¶ Initializes a local backup peer.
Note that the collect directory must be an absolute path, but does not have to exist when the object is instantiated. We do a lazy validation on this value since we could (potentially) be creating peer objects before an ongoing backup completed.
Parameters: - name – Name of the backup peer
- collectDir – Path to the peer’s collect directory
- ignoreFailureMode – Ignore failure mode for this peer, one of
VALID_FAILURE_MODES
Raises: ValueError
– If the name is emptyValueError
– If collect directory is not an absolute path
-
checkCollectIndicator
(collectIndicator=None)[source]¶ Checks the collect indicator in the peer’s staging directory.
When a peer has completed collecting its backup files, it will write an empty indicator file into its collect directory. This method checks to see whether that indicator has been written. We’re “stupid” here - if the collect directory doesn’t exist, you’ll naturally get back
False
.If you need to, you can override the name of the collect indicator file by passing in a different name.
Parameters: collectIndicator – Name of the collect indicator file to check Returns: Boolean true/false depending on whether the indicator exists Raises: ValueError
– If a path cannot be encoded properly
-
collectDir
¶ Path to the peer’s collect directory (an absolute local path).
-
ignoreFailureMode
¶ Ignore failure mode for peer.
-
name
¶ Name of the peer.
-
stagePeer
(targetDir, ownership=None, permissions=None)[source]¶ Stages data from the peer into the indicated local target directory.
The collect and target directories must both already exist before this method is called. If passed in, ownership and permissions will be applied to the files that are copied.
Note: The caller is responsible for checking that the indicator exists, if they care. This function only stages the files within the directory.
Note: If you have user/group as strings, call the
util.getUidGid
function to get the associated uid/gid as an ownership tuple.Parameters: - targetDir – Target directory to write data into
- ownership – Owner and group that files should have, tuple of numeric
(uid, gid)
- permissions – Unix permissions mode that the staged files should have, in octal like
0640
Returns: Number of files copied from the source directory to the target directory
Raises: ValueError
– If collect directory is not a directory or does not existValueError
– If target directory is not a directory, does not exist or is not absoluteValueError
– If a path cannot be encoded properlyIOError
– If there were no files to stage (i.e. the directory was empty)IOError
– If there is an IO error copying a fileOSError
– If there is an OS error copying or changing permissions on a file
-
writeStageIndicator
(stageIndicator=None, ownership=None, permissions=None)[source]¶ Writes the stage indicator in the peer’s staging directory.
When the master has completed collecting its backup files, it will write an empty indicator file into the peer’s collect directory. The presence of this file implies that the staging process is complete.
If you need to, you can override the name of the stage indicator file by passing in a different name.
Note: If you have user/group as strings, call the
util.getUidGid
function to get the associated uid/gid as an ownership tuple.Parameters: - stageIndicator – Name of the indicator file to write
- ownership – Owner and group that files should have, tuple of numeric
(uid, gid)
- permissions – Unix permissions mode that the staged files should have, in octal like
0640
Raises: ValueError
– If collect directory is not a directory or does not existValueError
– If a path cannot be encoded properlyIOError
– If there is an IO error creating the fileOSError
– If there is an OS error creating or changing permissions on the file
-
-
class
CedarBackup3.peer.
RemotePeer
(name=None, collectDir=None, workingDir=None, remoteUser=None, rcpCommand=None, localUser=None, rshCommand=None, cbackCommand=None, ignoreFailureMode=None)[source]¶ Bases:
object
Backup peer representing a remote peer in a backup pool.
This is a class representing a remote (networked) peer in a backup pool. Remote peers are backed up using an rcp-compatible copy command. A remote peer has associated with it a name (which must be a valid hostname), a collect directory, a working directory and a copy method (an rcp-compatible command).
You can also set an optional local user value. This username will be used as the local user for any remote copies that are required. It can only be used if the root user is executing the backup. The root user will
su
to the local user and execute the remote copies as that user.The copy method is associated with the peer and not with the actual request to copy, because we can envision that each remote host might have a different connect method.
The public methods other than the constructor are part of a “backup peer” interface shared with the
LocalPeer
class.-
__init__
(name=None, collectDir=None, workingDir=None, remoteUser=None, rcpCommand=None, localUser=None, rshCommand=None, cbackCommand=None, ignoreFailureMode=None)[source]¶ Initializes a remote backup peer.
Note: If provided, each command will eventually be parsed into a list of strings suitable for passing to
util.executeCommand
in order to avoid security holes related to shell interpolation. This parsing will be done by theutil.splitCommandLine
function. See the documentation for that function for some important notes about its limitations.Parameters: - name – Name of the backup peer, a valid DNS name
- collectDir – Path to the peer’s collect directory, absolute path
- workingDir – Working directory that can be used to create temporary files, etc, an absolute path
- remoteUser – Name of the Cedar Backup user on the remote peer
- localUser – Name of the Cedar Backup user on the current host
- rcpCommand – An rcp-compatible copy command to use for copying files from the peer
- rshCommand – An rsh-compatible copy command to use for remote shells to the peer
- cbackCommand – A chack-compatible command to use for executing managed actions
- ignoreFailureMode – Ignore failure mode for this peer, one of
VALID_FAILURE_MODES
Raises: ValueError
– If collect directory is not an absolute path
-
cbackCommand
¶ A chack-compatible command to use for executing managed actions.
-
checkCollectIndicator
(collectIndicator=None)[source]¶ Checks the collect indicator in the peer’s staging directory.
When a peer has completed collecting its backup files, it will write an empty indicator file into its collect directory. This method checks to see whether that indicator has been written. If the remote copy command fails, we return
False
as if the file weren’t there.If you need to, you can override the name of the collect indicator file by passing in a different name.
Note: Apparently, we can’t count on all rcp-compatible implementations to return sensible errors for some error conditions. As an example, the
scp
command in Debian ‘woody’ returns a zero (normal) status even when it can’t find a host or if the login or path is invalid. Because of this, the implementation of this method is rather convoluted.Parameters: collectIndicator – Name of the collect indicator file to check Returns: Boolean true/false depending on whether the indicator exists Raises: ValueError
– If a path cannot be encoded properly
-
collectDir
¶ Path to the peer’s collect directory (an absolute local path).
-
executeManagedAction
(action, fullBackup)[source]¶ Executes a managed action on this peer.
Parameters: - action – Name of the action to execute
- fullBackup – Whether a full backup should be executed
Raises: IOError
– If there is an error executing the action on the remote peer
-
executeRemoteCommand
(command)[source]¶ Executes a command on the peer via remote shell.
Parameters: command – Command to execute Raises: IOError
– If there is an error executing the command on the remote peer
-
ignoreFailureMode
¶ Ignore failure mode for peer.
-
localUser
¶ Name of the Cedar Backup user on the current host.
-
name
¶ Name of the peer (a valid DNS hostname).
-
rcpCommand
¶ An rcp-compatible copy command to use for copying files.
-
remoteUser
¶ Name of the Cedar Backup user on the remote peer.
-
rshCommand
¶ An rsh-compatible command to use for remote shells to the peer.
-
stagePeer
(targetDir, ownership=None, permissions=None)[source]¶ Stages data from the peer into the indicated local target directory.
The target directory must already exist before this method is called. If passed in, ownership and permissions will be applied to the files that are copied.
Note: The returned count of copied files might be inaccurate if some of the copied files already existed in the staging directory prior to the copy taking place. We don’t clear the staging directory first, because some extension might also be using it.
Note: If you have user/group as strings, call the
util.getUidGid
function to get the associated uid/gid as an ownership tuple.Note: Unlike the local peer version of this method, an I/O error might or might not be raised if the directory is empty. Since we’re using a remote copy method, we just don’t have the fine-grained control over our exceptions that’s available when we can look directly at the filesystem, and we can’t control whether the remote copy method thinks an empty directory is an error.
Parameters: - targetDir – Target directory to write data into
- ownership – Owner and group that files should have, tuple of numeric
(uid, gid)
- permissions – Unix permissions mode that the staged files should have, in octal like
0640
Returns: Number of files copied from the source directory to the target directory
Raises: ValueError
– If target directory is not a directory, does not exist or is not absoluteValueError
– If a path cannot be encoded properlyIOError
– If there were no files to stage (i.e. the directory was empty)IOError
– If there is an IO error copying a fileOSError
– If there is an OS error copying or changing permissions on a file
-
workingDir
¶ Path to the peer’s working directory (an absolute local path).
-
writeStageIndicator
(stageIndicator=None)[source]¶ Writes the stage indicator in the peer’s staging directory.
When the master has completed collecting its backup files, it will write an empty indicator file into the peer’s collect directory. The presence of this file implies that the staging process is complete.
If you need to, you can override the name of the stage indicator file by passing in a different name.
Note: If you have user/group as strings, call the
util.getUidGid
function to get the associated uid/gid as an ownership tuple.Parameters: stageIndicator – Name of the indicator file to write
Raises: ValueError
– If a path cannot be encoded properlyIOError
– If there is an IO error creating the fileOSError
– If there is an OS error creating or changing permissions on the file
-
CedarBackup3.release module¶
Provides location to maintain version information.
Module Attributes¶
-
CedarBackup3.release.
AUTHOR
¶ Author of software
-
CedarBackup3.release.
EMAIL
¶ Email address of author
-
CedarBackup3.release.
COPYRIGHT
¶ Copyright date
-
CedarBackup3.release.
VERSION
¶ Software version
-
CedarBackup3.release.
DATE
¶ Software release date
-
CedarBackup3.release.
URL
¶ URL of Cedar Backup webpage
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
CedarBackup3.testutil module¶
Provides unit-testing utilities.
These utilities are kept here, separate from util.py, because they provide common functionality that I do not want exported “publicly” once Cedar Backup is installed on a system. They are only used for unit testing, and are only useful within the source tree.
Many of these functions are in here because they are “good enough” for unit
test work but are not robust enough to be real public functions. Others (like
removedir
) do what they are supposed to, but I don’t want responsibility for
making them available to others.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
CedarBackup3.testutil.
availableLocales
()[source]¶ Returns a list of available locales on the system :returns: List of string locale names
-
CedarBackup3.testutil.
buildPath
(components)[source]¶ Builds a complete path from a list of components. For instance, constructs
"/a/b/c"
from["/a", "b", "c",]
. :param components: List of componentsReturns: String path constructed from components Raises: ValueError
– If a path cannot be encoded properly
-
CedarBackup3.testutil.
captureOutput
(c)[source]¶ Captures the output (stdout, stderr) of a function or a method.
Some of our functions don’t do anything other than just print output. We need a way to test these functions (at least nominally) but we don’t want any of the output spoiling the test suite output.
This function just creates a dummy file descriptor that can be used as a target by the callable function, rather than
stdout
orstderr
.Note: This method assumes that
callable
doesn’t take any arguments besides keyword argumentfd
to specify the file descriptor.Parameters: c – Callable function or method Returns: Output of function, as one big string
-
CedarBackup3.testutil.
changeFileAge
(filename, subtract=None)[source]¶ Changes a file age using the
os.utime
function.Note: Some platforms don’t seem to be able to set an age precisely. As a result, whereas we might have intended to set an age of 86400 seconds, we actually get an age of 86399.375 seconds. When util.calculateFileAge() looks at that the file, it calculates an age of 0.999992766204 days, which then gets truncated down to zero whole days. The tests get very confused. To work around this, I always subtract off one additional second as a fudge factor. That way, the file age will be at least as old as requested later on.
Parameters: - filename – File to operate on
- subtract – Number of seconds to subtract from the current time
Raises: ValueError
– If a path cannot be encoded properly
-
CedarBackup3.testutil.
commandAvailable
(command)[source]¶ Indicates whether a command is available on $PATH somewhere. This should work on both Windows and UNIX platforms. :param command: Commang to search for
Returns: Boolean true/false depending on whether command is available
-
CedarBackup3.testutil.
extractTar
(tmpdir, filepath)[source]¶ Extracts the indicated tar file to the indicated tmpdir. :param tmpdir: Temp directory to extract to :param filepath: Path to tarfile to extract
Raises: ValueError
– If a path cannot be encoded properly
-
CedarBackup3.testutil.
failUnlessAssignRaises
(testCase, exception, obj, prop, value)[source]¶ Equivalent of
failUnlessRaises
, but used for property assignments instead.It’s nice to be able to use
failUnlessRaises
to check that a method call raises the exception that you expect. Unfortunately, this method can’t be used to check Python propery assignments, even though these property assignments are actually implemented underneath as methods.This function (which can be easily called by unit test classes) provides an easy way to wrap the assignment checks. It’s not pretty, or as intuitive as the original check it’s modeled on, but it does work.
Let’s assume you make this method call:
testCase.failUnlessAssignRaises(ValueError, collectDir, "absolutePath", absolutePath)
If you do this, a test case failure will be raised unless the assignment:
collectDir.absolutePath = absolutePath
fails with a
ValueError
exception. The failure message differentiates between the case where no exception was raised and the case where the wrong exception was raised.Note: Internally, the
missed
andinstead
variables are used rather than directly callingtestCase.fail
upon noticing a problem because the act of “failure” itself generates an exception that would be caught by the generalexcept
clause.Parameters: - testCase – PyUnit test case object (i.e. self)
- exception – Exception that is expected to be raised
- obj – Object whose property is to be assigned to
- prop – Name of the property, as a string
- value – Value that is to be assigned to the property
@see:
unittest.TestCase.failUnlessRaises
-
CedarBackup3.testutil.
findResources
(resources, dataDirs)[source]¶ Returns a dictionary of locations for various resources. :param resources: List of required resources :param dataDirs: List of data directories to search within for resources
Returns: Dictionary mapping resource name to resource path Raises: Exception
– If some resource cannot be found
-
CedarBackup3.testutil.
getLogin
()[source]¶ Returns the name of the currently-logged in user. This might fail under some circumstances - but if it does, our tests would fail anyway.
-
CedarBackup3.testutil.
getMaskAsMode
()[source]¶ Returns the user’s current umask inverted to a mode. A mode is mostly a bitwise inversion of a mask, i.e. mask 002 is mode 775. :returns: Umask converted to a mode, as an integer
-
CedarBackup3.testutil.
platformDebian
()[source]¶ Returns boolean indicating whether this is the Debian platform.
-
CedarBackup3.testutil.
platformMacOsX
()[source]¶ Returns boolean indicating whether this is the Mac OS X platform.
-
CedarBackup3.testutil.
randomFilename
(length, prefix=None, suffix=None)[source]¶ Generates a random filename with the given length. :param length: Length of filename
@return Random filename
-
CedarBackup3.testutil.
removedir
(tree)[source]¶ Recursively removes an entire directory. This is basically taken from an example on python.com. :param tree: Directory tree to remove
Raises: ValueError
– If a path cannot be encoded properly
-
CedarBackup3.testutil.
runningAsRoot
()[source]¶ Returns boolean indicating whether the effective user id is root.
-
CedarBackup3.testutil.
setupDebugLogger
()[source]¶ Sets up a screen logger for debugging purposes.
Normally, the CLI functionality configures the logger so that things get written to the right place. However, for debugging it’s sometimes nice to just get everything – debug information and output – dumped to the screen. This function takes care of that.
-
CedarBackup3.testutil.
setupOverrides
()[source]¶ Set up any platform-specific overrides that might be required.
When packages are built, this is done manually (hardcoded) in customize.py and the overrides are set up in cli.cli(). This way, no runtime checks need to be done. This is safe, because the package maintainer knows exactly which platform (Debian or not) the package is being built for.
Unit tests are different, because they might be run anywhere. So, we attempt to make a guess about plaform using platformDebian(), and use that to set up the custom overrides so that platform-specific unit tests continue to work.
CedarBackup3.util module¶
Provides general-purpose utilities.
Module Attributes¶
-
CedarBackup3.util.
ISO_SECTOR_SIZE
¶ Size of an ISO image sector, in bytes
-
CedarBackup3.util.
BYTES_PER_SECTOR
¶ Number of bytes (B) per ISO sector
-
CedarBackup3.util.
BYTES_PER_KBYTE
¶ Number of bytes (B) per kilobyte (kB)
-
CedarBackup3.util.
BYTES_PER_MBYTE
¶ Number of bytes (B) per megabyte (MB)
-
CedarBackup3.util.
BYTES_PER_GBYTE
¶ Number of bytes (B) per megabyte (GB)
-
CedarBackup3.util.
KBYTES_PER_MBYTE
¶ Number of kilobytes (kB) per megabyte (MB)
-
CedarBackup3.util.
MBYTES_PER_GBYTE
¶ Number of megabytes (MB) per gigabyte (GB)
-
CedarBackup3.util.
SECONDS_PER_MINUTE
¶ Number of seconds per minute
-
CedarBackup3.util.
MINUTES_PER_HOUR
¶ Number of minutes per hour
-
CedarBackup3.util.
HOURS_PER_DAY
¶ Number of hours per day
-
CedarBackup3.util.
SECONDS_PER_DAY
¶ Number of seconds per day
-
CedarBackup3.util.
UNIT_BYTES
¶ Constant representing the byte (B) unit for conversion
-
CedarBackup3.util.
UNIT_KBYTES
¶ Constant representing the kilobyte (kB) unit for conversion
-
CedarBackup3.util.
UNIT_MBYTES
¶ Constant representing the megabyte (MB) unit for conversion
-
CedarBackup3.util.
UNIT_GBYTES
¶ Constant representing the gigabyte (GB) unit for conversion
-
CedarBackup3.util.
UNIT_SECTORS
¶ Constant representing the ISO sector unit for conversion
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
class
CedarBackup3.util.
AbsolutePathList
[source]¶ Bases:
CedarBackup3.util.UnorderedList
Class representing a list of absolute paths.
This is an unordered list.
We override the
append
,insert
andextend
methods to ensure that any item added to the list is an absolute path.Each item added to the list is encoded using
encodePath
. If we don’t do this, we have problems trying certain operations between strings and unicode objects, particularly for “odd” filenames that can’t be encoded in standard ASCII.-
append
(item)[source]¶ Overrides the standard
append
method. :raises:ValueError
– If item is not an absolute path
-
-
class
CedarBackup3.util.
Diagnostics
[source]¶ Bases:
object
Class holding runtime diagnostic information.
Diagnostic information is information that is useful to get from users for debugging purposes. I’m consolidating it all here into one object.
-
encoding
¶ Filesystem encoding that is in effect.
-
getValues
()[source]¶ Get a map containing all of the diagnostic values. :returns: Map from diagnostic name to diagnostic value
-
interpreter
¶ Python interpreter version.
-
locale
¶ Locale that is in effect.
-
logDiagnostics
(method, prefix='')[source]¶ Pretty-print diagnostic information using a logger method. :param method: Logger method to use for logging (i.e. logger.info) :param prefix: Prefix string (if any) to place onto printed lines
-
platform
¶ Platform identifying information.
-
printDiagnostics
(fd=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, prefix='')[source]¶ Pretty-print diagnostic information to a file descriptor. :param fd: File descriptor used to print information :param prefix: Prefix string (if any) to place onto printed lines
Note: The
fd
is used rather thanprint
to facilitate unit testing.
-
timestamp
¶ Current timestamp.
-
version
¶ Cedar Backup version.
-
-
class
CedarBackup3.util.
DirectedGraph
(name)[source]¶ Bases:
object
Represents a directed graph.
A graph G=(V,E) consists of a set of vertices V together with a set E of vertex pairs or edges. In a directed graph, each edge also has an associated direction (from vertext v1 to vertex v2). A
DirectedGraph
object provides a way to construct a directed graph and execute a depth- first search.This data structure was designed based on the graphing chapter in U{The Algorithm Design Manual<http://www2.toki.or.id/book/AlgDesignManual/>}, by Steven S. Skiena.
This class is intended to be used by Cedar Backup for dependency ordering. Because of this, it’s not quite general-purpose. Unlike a “general” graph, every vertex in this graph has at least one edge pointing to it, from a special “start” vertex. This is so no vertices get “lost” either because they have no dependencies or because nothing depends on them.
-
__init__
(name)[source]¶ Directed graph constructor.
Parameters: name (String value) – Name of this graph
-
createEdge
(start, finish)[source]¶ Adds an edge with an associated direction, from
start
vertex tofinish
vertex. :param start: Name of start vertex :param finish: Name of finish vertexRaises: ValueError
– If one of the named vertices is unknown
-
createVertex
(name)[source]¶ Creates a named vertex. :param name: vertex name
Raises: ValueError
– If the vertex name isNone
or empty
-
name
¶ Name of the graph.
-
topologicalSort
()[source]¶ Implements a topological sort of the graph.
This method also enforces that the graph is a directed acyclic graph, which is a requirement of a topological sort.
A directed acyclic graph (or “DAG”) is a directed graph with no directed cycles. A topological sort of a DAG is an ordering on the vertices such that all edges go from left to right. Only an acyclic graph can have a topological sort, but any DAG has at least one topological sort.
Since a topological sort only makes sense for an acyclic graph, this method throws an exception if a cycle is found.
A depth-first search only makes sense if the graph is acyclic. If the graph contains any cycles, it is not possible to determine a consistent ordering for the vertices.
Note: If a particular vertex has no edges, then its position in the final list depends on the order in which the vertices were created in the graph. If you’re using this method to determine a dependency order, this makes sense: a vertex with no dependencies can go anywhere (and will).
Returns: Ordering on the vertices so that all edges go from left to right Raises: ValueError
– If a cycle is found in the graph
-
-
class
CedarBackup3.util.
ObjectTypeList
(objectType, objectName)[source]¶ Bases:
CedarBackup3.util.UnorderedList
Class representing a list containing only objects with a certain type.
This is an unordered list.
We override the
append
,insert
andextend
methods to ensure that any item added to the list matches the type that is requested. The comparison uses the built-inisinstance
, which should allow subclasses of of the requested type to be added to the list as well.The
objectName
value will be used in exceptions, i.e. C{“Item must be a CollectDir object.”} ifobjectName
is"CollectDir"
.-
__init__
(objectType, objectName)[source]¶ Initializes a typed list for a particular type. :param objectType: Type that the list elements must match :param objectName: Short string containing the “name” of the type
-
append
(item)[source]¶ Overrides the standard
append
method. :raises:ValueError
– If item does not match requested type
-
-
class
CedarBackup3.util.
PathResolverSingleton
[source]¶ Bases:
object
Singleton used for resolving executable paths.
Various functions throughout Cedar Backup (including extensions) need a way to resolve the path of executables that they use. For instance, the image functionality needs to find the
mkisofs
executable, and the Subversion extension needs to find thesvnlook
executable. Cedar Backup’s original behavior was to assume that the simple name ("svnlook"
or whatever) was available on the caller’s$PATH
, and to fail otherwise. However, this turns out to be less than ideal, since for instance the root user might not always have executables likesvnlook
in its path.One solution is to specify a path (either via an absolute path or some sort of path insertion or path appending mechanism) that would apply to the
executeCommand()
function. This is not difficult to implement, but it seem like kind of a “big hammer” solution. Besides that, it might also represent a security flaw (for instance, I prefer not to mess with root’s$PATH
on the application level if I don’t have to).The alternative is to set up some sort of configuration for the path to certain executables, i.e. “find
svnlook
in/usr/local/bin/svnlook
” or whatever. This PathResolverSingleton aims to provide a good solution to the mapping problem. Callers of all sorts (extensions or not) can get an instance of the singleton. Then, they call thelookup
method to try and resolve the executable they are looking for. Through thelookup
method, the caller can also specify a default to use if a mapping is not found. This way, with no real effort on the part of the caller, behavior can neatly degrade to something equivalent to the current behavior if there is no special mapping or if the singleton was never initialized in the first place.Even better, extensions automagically get access to the same resolver functionality, and they don’t even need to understand how the mapping happens. All extension authors need to do is document what executables their code requires, and the standard resolver configuration section will meet their needs.
The class should be initialized once through the constructor somewhere in the main routine. Then, the main routine should call the
fill
method to fill in the resolver’s internal structures. Everyone else who needs to resolve a path will get an instance of the class usinggetInstance
and will then just call thelookup
method.-
_instance
¶ Holds a reference to the singleton
-
_mapping
¶ Internal mapping from resource name to path
-
fill
(mapping)[source]¶ Fills in the singleton’s internal mapping from name to resource. :param mapping: Mapping from resource name to path :type mapping: Dictionary mapping name to path, both as strings
-
getInstance
= <CedarBackup3.util.PathResolverSingleton._Helper object>¶
-
lookup
(name, default=None)[source]¶ Looks up name and returns the resolved path associated with the name. :param name: Name of the path resource to resolve :param default: Default to return if resource cannot be resolved
Returns: Resolved path associated with name, or default if name can’t be resolved
-
-
class
CedarBackup3.util.
Pipe
(cmd, bufsize=-1, ignoreStderr=False)[source]¶ Bases:
subprocess.Popen
Specialized pipe class for use by
executeCommand
.The
executeCommand
function needs a specialized way of interacting with a pipe. First,executeCommand
only reads from the pipe, and never writes to it. Second,executeCommand
needs a way to discard all output written tostderr
, as a means of simulating the shell2>/dev/null
construct.
-
class
CedarBackup3.util.
RegexList
[source]¶ Bases:
CedarBackup3.util.UnorderedList
Class representing a list of valid regular expression strings.
This is an unordered list.
We override the
append
,insert
andextend
methods to ensure that any item added to the list is a valid regular expression.-
append
(item)[source]¶ Overrides the standard
append
method. :raises:ValueError
– If item is not an absolute path
-
-
class
CedarBackup3.util.
RegexMatchList
(valuesRegex, emptyAllowed=True, prefix=None)[source]¶ Bases:
CedarBackup3.util.UnorderedList
Class representing a list containing only strings that match a regular expression.
If
emptyAllowed
is passed in asFalse
, then empty strings are explicitly disallowed, even if they happen to match the regular expression. (None
values are always disallowed, since string operations are not permitted onNone
.)This is an unordered list.
We override the
append
,insert
andextend
methods to ensure that any item added to the list matches the indicated regular expression.Note: If you try to put values that are not strings into the list, you will likely get either TypeError or AttributeError exceptions as a result.
-
__init__
(valuesRegex, emptyAllowed=True, prefix=None)[source]¶ Initializes a list restricted to containing certain values. :param valuesRegex: Regular expression that must be matched, as a string :param emptyAllowed: Indicates whether empty or None values are allowed :param prefix: Prefix to use in error messages (None results in prefix “Item”)
-
append
(item)[source]¶ Overrides the standard
append
method.Raises: ValueError
– If item is NoneValueError
– If item is empty and empty values are not allowedValueError
– If item does not match the configured regular expression
-
-
class
CedarBackup3.util.
RestrictedContentList
(valuesList, valuesDescr, prefix=None)[source]¶ Bases:
CedarBackup3.util.UnorderedList
Class representing a list containing only object with certain values.
This is an unordered list.
We override the
append
,insert
andextend
methods to ensure that any item added to the list is among the valid values. We use a standard comparison, so pretty much anything can be in the list of valid values.The
valuesDescr
value will be used in exceptions, i.e. C{“Item must be one of values in VALID_ACTIONS”} ifvaluesDescr
is"VALID_ACTIONS"
.Note: This class doesn’t make any attempt to trap for nonsensical arguments. All of the values in the values list should be of the same type (i.e. strings). Then, all list operations also need to be of that type (i.e. you should always insert or append just strings). If you mix types – for instance lists and strings – you will likely see AttributeError exceptions or other problems.
-
__init__
(valuesList, valuesDescr, prefix=None)[source]¶ Initializes a list restricted to containing certain values. :param valuesList: List of valid values :param valuesDescr: Short string describing list of values :param prefix: Prefix to use in error messages (None results in prefix “Item”)
-
append
(item)[source]¶ Overrides the standard
append
method. :raises:ValueError
– If item is not in the values list
-
-
class
CedarBackup3.util.
UnorderedList
[source]¶ Bases:
list
Class representing an “unordered list”.
An “unordered list” is a list in which only the contents matter, not the order in which the contents appear in the list.
For instance, we might be keeping track of set of paths in a list, because it’s convenient to have them in that form. However, for comparison purposes, we would only care that the lists contain exactly the same contents, regardless of order.
I have come up with two reasonable ways of doing this, plus a couple more that would work but would be a pain to implement. My first method is to copy and sort each list, comparing the sorted versions. This will only work if two lists with exactly the same members are guaranteed to sort in exactly the same order. The second way would be to create two Sets and then compare the sets. However, this would lose information about any duplicates in either list. I’ve decided to go with option #1 for now. I’ll modify this code if I run into problems in the future.
We override the original
__eq__
,__ne__
,__ge__
,__gt__
,__le__
and__lt__
list methods to change the definition of the various comparison operators. In all cases, the comparison is changed to return the result of the original operation but instead comparing sorted lists. This is going to be quite a bit slower than a normal list, so you probably only want to use it on small lists.-
static
mixedsort
(value)[source]¶ Sort a list, making sure we don’t blow up if the list happens to include mixed values. @see: http://stackoverflow.com/questions/26575183/how-can-i-get-2-x-like-sorting-behaviour-in-python-3-x
-
static
-
CedarBackup3.util.
buildNormalizedPath
(path)[source]¶ Returns a “normalized” path based on a path name.
A normalized path is a representation of a path that is also a valid file name. To make a valid file name out of a complete path, we have to convert or remove some characters that are significant to the filesystem – in particular, the path separator and any leading
'.'
character (which would cause the file to be hidden in a file listing).Note that this is a one-way transformation – you can’t safely derive the original path from the normalized path.
To normalize a path, we begin by looking at the first character. If the first character is
'/'
or'\'
, it gets removed. If the first character is'.'
, it gets converted to'_'
. Then, we look through the rest of the path and convert all remaining'/'
or'\'
characters'-'
, and all remaining whitespace characters to'_'
.As a special case, a path consisting only of a single
'/'
or'\'
character will be converted to'-'
.Parameters: path – Path to normalize Returns: Normalized path as described above Raises: ValueError
– If the path is None
-
CedarBackup3.util.
calculateFileAge
(path)[source]¶ Calculates the age (in days) of a file.
The “age” of a file is the amount of time since the file was last used, per the most recent of the file’s
st_atime
andst_mtime
values.Technically, we only intend this function to work with files, but it will probably work with anything on the filesystem.
Parameters: path – Path to a file on disk Returns: Age of the file in days (possibly fractional) Raises: OSError
– If the file doesn’t exist
-
CedarBackup3.util.
changeOwnership
(path, user, group)[source]¶ Changes ownership of path to match the user and group.
This is a no-op if user/group functionality is not available on the platform, or if the either passed-in user or group is
None
. Further, we won’t even try to do it unless running as root, since it’s unlikely to work.Parameters: - path – Path whose ownership to change
- user – User which owns file
- group – Group which owns file
-
CedarBackup3.util.
checkUnique
(prefix, values)[source]¶ Checks that all values are unique.
The values list is checked for duplicate values. If there are duplicates, an exception is thrown. All duplicate values are listed in the exception.
Parameters: - prefix – Prefix to use in the thrown exception
- values – List of values to check
Raises: ValueError
– If there are duplicates in the list
-
CedarBackup3.util.
convertSize
(size, fromUnit, toUnit)[source]¶ Converts a size in one unit to a size in another unit.
This is just a convenience function so that the functionality can be implemented in just one place. Internally, we convert values to bytes and then to the final unit.
The available units are:
UNIT_BYTES
- BytesUNIT_KBYTES
- Kilobytes, where 1 kB = 1024 BUNIT_MBYTES
- Megabytes, where 1 MB = 1024 kBUNIT_GBYTES
- Gigabytes, where 1 GB = 1024 MBUNIT_SECTORS
- Sectors, where 1 sector = 2048 B
Parameters: - size (Integer or float value in units of
fromUnit
) – Size to convert - fromUnit (One of the units listed above) – Unit to convert from
- toUnit (One of the units listed above) – Unit to convert to
Returns: Number converted to new unit, as a float
Raises: ValueError
– If one of the units is invalid
-
CedarBackup3.util.
dereferenceLink
(path, absolute=True)[source]¶ Deference a soft link, optionally normalizing it to an absolute path. :param path: Path of link to dereference :param absolute: Whether to normalize the result to an absolute path
Returns: Dereferenced path, or original path if original is not a link
-
CedarBackup3.util.
deriveDayOfWeek
(dayName)[source]¶ Converts English day name to numeric day of week as from
time.localtime
.For instance, the day
monday
would be converted to the number0
.Parameters: dayName (string, i.e. "monday"
,"tuesday"
, etc) – Day of week to convertReturns: Integer, where Monday is 0 and Sunday is 6; or -1 if no conversion is possible
-
CedarBackup3.util.
deviceMounted
(devicePath)[source]¶ Indicates whether a specific filesystem device is currently mounted.
We determine whether the device is mounted by looking through the system’s
mtab
file. This file shows every currently-mounted filesystem, ordered by device. We only do the check if themtab
file exists and is readable. Otherwise, we assume that the device is not mounted.Note: This only works on platforms that have a concept of an mtab file to show mounted volumes, like UNIXes. It won’t work on Windows.
Parameters: devicePath – Path of device to be checked Returns: True if device is mounted, false otherwise
-
CedarBackup3.util.
displayBytes
(bytes, digits=2)[source]¶ Format a byte quantity so it can be sensibly displayed.
It’s rather difficult to look at a number like “72372224 bytes” and get any meaningful information out of it. It would be more useful to see something like “69.02 MB”. That’s what this function does. Any time you want to display a byte value, i.e.:
print "Size: %s bytes" % bytes
Call this function instead:
print "Size: %s" % displayBytes(bytes)
What comes out will be sensibly formatted. The indicated number of digits will be listed after the decimal point, rounded based on whatever rules are used by Python’s standard
%f
string format specifier. (Values less than 1 kB will be listed in bytes and will not have a decimal point, since the concept of a fractional byte is nonsensical.)Parameters: - bytes (Integer number of bytes) – Byte quantity
- digits (Integer value, typically 2-5) – Number of digits to display after the decimal point
Returns: String, formatted for sensible display
-
CedarBackup3.util.
encodePath
(path)[source]¶ Safely encodes a filesystem path as a Unicode string, converting bytes to fileystem encoding if necessary. :param path: Path to encode
Returns: Path, as a string, encoded appropriately Raises: ValueError
– If the path cannot be encoded properly@see: http://lucumr.pocoo.org/2013/7/2/the-updated-guide-to-unicode/
-
CedarBackup3.util.
executeCommand
(command, args, returnOutput=False, ignoreStderr=False, doNotLog=False, outputFile=None)[source]¶ Executes a shell command, hopefully in a safe way.
This function exists to replace direct calls to
os.popen
in the Cedar Backup code. It’s not safe to call a function such asos.popen()
with untrusted arguments, since that can cause problems if the string contains non-safe variables or other constructs (imagine that the argument is$WHATEVER
, but$WHATEVER
contains something like C{”; rm -fR ~/; echo”} in the current environment).Instead, it’s safer to pass a list of arguments in the style supported bt
popen2
orpopen4
. This function actually uses a specializedPipe
class implemented using eithersubprocess.Popen
orpopen2.Popen4
.Under the normal case, this function will return a tuple of C{(status, None)} where the status is the wait-encoded return status of the call per the
popen2.Popen4
documentation. IfreturnOutput
is passed in asTrue
, the function will return a tuple of(status, output)
whereoutput
is a list of strings, one entry per line in the output from the command. Output is always logged to theoutputLogger.info()
target, regardless of whether it’s returned.By default,
stdout
andstderr
will be intermingled in the output. However, if you pass inignoreStderr=True
, then onlystdout
will be included in the output.The
doNotLog
parameter exists so that callers can force the function to not log command output to the debug log. Normally, you would want to log. However, if you’re using this function to write huge output files (i.e. database backups written tostdout
) then you might want to avoid putting all that information into the debug log.The
outputFile
parameter exists to make it easier for a caller to push output into a file, i.e. as a substitute for redirection to a file. If this value is passed in, each time a line of output is generated, it will be written to the file usingoutputFile.write()
. At the end, the file descriptor will be flushed usingoutputFile.flush()
. The caller maintains responsibility for closing the file object appropriately.Note: I know that it’s a bit confusing that the command and the arguments are both lists. I could have just required the caller to pass in one big list. However, I think it makes some sense to keep the command (the constant part of what we’re executing, i.e.
"scp -B"
) separate from its arguments, even if they both end up looking kind of similar.Note: You cannot redirect output via shell constructs (i.e.
>file
,2>/dev/null
, etc.) using this function. The redirection string would be passed to the command just like any other argument. However, you can implement the equivalent to redirection usingignoreStderr
andoutputFile
, as discussed above.Note: The operating system environment is partially sanitized before the command is invoked. See
sanitizeEnvironment
for details.Parameters: - command (List of individual arguments that make up the command) – Shell command to execute
- args (List of additional arguments to the command) – List of arguments to the command
- returnOutput (Boolean
True
orFalse
) – Indicates whether to return the output of the command - ignoreStderr (Boolean True or False) – Whether stderr should be discarded
- doNotLog (Boolean
True
orFalse
) – Indicates that output should not be logged - outputFile (File as from
open
orfile
, binary write) – File that all output should be written to
Returns: Tuple of
(result, output)
as described above
-
CedarBackup3.util.
getFunctionReference
(module, function)[source]¶ Gets a reference to a named function.
This does some hokey-pokey to get back a reference to a dynamically named function. For instance, say you wanted to get a reference to the
os.path.isdir
function. You could use:myfunc = getFunctionReference("os.path", "isdir")
Although we won’t bomb out directly, behavior is pretty much undefined if you pass in
None
or""
for eithermodule
orfunction
.The only validation we enforce is that whatever we get back must be callable.
I derived this code based on the internals of the Python unittest implementation. I don’t claim to completely understand how it works.
Parameters: - module (Something like "os.path" or "CedarBackup3.util") – Name of module associated with function
- function (Something like "isdir" or "getUidGid") – Name of function
Returns: Reference to function associated with name
Raises: ImportError
– If the function cannot be foundValueError
– If the resulting reference is not callable
@copyright: Some of this code, prior to customization, was originally part of the Python 2.3 codebase. Python code is copyright (c) 2001, 2002 Python Software Foundation; All Rights Reserved.
-
CedarBackup3.util.
getUidGid
(user, group)[source]¶ Get the uid/gid associated with a user/group pair
This is a no-op if user/group functionality is not available on the platform.
Parameters: - user (User name as a string) – User name
- group (Group name as a string) – Group name
Returns: Tuple
(uid, gid)
matching passed-in user and groupRaises: ValueError
– If the ownership user/group values are invalid
-
CedarBackup3.util.
isRunningAsRoot
()[source]¶ Indicates whether the program is running as the root user.
-
CedarBackup3.util.
isStartOfWeek
(startingDay)[source]¶ Indicates whether “today” is the backup starting day per configuration.
If the current day’s English name matches the indicated starting day, then today is a starting day.
Parameters: startingDay (string, i.e. "monday"
,"tuesday"
, etc) – Configured starting dayReturns: Boolean indicating whether today is the starting day
-
CedarBackup3.util.
mount
(devicePath, mountPoint, fsType)[source]¶ Mounts the indicated device at the indicated mount point.
For instance, to mount a CD, you might use device path
/dev/cdrw
, mount point/media/cdrw
and filesystem typeiso9660
. You can safely use any filesystem type that is supported bymount
on your platform. If the type isNone
, we’ll attempt to letmount
auto-detect it. This may or may not work on all systems.Note: This only works on platforms that have a concept of “mounting” a filesystem through a command-line
"mount"
command, like UNIXes. It won’t work on Windows.Parameters: - devicePath – Path of device to be mounted
- mountPoint – Path that device should be mounted at
- fsType – Type of the filesystem assumed to be available via the device
Raises: IOError
– If the device cannot be mounted
-
CedarBackup3.util.
nullDevice
()[source]¶ Attempts to portably return the null device on this system.
The null device is something like
/dev/null
on a UNIX system. The name varies on other platforms.
-
CedarBackup3.util.
parseCommaSeparatedString
(commaString)[source]¶ Parses a list of values out of a comma-separated string.
The items in the list are split by comma, and then have whitespace stripped. As a special case, if
commaString
isNone
, thenNone
will be returned.Parameters: commaString – List of values in comma-separated string format Returns: Values from commaString split into a list, or None
-
CedarBackup3.util.
removeKeys
(d, keys)[source]¶ Removes all of the keys from the dictionary. The dictionary is altered in-place. Each key must exist in the dictionary. :param d: Dictionary to operate on :param keys: List of keys to remove
Raises: KeyError
– If one of the keys does not exist
-
CedarBackup3.util.
resolveCommand
(command)[source]¶ Resolves the real path to a command through the path resolver mechanism.
Both extensions and standard Cedar Backup functionality need a way to resolve the “real” location of various executables. Normally, they assume that these executables are on the system path, but some callers need to specify an alternate location.
Ideally, we want to handle this configuration in a central location. The Cedar Backup path resolver mechanism (a singleton called
PathResolverSingleton
) provides the central location to store the mappings. This function wraps access to the singleton, and is what all functions (extensions or standard functionality) should call if they need to find a command.The passed-in command must actually be a list, in the standard form used by all existing Cedar Backup code (something like
["svnlook", ]
). The lookup will actually be done on the first element in the list, and the returned command will always be in list form as well.If the passed-in command can’t be resolved or no mapping exists, then the command itself will be returned unchanged. This way, we neatly fall back on default behavior if we have no sensible alternative.
Parameters: command (List form of command, i.e. ["svnlook", ]
) – Command to resolveReturns: Path to command or just command itself if no mapping exists
-
CedarBackup3.util.
sanitizeEnvironment
()[source]¶ Sanitizes the operating system environment.
The operating system environment is contained in
os.environ
. This method sanitizes the contents of that dictionary.Currently, all it does is reset the locale (removing
$LC_*
) and set the default language ($LANG
) toDEFAULT_LANGUAGE
. This way, we can count on consistent localization regardless of what the end-user has configured. This is important for code that needs to parse program output.The
os.environ
dictionary is modifed in-place. If$LANG
is already set to the proper value, it is not re-set, so we can avoid the memory leaks that are documented to occur on BSD-based systems.Returns: Copy of the sanitized environment
-
CedarBackup3.util.
sortDict
(d)[source]¶ Returns the keys of the dictionary sorted by value. :param d: Dictionary to operate on
Returns: List of dictionary keys sorted in order by dictionary value
-
CedarBackup3.util.
splitCommandLine
(commandLine)[source]¶ Splits a command line string into a list of arguments.
Unfortunately, there is no “standard” way to parse a command line string, and it’s actually not an easy problem to solve portably (essentially, we have to emulate the shell argument-processing logic). This code only respects double quotes (
"
) for grouping arguments, not single quotes ('
). Make sure you take this into account when building your command line.Incidentally, I found this particular parsing method while digging around in Google Groups, and I tweaked it for my own use.
Parameters: commandLine (String, i.e. "cback3 --verbose stage store") – Command line string Returns: List of arguments, suitable for passing to popen2
Raises: ValueError
– If the command line is None
-
CedarBackup3.util.
unmount
(mountPoint, removeAfter=False, attempts=1, waitSeconds=0)[source]¶ Unmounts whatever device is mounted at the indicated mount point.
Sometimes, it might not be possible to unmount the mount point immediately, if there are still files open there. Use the
attempts
andwaitSeconds
arguments to indicate how many unmount attempts to make and how many seconds to wait between attempts. If you pass in zero attempts, no attempts will be made (duh).If the indicated mount point is not really a mount point per
os.path.ismount()
, then it will be ignored. This seems to be a safer check then looking through/etc/mtab
, sinceismount()
is already in the Python standard library and is documented as working on all POSIX systems.If
removeAfter
isTrue
, then the mount point will be removed usingos.rmdir()
after the unmount action succeeds. If for some reason the mount point is not a directory, then it will not be removed.Note: This only works on platforms that have a concept of “mounting” a filesystem through a command-line
"mount"
command, like UNIXes. It won’t work on Windows.Parameters: - mountPoint – Mount point to be unmounted
- removeAfter – Remove the mount point after unmounting it
- attempts – Number of times to attempt the unmount
- waitSeconds – Number of seconds to wait between repeated attempts
Raises: IOError
– If the mount point is still mounted after attempts are exhausted
CedarBackup3.writer module¶
Provides interface backwards compatibility.
In Cedar Backup 2.10.0, a refactoring effort took place while adding code to support DVD hardware. All of the writer functionality was moved to the writers/ package. This mostly-empty file remains to preserve the Cedar Backup library interface.
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
CedarBackup3.xmlutil module¶
Provides general XML-related functionality.
What I’m trying to do here is abstract much of the functionality that directly accesses the DOM tree. This is not so much to “protect” the other code from the DOM, but to standardize the way it’s used. It will also help extension authors write code that easily looks more like the rest of Cedar Backup.
Module Attributes¶
-
CedarBackup3.xmlutil.
TRUE_BOOLEAN_VALUES
¶ List of boolean values in XML representing
True
-
CedarBackup3.xmlutil.
FALSE_BOOLEAN_VALUES
¶ List of boolean values in XML representing
False
-
CedarBackup3.xmlutil.
VALID_BOOLEAN_VALUES
¶ List of valid boolean values in XML
author: | Kenneth J. Pronovici <pronovic@ieee.org> |
---|
-
class
CedarBackup3.xmlutil.
Serializer
(stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, encoding='UTF-8', indent=3)[source]¶ Bases:
object
XML serializer class.
This is a customized serializer that I hacked together based on what I found in the PyXML distribution. Basically, around release 2.7.0, the only reason I still had around a dependency on PyXML was for the PrettyPrint functionality, and that seemed pointless. So, I stripped the PrettyPrint code out of PyXML and hacked bits of it off until it did just what I needed and no more.
This code started out being called PrintVisitor, but I decided it makes more sense just calling it a serializer. I’ve made nearly all of the methods private, and I’ve added a new high-level serialize() method rather than having clients call
visit()
.Anyway, as a consequence of my hacking with it, this can’t quite be called a complete XML serializer any more. I ripped out support for HTML and XHTML, and there is also no longer any support for namespaces (which I took out because this dragged along a lot of extra code, and Cedar Backup doesn’t use namespaces). However, everything else should pretty much work as expected.
@copyright: This code, prior to customization, was part of the PyXML codebase, and before that was part of the 4DOM suite developed by Fourthought, Inc. It its original form, it was Copyright (c) 2000 Fourthought Inc, USA; All Rights Reserved.
-
CedarBackup3.xmlutil.
addBooleanNode
(xmlDom, parentNode, nodeName, nodeValue)[source]¶ Adds a text node as the next child of a parent, to contain a boolean.
If the
nodeValue
is None, then the node will be created, but will be empty (i.e. will contain no text node child).Boolean
True
, or anything else interpreted asTrue
by Python, will be converted to a string “Y”. Anything else will be converted to a string “N”. The result is added to the document viaaddStringNode
.Parameters: - xmlDom – DOM tree as from
impl.createDocument()
- parentNode – Parent node to create child for
- nodeName – Name of the new container node
- nodeValue – The value to put into the node
Returns: Reference to the newly-created node
- xmlDom – DOM tree as from
-
CedarBackup3.xmlutil.
addContainerNode
(xmlDom, parentNode, nodeName)[source]¶ Adds a container node as the next child of a parent node.
Parameters: - xmlDom – DOM tree as from
impl.createDocument()
- parentNode – Parent node to create child for
- nodeName – Name of the new container node
Returns: Reference to the newly-created node
- xmlDom – DOM tree as from
-
CedarBackup3.xmlutil.
addIntegerNode
(xmlDom, parentNode, nodeName, nodeValue)[source]¶ Adds a text node as the next child of a parent, to contain an integer.
If the
nodeValue
is None, then the node will be created, but will be empty (i.e. will contain no text node child).The integer will be converted to a string using “%d”. The result will be added to the document via
addStringNode
.Parameters: - xmlDom – DOM tree as from
impl.createDocument()
- parentNode – Parent node to create child for
- nodeName – Name of the new container node
- nodeValue – The value to put into the node
Returns: Reference to the newly-created node
- xmlDom – DOM tree as from
-
CedarBackup3.xmlutil.
addLongNode
(xmlDom, parentNode, nodeName, nodeValue)[source]¶ Adds a text node as the next child of a parent, to contain a long integer.
If the
nodeValue
is None, then the node will be created, but will be empty (i.e. will contain no text node child).The integer will be converted to a string using “%d”. The result will be added to the document via
addStringNode
.Parameters: - xmlDom – DOM tree as from
impl.createDocument()
- parentNode – Parent node to create child for
- nodeName – Name of the new container node
- nodeValue – The value to put into the node
Returns: Reference to the newly-created node
- xmlDom – DOM tree as from
-
CedarBackup3.xmlutil.
addStringNode
(xmlDom, parentNode, nodeName, nodeValue)[source]¶ Adds a text node as the next child of a parent, to contain a string.
If the
nodeValue
is None, then the node will be created, but will be empty (i.e. will contain no text node child).Parameters: - xmlDom – DOM tree as from
impl.createDocument()
- parentNode – Parent node to create child for
- nodeName – Name of the new container node
- nodeValue – The value to put into the node
Returns: Reference to the newly-created node
- xmlDom – DOM tree as from
-
CedarBackup3.xmlutil.
createInputDom
(xmlData, name='cb_config')[source]¶ Creates a DOM tree based on reading an XML string. :param name: Assumed base name of the document (root node name)
Returns: Tuple (xmlDom, parentNode) for the parsed document Raises: ValueError
– If the document can’t be parsed
-
CedarBackup3.xmlutil.
createOutputDom
(name='cb_config')[source]¶ Creates a DOM tree used for writing an XML document. :param name: Base name of the document (root node name)
Returns: Tuple (xmlDom, parentNode) for the new document
-
CedarBackup3.xmlutil.
isElement
(node)[source]¶ Returns True or False depending on whether the XML node is an element node.
-
CedarBackup3.xmlutil.
readBoolean
(parent, name)[source]¶ Returns boolean contents of the first child with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
The string value of the node must be one of the values in
VALID_BOOLEAN_VALUES
.Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: Boolean contents of node or
None
if no matching nodes are foundRaises: ValueError
– If the string at the location can’t be converted to a boolean
-
CedarBackup3.xmlutil.
readChildren
(parent, name)[source]¶ Returns a list of nodes with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
Underneath, we use the Python
getElementsByTagName
method, which is pretty cool, but which (surprisingly?) returns a list of all children with a given name below the parent, at any level. We just prune that list to include only children whoseparentNode
matches the passed-in parent.Parameters: - parent – Parent node to search beneath
- name – Name of nodes to search for
Returns: List of child nodes with correct parent, or an empty list if
no matching nodes are found.
-
CedarBackup3.xmlutil.
readFirstChild
(parent, name)[source]¶ Returns the first child with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: First properly-named child of parent, or
None
if no matching nodes are found
-
CedarBackup3.xmlutil.
readFloat
(parent, name)[source]¶ Returns float contents of the first child with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: Float contents of node or
None
if no matching nodes are foundRaises: ValueError
– If the string at the location can’t be converted to afloat value.
-
CedarBackup3.xmlutil.
readInteger
(parent, name)[source]¶ Returns integer contents of the first child with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: Integer contents of node or
None
if no matching nodes are foundRaises: ValueError
– If the string at the location can’t be converted to an integer
-
CedarBackup3.xmlutil.
readLong
(parent, name)[source]¶ Returns long integer contents of the first child with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: Long integer contents of node or
None
if no matching nodes are foundRaises: ValueError
– If the string at the location can’t be converted to an integer
-
CedarBackup3.xmlutil.
readString
(parent, name)[source]¶ Returns string contents of the first child with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node. We assume that string contents of a given node belong to the first
TEXT_NODE
child of that node.Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: String contents of node or
None
if no matching nodes are found
-
CedarBackup3.xmlutil.
readStringList
(parent, name)[source]¶ Returns a list of the string contents associated with nodes with a given name immediately beneath the parent.
By “immediately beneath” the parent, we mean from among nodes that are direct children of the passed-in parent node.
First, we find all of the nodes using
readChildren
, and then we retrieve the “string contents” of each of those nodes. The returned list has one entry per matching node. We assume that string contents of a given node belong to the firstTEXT_NODE
child of that node. Nodes which have noTEXT_NODE
children are not represented in the returned list.Parameters: - parent – Parent node to search beneath
- name – Name of node to search for
Returns: List of strings as described above, or
None
if no matching nodes are found