pysys.mappers

Mapper filter or transform lines of input, for use with methods such as pysys.basetest.BaseTest.copy and pysys.basetest.BaseTest.assertGrep.

This package contains several pre-defined mappers:

IncludeMatches(regex[, repl])

Mapper that returns only text matching the specified regular expression.

IncludeLinesMatching(regex)

Mapper that filters lines by returning only lines matching the specified regular expression.

IncludeLinesBetween([startAt, stopAfter, …])

Mapper that filters out all lines except those within a range of expressions.

ExcludeLinesMatching(regex)

Mapper that filters lines by excluding/ignoring lines matching the specified regular expression.

RegexReplace(regex, replacement)

Mapper that substitutes all character sequences matching the specified regular expression with something different.

JoinLines([startAt, continueWhile, …])

Mapper that joins/concatenates consecutive related lines together into a single line.

JoinLines.PythonTraceback()

Mapper that joins the lines of a typical Python traceback (starting Traceback (most recent call last):) into a single line, for easier grepping and self-contained test outcome failure reasons.

JoinLines.JavaStackTrace([combiner, …])

Mapper that joins the lines of a typical Java(R) stack trace (from stderr or a log file) into a single line, for easier grepping and self-contained test outcome failure reasons.

JoinLines.AntBuildFailure()

Mapper that joins the lines of an ant’s stderr BUILD FAILED output to actually include the failure message(s), for easier grepping and self-contained test outcome failure reasons.

SortLines([key])

Mapper that sorts all lines.

TruncateLongLines([maxLineLength])

Mapper that truncates any excessively long lines, to avoid regular expression matching taking an unreasonable amount of time.

applyMappers(iterator, mappers)

A generator function that applies zero or more mappers to each line from an iterator and yields each fully mapped line.

In addition to the above, you can create custom mappers, which are usually callables (functions, lambdas, or classes with a __call__() method) that return the transformed copy of each incoming line.

For advanced cases you can provide a generator function that accepts a line iterator as input and yields the mapped lines; this allows for stateful transformation and avoids the limitation of having a 1:1 (or 1:0) relationship between input and output lines.

All lines passed to/from mappers end with a \n character (on all platforms), except for the last line of the file which will only have the \n if the file ends with a blank line. Mappers must always preserve the final \n of each line (if present).

New in version 1.6.0.

RegexReplace

class pysys.mappers.RegexReplace(regex, replacement)[source]

Bases: object

Mapper that substitutes all character sequences matching the specified regular expression with something different.

For example:

self.copy('myfile.txt', 'myfile-processed.txt', mappers=[RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')])

This mapper returns all lines whether or not any substitutions occur. To return only the parts of lines that match a regular expression, use IncludeMatches instead.

Parameters
  • regex (str|compiled_regex) – The regular expression to search for.

  • replacement (str) – The string to replace it with. This can contain backslash references to groups in the regex such as \1 for the first (...) group (see re.sub() in the Python documentation for more information).

>>> RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')('Test string x=2020-07-15T19:22:34+00:00.')
'Test string x=<timestamp>.'
>>> RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')('Test string x=5/7/2020 19:22:34.1234.')
'Test string x=<timestamp>.'
>>> RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')('Test string x=20200715T192234Z.\n')
'Test string x=<timestamp>.\n'
>>> RegexReplace(RegexReplace.NUMBER_REGEX, '<number>')('Test string x=123.')
'Test string x=<number>.'
>>> RegexReplace(RegexReplace.NUMBER_REGEX, '<number>')('Test string x=-12.45e+10.')
'Test string x=<number>.'
DATETIME_REGEX = '(([0-9]{1,4}[/-][0-9]{1,2}[/-][0-9]{2,4}[ T]?)?[0-9]{1,2}:[0-9]{2}:[0-9]{2}([.][0-9]+|Z|[+-][0-9][0-9](:[0-9][0-9])?)?|[0-9]{8}T[0-9]{6}(Z|[+-][0-9][0-9]:)?)'

A regular expression that can be used to match timestamps in ISO 8601 format and other common alternatives such as: “2020-07-15T19:22:34+00:00”, “5/7/2020 19:22:34.1234”, “20200715T192234Z”

NUMBER_REGEX = '[+-]?[0-9]+([.][0-9]+)?([eE][-+]?[0-9]+)?'

A regular expression that can be used to match integer or floating point numbers. This could be used in a mapper to replace all numbers with with “<number>” to remove ids that would make diff-ing files more difficult, if you only care about validating the non-numeric text.

IncludeLinesBetween

class pysys.mappers.IncludeLinesBetween(startAt=None, stopAfter=None, startAfter=None, stopBefore=None)[source]

Bases: object

Mapper that filters out all lines except those within a range of expressions.

This is useful when a log file contains lots of data you don’t care about, in addition to some multi-line sequences that you want to extract (with pysys.basetest.BaseTest.copy) ready for pysys.basetest.BaseTest.assertDiff.

As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).

The following parameters can be either a callable/lambda that accepts an input line and returns a boolean, or a regular expression string to search for in the specified line.

Parameters
  • startAt (str|callable[str]->bool) – If it matches then the current line and subsequent lines are included (not filtered out). If not specified, lines from the start of the file onwards are matched.

  • startAfter (str|callable[str]->bool) – If it matches then the subsequent lines are included (not filtered out). If not specified, lines from the start of the file onwards are matched.

  • stopAfter (str|callable[str]->bool) – If it matches then lines after the current one are filtered out (unless/until a line matching startAt is found). Includes the stop line.

  • stopBefore (str|callable[str]->bool) – If it matches then this line and lines after it are filtered out (unless/until a line matching startAt is found). Excludes the stop line.

>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line+'\n' for line in input.replace('<tab>', chr(9)).split('|')], [mapper])))
>>> _mapperUnitTest( IncludeLinesBetween('start.*', 'stopafter.*'), 'a|start line|b|c|stopafter line|d|start line2|e').replace('\n','')
'start line|b|c|stopafter line|start line2|e'
>>> _mapperUnitTest( IncludeLinesBetween(startAt='start.*'), 'a|start line|b|c').replace('\n','')
'start line|b|c'
>>> _mapperUnitTest( IncludeLinesBetween(startAfter='start.*'), 'a|start line|b|c').replace('\n','')
'b|c'
>>> _mapperUnitTest( IncludeLinesBetween(startAt=lambda l: l.startswith('start')), 'a|start line|b|c').replace('\n','')
'start line|b|c'
>>> _mapperUnitTest( IncludeLinesBetween(stopAfter='stopafter.*'), 'a|stopafter|b|c').replace('\n','')
'a|stopafter'
>>> _mapperUnitTest( IncludeLinesBetween(stopBefore='stopbefore.*'), 'a|b|stopbefore|c')
'a\n|b\n'

Changed in version 2.0: Added startAfter

JoinLines

class pysys.mappers.JoinLines(startAt=None, continueWhile=None, stopAfter=None, stopBefore=None, combiner=None)[source]

Bases: object

Mapper that joins/concatenates consecutive related lines together into a single line. Useful for combining error or stack trace lines together for easier grepping and for more meaingful test failure reasons.

There are static factory methods on this class to create pre-configured instances for common languages e.g. JoinLines.JavaStackTrace, JoinLines.PythonTraceback, or you can create your own. See pysys.basetest.BaseTest.assertGrep for an example.

As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).

The following parameters can be either a callable/lambda that accepts an input line and returns a boolean, or a regular expression string to search for in the specified line. Note that a lambda with a simple string operation such as startswith(...) is usually a lot more efficient than a regular expression.

Typically you would use startAt and just one of continueWhile/stopAfter/stopBefore.

Parameters
  • startAt (str|callable[str]->bool) – If it matches then the current line then subsequent lines are joined into one. Can be a regular expression or a function with argument line.

  • continueWhile (str|callable[str,list[str]]->bool) – After joining has started, then all consecutive lines matching this will be included in the current join, and it will be stopped as soon as a non-matching line is found. Can be a regular expression or a function with arguments (line, buffer) where buffer is the list of previous lines accumulated from the current startAt match.

  • stopAfter (str|callable[str,list[str]]->bool) – After joining has started, if this matches then this is the last line to be included in the current join. Includes the stop line. Can be a regular expression or a function with arguments (line, buffer) where buffer is the list of previous lines accumulated from the current startAt match.

  • stopBefore (str|callable[str,list[str]]->bool) – After joining has started, if this matches, then the preceding line is the last line to be included in the current join. Excludes the stop line. Can be a regular expression or a function with arguments (line, buffer) where buffer is the list of previous lines accumulated from the current startAt match.

  • combiner (callable[list[str]]->str) – A function that combines the joined lines from a given sequence into a single line. The implementation is defaultCombiner.

>>> def _mapperUnitTest(mapper, input): return ''.join(x for x in (applyMappers([line+'\n' for line in input.replace('<tab>', chr(9)).split('|')], [mapper]))).replace('\n','|')
>>> _mapperUnitTest( JoinLines(startAt='startat.*', stopAfter='stopafter.*'), 'a| startat START|  stack1|  stack2 | stopafter STOP | d|startat2| e | f ')
'a|startat START / stack1 / stack2 / stopafter STOP| d|startat2 / e / f|'
>>> _mapperUnitTest( JoinLines(startAt='startat.*', continueWhile='stack.*'), 'startat START|  stack1|  stack2 | stopbefore NEXT LINE|d|startat2|stopbefore2')
'startat START / stack1 / stack2| stopbefore NEXT LINE|d|startat2|stopbefore2|'
>>> _mapperUnitTest( JoinLines(startAt='startat.*', stopAfter='stopafter.*'), 'a| startat START|  stack1|  |  stack2 | stopafter STOP |d|startat2| stopafter e | f ')
'a|startat START / stack1 / stack2 / stopafter STOP|d|startat2 / stopafter e| f |'
>>> _mapperUnitTest( JoinLines(startAt='startat.*', stopBefore='stopbefore.*'), 'startat START|  stack1|  stack2 | stopbefore NEXT LINE|d|startat2|stopbefore2')
'startat START / stack1 / stack2| stopbefore NEXT LINE|d|startat2|stopbefore2|'

New in version 2.0.

static defaultCombiner(lines)[source]

The default “combiner” function used by JoinLines, which joins the lines with the delimiter " / " after stripping leading/trailing whitespace and blank lines.

If you want different behaviour, create your own function with this signature and pass it in as the combiner= argument.

Parameters

lines (list[str]) – The lines to be joined.

Returns

A single string representing all of these lines.

static PythonTraceback()[source]

Mapper that joins the lines of a typical Python traceback (starting Traceback (most recent call last):) into a single line, for easier grepping and self-contained test outcome failure reasons.

The combiner is configured to put the actual exception class and message (which is the most important information) at the start of the joined line rather than at the end (after the traceback).

>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line for line in input.replace('|','\n|').replace('<tab>',chr(9)).split('|')], [mapper]))).replace('\n','')
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):|  File "~/foo.py", line 1195, in __call__|    def __call__(self): myfunction()|  File "~/bar.py", line 11, in myfunction |    raise KeyError ("foo bar")|KeyError: "foo bar"|Normal operation is resumed')
'a|KeyError: "foo bar" / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):|  File "~/foo.py", line 1195, in __call__|    def __call__(self): myfunction()|  File "~/bar.py", line 11, in myfunction |    raise KeyError ("foo bar")|AssertionError|Normal operation is resumed')
'a|AssertionError / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):|  File "~/foo.py", line 1195, in __call__|    def __call__(self): myfunction()|  File "~/bar.py", line 11, in myfunction |    raise KeyError ("foo bar")||KeyError: "foo bar"|Normal operation is resumed')
'a|KeyError: "foo bar" / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):|  File "~/foo.py", line 1195, in __call__|    def __call__(self): myfunction()|  File "~/bar.py", line 11, in myfunction |<tab>raise KeyError ("foo bar")||KeyError: "foo bar"|OtherError: baz|Normal operation is resumed')
'a|KeyError: "foo bar" / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|OtherError: baz|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):|  File "~/foo.py", line 1195, in __call__|    def __call__(self): myfunction()|  File "~/bar.py", line 11, in myfunction |    raise KeyError ("foo bar")|Normal operation is resumed')
'a|Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
static JavaStackTrace(combiner=None, errorLogLineRegex='(ERROR|FATAL) ')[source]

Mapper that joins the lines of a typical Java(R) stack trace (from stderr or a log file) into a single line, for easier grepping and self-contained test outcome failure reasons.

Parameters
  • combiner (callable[list[str]]->str) – See JoinLines.

  • errorLogLineRegex (str) – A regular expression used to match log lines which could (optionally) be followed by a stack trace.

>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line for line in input.replace('<tab>', chr(9)).split('|')], [mapper]))).replace('\n','')
>>> _mapperUnitTest( JoinLines.JavaStackTrace(), 'java.lang.AssertionError: Invalid state|<tab>at org.junit.Assert.fail(Assert.java:100)|Caused by: java.lang.RuntimeError: Oh dear |<tab>at org.myorg.TestMyClass2|Normal operation has resumed ')
'java.lang.AssertionError: Invalid state / at org.junit.Assert.fail(Assert.java:100) / Caused by: java.lang.RuntimeError: Oh dear / at org.myorg.TestMyClass2|Normal operation has resumed '
>>> _mapperUnitTest( JoinLines.JavaStackTrace(), '2021-05-25 ERROR [Thread1] The operation failed|java.lang.AssertionError: Invalid state|<tab>at org.junit.Assert.fail(Assert.java:100)|Caused by: java.lang.RuntimeError: Oh dear |<tab>at org.myorg.TestMyClass2|2021-05-25 ERROR [Thread1] Another error|2021-05-25 INFO [Thread1] normal operation')
'2021-05-25 ERROR [Thread1] The operation failed / java.lang.AssertionError: Invalid state / at org.junit.Assert.fail(Assert.java:100) / Caused by: java.lang.RuntimeError: Oh dear / at org.myorg.TestMyClass2|2021-05-25 ERROR [Thread1] Another error|2021-05-25 INFO [Thread1] normal operation'
>>> _mapperUnitTest( JoinLines.JavaStackTrace(), 'Exception in thread "main" java.lang.RuntimeException: Main exception|<tab>at scratch.ExceptionTest.main(ExceptionTest.java:16)')
'Exception in thread "main" java.lang.RuntimeException: Main exception / at scratch.ExceptionTest.main(ExceptionTest.java:16)'
static AntBuildFailure()[source]

Mapper that joins the lines of an ant’s stderr BUILD FAILED output to actually include the failure message(s), for easier grepping and self-contained test outcome failure reasons.

As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).

>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line for line in input.split('|')], [mapper]))).replace('\n','')
>>> _mapperUnitTest( JoinLines.AntBuildFailure(), 'BUILD FAILED|~/build.xml:13: Unknown attribute [dodgyattribute]||Total time: 0 seconds')
'BUILD FAILED / ~/build.xml:13: Unknown attribute [dodgyattribute]||Total time: 0 seconds'

IncludeMatches

class pysys.mappers.IncludeMatches(regex, repl=None)[source]

Bases: object

Mapper that returns only text matching the specified regular expression.

Parameters
  • regex (str|compiled_regex) – The regular expression to match (this is a match not a search, so use .* at the beginning if you want to allow extra characters at the start of the line). Multiple expressions can be combined (efficiently) using (expr1|expr2) syntax.

  • repl (str) – By default this mapper returns the entire match, but instead set this to a replacement string such as \1 \2 to return only the specified (...) groups from the match (see the Python MatchObject expand method).

>>> IncludeMatches('F..')('Foo bar\n')
'Foo\n'
>>> IncludeMatches('F..')('Foo bar')
'Foo'
>>> IncludeMatches('.*(oo) *(.*)', repl='\\1-\\2')('Foo bar\n')
'oo-bar\n'

New in version 2.2.

IncludeLinesMatching

class pysys.mappers.IncludeLinesMatching(regex)[source]

Bases: object

Mapper that filters lines by returning only lines matching the specified regular expression.

To return only the matching parts of the lines, use IncludeMatches instead.

Parameters

regex (str|compiled_regex) – The regular expression to match (this is a match not a search, so use .* at the beginning if you want to allow extra characters at the start of the line). Multiple expressions can be combined (efficiently) using (expr1|expr2) syntax.

>>> IncludeLinesMatching('Foo.*')('Foo bar\n')
'Foo bar\n'
>>> IncludeLinesMatching('bar.*')('Foo bar\n') is None
True

ExcludeLinesMatching

class pysys.mappers.ExcludeLinesMatching(regex)[source]

Bases: object

Mapper that filters lines by excluding/ignoring lines matching the specified regular expression.

Parameters

regex (str|compiled_regex) – The regular expression to match (use .* at the beginning to allow extra characters at the start of the line). Multiple expressions can be combined using (expr1|expr2) syntax.

>>> ExcludeLinesMatching('Foo.*')('Foo bar') is None
True
>>> ExcludeLinesMatching('bar.*')('Foo bar')
'Foo bar'

TruncateLongLines

class pysys.mappers.TruncateLongLines(maxLineLength=5000)[source]

Bases: object

Mapper that truncates any excessively long lines, to avoid regular expression matching taking an unreasonable amount of time.

Occasionally log files can contain lines with a large dump of debugging data, which typically don’t need to be checked when grepping the logs but unfortunately can cause debilitating slowness in Python since some regular expressions take an extremely long time to evaluate against long strings. This mapper can be added to avoid this problem by truncating log lines to a reasonable length.

Parameters

maxLineLength (int) – The maximum number of characters per line.

>>> TruncateLongLines() ('Short line\n')
'Short line\n'
>>> TruncateLongLines(13) ('Long line AAAAAAAAAAA\n')
'Long line AAA <truncated by PySys>\n'
>>> len(TruncateLongLines() ('a'*20000) ) < 10000+30
True

New in version 2.2.

SortLines

pysys.mappers.SortLines(key=None)[source]

Mapper that sorts all lines.

Note that unlike most mappers this will read the entire input into memory to perform the sort, so only use this when you know the file size isn’t enormous.

As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).

New in version 2.0.

Parameters

key (callable[str]->str) – A callable that returns the sort key to use for each line, in case you want something other than the default lexicographic sorting.

>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line+'' for line in input.split('|')], [mapper])))
>>> _mapperUnitTest( SortLines(), 'a|z|A|B|aa|c').replace('\n', '')
'A|B|a|aa|c|z'
>>> _mapperUnitTest( SortLines( key=lambda s: int(s) ), '100|1|10|22|2').replace('\n', '')
'1|2|10|22|100'
>>> _mapperUnitTest( SortLines(), 'a\n|c\n|b')
'a\n|b\n|c\n'

applyMappers

pysys.mappers.applyMappers(iterator, mappers)[source]

A generator function that applies zero or more mappers to each line from an iterator and yields each fully mapped line.

If a mapper function returns None for a line that line is dropped.

Parameters
  • iterator (Iterable[str]) – An iterable such as a file object that yields lines to be mapped. Trailing newline characters are preserved, but not passed to the mappers.

  • mappers (List[callable[str]->str or callable[iterator]->Generator[str,None,None] ]) –

    A list of filter functions that will be used to pre-process each line from the file (returning None if the line is to be filtered out). For advanced cases where stateful mappings are needed, instead of a function to filter individual lines, you can provide a generator function which accepts an iterable of all input lines from each file and yields output lines (including potentially some additional lines).

    Mappers must always preserve the final \n of each line (if present).

    Do not share mapper instances across multiple tests or threads as this can cause race conditions.

    As a convenience to make conditionalization easier, any None items in the mappers list are simply ignored.

Return type

Iterable[str]

New in version 2.0.