pysys.mappers¶
Mapper filter or transform lines of input, for use with methods such as pysys.basetest.BaseTest.copy
and pysys.basetest.BaseTest.assertGrep
.
This package contains several pre-defined mappers:
|
Mapper that returns only text matching the specified regular expression. |
|
Mapper that filters lines by returning only lines matching the specified regular expression. |
|
Mapper that filters out all lines except those within a range of expressions. |
|
Mapper that filters lines by excluding/ignoring lines matching the specified regular expression. |
|
Mapper that substitutes all character sequences matching the specified regular expression with something different. |
|
Mapper that joins/concatenates consecutive related lines together into a single line. |
Mapper that joins the lines of a typical Python traceback (starting |
|
|
Mapper that joins the lines of a typical Java(R) stack trace (from stderr or a log file) into a single line, for easier grepping and self-contained test outcome failure reasons. |
Mapper that joins the lines of an ant’s stderr BUILD FAILED output to actually include the failure message(s), for easier grepping and self-contained test outcome failure reasons. |
|
|
Mapper that sorts all lines. |
|
Mapper that truncates any excessively long lines, to avoid regular expression matching taking an unreasonable amount of time. |
|
A generator function that applies zero or more mappers to each line from an iterator and yields each fully mapped line. |
In addition to the above, you can create custom mappers, which are usually callables (functions, lambdas, or classes
with a __call__()
method) that return the transformed copy of each incoming line.
For advanced cases you can provide a generator function that accepts a line iterator as input and yields the mapped lines; this allows for stateful transformation and avoids the limitation of having a 1:1 (or 1:0) relationship between input and output lines.
All lines passed to/from mappers end with a \n
character (on all platforms), except for the last line of the
file which will only have the \n
if the file ends with a blank line.
Mappers must always preserve the final \n
of each line (if present).
New in version 1.6.0.
RegexReplace¶
-
class
pysys.mappers.
RegexReplace
(regex, replacement)[source]¶ Bases:
object
Mapper that substitutes all character sequences matching the specified regular expression with something different.
For example:
self.copy('myfile.txt', 'myfile-processed.txt', mappers=[RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')])
This mapper returns all lines whether or not any substitutions occur. To return only the parts of lines that match a regular expression, use
IncludeMatches
instead.- Parameters
regex (str|compiled_regex) – The regular expression to search for.
replacement (str) – The string to replace it with. This can contain backslash references to groups in the regex such as
\1
for the first(...)
group (seere.sub()
in the Python documentation for more information).
>>> RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')('Test string x=2020-07-15T19:22:34+00:00.') 'Test string x=<timestamp>.'
>>> RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')('Test string x=5/7/2020 19:22:34.1234.') 'Test string x=<timestamp>.'
>>> RegexReplace(RegexReplace.DATETIME_REGEX, '<timestamp>')('Test string x=20200715T192234Z.\n') 'Test string x=<timestamp>.\n'
>>> RegexReplace(RegexReplace.NUMBER_REGEX, '<number>')('Test string x=123.') 'Test string x=<number>.'
>>> RegexReplace(RegexReplace.NUMBER_REGEX, '<number>')('Test string x=-12.45e+10.') 'Test string x=<number>.'
-
DATETIME_REGEX
= '(([0-9]{1,4}[/-][0-9]{1,2}[/-][0-9]{2,4}[ T]?)?[0-9]{1,2}:[0-9]{2}:[0-9]{2}([.][0-9]+|Z|[+-][0-9][0-9](:[0-9][0-9])?)?|[0-9]{8}T[0-9]{6}(Z|[+-][0-9][0-9]:)?)'¶ A regular expression that can be used to match timestamps in ISO 8601 format and other common alternatives such as: “2020-07-15T19:22:34+00:00”, “5/7/2020 19:22:34.1234”, “20200715T192234Z”
-
NUMBER_REGEX
= '[+-]?[0-9]+([.][0-9]+)?([eE][-+]?[0-9]+)?'¶ A regular expression that can be used to match integer or floating point numbers. This could be used in a mapper to replace all numbers with with “<number>” to remove ids that would make diff-ing files more difficult, if you only care about validating the non-numeric text.
IncludeLinesBetween¶
-
class
pysys.mappers.
IncludeLinesBetween
(startAt=None, stopAfter=None, startAfter=None, stopBefore=None)[source]¶ Bases:
object
Mapper that filters out all lines except those within a range of expressions.
This is useful when a log file contains lots of data you don’t care about, in addition to some multi-line sequences that you want to extract (with
pysys.basetest.BaseTest.copy
) ready forpysys.basetest.BaseTest.assertDiff
.As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).
The following parameters can be either a callable/lambda that accepts an input line and returns a boolean, or a regular expression string to search for in the specified line.
- Parameters
startAt (str|callable[str]->bool) – If it matches then the current line and subsequent lines are included (not filtered out). If not specified, lines from the start of the file onwards are matched.
startAfter (str|callable[str]->bool) – If it matches then the subsequent lines are included (not filtered out). If not specified, lines from the start of the file onwards are matched.
stopAfter (str|callable[str]->bool) – If it matches then lines after the current one are filtered out (unless/until a line matching startAt is found). Includes the stop line.
stopBefore (str|callable[str]->bool) – If it matches then this line and lines after it are filtered out (unless/until a line matching startAt is found). Excludes the stop line.
>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line+'\n' for line in input.replace('<tab>', chr(9)).split('|')], [mapper]))) >>> _mapperUnitTest( IncludeLinesBetween('start.*', 'stopafter.*'), 'a|start line|b|c|stopafter line|d|start line2|e').replace('\n','') 'start line|b|c|stopafter line|start line2|e'
>>> _mapperUnitTest( IncludeLinesBetween(startAt='start.*'), 'a|start line|b|c').replace('\n','') 'start line|b|c'
>>> _mapperUnitTest( IncludeLinesBetween(startAfter='start.*'), 'a|start line|b|c').replace('\n','') 'b|c'
>>> _mapperUnitTest( IncludeLinesBetween(startAt=lambda l: l.startswith('start')), 'a|start line|b|c').replace('\n','') 'start line|b|c'
>>> _mapperUnitTest( IncludeLinesBetween(stopAfter='stopafter.*'), 'a|stopafter|b|c').replace('\n','') 'a|stopafter'
>>> _mapperUnitTest( IncludeLinesBetween(stopBefore='stopbefore.*'), 'a|b|stopbefore|c') 'a\n|b\n'
Changed in version 2.0: Added startAfter
JoinLines¶
-
class
pysys.mappers.
JoinLines
(startAt=None, continueWhile=None, stopAfter=None, stopBefore=None, combiner=None)[source]¶ Bases:
object
Mapper that joins/concatenates consecutive related lines together into a single line. Useful for combining error or stack trace lines together for easier grepping and for more meaingful test failure reasons.
There are static factory methods on this class to create pre-configured instances for common languages e.g.
JoinLines.JavaStackTrace
,JoinLines.PythonTraceback
, or you can create your own. Seepysys.basetest.BaseTest.assertGrep
for an example.As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).
The following parameters can be either a callable/lambda that accepts an input line and returns a boolean, or a regular expression string to search for in the specified line. Note that a lambda with a simple string operation such as
startswith(...)
is usually a lot more efficient than a regular expression.Typically you would use startAt and just one of continueWhile/stopAfter/stopBefore.
- Parameters
startAt (str|callable[str]->bool) – If it matches then the current line then subsequent lines are joined into one. Can be a regular expression or a function with argument
line
.continueWhile (str|callable[str,list[str]]->bool) – After joining has started, then all consecutive lines matching this will be included in the current join, and it will be stopped as soon as a non-matching line is found. Can be a regular expression or a function with arguments
(line, buffer)
wherebuffer
is the list of previous lines accumulated from the current startAt match.stopAfter (str|callable[str,list[str]]->bool) – After joining has started, if this matches then this is the last line to be included in the current join. Includes the stop line. Can be a regular expression or a function with arguments
(line, buffer)
wherebuffer
is the list of previous lines accumulated from the current startAt match.stopBefore (str|callable[str,list[str]]->bool) – After joining has started, if this matches, then the preceding line is the last line to be included in the current join. Excludes the stop line. Can be a regular expression or a function with arguments
(line, buffer)
wherebuffer
is the list of previous lines accumulated from the current startAt match.combiner (callable[list[str]]->str) – A function that combines the joined lines from a given sequence into a single line. The implementation is
defaultCombiner
.
>>> def _mapperUnitTest(mapper, input): return ''.join(x for x in (applyMappers([line+'\n' for line in input.replace('<tab>', chr(9)).split('|')], [mapper]))).replace('\n','|') >>> _mapperUnitTest( JoinLines(startAt='startat.*', stopAfter='stopafter.*'), 'a| startat START| stack1| stack2 | stopafter STOP | d|startat2| e | f ') 'a|startat START / stack1 / stack2 / stopafter STOP| d|startat2 / e / f|'
>>> _mapperUnitTest( JoinLines(startAt='startat.*', continueWhile='stack.*'), 'startat START| stack1| stack2 | stopbefore NEXT LINE|d|startat2|stopbefore2') 'startat START / stack1 / stack2| stopbefore NEXT LINE|d|startat2|stopbefore2|'
>>> _mapperUnitTest( JoinLines(startAt='startat.*', stopAfter='stopafter.*'), 'a| startat START| stack1| | stack2 | stopafter STOP |d|startat2| stopafter e | f ') 'a|startat START / stack1 / stack2 / stopafter STOP|d|startat2 / stopafter e| f |'
>>> _mapperUnitTest( JoinLines(startAt='startat.*', stopBefore='stopbefore.*'), 'startat START| stack1| stack2 | stopbefore NEXT LINE|d|startat2|stopbefore2') 'startat START / stack1 / stack2| stopbefore NEXT LINE|d|startat2|stopbefore2|'
New in version 2.0.
-
static
defaultCombiner
(lines)[source]¶ The default “combiner” function used by
JoinLines
, which joins the lines with the delimiter" / "
after stripping leading/trailing whitespace and blank lines.If you want different behaviour, create your own function with this signature and pass it in as the
combiner=
argument.- Parameters
lines (list[str]) – The lines to be joined.
- Returns
A single string representing all of these lines.
-
static
PythonTraceback
()[source]¶ Mapper that joins the lines of a typical Python traceback (starting
Traceback (most recent call last):
) into a single line, for easier grepping and self-contained test outcome failure reasons.The combiner is configured to put the actual exception class and message (which is the most important information) at the start of the joined line rather than at the end (after the traceback).
>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line for line in input.replace('|','\n|').replace('<tab>',chr(9)).split('|')], [mapper]))).replace('\n','') >>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):| File "~/foo.py", line 1195, in __call__| def __call__(self): myfunction()| File "~/bar.py", line 11, in myfunction | raise KeyError ("foo bar")|KeyError: "foo bar"|Normal operation is resumed') 'a|KeyError: "foo bar" / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):| File "~/foo.py", line 1195, in __call__| def __call__(self): myfunction()| File "~/bar.py", line 11, in myfunction | raise KeyError ("foo bar")|AssertionError|Normal operation is resumed') 'a|AssertionError / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):| File "~/foo.py", line 1195, in __call__| def __call__(self): myfunction()| File "~/bar.py", line 11, in myfunction | raise KeyError ("foo bar")||KeyError: "foo bar"|Normal operation is resumed') 'a|KeyError: "foo bar" / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):| File "~/foo.py", line 1195, in __call__| def __call__(self): myfunction()| File "~/bar.py", line 11, in myfunction |<tab>raise KeyError ("foo bar")||KeyError: "foo bar"|OtherError: baz|Normal operation is resumed') 'a|KeyError: "foo bar" / Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|OtherError: baz|Normal operation is resumed'
>>> _mapperUnitTest( JoinLines.PythonTraceback(), 'a|Traceback (most recent call last):| File "~/foo.py", line 1195, in __call__| def __call__(self): myfunction()| File "~/bar.py", line 11, in myfunction | raise KeyError ("foo bar")|Normal operation is resumed') 'a|Traceback (most recent call last): / File "~/foo.py", line 1195, in __call__ / def __call__(self): myfunction() / File "~/bar.py", line 11, in myfunction / raise KeyError ("foo bar")|Normal operation is resumed'
-
static
JavaStackTrace
(combiner=None, errorLogLineRegex='(ERROR|FATAL) ')[source]¶ Mapper that joins the lines of a typical Java(R) stack trace (from stderr or a log file) into a single line, for easier grepping and self-contained test outcome failure reasons.
- Parameters
combiner (callable[list[str]]->str) – See
JoinLines
.errorLogLineRegex (str) – A regular expression used to match log lines which could (optionally) be followed by a stack trace.
>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line for line in input.replace('<tab>', chr(9)).split('|')], [mapper]))).replace('\n','') >>> _mapperUnitTest( JoinLines.JavaStackTrace(), 'java.lang.AssertionError: Invalid state|<tab>at org.junit.Assert.fail(Assert.java:100)|Caused by: java.lang.RuntimeError: Oh dear |<tab>at org.myorg.TestMyClass2|Normal operation has resumed ') 'java.lang.AssertionError: Invalid state / at org.junit.Assert.fail(Assert.java:100) / Caused by: java.lang.RuntimeError: Oh dear / at org.myorg.TestMyClass2|Normal operation has resumed '
>>> _mapperUnitTest( JoinLines.JavaStackTrace(), '2021-05-25 ERROR [Thread1] The operation failed|java.lang.AssertionError: Invalid state|<tab>at org.junit.Assert.fail(Assert.java:100)|Caused by: java.lang.RuntimeError: Oh dear |<tab>at org.myorg.TestMyClass2|2021-05-25 ERROR [Thread1] Another error|2021-05-25 INFO [Thread1] normal operation') '2021-05-25 ERROR [Thread1] The operation failed / java.lang.AssertionError: Invalid state / at org.junit.Assert.fail(Assert.java:100) / Caused by: java.lang.RuntimeError: Oh dear / at org.myorg.TestMyClass2|2021-05-25 ERROR [Thread1] Another error|2021-05-25 INFO [Thread1] normal operation'
>>> _mapperUnitTest( JoinLines.JavaStackTrace(), 'Exception in thread "main" java.lang.RuntimeException: Main exception|<tab>at scratch.ExceptionTest.main(ExceptionTest.java:16)') 'Exception in thread "main" java.lang.RuntimeException: Main exception / at scratch.ExceptionTest.main(ExceptionTest.java:16)'
-
static
AntBuildFailure
()[source]¶ Mapper that joins the lines of an ant’s stderr BUILD FAILED output to actually include the failure message(s), for easier grepping and self-contained test outcome failure reasons.
As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).
>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line for line in input.split('|')], [mapper]))).replace('\n','') >>> _mapperUnitTest( JoinLines.AntBuildFailure(), 'BUILD FAILED|~/build.xml:13: Unknown attribute [dodgyattribute]||Total time: 0 seconds') 'BUILD FAILED / ~/build.xml:13: Unknown attribute [dodgyattribute]||Total time: 0 seconds'
IncludeMatches¶
-
class
pysys.mappers.
IncludeMatches
(regex, repl=None)[source]¶ Bases:
object
Mapper that returns only text matching the specified regular expression.
- Parameters
regex (str|compiled_regex) – The regular expression to match (this is a match not a search, so use
.*
at the beginning if you want to allow extra characters at the start of the line). Multiple expressions can be combined (efficiently) using(expr1|expr2)
syntax.repl (str) – By default this mapper returns the entire match, but instead set this to a replacement string such as
\1 \2
to return only the specified(...)
groups from the match (see the Python MatchObjectexpand
method).
>>> IncludeMatches('F..')('Foo bar\n') 'Foo\n'
>>> IncludeMatches('F..')('Foo bar') 'Foo'
>>> IncludeMatches('.*(oo) *(.*)', repl='\\1-\\2')('Foo bar\n') 'oo-bar\n'
New in version 2.2.
IncludeLinesMatching¶
-
class
pysys.mappers.
IncludeLinesMatching
(regex)[source]¶ Bases:
object
Mapper that filters lines by returning only lines matching the specified regular expression.
To return only the matching parts of the lines, use
IncludeMatches
instead.- Parameters
regex (str|compiled_regex) – The regular expression to match (this is a match not a search, so use
.*
at the beginning if you want to allow extra characters at the start of the line). Multiple expressions can be combined (efficiently) using(expr1|expr2)
syntax.
>>> IncludeLinesMatching('Foo.*')('Foo bar\n') 'Foo bar\n'
>>> IncludeLinesMatching('bar.*')('Foo bar\n') is None True
ExcludeLinesMatching¶
-
class
pysys.mappers.
ExcludeLinesMatching
(regex)[source]¶ Bases:
object
Mapper that filters lines by excluding/ignoring lines matching the specified regular expression.
- Parameters
regex (str|compiled_regex) – The regular expression to match (use
.*
at the beginning to allow extra characters at the start of the line). Multiple expressions can be combined using(expr1|expr2)
syntax.
>>> ExcludeLinesMatching('Foo.*')('Foo bar') is None True
>>> ExcludeLinesMatching('bar.*')('Foo bar') 'Foo bar'
TruncateLongLines¶
-
class
pysys.mappers.
TruncateLongLines
(maxLineLength=5000)[source]¶ Bases:
object
Mapper that truncates any excessively long lines, to avoid regular expression matching taking an unreasonable amount of time.
Occasionally log files can contain lines with a large dump of debugging data, which typically don’t need to be checked when grepping the logs but unfortunately can cause debilitating slowness in Python since some regular expressions take an extremely long time to evaluate against long strings. This mapper can be added to avoid this problem by truncating log lines to a reasonable length.
- Parameters
maxLineLength (int) – The maximum number of characters per line.
>>> TruncateLongLines() ('Short line\n') 'Short line\n'
>>> TruncateLongLines(13) ('Long line AAAAAAAAAAA\n') 'Long line AAA <truncated by PySys>\n'
>>> len(TruncateLongLines() ('a'*20000) ) < 10000+30 True
New in version 2.2.
SortLines¶
-
pysys.mappers.
SortLines
(key=None)[source]¶ Mapper that sorts all lines.
Note that unlike most mappers this will read the entire input into memory to perform the sort, so only use this when you know the file size isn’t enormous.
As this mapper is stateful, do not use a single instance of it in multiple tests (or multiple threads).
New in version 2.0.
- Parameters
key (callable[str]->str) – A callable that returns the sort key to use for each line, in case you want something other than the default lexicographic sorting.
>>> def _mapperUnitTest(mapper, input): return '|'.join(x for x in (applyMappers([line+'' for line in input.split('|')], [mapper]))) >>> _mapperUnitTest( SortLines(), 'a|z|A|B|aa|c').replace('\n', '') 'A|B|a|aa|c|z'
>>> _mapperUnitTest( SortLines( key=lambda s: int(s) ), '100|1|10|22|2').replace('\n', '') '1|2|10|22|100'
>>> _mapperUnitTest( SortLines(), 'a\n|c\n|b') 'a\n|b\n|c\n'
applyMappers¶
-
pysys.mappers.
applyMappers
(iterator, mappers)[source]¶ A generator function that applies zero or more mappers to each line from an iterator and yields each fully mapped line.
If a mapper function returns None for a line that line is dropped.
- Parameters
iterator (Iterable[str]) – An iterable such as a file object that yields lines to be mapped. Trailing newline characters are preserved, but not passed to the mappers.
mappers (List[callable[str]->str or callable[iterator]->Generator[str,None,None] ]) –
A list of filter functions that will be used to pre-process each line from the file (returning None if the line is to be filtered out). For advanced cases where stateful mappings are needed, instead of a function to filter individual lines, you can provide a generator function which accepts an iterable of all input lines from each file and yields output lines (including potentially some additional lines).
Mappers must always preserve the final
\n
of each line (if present).Do not share mapper instances across multiple tests or threads as this can cause race conditions.
As a convenience to make conditionalization easier, any
None
items in the mappers list are simply ignored.
- Return type
Iterable[str]
New in version 2.0.