corrupy.pickleast — Special pickle construction¶
The pickleast module provides tools for constructing special pickles capable of
executing code and performing other operations during the deserialization of python’s
pickle object serialization format.
Technical Background¶
The deserialization machinery of the pickle format is powerful enough to construct arbitrary graphs of dynamically created python objects. Due to the dynamic nature, this means it is also capable of doing a lot more than just that.
The pickle format can be conceptually seen as a programming language that targets a very
simple stack machine. It has instructions for loading types, constructing them, and memoizing
them. Crucially, it has an instruction intended for object creation that simply calls a value
on the stack with other values as arguments. This means we can not just construct objects, but
also call functions. And since we can load builtins like getitem(), this instruction can
be used to call object methods, do slice indexing, perform math, etc. The main limitation to
this is that there is no control flow in the execution. The pickle bytecode will be executed
linearly, and as such it is impossible to encode looping constructs in the native pickle
control flow.
This can be worked around by calling builtins like eval() and just storing the python
code as a string, but one can imagine that these functions would be blacklisted during
unpickling. Therefore, this module implements as much python functionality as possible in pure pickle bytecode.
Interface¶
To embed special behaviour in a pickle bytestream, this module provides a set of types based on
the PickleBase type which can be placed anywhere in a normal python datastructure. This
datastructure can then serialized using the special AstPickler implementation which
will embed the special instructions into the bytestream.
- corrupy.pickleast.dumps(obj, protocol=2)¶
Create a pickle from an object with special behaviour for
PickleBasenodes, writing the result to abytesobject.
- corrupy.pickleast.dump(obj, file=None, protocol=2)¶
Like
dumps(), but writes the pickle to a file-like object.
- class corrupy.pickleast.AstPickler(file, protocol=None, *, fix_imports=True, buffer_callback=None)¶
A
pickle.Picklersubclass with special behaviour forPickleBaseinstances.This takes a binary file for writing a pickle data stream.
The optional protocol argument tells the pickler to use the given protocol; supported protocols are 0, 1, 2, 3, 4 and 5. The default protocol is 4. It was introduced in Python 3.4, and is incompatible with previous versions.
Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
The file argument must have a write() method that accepts a single bytes argument. It can thus be a file object opened for binary writing, an io.BytesIO instance, or any other custom object that meets this interface.
If fix_imports is True and protocol is less than 3, pickle will try to map the new Python 3 names to the old module names used in Python 2, so that the pickle data stream is readable with Python 2.
If buffer_callback is None (the default), buffer views are serialized into file as part of the pickle stream.
If buffer_callback is not None, then it can be called any number of times with a buffer view. If the callback returns a false value (such as None), the given buffer is out-of-band; otherwise the buffer is serialized in-band, i.e. inside the pickle stream.
It is an error if buffer_callback is not None and protocol is None or smaller than 5.
AST types¶
- class corrupy.pickleast.PickleBase¶
This is the abstract base class that all pickleast AST types derive from. When AstPickler
encounters an instance of this class during serialization, it’s functionality will be serialized
into the bytestream.
- corrupy.pickleast.__call__(*args, **kwargs)¶
Shorthand method for creating a Call AST node. PickleBase()(*args) is identical to Call(PickleBase(), *args)
Basic operations¶
These AST nodes all correspond to individual pickle bytecodes:
- class corrupy.pickleast.Wrap(obj)¶
A simple wrapper class which transforms obj into a PickleBase so the magic methods of picklebase can be used.
- class corrupy.pickleast.Call(callable, *args)¶
This operation represents calling an object on the pickle VM stack
This will call object and a set of positional arguments (or no arguments), the object will be called with these arguments at unpickling time.
- class corrupy.pickleast.SetAttributes(obj, **kwargs)¶
This operation represents calling __setattr__ or __dict__.update on the object.
It will call __dict__.update with the given keyword arguments and return the object.
- class corrupy.pickleast.Imports(module, name, cache=True)¶
This class will return the object name in module module at unpickling time.
- class corrupy.pickleast.Import(obj, cache=True)¶
This wrapper class will return obj at unpickling time.
Requirements: obj is a top level object in a module.
- class corrupy.pickleast.Sequence(*objects, **kwargs)¶
This class represents a series of objects, where only the last return value of the sequence will be returned at unpickling time. If reversed is True then the first object will be returned instead of the last object.
- class corrupy.pickleast.SetItem(object, key, value)¶
This class provides the equivalent of object[key] = value. This returns object
- class corrupy.pickleast.Assign(varname, value)¶
This class stores value in varname. This is implemented as pushing the value on to the memo. This returns value.
- class corrupy.pickleast.Load(varname)¶
This class loads the value from varname. This is implemented by getting the value from the memo. This returns value
Useful functions¶
These AST nodes are all Import instances of the relevant builtin function:
- corrupy.pickleast.List¶
- corrupy.pickleast.Dict¶
- corrupy.pickleast.Set¶
- corrupy.pickleast.Tuple¶
- corrupy.pickleast.Frozenset¶
- corrupy.pickleast.Str¶
- corrupy.pickleast.Int¶
- corrupy.pickleast.Bool¶
- corrupy.pickleast.Any¶
- corrupy.pickleast.All¶
- corrupy.pickleast.Map¶
- corrupy.pickleast.Zip¶
- corrupy.pickleast.HasAttr¶
- corrupy.pickleast.GetAttr¶
- corrupy.pickleast.SetAttr¶
- corrupy.pickleast.DelAttr¶
- corrupy.pickleast.IsInstance¶
- corrupy.pickleast.IsSubclass¶
- corrupy.pickleast.Iter¶
- corrupy.pickleast.Next¶
- corrupy.pickleast.Range¶
- corrupy.pickleast.Globals¶
- corrupy.pickleast.Locals¶
- corrupy.pickleast.Compile¶
Operation analogues¶
These AST nodes represent basic python operations that aren’t built into the pickle machinery and therefore constructed using builtin functions:
- corrupy.pickleast.CallMethod(obj, attr, *args)¶
A convenience function for calling methods.
- corrupy.pickleast.GetItem(obj, attr)¶
The equivalent of obj[attr].
- corrupy.pickleast.DelItem(obj, attr)¶
The equivalent of del obj[attr].
- corrupy.pickleast.Ternary(conditional, true_value, false_value)¶
A simple ternary statement. Due to the limitations of pickling both branches will be executed but it is possible to have a conditional final result.
- corrupy.pickleast.AssignGlobal(varname, value, module=None)¶
Assigns value to varname in the global namespace (to interact with exec and eval blocks) This is implemented as globals()[varname] = value.
This returns the global namespace.
- corrupy.pickleast.LoadGlobal(varname, module=None)¶
Loads varname from the global namespace
This is implemented as globals()[varname]
Code execution¶
These AST nodes allow arbitrary python code to be executed during the unpickling process:
- corrupy.pickleast.Eval(code, globals=<corrupy.pickleast.Call object>, locals=None)¶
This node executes code in the global (pickle module) namespace and returns the result
- corrupy.pickleast.Exec(string, globals=<corrupy.pickleast.Call object>, locals=None, filename='<pickle>')¶
This node executes string in the global namespace (this will usually be the pickle module namespace)
It returns None
- corrupy.pickleast.ExecTranspile(string, foreign=())¶
This node takes as input a string of python code, and transpiles this to pickle code using
TransPickler. See the documentation ofTransPicklerfor details.
- corrupy.pickleast.ExecAst(string, globals=<corrupy.pickleast.Call object>, locals=None, filename='<pickle>')¶
Takes a string of python code and compiles it into an object that, after being serialized with the ASTPickler, will execute the python code when unserialized.
The mechanism used for this is compiling the code to an AST, serializing this AST and then calling eval(compile()) on the ast.
Shell execution¶
The quickest proof for why you should not unpickle untrusted data.
- corrupy.pickleast.System(string)¶
This will execute string as a shell command
Module manipulation¶
- corrupy.pickleast.DeclareModule(name, retval=True)¶
Declares a module. This creates an empty module and inserts it in the sys.modules namespace, if retval is True then the module will be returned else sys.modules will be returned.
- corrupy.pickleast.DefineModule(name, code, executor=<function Exec>)¶
This ‘defines’ a module by executing a block of code in the namespace Of said module. Returns None
- corrupy.pickleast.GetModule(name)¶
This imports module name. Note that, if you ever need something contained in a module, it is more efficient to just use the native Import or Imports.
- corrupy.pickleast.Module(name, code, retval=True, executor=<function Exec>)¶
This node creates a module at importing time. It simply takes the name of the module and the code in the module as a string. This is done by first declaring the module, and then defining it. If circular references between modules are problematic, the declaring and defining has to be ordered manually.
it returns the module if retval is set to True, else it returns sys.modules
Utilities¶
- corrupy.pickleast.pprint(ast, file=None)¶
Pretty print a Pickle AST to a file or stdout.
This is shorthand for AstPrinter(file).dump(ast).
- class corrupy.pickleast.AstPrinter(out_file=None, indentation=' ')¶
The internal implementation of
pprint().
- class corrupy.pickleast.TransPickler(foreign)¶
A somewhat experimental way of directly transpiling a python code ast to a pickle ast. This is a subclass of
ast.NodeVisitor.Not all python constructs are supported (no loops, conditionals, etc). Semantics of some operations may differ. External data can be passed in through the foreign argument, which can be accessed in the python code by referring to variable names _0, _1, etc, where the number represents the index of this value in the foreign list.
- class corrupy.pickleast.PyAstCompiler¶
This is a more efficient way of embedding python ast’s in pickles.
This
ast.NodeTransformertakes a python AST and returns an object hierarchy that, when pickled using theASTPicklercompresses in a more optimized format due to it calling the ast constructors directly.Use it by calling PyAstCompiler.visit(ast_node).
- corrupy.pickleast.optimize(origpickle, protocol=2)¶
optimizes a pickle by stripping extraenous memoizing instructions and embedding a zlib compressed pickle inside the pickle.
- corrupy.pickleast.optimize_puts(p)¶
Optimizes a pickle bytecode given in p by assigning the low 256 BINPUT opcodes to the most used GET opcodes.
Should only be used for pickle protocol 1 - 3, as it does not handle the MEMOIZE opcode.
Returns the modified pickle bytecode.