CFiddle Core API

The core CFiddle API allows you to parameterize the compilation and execution of code and then analyze the code and resulting measurements.

Parameterizing Compilation and Execution

Flexible, uniform parameterizations is one of the central features of CFiddle and the source of much of its power.

Two functions, cfiddle.arg_map() and cfiddle.arg_product(), form the core of CFiddle’s parameterization facilities. They let CFiddle explore the impact of compile- and run-time parameters by making it very easy to construct complex sets of parameter/argument values.

arg_map() ideal for generating all possible combinations of values for arguments or specifying a list of specific configurations. arg_product() is useful when you need a combination of these two behaviors. See the examples below.

cfiddle.arg_map(**parameters)

Generates take a set of named lists of values and generate the named cross product a set of dict s with all combinations of the values.

For example:

>>> from cfiddle import *
>>> from pprint import pprint
>>> pprint(arg_map(foo=[1,2], bar=[3,4], baz=5))
[{'bar': 3, 'baz': 5, 'foo': 1},
 {'bar': 4, 'baz': 5, 'foo': 1},
 {'bar': 3, 'baz': 5, 'foo': 2},
 {'bar': 4, 'baz': 5, 'foo': 2}]

You can also specify a list argument values by adding together the results of arg_map():

>>> from cfiddle import *
>>> from pprint import pprint
>>> pprint(arg_map(foo=1, bar=3, baz=5) +
...        arg_map(foo=4, bar=5, baz=6))
[{'bar': 3, 'baz': 5, 'foo': 1}, {'bar': 5, 'baz': 6, 'foo': 4}]
Parameters:

**kwargs – key-value pairs. Scalar values will are treated as lists of length 1.

Returns:

See example above.

Return type:

list of dict

cfiddle.arg_product(*args)

Generate and merge the cross product of a set of dicts. The arguments must be lists of dict.

In the common use case, the arguments are the result of calls to arg_map().

For example:

>>> from cfiddle import *
>>> from pprint import pprint
>>> pprint(arg_map(a=[1,2]))
[{'a': 1}, {'a': 2}]
>>> pprint(arg_map(b=[3,4]))
[{'b': 3}, {'b': 4}]
>>> pprint(arg_product(arg_map(a=[1,2]), arg_map(b=[3,4])))
[{'a': 1, 'b': 3}, {'a': 1, 'b': 4}, {'a': 2, 'b': 3}, {'a': 2, 'b': 4}]

You can use arg_product() and arg_map() to compose complex combinations of parameters.

For instance, let’s imagine that we a C++ function,

matexp(int m, int tile_size, int thread_count)

that measures the performance of raising an m x m matrix to the p th power using a given memory tile_size and thread_count.

We’d like to chose a few representative values of m and p and run them for all combinations of thread_count and tile_size.

We can the set of function arguments like so:

>>> from cfiddle import *
>>> from pprint import pprint
>>> p = arg_product(arg_map(size=600, power=2) +
...                 arg_map(size=320, power=40) +
...                 arg_map(size=120, power=240),
...                 arg_map(thread_count=[1,2],
...                         tile_size=[4,8,16]))
>>> pprint(p)  
[{'power': 2, 'size': 600, 'thread_count': 1, 'tile_size': 4},
 {'power': 2, 'size': 600, 'thread_count': 1, 'tile_size': 8},
 {'power': 2, 'size': 600, 'thread_count': 1, 'tile_size': 16},
 {'power': 2, 'size': 600, 'thread_count': 2, 'tile_size': 4},
 {'power': 2, 'size': 600, 'thread_count': 2, 'tile_size': 8},
 {'power': 2, 'size': 600, 'thread_count': 2, 'tile_size': 16},
 {'power': 40, 'size': 320, 'thread_count': 1, 'tile_size': 4},
 {'power': 40, 'size': 320, 'thread_count': 1, 'tile_size': 8},
 {'power': 40, 'size': 320, 'thread_count': 1, 'tile_size': 16},
 {'power': 40, 'size': 320, 'thread_count': 2, 'tile_size': 4},
 {'power': 40, 'size': 320, 'thread_count': 2, 'tile_size': 8},
 {'power': 40, 'size': 320, 'thread_count': 2, 'tile_size': 16},
 {'power': 240, 'size': 120, 'thread_count': 1, 'tile_size': 4},
 {'power': 240, 'size': 120, 'thread_count': 1, 'tile_size': 8},
 {'power': 240, 'size': 120, 'thread_count': 1, 'tile_size': 16},
 {'power': 240, 'size': 120, 'thread_count': 2, 'tile_size': 4},
 {'power': 240, 'size': 120, 'thread_count': 2, 'tile_size': 8},
 {'power': 240, 'size': 120, 'thread_count': 2, 'tile_size': 16}]
Parameters:

*args – A list of dict.

Returns:

list of dict

Creating Code

CFiddle can compile existing source files or you can create an anonymous source file with code() which will return the path to a file containing your code.

cfiddle.code(source, file_name=None, language=None, raw=False)

Generate an anonymous (by default) source file and return the path to it.

Write source to anonymous file and return the file’s name. This function is meant to be used an the first argument of build().

You can choose the location of the file, by speciying file_name. If the contents of the file has changed since code() last wrote it, it will raise SourceCodeModified to prevent deleting your edits.

Use language to specify the language you are writing it.

For some languages, code() adds some boilerplate to make compilation work. You can prevent this with raw=True.

Parameters:
  • source – The source code. Raw strings work best (e.g., r””” // my code “””).

  • file_name – Where to put the source code. This file will be overwritten.

  • language – Suffix to use for the filename. Default to cpp.

  • raw – Don’t add language-specific boilerplate. (Default: False)

Returns:

The file name.

Return type:

str

Compiling Code

CFiddle compiles code with cfiddle.build(). It takes source code, and a set of build parameters and generates an a list of cfiddle.source.InstrumentedExecutable objects that represent compiled code and allows you to inspect the code and the results of its compilation (e.g., the assembly).

cfiddle.build(source, build_parameters=None, **kwargs)

Compile one or more source files in one or more ways.

source can be a single file name or a list of file names. build compiles each file into an Executable. A call to cfiddle.code() is often passed as source.

build_parameters can set parameters for the build process (e.g., optimization levels, the target architecture, or the compiler to use). It can be a dict or list of dict that provide values for build parameters. If build_parameters is None, defaults will be used.

Typically, the build_parameters value is generated with cfiddle.util.arg_map().

build compiles each source file using each set of build parameters, and returns list of resulting InstrumentedExecutable objects.

The InstrumentedExecutable s can be studied themselves or passed to run().

Parameters:
  • source – One or more (as a list) of source files to compile.

  • build_parameters – One or more (as a list) dict listing build parameters. Defaults to None.

  • **kwargs – Further options to the Builder object that perform compilation.

Returns:

One executable for each combination of source and build_parameters.

Return type:

list of Executable

Inspecting Compiled Code

cfiddle.source.InstrumentedExecutable provides several ways to inspect your compiled code.

class cfiddle.source.InstrumentedExecutable(*argc, **kwargs)

A compiled source file.

Builder objects create these when they compile code. They can be passed to cfiddle.run() for execution.

The compiled code is a dynamic library (i.e., a .so file). The path to the library is in lib.

DWARFInfo()

Context manager for the raw DWARFInfo object for the compiled code.

Returns:

DWARFInfo object created by pyelftools.

Return type:

DWARFInfo

ELFFile()

Context manager for the raw ELFFile object for the compiled code.

Returns:

ELFFile object created by pyelftools.

Return type:

ELFFile

asm(show=None, demangle=True, filter=None, **kwargs)

Return the compiled assembly for a function.

The output is from the assembly output of the compiler (e.g., the result of g++ -S), not the compiled object code.

This function uses regular expression-based heuristics to find the function, rather than actually parsing the code, this can lead to unexpected outputs.

Parameters:
  • show – What to show. Either a function name or a 2-tuple: either (start_regex,end_regex) or (start_line_number,end_line_number). Defaults to None which shows the whole file.

  • demangle – Pass the assembly through c++filt first, so C++ symbols are more readable. Defaults to True.

Returns:

The assembly.

Return type:

str

cfg(function, output, **kwargs)

Return a image of the control flow graph for a function.

This extracts the CFG from the compiled object file using the Redare2 toolkit. Redare will sometimes fail to create a coherent CFG.

Parameters:
  • function – function to show.

  • output – filename in which to put the resulting png file or None, which an anonymous file will be created.

Returns:

The filename containing the file.

Return type:

str

debug_info(show=None, **kwargs)

Print a summary of the debugging info for the compiled code.

This is the data that debuggers use to make debugging a program comprehensible. It includes variable and function names, types, file names, line numbers, etc.

Currently only DWARF4 is supported, which is the standard on Linux systems.

In order for debugging information to present, the code must be compiled with -g.

Parameters:

show – What to show – a function name. Defaults to None which will display all the debugging info.

Returns:

String rendering the DWARF data for the file or function. This can be very long.

Return type:

str

get_build_parameters()

Returns a map containing the build parameters that cfiddle.build() used when building it via get_build_parameters().

preprocessed(show=None, language=None, filter=None, **kwargs)

Return the preprocessed source code for a function.

This function uses regular expression-based heuristics to find the function, rather than actually parsing the code, this can lead to unexpected outputs.

The heuristics assume that the function prototype is on a single line and that the function ends with a } on a line by itself.

Parameters:
  • show – What to show. Either a function name or a 2-tuple: either (start_regex,end_regex) or (start_line_number,end_line_number). Defaults to None which shows the whole file.

  • demangle – Pass the assembly through c++filt first, so C++ symbols are more readable. Defaults to True.

Returns:

The preprocessed source code.

Return type:

str

source(show=None, language=None, filter=None, **kwargs)

Return the source code for a function.

This function uses regular expression-based heuristics to find the function, rather than actually parsing the code, this can lead to unexpected outputs.

The heuristics assume that the function prototype is on a single line and that the function ends with a } on a line by itself.

Parameters:
  • show – What to show. Either a function name or a 2-tuple: either (start_regex,end_regex) or (start_line_number,end_line_number). Defaults to None which shows the whole file.

  • language – What language to assume. Defaults to c++.

Returns:

The source code.

Return type:

str

stack_frame(show, **kwargs)

Print the stack frame layout for a function.

This returns a description of where each variable and argument resides on the stack or in registers.

For instance:

>>> from cfiddle import *
>>> sample = code(r'''
... extern "C"
... int foo(int a) {
...    register int sum = 0;
...    for(int i = 0; i < 10; i++) {
...       sum += i;
...    }
...    return sum;
... }
... ''')
>>> stack_frame = build(sample)[0].stack_frame("foo")
>>> print(stack_frame) 
function foo
    a: (DW_OP_fbreg: -44)
    sum: (DW_OP_reg3 (rbx))
    i: (DW_OP_fbreg: -28)

The format is potentially complicated (the DWARF format is Turing complelete!), but most entries are easy to understand.

The example above shows that a is store at -44 bytes relative to the frame base register and sum is a register.

This is a work in progress. Here’s the Dwarf4 spec and the source code for pyelftools, which is reasonably well documented.

Pull requests welcome :-).

Parameters:

show – Function to extract the frame layout from.

Returns:

A description of the layout

Return type:

str

Executing Code

cfiddle.run() can invoke functions in a cfiddle.Executable and collect data about their execution. CFiddle provides easy access to performance counters .

cfiddle.run(executable, function, arguments=None, perf_counters=None, run_options=None, **kwargs)

Run one or more functions with one or more sets of arguments and collect one or more measurements.

CFiddle parameterizes execution in five ways, corresponding to run() ‘s five arguments:

  1. An InstrumentedExecutable to run as returned by cfiddle.build().

  2. The function to call.

  3. The arguments to pass to the function.

  4. The performance counters to track.

  5. And the other aspects of the execution environment (e.g., environment variables and clock speed).

run() can take multiple values for each of these and will run all combinations.

function can be a str corresponding to a function that is present in each executable with the same signature. To run multiple functions, pass a list of strings.

The contents of :code:’arguments` must match the signature of the function invoked. While you can pass a dict (for a single invocation) or list of dict (to invoke the function multiple times), it’s easiest to just always invoke cfiddle.arg_map() to generate this argument.

perf_counters should be a list of performance counter names or a list of such lists. If it is a list of lists, it will result in multiple invocations using different sets of counters.

You can set default values for perf_counters by setting the perf_counters_default configuration value. Passing a value to run() completely overrides the default.

By default, run_options interpreted by cfiddle.Runner.RunOptionInterpreter. The default implementation copies the contents of run_options to environment variables before execution.

You can set default values for run_options by setting the run_options_default configuration value. You set it to the result of a call to cfiddle.arg_map(). The values you pass to run() will override the defaults. If the defaults includes multiple sets of values, run() will run all combinations of the defaults and the values passed supplied.

run() returns an cfiddle.InvocationResultsList which is a subclass of list that can format results in useful ways (e.g., as a Panda dataframe or CSV file).

The elements of the list are cfiddle.InvocationResult objects. Each of which contains the build parameter for the executable used and all the parameters listed aabove.

Parameters:
  • executable – An Executable or list of Executable objects.

  • function – A str or list of str naming functions to call.

  • arguments – A dict of arguments for the function. Or a list of such dict. Defaults to [{}]

  • perf_counters – A list of performance counters to collect. Default to None.

  • run_options – Parameters controlling how the function is run. Default to None.

Returns:

A list of InvocationResult objects.

Return type:

InvocationResultsList

Analyzing Results

The results end up special list type (cfiddle.InvocationResultsList) that can summarize the results in several useful formats.

class cfiddle.Data.InvocationResultsList(iterable=(), /)

Collect and summarize execution results.

A list of results from multiple executions (e.g., returned by run()).

It includes:

  1. Build parameters.

  2. The function name.

  3. The function arguments.

  4. The run options.

  5. Measurements and outputs of the functions

You can export these data in multiple formats using the methods below.

as_csv(csv_file)

Write results to a CSV file.

Parameters:

csv_file – filename to write results to.

Returns:

None

as_df()

Return results as a Pandas dataframe.

Values that appear numeric are convert to numbers.

Returns:

A copy of the data as a Dataframe.

Return type:

Dataframe

as_dicts()

Return results as a list of dict.

Returns:

A copy of the data as a JSON-like Python object.

Return type:

list of dict

as_json()

Return results as a json string.

Returns:

A JSON represenation of the data.

Return type:

str