OFEP1: Pass variables through the microscope codebase as dictionaries

OFEP 1
Author Julian Stirling
Created 28-Nov-2020
Status Approved, not implemented
Approved date 2-Dec-2020
Requires implementation Yes
Implemented date 13-Apr-2021
Updated dates (post-approval) N/A

1. Introduction

This OpenFlexure Enhancement Proposal relates to the customisability, ease of understanding, and code safety of the OpenFlexure codebase.

This is an OFEP because it involves major changes to how the codebase is structured. While this comes with some level of risk, it opens up new possibilities that the old programming structure did not allow. It also provides a much clearer picture of which parameters are set and which parameters are derived. It also provides a simple method for overriding a parameter or producing a customised microscope/translation stage without modifying the OpenFlexure Microscope codebase.

While this OFEP touches on issues such as code style, code safety, namespace pollution this is not the OFEP does not provide a complete solution. Style conventions will have a dedicated OFEP at a later date. This OFEP discusses the use of dictionaries, passed as arguments, for parameter management. While some examples of how these dictionaries can be created and modified are provided, the code style for the stl generating .scad files will be decided at a later date.

2. Current state

Currently almost all geometric properties are defined in a single file: microscope_parameters.scad. This file is included into all other .scad files with the include command. While this does give a central location for all modifiable parameters, and derived parameters it does have some disadvantages. Many of the solutions proposed in this rely on features added in OpenSCAD 2019.

2.1. Confusion relating to the subtleties of use/include

This issue depends on the subtleties of OpenSCAD. First of all that OpenSCAD variables are set for each scope at compile time. Not run time. For example the code snippet:

a = 1;
b = a;
a = 3;
echo(b);

will return 3 and not 1. This means that any variable that is overwritten can affect previous code.

OpenSCAD has two ways to access code from an external file include and use. Include effectively writes the code from the included file into the current file. use on the other hand provides access to only modules and functions defined in that file.

However consider a file (file.scad) in the repository that includes microscope_parameters.scad and uses another_file.scad, but another_file.scad also includes microscope_parameters.scad. In file.scad the programmer uses (intentionally or otherwise) one of the global variables in microscope_parameters.scad. Now when file.scad accesses parameters derived from this value it will receive a different value to when another_file.scad accesses the same parameters.

This can lead to confusing results. If this same parameter is over-ridden in the build system using -D a different result will be built. This is particularly confusing when trying to develop for a custom camera or optics configuration.

Of course, the answer to this solution is to only overwrite parameters directly in microscope_parameters.scad, or to override with -D. Wile both have some drawbacks, such as remembering to return default parameters to their original state, or being forced to always compile tests from command line, the biggest problem is that new developers do not understand what is happening. For example it took me over two years of working on the project to understand this, which explains why almost all of my attempts to modify the codebase failed.

2.2. Danger of -D

Currently our build system injects parameters into the codebase with the command line flag -D. This flag overwrites a given variable in all included and used files. Thus, without care to ensure that no files have the same parameters in the global scope this can cause the potential confusing results if this parameter is redefined with -D.

2.3. Loss of customisability in practice

While the codebase is in theory totally customisable via the parameters file this means that any user wishing to build a modified design must modify the codebase itself and effectively fork the project.

3. Proposed refactor

The proposed refactor is inspired by the way that object oriented languages keep a and pass a specific set of attributes for each instance of a class. While implementing objects in OpenSCAD is not possible nor desirable we can get some of the benefits by starting all key functions with a dictionary of parameters. (Implementation details of the dictionaries themselves is handled in Section 4).

With this structure in the same way that all Python class methods begin with self in OpenSCAD microscope related modules and functions can begin with params to give a similar workflow:

function foo(params, other_var) = let(
    var1 = key_lookup("var1", params)
) var1 + other_var;

While this lacks the syntactic sugar of Python, it achieves a similar goal. The parameters dictionary would contain only parameters that are set as a constant, and a default parameter set would be available via a functions:

params = default_parameters();

All derived parameters in microscope_parameters.scad can then be modified into functions. Removing the need to include the file. Note that the dictionary will only include fixed parameters, not the logic for creating derived parameters. These will continue to be set by functions in the codebase, these functions will accept the parameter dictionary as an argument.

Passing dictionaries as arguments makes the dataflow explicit. This comes with certain advantages detailed in the following subsections. We just need to be explicit that the microscope is only tested with the default parameters.

3.1. Data safety

By defining all key parameters in a dictionary and moving all derived parameters to functions we can stop declaring anything in the global scope of all files except the stl_generating files which can have explicit switches for the build script to toggle. Moving away from globals provides a dataflow which is much more controlled and "auditable", without risks of global variables being overwritten.

Using the new assert feature in OpenSCAD it is possible for functions to verify that parameters are withing certain ranges, and abort execution if they are not. This is possible even without the dictionary dataflow.

3.2. Ease of future development.

If all modifiable parameters are explicitly passed into functions then it becomes much more simple to test how a specific module responds to different parameters while developing it. It is possible to test the modules side by side rather than sequentially for example the following would be possible:

params = default_parameters();
rms_config = rms_f50d13_config();
rms_inf_config = rms_infinity_f50d13_config();
optics_module_rms(params, rms_config);
translate([0, 40, 0]){
    optics_module_rms(params, rms_inf_config);
}

3.3. Ability of derived projects to build custom hardware without modifying the codebase

While the proposed dictionary dataflow gives the advantages above, one key benefit is it unlocks the power of the OpenFlexure codebase for developing other custom hardware. Previously only one set of global parameters could ever be used and the microscope and these had to be written either from command line or modified in the codebase itself. However with these proposed changes the microscope can be used as a library by external files to produce complex hardware reusing OpenFlexure modules multiple times with different parameters.

A simple example of this is provided below, using the codebase to create a custom dual translation stage, with different size stages. Complete with walls and a base:

This example was produced a development branch (scad_dictionary) of the microscope codebase. This OFEP has been partially implemented in this branch both as an example, and to test what is feasible.

4. Specific details of the dictionary structure

The SCAD "dictionary" is implemented as a list of lists, where each internal list is exactly 2 elements long. It enforces that the first element in the sub-lists (the key) is a string. It also enforces that the keys are unique. In the scad_dictionary development branch the library that implements the dictionaries is openscad/libs/libdict.scad. This includes:

  • valid_dict(dict) a validator that returns whether the dictionary is in the correct format.
  • key_lookup(key, dict) which returns the value for a given key.
  • replace_value(key, value, dict) which returns a new dictionary with a single key updated to a new value.
  • replace_multiple_values(rep_dict, dict) which returns a new dictionary with multiple keys updated to new values.

The functions use the list comprehension and let functionality of OpenSCAD 2019 to implement these functions. The functions are well commented and use a number of other functions to break up the logic. Many of these functions begin with an underscore to imply that they are private functions for use only by functions in the dictionary library file.

The input variable types are tested with assert statements to ensure that corrupt dictionaries are not created or used.

4.1. Speed implications

A key consideration when developing libdict.scad was speed implications on the repository. If the dictionaries slow down the codebase significantly then they are not useful. Unfortunately it is not possible (or at least practical) to create a dictionary using a true hash table for speed. However rather than using loops inside OpenSCAD most functionality is implemented with the search function in OpenSCAD, which is a compiled C function.

A test script dict_speed_test.py has been included in the repository. This creates three openscad files with dictionaries of length 10, 100, and 1000 respectively. 20 key lookups is then performed on that dictionary. The entire time to run each OpenSCAD program is timed. The total time and the time per lookup are reported. The overhead of opening and running OpenSCAD is not compensated for so these are worst case performances:

Program ran without error for dictionary length 10
For length 10:  20 lookups takes: 0.0287s
That is 0.0014s per lookup
Program ran without error for dictionary length 100
For length 100:  20 lookups takes: 0.0455s
That is 0.0023s per lookup
Program ran without error for dictionary length 1000
For length 1000:  20 lookups takes: 0.5409s
That is 0.0270s per lookup

For the microscope we can safely assume that less than 100 parameters will be needed, and that less than 1000 lookups will be needed. This gives an upper bound on the performance penalty of 2.3 seconds.

4.2 Stability

Perhaps the largest worry for the proposed changes is the possibility of introducing a bug which causes the codebase to behave unexpectedly. For this reason a series of unit tests have been created to test the dictionary codebase for bugs. Due to the lack of a unit testing framework inside OpenSCAD the test were written using the Python framework unittest. This allows the CI to check that bugs are not being introduced into the codebase via libdict.scad.