Image Analyst MKII provides complex image processing tasks in a biologist-friendly manner.

Fluorescence microscopy image analysis
automation - time series - physiology

Pipeline Optimizer

The Pipeline Optimizer Dialog finds optimal values for one or multiple numeric parameters of a pipeline that achieve a user defined goal. Such goal can be finding the pipeline settings that produces the largest difference between positive and negative control recordings. The goal is defined by Excel formula computation within a template file. The output of the pipeline is collected into the template file. The template must calculate a single numeric value (such as a p-value or 1 - z-factor) from the output. This numeric value is minimized using the Differential Evolution algorithm.

  • The input image recording must be loaded into a Multi-Dimensional Open dialog, and must have multiple positions. A statistically reasonable number of positions must be positive and negative controls.
  • The pipeline is executed for all positive and negative control conditions, and this is repeated in an iterative manner while the software tests different parameter settings. For each iteration the Excel template is freshly reloaded, filled with analysis results and the statistical value, computed by the formula saved in the template, is retrieved.
  • It is the user's task to lay out the Excel template in a way that a proper statistics is calculated from the positive and negative controls. The pipeline execution does not differentiate between positive and negative controls, but all requested positions are evaluated in order.
  • The Pipeline Optimizer Dialog allows to set which parameters to optimize and in what range. The optimization is performed on discrete values, so besides the range, the number of steps looked at between a minimum and maximum values needs to be provided for each optimized parameter.
  • At the end of the optimization, the optimum parameters can be saved as the default parameters for the optimized pipeline.

How to find pipeline parameters that maximize the difference between positive and negative controls?
  • Open a recording that has statistically reasonable number of positions as positive and negative controls. File opening must result in a Multi-Dimensional Open dialog More help. If the data set is in multiple files, use multi selection, and checkmark 'Merge as Positions' in the Multi-Dimensional Open dialog. See here more on how to do this.
  • Activate the Pipeline to be optimized. Only pipelines with numerical parameters and with an output to the Excel Data Window can be optimized.
  • Open the Excel Data Window (Tools/Excel Data Window).
  • Process the data set.
    • If the data set consists only the controls, use the Run Pipeline .. On All Stage Positions.
    • If only a part of the data set are the controls, use Run Pipeline on Partial Plate. Provide a list of wells (positions) to be processed. Note down this list, as you will need to enter it in the Pipeline Optimizer Dialog.
  • When processing is finished, In the Excel Data Window:
    • Create a new worksheet and name it as 'Calculations'.
    • In the Calculations worksheet create all statistical calculations using formula computing and referencing to data in the 'IA Output' worksheet required to generate a single numerical value, that approaches to zero when the goal is reached. This final result of the calculation, that will be optimized must be a single numerical value. Note down the cell reference to this calculated value. Examples:
      • To optimize z-factor, calculate averages and standard deviations of positive and negative control values using the 'Calculations' worksheet. In a cell then use 1 - the z-factor formula, as: =3*(sum of standard deviations)/ABS(difference of averages). Note down the cell reference, where this formula appears.
      • To optimize a p-value, in the 'Calculations' worksheet create two columns of values referencing the appropriate data in the 'IA Output' worksheet. In a cell then use the TTEST function to compare the two columns. Note down the cell reference, where this formula appears. In the example below this is B9.

        Excel Example 1
    • Clear the 'IA Output' worksheet and save the workbook using the File/Save Excel Data as. This will be the template file.
  • Open the Pipeline Optimizer Dialog (Tools/Pipeline Optimizer).
    • Configuration of the Optimizer (go to the Optimizer Settings tab and select Settings for Pipeline Optimizer):
      • Set the 'Input image source' and 'Input positions list' according to how the data was analyzed above in the Multi-Dimensional Dialog. E.g. 'All positions' and the list line is left empty. Or 'List of positions' and the list line contains a comma separated list of positions or ranges given by dashes.
      • In the 'Template Excel workbook' row locate the file you saved above.
      • In the 'Worksheet of optimized cell' row provide the name of the Worksheet where the calculations can be found. This is 'Calculations' in the example above.
      • In the 'Optimized' cell row provide the cell reference to the calculated value, such as A1.
      •  'Penalty for erroneous run': Depending on the parametering, pipelines may fail or provide non-numeric output. The value of 'Penalty for erroneous run' will be used if no calculated numerical value is available or the pipeline fails.
      • 'Pipeline processing timeout (sec)': It is also possible to provide a timeout in seconds if a pipeline hangs at a particular setting. The default 0 value disables this feature.
    • Settings for Differential Evolution Core: these values need no adjustment, but allow the optimizer to be fine tuned.
    • Pipeline Parameters tab:
      • Optimize: checkmark parameters for optimization. Parameters marked with N/A are not numerical and are not available for optimization.
      • For each check marked parameter, set a Minimum and Maximum value. The optimum will be sought within this, inclusive range.  The optimization is performed on discrete values, So the number of steps looked at between a minimum and maximum values needs to be provided for each optimized parameter. Be conservative with the number of steps to shorten the run time. Typically 10-100.
    • Save the Pipeline Optimizer settings before run (Save 'Save configuration' button of the dialog)
    • Press the 'Optimize' button to start operation. Note: for more complicated pipelines and larger numbers of parameters optimized expect longer, e.g. overnight computation.
      • After ~60 cycles  (2x Population size in Settings/Differential Evolution Core) a part optimized, local optimum is available in the results pull-down, and this can be saved as default parameter.
      • During optimization the 'Test' column shows the currently tested parameter values.
      • The 'Optimum' column shows the best parameter set found so far.
How to adjust a pipeline so its operation reflects manual results (e.g. object counts)?
  • To optimize a pipeline (e.g. cell counting) in order to result in similar values to manual evaluation of recording, follow the above section with the following differences:
    • No positive and negative controls are used. Technically one image/position is sufficient here, but using more will make the results more robust. Use the Multi-Dimensional Open dialog 'Clear and run pipeline ... on Stage Position' button to evaluate positions one by one, so manual counting can be also performed. Hint: use the crosshair ROI tool as ticker for the manual count.
    • In the Calculations worksheet create the following calculation using formula computing and referencing to data in the 'IA Output' worksheet:
      •  Calculate the the square sum of differences of pipeline-calculated counts and the manual counts in the  'Calculations' worksheet. Note down the cell reference, where the square sum appears.  In the example below this is C7.

        Excel Example 2
  • Proceed with the steps described in the previous section.