Intensity scaling#

The intensity scaling, integration and merging for CBD data are performed in an iterative manner to achieve convergence to the optimal structure factor estimates and diffraction power of the crystal in one process. The intensity profile of a particular Bragg reflection captured on the detector can be broken down to a product of three terms:

\[I_{hkl}(\mathbf{x}) = |q_{hkl}|^2 \chi(\mathbf{u}(\mathbf{x})) f^2_{hkl}(\mathbf{x}),\]

where \(\chi(\mathbf{k}_{in})\) is the projection map of the sample, \(q_{hkl}\) is the structure factor, \(f_{hkl}\) is the standard profile, \(\mathbf{u}(\mathbf{x}) = \mathbf{k}_{out}(\mathbf{x}) - \mathbf{g}_{hkl}\) is the geometrical mapping from the detector plane to the aperture function.

To accurately handle the intensity measurements in the overlapping streak regions, we assume that the diffracted signal from different diffraction orders is summed up incoherently. Thus, the modelled diffraction pattern is given by:

\[\hat{I}_n(\mathbf{x}_i) = I_{bgd}(\mathbf{x}_i) + \sum_{hkl} |q_{hkl}|^2 \chi_n(\mathbf{u}(\mathbf{x}_i)) f^2_{hkl}(\mathbf{x}_i),\]

where and \(I_{bgd}(\mathbf{x})\) is the white-field and \(\chi_n(\mathbf{k}_{in})\) is the projection mapping of the crystal at the n-th frame.

IntensityScaler recovers the estimates of projection maps \(\chi\) and \(q_{hkl}\) iteratively with train by minimising one of two criteria (see fitness):

  1. The former one is the Poisson negative log-likelihood:

    \[\varepsilon^{NLL} = \sum_{ni} \varepsilon_n^{NLL}(\mathbf{x}_i) = \sum_{ni} \log \mathrm{P}(I_n(\mathbf{x}_i), \hat{I}_n(\mathbf{x}_i)),\]

    where the likelihood \(\mathrm{P}\) follows the Poisson distribution \(\log \mathrm{P}(I, \lambda) = I \log \lambda - I\).

  2. The latter one is the least-squares criterion:

    \[\varepsilon^{LS} = \sum_{ni} \varepsilon_n^{LS}(\mathbf{x}_i) = \sum_{ni} f\left( \frac{I_n(\mathbf{x}_i) - \hat{I}_n(\mathbf{x}_i)}{\sigma_I} \right),\]

    where \(f(x)\) is either l2, l1, or Huber loss function, and \(\sigma_I\) is the standard deviation of measured photon counts for a given diffraction streak.

Note

IntensityScaler incorporates intensity merging into iterative update with merge_hkl method that takes a symmetry type of the crystal lattice. Given the supplied symmetry, the structure factors for the symmetric reflections are assumed to be identical so that the optimisation in train is carried out for the whole subset of symmetric reflections in one fell swoop.

At each iteration, the update for the structure factors is performed by minimising one of the criteria above, where \(\chi_n\) is held constant. At the next step, the projection mapping are calculated based on the new estimates of crystal factors as follows:

\[\hat{\chi}_n(\mathbf{k}_{in}) = \frac{\sum_i \hat{I}^{hkl}_n(\mathbf{x}_i) f^2_{hkl}(\mathbf{x}_i) K((\mathbf{k}_{in} - \mathbf{u}(\mathbf{x}_i)) / h)} {\sum_i |q_{hkl}|^2 f^4_{hkl}(\mathbf{x}_i) K((\mathbf{k}_{in} - \mathbf{u}(\mathbf{x}_i)) / h)},\]

where \(K(\mathbf{x})\) is the kernel function, and \(h\) is the kernel bandwidth, and \(\hat{I}^{hkl}_n\) is the estimate of the intensity measurements pertaining to a certain Bragg reflection:

\[\hat{I}^{hkl}_n(\mathbf{x}_i) = \frac{(I_n(\mathbf{x}_i) - I_{bgd}(\mathbf{x}_i)) I_{hkl}(\mathbf{x}_i)}{\sum_{hkl} I_{hkl}(\mathbf{x}_i)}.\]

Note

\(\hat{I}^{hkl}_n(\mathbf{x}_i)\) differs from the measured intensities only in the regions where the diffraction streaks overlap.

IntensityScaler#

class cbclib.IntensityScaler(parent, frames, fidxs, xmap, xtal, num_threads, bgd=None, I_raw=None, idxs=None, iidxs=None, ij=None, hkl=None, hkl_idxs=None, prof=None, xtal_bi=None)[source]#

Iterative CBC intensity scaling algorithm. Provides an interface to iteratively update crystal projection maps (xtal) and crystal structure factors (sfac) by fitting the modelled intensity profile of Bragg reflections to the experimentally measured intensities.

Parameters
  • bgd (Optional[ndarray]) – Background levels.

  • fidxs (ndarray) – Array of CBC table first row indices pertaining to different CBC patterns.

  • frames (ndarray) – Array of unique frame indices stored in the table.

  • hkl (Optional[ndarray]) – Array of unique Miller indices stored inside the CBC table.

  • hkl_idxs (Optional[ndarray]) – Array of Bragg reflection indices.

  • I_raw (Optional[ndarray]) – Experimentally measured intensities.

  • idxs (Optional[ndarray]) – Array of streak indices.

  • num_threads (int) – Number of threads used in the calculations.

  • parent (ReferenceType[CBCTable]) – Reference to the parent CBC table.

  • prof (Optional[ndarray]) – Standard reflection profiles.

  • xmap (ndarray) – A set of aperture function coordinates, that correspond to the detector coordinates of experimentally measured intensities.

  • xtal (Map3D) – Crystal projection maps.

  • xtal_bi (Optional[ndarray]) – Array of interpolated crystal diffraction power values at xmap coordinates.

contents()#

Return a list of the attributes stored in the container that are initialised.

Return type

List[str]

Returns

List of the attributes stored in the container.

export_sfac(x, path, symmetry=None)[source]#

Export structure factors to a text hkl file.

Parameters
  • x (ndarray) – Current estimate of crystal structure factors and intercept values.

  • path (str) – Path to the output file.

  • symmetry (Optional[str]) –

    Symmetry of the crystal. Can be one of the following:

    • 4mm : (h, k, l) = (-h, -k, -l) = (k, h, l).

    • mmm : (h, k, l) = (-h, -k, -l).

    • None : No symmetry.

fitness(x, fit_intercept=True, kind='poisson', loss='l2')[source]#

Return the fitness criterion and the jacobian matrix. The structure factors and projection maps are used to model the intensity profiles of Bragg reflections. Either negative log likelihood or least squares error is calculated to estimate how well the modelled intensity profiles fit to the experimental intensities.

Parameters
  • x (ndarray) – Current estimate of crystal structure factors and intercept values.

  • fit_intercept (bool) – Use intercept if True.

  • kind (str) – Choose between the negative log likelihood Poisson criterion (‘poisson’) and the least squares criterion (‘least_squares’)

  • loss (str) –

    Loss function to use in the least squares criterion. The following keyword arguments are allowed:

    • l1: L1 loss (absolute) function.

    • l2 : L2 loss (squared) function.

    • Huber : Huber loss function.

Notes

The modelled diffraction pattern \(\hat{I}_n(\mathbf{x}_i)\) is given by:

\[\hat{I}_n(\mathbf{x}_i) = I_{bgd}(\mathbf{x}_i) + \sum_{hkl} |q_{hkl}|^2 \chi_n(\mathbf{u}(\mathbf{x}_i)) f^2_{hkl}(\mathbf{x}_i),\]

where \(q_{hkl}\) are the structure factors and \(\chi(\mathbf{u}(\mathbf{x}))\) are the projection maps of the sample, and \(f_{hkl}(\mathbf{x})\) are the standard reflection profiles.

The Poisson negative log-likelihood criterion is given by:

\[\varepsilon^{NLL} = \sum_{ni} \varepsilon_n^{NLL}(\mathbf{x}_i) = \sum_{ni} \log \mathrm{P}(I_n(\mathbf{x}_i), \hat{I}_n(\mathbf{x}_i)),\]

where the likelihood \(\mathrm{P}\) follows the Poisson distribution \(\log \mathrm{P}(I, \lambda) = I \log \lambda - I\).

The least-squares criterion is given by:

\[\varepsilon^{LS} = \sum_{ni} \varepsilon_n^{LS}(\mathbf{x}_i) = \sum_{ni} f\left( \frac{I_n(\mathbf{x}_i) - \hat{I}_n(\mathbf{x}_i)}{\sigma_I} \right),\]

where \(f(x)\) is either l2, l1, or Huber loss function, and \(\sigma_I\) is the standard deviation of measured photon counts for a given diffraction streak.

Raises

ValueError – If the loss argument is invalid.

Return type

Tuple[float, ndarray]

Returns

The criterion and the jacobian matrix.

gain(x, kind='poisson', loss='l2')[source]#

Return the fitness gain for every diffraction order.

Parameters
  • x (ndarray) – Current estimate of crystal structure factors and intercept values.

  • kind (str) – Choose between the negative log likelihood Poisson criterion (‘poisson’) and the least squares criterion (‘least_squares’)

  • loss (str) –

    Loss function to use in the least squares criterion. The following keyword arguments are allowed:

    • l1: L1 loss (absolute) function.

    • l2 : L2 loss (squared) function.

    • Huber : Huber loss function.

Return type

ndarray

Returns

Array of fitness gain values for each diffraction order.

get(attr, value=None)#

Retrieve a dataset, return value if the attribute is not found.

Parameters
  • attr (str) – Data attribute.

  • value (Optional[Any]) – Data which is returned if the attribute is not found.

Return type

Any

Returns

Attribute’s data stored in the container, value if attr is not found.

init_estimate()[source]#

Return initial estimate of crystal structure factors and intercept values.

Return type

ndarray

Returns

Array of initial crystal structure factors and intercepts.

items()#

Return (key, value) pairs of the datasets stored in the container.

Return type

ItemsView

Returns

(key, value) pairs of the datasets stored in the container.

keys()#

Return a list of the attributes available in the container.

Return type

List[str]

Returns

List of the attributes available in the container.

merge_hkl(symmetry=None)[source]#

Merge symmetrical reflection during the iterative update. Given the supplied symmetry, the structure factors for the symmetric reflection are assumed to be identical during the scaling procedure.

Parameters

symmetry (Optional[str]) –

Crystallographic point group symmetry. The following keyword arguments are allowed:

  • mmm: Centrosymmetric othorhombic point group.

  • 4mm : Tetragonal point group.

  • None : No symmetry.

Return type

IntensityScaler

Returns

A new IntensityScaler object with the updated symmetry.

merge_sfac(x, symmetry=None)[source]#

Return a merged list of structure factors.

Parameters
  • x (ndarray) – Current estimate of crystal structure factors and intercept values.

  • symmetry (Optional[str]) –

    Symmetry of the crystal. Can be one of the following:

    • 4mm : (h, k, l) = (-h, -k, -l) = (k, h, l).

    • mmm : (h, k, l) = (-h, -k, -l).

    • None : No symmetry.

Returns

  • hkl : Merged list of Miller indices.

  • sfac : Merged list of structure factors.

  • serr : List of structure factor uncertainties.

  • cnts : Number of measurements of the given reflection.

Return type

A dictionary of the following attributes

model(x)[source]#

Unmerge the structure factors and intercepts.

Parameters

x (ndarray) – Current estimate of crystal structure factors and intercept values.

Return type

Tuple[ndarray, ndarray]

Returns

An array of unmerged structure factors and intercepts.

replace(**kwargs)#

Return a new container object with a set of attributes replaced.

Parameters

kwargs (Any) – A set of attributes and the values to to replace.

Return type

~D

Returns

A new container object with updated attributes.

train(bandwidth, n_iter=10, f_tol=0.0, fit_intercept=True, kind='poisson', loss='l2', max_iter=10, x0=None, verbose=True)[source]#

Perform an iterative update of sample projection maps (xtal) and crystal structure factors (sfac) until the error between the predicted intensity Bragg profiles and experimental patterns converges to a minimum. The method provides two minimisation criteria: Poisson likelihood and least squares error.

Parameters
  • bandwidth (float) – Kernel bandwidth used in the diffraction efficiency map (xtal) update.

  • n_iter (int) – Maximum number of iterations.

  • f_tol (float) – Tolerance for termination by the change of the average error. The iteration stops when \((f^k - f^{k + 1}) / max(|f^k|, |f^{k + 1}|) <= f_{tol}\).

  • fit_intercept (bool) – Update intercepts if True.

  • kind (str) – Choose between the negative log likelihood Poisson criterion (‘poisson’) and the least squares criterion (‘least_squares’)

  • loss (str) –

    Loss function to use in the least squares criterion. The following keyword arguments are allowed:

    • l1: L1 loss (absolute) function.

    • l2 : L2 loss (squared) function.

    • Huber : Huber loss function.

  • max_iter (int) – Maximum number of iterations.

  • x0 (Optional[ndarray]) – Preliminary estimate of crystal structure factors and intercepts.

  • verbose (bool) – Set verbosity of the computation process.

Return type

Tuple[IntensityScaler, ndarray]

Returns

An updated intensity scaler object and a new estimate of crystal structure factors and intercepts.

unmerge_hkl()[source]#

Unmerge symmetrical reflection during the iterative update. Each reflection will be updated separately during the scaling procedure.

Return type

IntensityScaler

Returns

A new IntensityScaler object with no merging of reflections.

update_table(x)[source]#

Update the parent cbclib.CBCTable table.

Parameters

x (ndarray) – Current estimate of crystal structure factors and intercept values.

update_xtal(x, bandwidth)[source]#

Generate a new crystal map using the kernel regression.

Parameters
  • x (ndarray) – Current estimate of crystal structure factors and intercept values.

  • bandwidth (float) – kernel bandwidth in pixels.

Return type

Tuple[IntensityScaler, ndarray]

Returns

New IntensityScaler container with the updated crystal efficiency map.

values()#

Return the attributes’ data stored in the container.

Return type

ValuesView

Returns

List of data stored in the container.