mocca.dad_data package
Subpackages
Submodules
mocca.dad_data.models module
Created on Tue Aug 3 13:16:51 2021
@author: haascp
- class mocca.dad_data.models.CompoundData(hplc_system_tag: str, experiment: dataclasses.InitVar['mocca.user_interaction.user_objects.HplcInput'], wl_high_pass: dataclasses.InitVar[float] = None, wl_low_pass: dataclasses.InitVar[float] = None)[source]
Bases:
DadData
Data container for HPLC-DAD data with peaks originating from compounds.
- experiment: dataclasses.InitVar['mocca.user_interaction.user_objects.HplcInput']
- class mocca.dad_data.models.DadData(hplc_system_tag: str, experiment: dataclasses.InitVar['mocca.user_interaction.user_objects.HplcInput'], wl_high_pass: dataclasses.InitVar[float] = None, wl_low_pass: dataclasses.InitVar[float] = None)[source]
Bases:
object
Base class for HPLC-DAD data.
- experiment: dataclasses.InitVar['mocca.user_interaction.user_objects.HplcInput']
- class mocca.dad_data.models.GradientData(hplc_system_tag: str, experiment: dataclasses.InitVar['mocca.user_interaction.user_objects.HplcInput'], wl_high_pass: dataclasses.InitVar[float] = None, wl_low_pass: dataclasses.InitVar[float] = None)[source]
Bases:
DadData
Data container for gradient HPLC-DAD data.
- class mocca.dad_data.models.ParafacData(impure_peak: dataclasses.InitVar['mocca.peak.models.CorrectedPeak'], parafac_comp_tensor: dataclasses.InitVar[tuple], boundaries: dataclasses.InitVar[tuple], shift: dataclasses.InitVar[int], y_offset: dataclasses.InitVar[float])[source]
Bases:
object
Data container for synthetic data generated from PARAFAC models.
- impure_peak: dataclasses.InitVar['mocca.peak.models.CorrectedPeak']
mocca.dad_data.process_funcs module
- mocca.dad_data.process_funcs.get_peak_locs(summed_data)[source]
Finds all peaks of data.
- Parameters:
summed_data (numpy.ndarray) – A 1D array representing the absorbances over time. Best used on data that already had data below threshold zeroed (see function filter_absorbance_by_threshold).
- Returns:
peaks – List of all peaks, as a list of BasePeak classes
- Return type:
- mocca.dad_data.process_funcs.merge_peaks(summed_data, peaks)[source]
Merges overlapping peaks in the data.
- Parameters:
summed_data (numpy.ndarray) – A 1D array representing the absorbances over time. Best used on data that already had data below threshold zeroed (see function filter_absorbance_by_threshold).
peaks (list) – List of all peaks as BasePeak objects
- Returns:
new_peaks – List of all peaks in dictionary format with keys maximum, left, and right. Peaks that overlap are merged together into one BasePeak.
- Return type:
- mocca.dad_data.process_funcs.pick_peaks(compound_data, experiment, absorbance_threshold, peaks_high_pass, peaks_low_pass)[source]
Finds all peaks of data and returns them as a chromatogram
- Parameters:
data (numpy.ndarray) – Actual experimental data with shape [# of wavelengths] x [timepoints]. Generated from dataframe with absorbance_to_array function
absorbance_threshold (float) – The threshold below which peaks will. In other words, at at least one (wavelength, timepoint) will have absorbance greater than absorbance_threshold in order to be counted as a peak.
peaks_high_pass (float) – Time high pass filter only using peaks with a retention time greater than the here given value for data analysis
peaks_low_pass (float) – Time low pass filter only using peaks with a retention time lower than the here given value for data analysis
expand_peaks (boolean) – If True, then peaks will be expanded to their peak boundaries. If this is set to False, then only timepoints with cumulative absorbance greater than absorbance_threshold will be counted as part of the peak.
- Returns:
peaks – List of all peaks, as a list of tuples (maximum, left, right)
- Return type:
mocca.dad_data.process_gradientdata module
Created on Wed Aug 4 15:44:47 2021
@author: haascp
- mocca.dad_data.process_gradientdata.bsl_als(absorbance_array)[source]
Applies the baseline als algorithm row-wise (for every wavelength) on an absorbance array
- Parameters:
absorbance_arry (numpy 2D-array) – Absorbance values obtained by an HPLC run (time, wavelength dimension).
- Returns:
baseline_array – Baseline absorbance values
- Return type:
numpy 2D-array
- mocca.dad_data.process_gradientdata.bsl_als_alg(y, lam=100000.0, p=0.01, niter=3)[source]
Baseline correction algorithm: Optimized Python implementation of “Asymmetric Least Squares Smoothing” by P. Eilers and H. Boelens in 2005: https://stackoverflow.com/questions/29156532/python-baseline-correction-library, answer by Rustam Guliev.
- Parameters:
y (list) – List of absorbance values for which the baseline should be determined.
lam (numeric, optional) – Smoothness parameter. 10^2 ≤ λ ≤ 10^9, but exceptions may occur. In any case one should vary λ on a grid that is approximately linear for log λ. Often visual inspection is sufficient for good values. The default is 1e6.
p (numeric, optional) – Asymmetry parameter. 0.001 ≤ p ≤ 0.1 (for a signal with positive peaks), but exceptions may occur. Often visual inspection is sufficient to get good parameter values. The default is 0.01.
niter (integer, optional) – To emphasize the basic simplicity of the algorithm, the number of iterations has been fixed to 10 (original documentation). In practical applications one should check whether the weights show any change; if not, convergence has been attained. The default is 3.
- Returns:
z – Simulated baseline of the given absorbance data.
- Return type:
mocca.dad_data.utils module
Created on Fri Dec 10 13:31:37 2021
@author: haascp
- mocca.dad_data.utils.absorbance_to_array(df)[source]
Generates a 2D absorbance array of the absorbance values.
- mocca.dad_data.utils.apply_filter(dataframe, wl_high_pass, wl_low_pass, bandwidth=2, reference_wl=True)[source]
Filters absorbance data of tidy 3D DAD dataframes to remove noise and background systematic error.
- mocca.dad_data.utils.df_to_array(df)[source]
Takes a tidy dataframe of HPLC-DAD data and returns a numpy array of ” absorbance values as well as a vector for the time domain and a vector for ” the wavelength domain.
- mocca.dad_data.utils.get_reference_signal(dataframe, bandwidth=5)[source]
Returns the averaged signal over the last number of wavelengths as given by the bandwidth.
- mocca.dad_data.utils.sum_absorbance_by_time(data)[source]
Sums the absorbances for each time point over all wavelengths
- Parameters:
data (numpy.ndarray) – Actual experimental data with shape [# of wavelengths] x [timepoints]. Generated from dataframe with absorbance_to_array function
- Returns:
A 1D array containing the sum of wavelengths at each time point
- Return type: