UltraScan Version

Manual


SOMO HPLC-SAXS Module:

Last updated: May 2014

NOTICE: this module is being developed by E. Brookes, J. Pérez, P. Vachette, and M. Rocco.
Portions of this help file are taken from the Supplementary Materials of Brookes et al., "Fibrinogen species as resolved by HPLC-SAXS data processing within the UltraScan SOlution MOdeler (US SOMO) enhanced SAS module", J. Appl. Cryst., 46:1823-1833 (2013).

SOMO HPLC-SAXS Figure 1

This new US-SOMO module was conceived for the analysis of HPLC-SAXS data. In the image above, the main panel of the HPLC-SAXS module is shown. The buttons with the black labels are the ones currently active, the ones with the red labels become active when allowed by the processing/visualization stage. The graphics panel shows a collection of HPLC-SAXS log10[I(q)] vs. q SAXS data frames (points with 0 or negative values are automatically omitted from the visualization only) for a BSA separation on two 7.8 × 300 mm ID columns packed with hydroxylated polymethacrylate particles (TSK G4000PWXL, 10 μm size, 500 Å pore size, and G3000PWXL, 6 μm size, 200 Å pore size, Tosoh Bioscience, Tokyo, Japan) connected in series, protected by a 6 × 40 mm guard column filled with G3000PW resin (Tosoh). Note the permanent upturn at very small q-values, most likely due to biological material aggregated in this case by the intense X-ray beam on the capillary cell walls. While this kind of problem should be (and has been) preferentially dealt with at the experimental level, we use this outdated dataset to demonstrate the potential for correcting data still presenting such an issue.

The left side of the window is divided in three sections, labeled "Data files", "Produced Data", and "Messages". By clicking on these labels, the corresponding panel below each label will disappear, allowing for an expansion of the remaining other panel(s). If every panel is made to disappear, the main graph will expand to cover the full size of the HPLC-SAXS window. By clicking again of the labels, the corresponding panels will be restored.

On the top left panel (Data files) there are four buttons:

The Add files button is used to load data into the module. An operating directory can be pre-selected by clicking on the path shown above it, and navigating in the file system (selecting the Lock checkbox will fix that directory). The file format for SAXS data recognized by the US-SOMO HPLC-SAXS module consist of .dat files with two or three TAB- or space-separated columns containing the q, I(q), and optionally their associated standard deviation (SD) values, respectively. Each frame number (or time value) must be present somewhere in the filename with a common prefix and suffix. For example, data1saxs.dat, data2saxs.dat, data3saxs.dat will be recognized as frames 1,2,3, where "data" and "saxs" can be replaced by any common sequence of characters. Consequently, 1.dat, 2.dat, 3.dat would be acceptable, but abc1.dat, qrs2.dat, xyz3.dat would not, because the prefix characters are not common. Furthermore, the loader will also arrange the data files sequentially, in increasing frame number (or time value) order. The currently recognized format for concentration data is similar to the SAXS data format with the addition of the string "Frame data" in any place on the first line. The two or three columns of data are the frame number, concentration-related data, and optionally an associated SD value. I(q) vs. q and concentration data frames are automatically recognized and the labels on the x- and y-axes are then properly set.

Similar will select files with similar names and allow manual pattern matching entry if no new similar files are selected.

Concentration will show every file listed together with their associated concentration (mg/ml), if appropriate and properly set (see below). Concentrations can also be entered and modified manually. They can be used to normalize the I(q) vs. q data (see below). Loaded files can be displayed on the graphics panel by individually clicking on them (shift-click will select a contiguous series, ctrl-click allows multiple irregularly spaced selections). Produced data will also show up in this panel with associated putative filenames.

Remove files will discard previously selected files (see below); if the files were produced by the module, and were not previously saved, a warning window will pop-up, allowing to proceed or to stop removing the selected items.

Several buttons are available in the panel below the loaded files window:

SOMO HPLC-SAXS files commands panel

Select all will select all files.

Invert will allow toggling the selection between selected files and everything else not currently selected.

Select will open up a panel in which several selection options can be utilized (see here).

View, active when up to ten datasets are selected, will show them in text format.

M Pressing this button will open a pop-up window with the commands allowing to view in the main graphics window of the US-SOMO HPLC-SAXS module a series of selected data files in a movie-like manner, and to optionally save each frame as an image for real movie-making operations (see here).

The X and Y buttons allow to toggle between linear and log10 scaling of the data on the x- and y-axes (if zero or negative values are present, they will be temporarily removed as they cannot be shown on the display in the log10 mode).

Rescale adjusts the X-Y axes on the graphics window to maximize the display of selected datasets (no effect on the data themselves).

Normalize will divide the I(q) data by the stored/entered concentrations.

Average will produce a weighted average with propagated SDs of selected data.

To SOMO/SAS will transfer selected datasets back into the US-SOMO SAS panel.

Each time the Width button is pressed, it increments the data line size of the plots, until it goes back to the initial value.

Color shifts the colors used in the graphics window for a better contrast with the background.

Smooth performs a regularization of selected data using a moving window, whose dimension is defined in a pop-up menu (shown below), using a Gaussian smoothing kernel of 2n + 1 points.

SOMO HPLC-SAXS smoothing window panel

Repeak is used to effectively scale data on the Y-axis to a pre-set target, selectable in a pop-up window among the data subjected to this operation (this affects the data, a new file is generated with "rp" and the scaling factor added at the end of the filename). See more below on this subject.

SVD operates only on I(q) vs. q datasets. It opens a pop-up window were a single-value decomposition analysis (e.g., Williamson et al., Biophys J. 94, 4906-4923, 2008) can be performed on the selected data (see here). Important: the data must be all on the same grid; if not, a warning message will appear in the bottom left Messages window: "SVD: curves must be on the same grid, try 'Crop Common' first" (see below for the use of the Crop Common button).

Make I(t) is one of the crucial operations in the HPLC-SAXS module. It allows to generate a series of "chromatograms" (I(t) vs. t, where t can be real elution time or frame number) for each q-value present in the original data files (see below).

Test I(t) will perform a test to ascertain if a particular set of I(t) vs. t "chromatograms" contain useful data, on the basis of a comparison between the signal and its associated SD. This test could be automatically performed each time an I(q) vs. q dataset is converted into an I(t) vs. t dataset by selecting its relative checkbox and the SD factor in the Options menu accessible from the button provided at the bottom of this window (see here).

Make I(q) is the other crucial operation in the HPLC-SAXS module. It allows to re-generate I(q) vs. q files for each frame after data treatment in frame- (or time-) space.

Set concentration file will select an already uploaded file containing the UV or refractive index profile vs. time or frame number as the source of the concentration-dependent signal.

Concentration detector will allow to select the type of detector and to enter its calibration constant in a pop-up window (see here).


SOMO HPLC-SAXS files commands panel

Since a typical HPLC-SAXS experiment produces a series of I(q) vs. q data collected at some time interval ("frames"), they can be inserted in a 2D matrix where each line corresponds to a frame number (or time value) and the columns contain the intensities I(q) and their associated SDs at the various scattering angles q. It is then a simple operation to generate another matrix where the lines correspond to the q-values and each column contains the intensities I(t) (and their associated SDs) corresponding to each frame number (or time value). A new data set consisting of I(t) vs. t "chromatograms" for each q-value can then be generated.
In the image above, the original I(q) vs. q data shown in the first image of this Help section have been transformed to I(t) vs. t data by pressing the Make I(t) button after selecting all files. The q values are now part of the resulting filenames. Since the On Make I(t), discard I(t) with no signal above st. dev. multiplied by "4" was selected in the Options menu (see here), the following Warning message appeared:

SOMO HPLC-SAXS discard I(t) files warning

In addition, a test is automatically performed to identify regions within a sliding window (of 25 frames in this case) where the sum of the intensity is less than the negative of the sum of the corresponding SD values over the window. Regions with negative values will cause problems with the integral baseline subtraction procedure (see more below). This test did not identify any such region, so no additional warning messages were produced. If the sliding window had been set at "5" (a very restrictive criterium), this message would have appeared:

SOMO HPLC-SAXS negative I(t) integral warning

Some cropping operations (see below) were also performed to remove very noisy low-q datasets and to truncate the right-side of the datasets to frame #200, followed by display rescaling. All operations are recorded in the bottom left panel.

The file names of produced data are shown in the Produced Data panel to the centre-left, and can be selected and saved to files using the appropriate buttons below it.

Select all will select all files in this panel.

Invert will allow toggling the selection between selected files and everything else not currently selected.

Similar will search for similar file names after selecting a single file in this panel.

Remove will discard the selected files.

Two types of files can be produced, csv-style (Save CSV) or regular 3-columns .dat files (Save).

Show will add the selected file(s) among those produced to the ones already displayed in the graphics window.

Show only will show only the selected file(s) among those produced in the graphics window.

In the Messages area, the operations performed are tracked, and computed parameters are shown. The display can be copied or cleared from the File pull-down menu.

The last line of the left-side panels contains the Help and Options buttons. On pressing the latter, a pop-up panel will be shown:

SOMO HPLC-SAXS options

See here for a description of this module.


Below the US-SOMO HPLC-SAXS module graphics panel there are a series of buttons for performing several operations on the files displayed, some of which will become available only when multiple files are selected, or a region of the graph is zoomed, while others will become available only when single files are selected:

SOMO HPLC-SAXS commands

When a part of the graph is selected using the mouse/left button, the buttons in the bottom line become all available (only Crop Zeros, Crop Common, and Undo are available when files are just displayed after selection).


SOMO HPLC-SAXS integral baseline setting 1

After visual inspection of the I(t) vs. t chromatograms, baselines can be defined, if needed, for instance if the data after the peak(s) do not return to the initial values. A single chromatogram is first selected; a compromise between high intensity and low noise works best.
The Baseline button is then pressed. As shown in the image above, this superimposes to the selected chromatogram three vertical lines on the right side, the last line of buttons under the graphics window is replaced by three colored fields (magenta-red-magenta), and two dashed lines are drawn horizontally (orange and green). The two magenta lines define the beginning and end, respectively, of the chromatogram region over which the data are averaged to set the end of a baseline, while by default the baseline is set to be at zero at the beginning of the data on the left side. The horizontal dashed orange line indicates the trend of a linear regression done on this region. The vertical red line defines instead the end point of the data to be subjected to the baseline correction. The positions of the three vertical lines are indicated in the three background color-coded fields.

SOMO HPLC-SAXS integral baseline setting 2

By clicking on each field, the corresponding line can be moved across the chromatogram by left-clicking and moving the mouse over the gray-shades bar-wheel which has appeared just below the graph in this panel. The green dashed line will give the trend of a linear baseline between the chosen starting and end points. The orange line should be set to possibly match a flat baseline region at the end of the chromatogram, as shown in the example above. Be careful to identify and avoid potential late eluting peaks.

Pressing Keep once a reasonable end region has been defined will keep its parameters for the further operations involved in the integral baseline removal tool.

Cancel will remove the settings and revert to no baseline.

SOMO HPLC-SAXS integral baseline setting 3

After pressing Keep, the Baseline apply button becomes available. Pressing it will apply the integral procedure to the selected chromatogram(s). Each resulting baseline-subtracted chromatogram will have a "-bi" added after the q value and an "-s" at the end of the filename to indicate that an integral baseline subtraction was applied (if a linear baseline option is used, the first label will be "-bl"). The numerical value of the overall change in baseline and the alpha value (for an explanation alpha see here) are also added to the filename of the produced files, as shown in the Data files panel. It is best to first check the results on the initial chromatogram used to set the baseline parameters, and to select the Produce separate baseline curves checkbox in the Options menu (see here). In this way, the convergenge of the integral baseline removal procedure can be tested.
In the example shown above, the original curve at q=0.00680487 Å-1 (green) is compared with the baseline subtracted curve (pink), and superimposed are the five baseline curves produced by the iterative integral baseline subtraction procedure (from light green to purple; see legend for all the colors). As can be seen, the convergence in this case is reached by the 2nd iteration. It is then best to check the integral baseline subtraction results on a few files at higher q values than the first selected, to verify that consistent results are obtained. Subsequently, the Produce separate baseline curves checkbox in the Options menu (see here) should be deselected, and a set of I(t) curves starting from the lowest q value used to test the procedure up to an intermediate q value should be selected, and the Baseline apply button should be pressed again. This will produce a set of baseline subtracted curves still missing the lower q, more noisy I(t) datasets, and the higher q, less intense I(t) datasets, as shown in the image below, where the q range is from 0.00403252 to 0.0400725 Å-1:

SOMO HPLC-SAXS integral baseline applied

Note how around frame 160 of the I(t) vs. t chromatograms the signal now oscillates around 0. The procedure can be repeated on the remaining I(t) curves at the higher q values:

SOMO HPLC-SAXS integral baseline applied higher q

The integral baseline subtraction procedure always performs a test to verify that the integral is not negative, which would lead to an addition of signal rather than a subtraction. This usually originates at high q values and it's due mainly to slightly excessive buffer subtraction. In this case, the integral baseline is not subtracted, and the following message appears:

SOMO HPLC-SAXS integral baseline negative warning

The q I(t) datasets at very low q values can now be processed. If necessary, the settings can be adjusted for each chromatogram and then repeating the Baseline apply procedure. The image below illustrates the results obtained after individual adjustment of the right-side baseline start and end regions for the five I(t) datasets at the lowest q values:

SOMO HPLC-SAXS integral baseline applied lowest q

A final, baseline-adjusted set is then obtained by visualizing all the curves produced and removing the noisier ones:

SOMO HPLC-SAXS integral baseline final set


The quality of the baseline-subtracted data can be assessed by selecting with the mouse a region under a peak and then pressing Test I(q):

SOMO HPLC-SAXS test I(q) selection

This will bring up again the gray-shades wheel-bar and change the two lowermost bars with the buttons below the graphics window. At the bottom, a Time range for I(q): label will appear, followed by two fields with red background indicating the region subjected to the Test I(q) procedure. The limits can be changed by either clicking on each red-colored field and then using the gray-shades bar-wheel at the top, or, if a region was pre-selected with the mouse, by clicking on the Vis. range button.
Two operations can be then performed. Pressing the Scale button on the row above will change the layout in this way (note that, by pressing Y on the left-side commands panel, the left-axis has also been changed to log10):

SOMO HPLC-SAXS test I(q) scale layout

The two lowermost rows now display the tools for scaling the back-generated temporary I(q) vs. q curves on top of each other. The two red-background fields now indicate the actual q range for scaling, which can be adjusted by clicking on each field and using the gray-shades bar-wheel at the top; two vertical red lines will mark the corresponding positions on the graph. The Reset q range button will re-expand the q range.
The last row contains the scaling settings/commands:

In the image below, the scaling has been performed on the indicated q range. The Messages panel reports the statistics of the scaling process as applied to each curve scaled on the one with the higher intensity:

SOMO HPLC-SAXS test I(q) scaled set

A blow-up of the scaled regions shows how the back-generated curves appear to scale well on each other:

SOMO HPLC-SAXS test I(q) scaled set blow-up

Presing Cancel will completely exit from the Test I(q) mode.

Pressing Test I(q) will instead exit from the scaling mode only.

Pressing then the Guinier button will call the other test function available in the Test I(q) mode:

SOMO HPLC-SAXS test I(q) Guinier test initial

The lowermost row now carries the tools necessary to perform a Guinier analysis on the back-generated temporary I(q) vs. q curves:

At the bottom of this window region, the averages values for all the curves are reported, and they include:
The regression data for each individual curve are reported in the Messages window.

The q-range should be adjusted so that the average qmax*Rg is below the limit required by the theory (between 1-1.3), and the residuals of each linear regression can be seen by pressing Residuals:

SOMO HPLC-SAXS test I(q) Guinier test adjusted

Note how the average Rg recovered, 28.1 Å, compares well with that calculated from the BSA crystal structure using Crysol, 27.7 Å (Brookes et al., J. Appl. Cryst. 46:1823-1833, 2013).

At this point, a plot of the Rg values across the chromatogram together with a typical I(t) profile (continuous green curve) can be shown by pressing Rg plot:

SOMO HPLC-SAXS test I(q) Guinier test Rg plot

A new row will appear below the graphics window, with these fields:

The Rg plot utility can also be employed to verify the distribution of Rg values across the whole chromatogram:

SOMO HPLC-SAXS test I(q) Guinier test Rg plot global

In the example shown above, one can appreciate that the distribution of the Rg values is almost linearly decreasing from the beginning of the chromatogram up to the top of the peak preceeding the main peak, fairly constant across the right-half of that peak, and just slightly decreasing across the main peak. This indicates that while the resolution between the main peak and the peak preceeding it is fairly good, this is not the case for the other peaks. A Gaussian analysis and decomposition, as also suggested by the SVD analysis (see here), could be then useful to determine the actual Rg values in the not-resolved regions.

In the global Rg plot analysis, note that the qmax*Rg ≤ 1-1.3 rule cannot be met across the whole chromatogram. The Rg values displayed should thus be considered as only indicative. In the next release, we plan to introduce an automatic qmax*Rg limiter in the linear regression analysis (similar to the one already present in the Guinier analysis module of the main US-SOMO SAS module) to make the computed Rg values more reliable.

Pressing Test I(q) button will bring back the main options of this utility.

Pressing the Cancel button will exit the Test I(q) utility.

If Gaussian analysis is not required, a series of I(q) vs. q frames can be re-created at this stage from the baseline-corrected data by pressing the Make I(q) button.



Before proceeding to Gaussian analysis (whose theory can be seen here), a SVD analysis could be useful. In SVD analysis, the number of significant singular values in the decomposition should be equal to the number of components in the data, and thus to the minimum number of Gaussians required to accurately reconstruct the data (see here).

SOMO HPLC-SAXS Gaussian initialization

The baseline-subtracted data can be subjected to Gaussian analysis by first selecting a single chromatogram, and then pressing the Gaussians button. By default, the US-SOMO HPLC-SAXS module will consider symmetrical Gaussians, but distorted Gaussian functions are also availble and can be selected from the Options menu (see here). The choice must be done before starting the following procedure. An example of a data processing with non-symmetrical Gaussians is presented here.

On pressing Gaussians, two new rows will appear at the bottom of the graphics window. If a previously-generated set of Gaussians was present or loaded from file, the Gaussians will show up under the peak(s) together with vertical lines indicating their centers.

Clear will remove them, and allow to start a new analysis.

Each time the New button is pressed, a new Gaussian will be added (green colour), with pre-set center, width and amplitude shown in the three rightmost fields (additional fields will be present if distorted Gaussians functions are used; see here). By clicking on each field, and then using the gray-shades bar-wheel, each Gaussian can be adjusted to initialize the process (usually, only the centers need to be positioned under the peaks). If the Match checkbox is selected, the height of each Gaussian will be automatically adjusted so to match the height of the experimental I(t) vs. t curve at the Gaussian current position.

Del will remove only the current Gaussian.

Clicking on the "<" and ">" buttons will toggle among the Gaussian present, whose identifying number is shown in the field between them. The active Gaussian is identified by a magenta vertical line positioned at its center, while blue lines are used for the others. The limits for the analysis of the chromatogram are shown with two vertical red bars, whose position is shown in the two red-background fields in the bottom row, before the To produced data button.

The SD checkbox controls whether the data associated std. dev. values will be used in the fitting procedure (default: not selected). It is recommended to select this checkbox only after a first round of fiting with the various algorithms provided has been performed, as the SD can only be used with the LM algorithm at the time of writing this Help section (May 2014).

Once the initialization is completed, pressing the Fit button will bring up a window controlling the fit procedure, shown below:

SOMO HPLC-SAXS Gaussian Fit panel

See here for a description of the Fit module.

In the first cycle of iteration, it is best to keep the original centers fixed:

SOMO HPLC-SAXS Gaussian first round center fixed

In the example shown, a not well-defined aggregates peak is present at the beginning, and an extended initial baseline is not present. If the first Gaussian is left free to adjust, it will expand too much to compensate for the missing initial baseline. Therefore, in such situations it is best to keep fixed the position of the first Gaussian:

SOMO HPLC-SAXS Gaussian first round center fixed

A final round of fitting can then be performed using the SD and allowing a 5% variation on the Gaussian centers at each iteration, until a satisfactory fit of the main peak(s) is obtained:

SOMO HPLC-SAXS Gaussian initial fit

If some datasets have zero or NaN values for one or more SD values, a pop-up menu will appear listing all the files presenting this problem, and with how many occurrences. The user can then select between three options: drop the datasets containing these non-positive SDs; drop just the frame (or time) point missing the positive SD(s); or not use SD weighting.

The global improvement of the fit can be also judged by the rmsd (SD checkbox not selected) or χ2 (SD checkbox selected) value which is updated next to the Fit button. The residuals of the fit can be visualized by pressing the Residuals button, which will split the graphics window in two, and show a plot of the fit residuals below the main plot. The residuals plot can be removed by pressing Residuals again (see more below). In the example shown above, the residuals are weighted by the std. dev. associated with the experimental points (SD checkbox selected; a By percent residuals option is also available).

Once a satisfactory fit is reached, pressing Keep will accept the current Gaussians for further work. But to save the parameters of the current Gaussians in a file, the Save button has to be pressed before Keep.

Cancel will cancel the operations and remove all the Gaussians.

Once an initial set of fitted Gaussians has been produced, it should be globally fitted to all chromatograms. However, performing this operation directly on all chromatograms can be very computationally intensive. For this reason, it is best to perform it on a subset of all chromatograms, and the global fit results are then propagated to all remaining chromatograms. Importantly, in the global fitting procedure the centers and widths of each particular Gaussian are optimized so to be the same across all chromatograms, and only the amplitudes are then fitted.

To select a subset of data, the Select button is pressed, which will open the pop-up selection panel (see image below and here).

SOMO HPLC-SAXS Global Gaussian Nth selection

It is advisable to perform the global fitting avoiding the very first few low-q, very noisy, and the last high-q, very low signal I(t) vs. t chromatograms. In the example we are illustrating, we start from chromatogram # 6 (q = 0.007813 Å-1) and select every 4 chromatograms up to # 452 (q = 0.119698 Å-1). The I(t) vs. t chromatogram on which the initial set of Gaussians was optimized is also included (Select Additionally button). Pressing Select in main window will close the pop-up window and the selected files will be shown in the main HPLC-SAXS module graphics window:

SOMO HPLC-SAXS Global Gaussian on Nth

The Global Gaussians button is now available. Pressing it will simply find the amplitudes best fitting all the selected chromatograms based on the centres and widths found on the initial chromatogram. This operation has to be performed before the global fit.

If datasets having points with 0 or NaN std. dev. values are found, a pop-up panel will appear:

SOMO HPLC-SAXS Drop 0 SD pop-up

When just a few problematic points are found for each file, the Drop points with 0 SDs option can be safely used. The global Gaussians operation will then be completed:

SOMO HPLC-SAXS Global Gaussian on Nth final

In the image above, the Global Gaussians results on the Nth selected files are shown, together with the grouped fit residuals. The common centers and widhts, not optimized but just based on the initial chromatogram fit, are displayed in the graph as vertical and horizontal bars, respectively.

Save will save the resulting Gaussians to the current selected directory, with extension -gauss.dat for symmetrical Gaussians of single files and -mgauss.dat for Gaussians of multiple files. For distorted Gaussians, the extensions will be -mgmg.dat, -memg.dat, and -memggmg.dat for the GMG, EMG and EMG+GMG Gaussians, respectively.

Global fit, which becomes available instead of the Fit button once a series of chromatogram is selected and after at least an initial set of Gaussians is generated/loaded, can now be used to optimize all the centres and widths of each Gaussian along all the chromatograms to common values for each family of Gaussians. The operation is controlled by the same pop-up Fit panel as for the single chromatogram case (see here), but only the LM method is currently (May 2014) available. In this example, it is best to first perform a global fit round keeping the Gaussians 1 and 2 fixed, and then perform a second round leaving all parameters free.

SOMO HPLC-SAXS Gaussian Global fit

In the image above, the results of the Global fit are shown together with the grouped fit residuals. Note how the global residuals appear to be evenly distributed up to the boundaries of the main peak, with a noticeable worse fitting and the end of the main peak. This could indicate either a slight non-pure Gaussian shape of the peak, or the presence of a small amount of some trailing material in this region.

It is best to first Save and then Keep the results, and then select all the available I(t) vs. t chromatograms (use Select all if only I(t) vs. t data are present in the Data files section).

Global Gaussians can now be applied to all the selected chromatograms.

SOMO HPLC-SAXS Global Gaussian Global on all

The image above shows the global Gaussians results after applying the global fit parameters found on a subset of data to all chromatograms. Note that the residuals x-axis was rescaled (using the graph controls accessed by right-clicking on the graphics window plots) to align it with the selected fit region delimited by the two red vertical lines.

Save and Keep can then be sequentially pressed to store and accept the global Gaussian results.

Pressing the 3D button will generate a 3D plot of the data, allowing easier detection of potential fitting issues. First a pop-up window will appear:

SOMO HPLC-SAXS 3D plot Gaussians selector

More infos about this small module can be found here.

SOMO HPLC-SAXS 3D plot of Global Gaussians

This interactive plot can show any selected set of the Global Gaussians over any collection of curves. The interface is fully interactive for rotations, scaling and zooming along with multiple display and save controls. Its utilization is helpful for visualizing the quality of the global fit.

The goodness of the fit for each I(t) vs. t chromatogram can be checked by selecting a single chromatogram and then pressing Gaussians, as shown below:

SOMO HPLC-SAXS Global Gaussians results on a single chromatogram

Cancel will revert to no Gaussian visualization mode.

After Gaussian decomposition, the Test I(q) procedure can be repeated. First, all the available I(t) vs. t chromatograms for which Gaussians have been produced are selected, and the Test I(q) button is pressed. The third commands row under the graphics window will now show additional options:

SOMO HPLC-SAXS Test I(q) after Global Gaussians decomposition

The round checkboxes labeled none, 1, 2, 3, and 4 allow selecting which Gaussian will be used to produce the corresponding decomposed I(q) vs. q data, as a pointwise % of the original I(t) vs. t data based on the relative contribution of all Gaussians at that particular point in t space. If the square as pure Gaussian checkbox is selected, the actual Gaussian value will insted be used (effectively smoothing the data).

In the first example shown below, the Rg plot for the region of the main peak using the 4th Gaussian can be seen:

SOMO HPLC-SAXS Test I(q) after Global Gaussians decomposition G4 main peak

The contribution of the 3rd Gaussian under the main peak can be now evaluated (note that the qmax*Rg limit has been adjusted for this dataset):

SOMO HPLC-SAXS Test I(q) after Global Gaussians decomposition G3 main peak

As can be seen, there is a slight contribution (see the ~500-fold reduction in intensity in the Guinier plot) from the peak preceeding the main peak, evidenced by the fairly constant but considerably higher Rg values (≈ 41.5 Å), likely identifying this material as a BSA dimer.
If we perform this analysis selecting the top region of the dimer peak, we find similar Rg values, ≈ 41.6 Å, evenly distributed across the peak:

SOMO HPLC-SAXS Test I(q) after Global Gaussians decomposition G3 dimer peak

The contribution of the 2nd Gaussian under the dimer peak is shown below:

SOMO HPLC-SAXS Test I(q) after Global Gaussians decomposition G2 dimer peak

Again, reasonably flat, considering the very low intensity, and higher Rg values (≈ 56 Å) are observed across the dimer peak for the likely trimer contribution under the dimer peak.
Finally, the contribution of the 2nd Gaussian under the trimer peak is evaluated:

SOMO HPLC-SAXS Test I(q) after Global Gaussians decomposition G2 trimer peak

confirming the Rg values (≈ 56 Å) and a reasonably flat distribution.


Each individual Gaussian is defined by three numbers: the amplitude, width and center. As such, they are not "curves" in the sense of the loaded files, which are collections of data points. Therefore, the Gaussians can not be visualized with the facilities of the program outside of Gaussian or Global Gaussian modes.

To allow the visualization of the Gaussians, the To produced data button is provided which produces curves of individual Gaussians and their sum. This is available in either Gaussian or Global Gaussian modes. The resulting curves are collections of data points that can be visualized outside of the Gaussian modes. The Global Fit method requires a simultaneous fit of all the selected curves. This is internally represented by joining all the selected curves along the time/frame dimension to produce one long curve. Of course, each curve is generally on the same time/frame axis range, so to maintain increasing time/frame numbers, curves subsequent to the first one are placed into the joined curve with an offset in time/frame.

To visualize the joined curve and the Global Gaussian fit to the joined curve the Make result curves button is provided. This will create the joined curve along with the joined Global Gaussian fit as a pair of curves that can be visualized outside of the Global Gaussian mode, as in the example shown below:

SOMO HPLC-SAXS Joined result curves and original data



If available, a concentration chromatogram deriving from UV or refractive index monitors can now be processed. After uploading a suitable file with the Add button:

SOMO HPLC-SAXS Concentration Chromatogram

the first operation is to rescale it to one of the high intensity but relatively low-noise I(t) vs. t chromatograms. This is done by selecting the two files:

SOMO HPLC-SAXS Concentration Chromatogram and target file

and pressing Repeak in the left-side command panel, which will bring up a small window asking to identify the target chromatogram (in case multiple were selected).

SOMO HPLC-SAXS Target Chromatogram selection

Usually concentration detector data have no associated SDs. In this case, another pop-up panel will appear presenting three options:

SOMO HPLC-SAXS SD options in repeak

Selecting the second option will allow a consistent Gaussian analysis of the concentration detector data.

SOMO HPLC-SAXS Conc. Chromatogram Repeaked

The result of a re-peak operation is shown above, and the scaling factor is added to the concentration dataset filename.

After re-peaking, the concentration chromatogram usually must be time-shifted to align its peaks to the I(t) vs. t chromatograms using the Timeshift on button.

Again, two files must be selected, one is the concentration data, the other belonging to the I(t) vs. t (the file used for re-peaking is normally used for this operation). On pressing the Timeshift on button, a pop-up window will ask to select which chromatogram has to be timeshifted:

SOMO HPLC-SAXS Conc. Chromatogram Timeshift select

The alignment is then performed manually by left-clicking and moving the mouse over the grey-shades wheel bar below the graphics window until the two chromatograms are best aligned.

SOMO HPLC-SAXS Conc. Chromatogram Timeshift select

The value of the timeshift is reported in the field next to the Timeshift on button.

Cancel will stop the operation.

Keep will keep the time-shifted data). The produced data will have the timeshift value added to its filename on saving.

Selecting only the time-shifted concentration chromatogram data, and pressing Set concentration file will then associate the time-shifted data to the I(t) vs. t data under analysis.

The re-peaked, time-shifted concentration chromatogram can be now fitted with Gaussians, using for initialization the set derived from the I(t) vs. t chromatograms (note: it is mandatory that the same number of Gaussians be used for both the concentration and I(t) vs. t chromatograms).

This is done by first selecting only the concentration chromatogram and then pressing Gaussian. A warning message will appear:

SOMO HPLC-SAXS Conc. Chromatogram initial Gaussians

asking to rescale or not to rescale the Gaussian amplitudes. After selecting the first option, the Gaussians will be rescaled to the highest peak, and will be shown superimposed to the concentration chromatogram:

SOMO HPLC-SAXS Conc. Chromatogram initial Gaussians

Pressing Fit will then bring up the Fit window (see here) and an initial round is done by keeping fixed both the position and widths. If necessary, a refinement can be done by keeping fixed the smallest, front eluting peak(s), and allowing only a limited shift to a % of the initial values from the widths and positions determined from the SAXS data (suggested: 2-3%). This should compensate for slight band broadening and misalignment between the concentration and SAXS detectors.

SOMO HPLC-SAXS Conc. Chromatogram final Gaussians

To deal with serious band brodening issues, a band broadening correction routine will be implemented in a future release. However, the agreement with the two major peaks in the figure above shows that, at least in the case of this dataset, this is not a major issue.

The Save and Keep buttons must be then pressed to store and associate the resulting Gaussians to the concentration chromatogram. On re-generating the I(q) vs. q frames (see below), each concentration Gaussian peak will be mapped onto the corresponding I(t) vs. t peaks.

The Make I(q) button becomes available every time that more than one I(t) chromatogram is selected. If Gaussian fitting was performed, pressing it will produce a series of I(q) vs. q curves for each Gaussian peak for each frame of the chromatogram on which the global operations have been carried out. An option panel in a pop-up window will allow several choices:

SOMO HPLC-SAXS Make I(q)

A description of this module can be found here.

Once the I(q) vs. q files have been generated, it is possible to view the resulting Gaussian contributions and their sum, as shown below:

SOMO HPLC-SAXS resulting I(q)s

Here the I(q)'s for frame #61, originally presenting an overlap between the trimers and dimers peaks, are shown, together with the reconstructed sum with baseline back-addition (see the legend for details). The drop of intensity at q values > 0.07 Å-1 for Gaussian peak #1 (green) is due to its contribution vanishing in the high q range. Note how there is a contribution from Gaussians #1, #2, and #3 in this frame, with the latter two contributing almost equally. Note also how the reconstructed curve with baseline back-addition (pink) almost perfectly superimposes with the original frame data (white). The truncation of the Gaussian peaks data at low q is due to the exclusion of these very noisy data from the analysis.

A zoom into the low q region for frames #75-86 of the Gaussian peak #3 it is shown in the next two images, before and after concentration normalization (Normalize button):

SOMO HPLC-SAXS resulting I(q)s

SOMO HPLC-SAXS resulting I(q)s

In the latter, an average curve (obtained by pressing the Average button) is also superimposed. Such re-generated I(q) vs. q data can be directly exported in the main US-SOMO SAS module for further operations by pressing the To SOMO/SAS button.


www contact: Emre Brookes

This document is part of the UltraScan Software Documentation distribution.
Copyright © notice.

The latest version of this document can always be found at:

http://somo.uthscsa.edu

Last modified on May 20, 2014.