How to prepare glycan array data for use in validating specificity predictions

Return to Grafting documentation | Tutorials | Instructions

These instructions are also available as a video tutorial:

Criteria for inclusion of a dataset
The data set for a particular glycan-binding protein (GBP) concentration was considered unsuitable for analysis if the maximum signal was below 2,000 relative fluorescence units (RFU). Also disregarded were single data points from a particular concentration for any binder whose signal variance was too high, as defined by having a percentage coefficient of variation greater than 50%.

Criteria for selecting binders
Dominant binders were defined as those that gave rise to signals greater than 10% of the maximum signal at each analysed GBP concentration. Weak binders were glycans whose signal was not greater than 10% of maximum for the lower protein concentrations, but did show signals above 10% at the top two GBP concentrations, or showed greater than 50% of maximal binding at only the top concentration. When calculating the overall agreement between theoretical and experimental specificities, experimental non-binders are defined as glycans which contain at least one minimal binding determinant (MBD), but do not display detectable binding to the GBP on the array. While these criteria are inherently arbitrary, they attempt to define reasonable boundaries.

The consortium for functional glycomics (CFG) glycan array screening results are provided as excel (xls) files. One excel file per screened protein concentration. The following instructions apply to Windows7 Excel 2010, but are likely adaptable to most spreadsheet software.

  1. Copy the ID and glycan structure columns from one of these files into a new spreadsheet.
  2. Starting from the lowest qualifying concentration (see Criteria for inclusion of a dataset), copy and paste the RFU,StdDev, and %CV columns from each protein concentration into the new spreadsheet. Note that you must take the columns from the part of the results that is sorted by ID number (usually leftmost data), and not by RFU (usually rightmost data).
  3. Calculate the maximum RFU value at each concentration using the MAX function in excel.
  4. Insert a new column and name it %Max. Calculate the percentage maximum for each RFU value. Note: do it for the first RFU value, eg: (D3/X$3), where MAX RFU value is in X3, and then double-click on the little square dot in the bottom right corner of the box. The $ in the formula means X3 won’t auto change to X4 when you autofill.
  5. Sort by the lowest proteins concentration’s %Max, largest to smallest. For glycans with values less that 10%, sort again based on a higher concentrations %Max.
  6. Conditional formatting is useful. I color green any %Max value that is greater than 10%.
  7. See Criteria for selecting binders
Print Friendly, PDF & Email