All-ion fragmentation approaches, such as MS
E
, offer an alternative approach to collect fragment ion data across the entire acquisition mass range. Here, peptide ions are cycled between low and high energy data acquisition to collect a permanent record of the mass spectra for all precursors and all product ions. Since the product ion spectra are highly complex, deconvolution requires precise liquid chromatography (LC) separation to match elution peak shape for precursor and product ions
1
. In complex peptide samples, ion interference complicates selective analysis of individual features, reducing the depth of proteome coverage. Without additional off-line fractionation, LC-MS simply does not have sufficient peak capacity to individually resolve all features in a complex proteomics sample. Other DIA methods, such as SWATH, provide a hybrid approach to combine all-ion fragmentation re-producibility with sequential acquisition of smaller mass windows (e.g. 25 m/z)
2
-
3
. Nonetheless, these methods still require fast scan speeds to achieve sufficient sampling for multiple measurements across peaks for precise quantitation.
One way to increase peak capacity is to introduce in-line ion mobility separation. This orthogonal separation dimension prior to fragmentation helps deconvolve complex mass spectra, increasing the number of individually resolved peptides and precursor ions
4
. In traveling wave ion mobility spectrometry (TWIMS), ions are confined by an RF field, and DC voltages are sequentially applied to stacked-ring ion guides. This creates a potential wave that pushes ions along through neutral gas. Ions with larger collisional cross sections fall behind the traveling wave more frequently, and arrive later to the detector. This arrival time, or drift time, is related to the size, shape, and charge of each ion. Importantly, TWIMS separation can resolve isobaric features with distinct cross sectional areas
5
-
6
. In addition, since larger peptides have later arrival times and also require more energy for fragmentation, adjusting collision energy as a function of drift time improves fragmentation efficiency and significantly increases the number of peptide and protein identifications when employed with TWIMS
7
.
Previous reports have sought to quantify the benefits of IM-MS for proteomics samples in terms of peak capacity, often relying on simplified samples and extrapolation in order to predict outcomes for cellular extracts
8
-
13
. Such results are highly dependent on IM resolution, defined as the centroid IM drift time divided by the width of arrival time distribution at half height. For TWIMS, an IM resolution of 45 has been reported for peptides on current generation equipment, with larger resolution values possible in cross-sectional space
5
,
14
. Despite significant commercial interest in ion mobility separation, there is little information on how TWIMS performs in highly complex proteomics samples, and whether one fixed set of TWIMS parameters is optimal across the entire diversity of peptides from a whole cell tryptic digest.
Here we benchmarked TWIMS separation using a commercial HeLa cell tryptic digest. When surveying reported TWIMS settings, most reports use fixed wave height and velocity settings (40 V, 600 m / s), although there is no clear consensus (
Table S1
). Taking into account the degree of orthogonality
10
,
15
between the liquid chromatography (LC), IM, and m/z dimensions, we calculated a three-dimensional separation space to provide a quantifiable peak capacity. Importantly, the majority of ions fell within a limited area of the TWIMS separation space. Optimization of TWIMS settings identified ramped wave velocity settings that increased the drift space occupancy, yielding a 40% increase in peptide annotations and a 50% increase in the theoretical peak capacity. Under these conditions, the velocity of the DC pulses within the TWIMS separator are ramped in a time domain that allows high mobility species to experience primarily higher wave velocities, and lower mobility peptides to experience progressively lower velocities, each of which can be optimized for the mobility range of the peptides separated. Overall, this analysis establishes a quantitative benchmark for TWIMS separation, establishing new customizable settings to optimally resolve peptide features in complex proteomics samples.
IM separation dramatically improves the number of peptide and protein identifications in all-ion fragmentation (MS
E
) shotgun proteomics. Unfortunately, IM resolved DIA data files acquired on the Waters Synapt platform are stored in a proprietary file format, and existing open source tools for data conversion perform poorly with ion mobility-mass spectrometry data. While individual features can be manually interrogated, it is not technically feasible to manually extract comprehensive information for each of the hundreds of thousands of features across an IM-MS proteomics dataset.
To address this issue, we designed a user-friendly interface (TWIMExtract) around a minimal raw data extraction tool provided by Waters (see
Supporting Information
). TWIMExtract is a Java-based graphical user interface (GUI) for extracting data from Waters' proprietary .raw format. It uses an executable to query the proprietary format and return information requested by the user. The Java GUI allows users to quickly process large numbers of data files and extract whole or targeted datasets in an automated fashion. The user selects the data file(s) to query, which are displayed in a table with accompanying information, and range file(s) specifying the regions of the three-dimensional data to extract. Any of the three-dimensions of data (retention time (RT), m/z, and ion mobility arrival time (DT)) can be extracted individually. This enables users interested in obtaining retention time chromatograms, drift time spectra, or mass spectra of any or all features in their raw dataset to rapidly and automatically extract those features for further analysis. TWIMExtract provides raw data without further processing and is intended to act as a flexible piece of larger data analysis pipelines. Importantly, TWIMExtract can extract hundreds of thousands of user-defined slices of LC-IM-MS data in a few hours. Users can specify any combination of ranges of chromatographic retention time, IM drift time, and m/z and process any number of raw data files with any number of slices. Key parameters can be exported with the data, including collision energies and IM settings, and multiple extractions from the same raw file can be automatically stitched together into a single .csv output file. TWIMExtract also supports batching of analyses to enable large-scale automated data extraction.
We applied TWIMExtract to evaluate the arrival time distributions of defined peptide ions across ± 0.005 m/z and ± 0.2 minutes from their respective centroid values. We then implemented a custom peak-processing algorithm in Python (see
Supporting Information
) to extract the centroid drift time, peak width, and resolution of the dominant peak within the arrival time distribution. Due to the complexity of the HeLa digest, the m/z and retention time windows occasionally extracted multiple peaks (precursor events) in the ion mobility arrive time distribution. In the rare event (< 0.1%) when an ion clearly fell outside the m/z – drift time trendline of the assigned charge state, the event was flagged as an incorrect assignment and excluded from further analysis.
We started by comparing the ion mobility drift times with mass to charge (m/z) and LC retention time of a commercial HeLa cell proteome tryptic digest, plotting each separation in two dimensions. Drift time information was extracted for all annotated peptides and compared to the accurate mass retention time (AMRT) features, which deconvolve the isotopic distributions into a single measurement. When ion mobility drift times are compared to m/z, tryptic peptides are primarily in the 2+ charge state, and share a common trendline (
). Importantly, both AMRTs and annotated peptides show nearly identical distributions in the drift and mass dimensions, although in both instances the majority of features occupy a highly dense region of separation space. LC separation is significantly more orthogonal to IM separation, and distributes events more evenly across the analysis window in both annotated peptides and AMRT events (
). Indeed, ∼90% of the annotated peptides or AMRTs occupy less than 25% of the overall drift space (
). Under these ion mobility instrument settings, tryptic peptides do not access the majority of TWIMS separation space, highlighting poor separation efficiency for peptides with experimentally similar collisional cross sections.
Distribution of annotated peptides and AMRTs in three dimensions
(a) Two dimensional plot of m/z and drift separates ions by charge state. 1+ peptides have longer drift times and are entirely separated from the 2+ and higher trendlines. (b) Two dimensional plot of liquid chromatography versus drift separation reveals greater orthogonality than mass-to-charge separation. Representative data from 3 independent replicates.
Tryptic peptides occupy a limited area of TWIMS separation space
(a) Distribution of AMRTs and peptides in drift time. Shaded gray areas define 25% of the total TWIMS separation space, corresponding to approximately 90% of the total AMRT and peptide events. (b) Ion mobility resolution of >10
6
AMRTs, displayed by charge state. (c) TWIMS doubles the peak capacity of LC-MS analysis. Data are representative of 3 independent replicates.
Next, we evaluated TWIMS resolving power for both AMRTs and annotated peptides in the HeLa cell tryptic digest. After optimization of wave height and wave velocity, synthetic peptide standards can reportedly achieve IM resolution between 25-45
5
,
14
. In a complex tryptic digest, our analysis returned significantly lower values, with a mean resolution of 18.6 and a large standard deviation of 5.5 (
). Peptides with different charge states had significantly different apparent TWIMS resolutions. For example, 1+ peptides exhibited higher mean resolution (21.1 ± 0.4) than 2+ (18.6 ± 0.3) and higher charge state (18.4 ± 0.2) ions (p < 0.001, N = 3). When the same peptide was compared across different charge states, the species with higher charge generally exhibited lower mobility resolution (
Table S2
,
Figure S1
), likely due in part to additional unresolved conformers
16
.
Using the extracted TWIMS resolution data, we next sought to estimate the total 3-dimensional peak capacity of the LC-IM-MS analysis. Peak capacity of a single dimension of separation represents the total separation length divided by the average peak width. If LC, IMS and MS were fully orthogonal to one another, the total peak capacity of the LC-IMS-MS system would be the product of each of the peak capacities from each dimension
17
. Because LC, IMS and MS separate peptides on the basis of related physicochemical properties, we accounted for their correlation in our estimates of system peak capacity (
Figure S2
). First, we calculated linear fits for m/z and drift time separately for 1+, 2+ and combined 3-7+ charge states. From these linear fits, a mean m/z value was calculated for each drift time ‘bin’, and a conservative separation length was defined as the average deviation from the mean m/z, demarcating the actual area of two-dimensional IMS-MS separation space occupied. Average peak widths across both the m/z and drift time dimensions were fit into each charge state area, which in sum yielded a 2-dimensional peak capacity of 5.67 ± 0.27 × 10
4
peaks across the m/z and ion mobility dimensions, nearly twice the 2.93 ± 0.01 × 10
4
peaks defined by mass analysis alone (
). Next, the degree of correlation between LC, IM and MS separation were defined. Both m/z and drift time are almost completely uncorrelated to retention time. To provide the most conservative estimate of three-dimensional peak capacity, we estimated the contribution of the LC dimension from the slightly stronger retention time-m/z correlation. This added separation ‘length’ was then divided by average chromatographic peak width (12 sec), resulting in 53-fold more peak capacity than the two-dimensional IM-MS separation. The 2-dimensional peak capacity multiplied 53-fold produced a total three-dimensional peak capacity estimate of 2.95 ± 0.11 × 10
6
using the standard TWIMS settings, or twice the 1.53 ± 0.01 × 10
6
afforded by LC-MS alone (
).
From our analysis, the starting TWIMS settings (wave height 40 V, wave velocity 600 m / s) do not efficiently occupy the IM separation space. Therefore, we sought to optimize TWIMS separation using the HeLa cell protein digest standard. Our initial survey tested 20 different combinations of traveling wave velocities, which overall achieved largely the same number of protein annotations across different constant collision energies (
Figure S3
).
Interestingly, gradually decreasing the traveling wave velocity from high to low voltage during the 13.4 ms separation increased the number of annotated proteins by approximately 10%. Therefore, we selected the 1000 – 600 m / s wave velocity (and 40V wave height) for further analysis, optimizing the drift time-specific collision energy profile to increase fragmentation efficiency.
Following this optimization, the 1000 – 600 m / s wave velocity reproducibly increased the number of annotated peptides from 1.25 ± 0.05 × 10
4
peptides to 2.00 ± 0.14 × 10
4
peptides, or more than 60% (
). This translated to an increase from 1289 ± 21 proteins to 1820 ± 64 proteins, and increased occupancy throughout the ion mobility separation dimension with no loss in peak integrity (
,
Figure S4
). In fact, we observe an overall increase in resolution from 18.8 ± 0.6 to 21.9 ± 0.2 (
), which significantly shifts the median drift time toward larger values while maintaining peak width, yielding improved overall resolution. Furthermore, the resulting peak capacity is 4.25 ± 0.2 × 10
6
, or 44% higher than the default traveling wave velocity settings (
). Importantly, the variable drift time settings (1000 – 600 m/s) not only improved ion dispersal, but also improved transmission efficiency (
Figure S5
), especially for ions with longer drift and retention times (
Figure S6
and
S7
). In addition, the 1000 – 600 m/s wave velocity settings increased the mean size of identified peptides from 649 ± 2 m/z to 692 ± 5 m/z, which is likely related to more precise drift-dependent collision energy assignment resulting from more efficient ion dispersal. Clearly, even modest improvements in TWIMS separation can greatly enhance the resultant LC-IM-MS peak capacity, affording better precursor-fragment alignment, peak annotation, and ultimately a larger number of high-confidence protein identifications. Overall, we present a detailed analysis of the resolving power and peak capacity of LC-TWIMS-MS for cellular mixtures of tryptic peptides. Based on this analysis, TWIMS is only marginally orthogonal to LC-MS analysis, yet as previously reported, TWIMS separation greatly increases the number of protein identifications from all-ion fragmentation methods. Additionally, our optimized, wave-velocity ramped TWIMS conditions increased the overall peak capacity by 2.8-fold over LC-MS analysis alone, which greatly enhances peptide and protein identifications. Improvements in IM separation technologies promise to further increase peptide peak capacities, which will continue to reduce ion interferences that are currently commonplace in most complex DIA proteomics analyses.
Variable-velocity TWIMS separation improves peptide analysis
(a) Peptide and protein identifications are increased with variable-velocity TWIMS. (b) Optimized wave velocity settings increase drift space occupancy. (c) Enhanced TWIMS resolution from variable –velocity separation. (d) LC-IMS-MS peak capacity is increased with optimized TWIMS settings. Data are representative of 3 independent replicates.