Sample preparation using dried blood devices enables quantification of 3900 proteins from whole blood and biomarker identification in lung cancer
Natasha Lucas; Cameron Hill; Elisabeth Karsten; Dana Pascovici; Rosalee McMahon; Ben Herbert
Proteomic biomarker studies of blood are primarily performed on plasma/serum using specific blood collection and sample preparation steps to address dynamic range issues arising from high abundance proteins. In some clinical studies the whole blood cell pellet, after plasma removal, is stored frozen although the lysis of cells exacerbates the sample preparation challenges for proteomics.
Using volumetric absorptive microsampling (VAMS) devices we have developed a series of methods to fractionate samples using serial washing and in-tip digests. This successfully addresses the dynamic range problem of dried whole blood and enhances the detection of the blood proteome. In a previous study we demonstrate successful detection and robust quantification from single-shot shotgun-LC-MS analysis of dried whole blood.
In this study whole blood was collected into VAMS tips to test the use of various wash buffer combinations (salts, organic solvents, acids, detergents) in producing different protein profiles. These methods were further implemented in a clinical study with the aim to identify protein biomarkers of disease. In this study we obtained plasma and frozen whole blood cell pellets from patients with clinical stages II – IV non-small cell lung cancer (NSCLC) and age and sex matched healthy controls. For this cohort, 30µL thawed whole blood cell pellets was loaded onto VAMS devices. All samples were dried, washed, and trypsin digested in situ. Peptides were separated using a one-hour gradient and detected using a HF-X Orbitrap mass spectrometer in DIA-mode.
Our rapid and reproducible methods enable the production of high-quality data from small aliquots of complex samples that are typically seen as requiring significant fractionation prior to proteomic analysis. Using different wash buffers produces a distinct protein profile and hence can be used to target specific protein and/or peptide populations.
For the 34 patient samples we quantified 3913.8 ± 170.9 proteins for each sample. There were 508 differentially expressed proteins initially identified. Ingenuity pathway analysis revealed these identified proteins were involved with a variety of functional pathways covering adhesion and migration, as well as a strong network of known cancer-associated cytokines and enzymes.
To reduce complexity, proteins were ranked based on area under the curve (AUC), and separately using a boosted regression importance filter. The first filter produced a short list of markers that were both differentially expressed for the model and had an AUC > 0.9. The second filter calculated the importance rank of each protein using gradient boosting methods. The procedure was repeated 100 times, and the markers were ranked in terms of the number of times they were selected in the top 10 importance rank was recorded for each protein. Using the methods described above a set of 14 markers were identified using the AUC filter, and 13 markers were identified using the importance rank filter that discriminate between NSCLC and healthy controls with 3 markers common to both analyses.
Using a series of novel models, we demonstrate here the amount of data that can be collected from stored, biobanked samples, and that there is value to be gained from analysis of additional blood fractions in addition to plasma.