Jungyo Suh and Dohyun Han contributed equally to this work.
We aimed to identify, verify, and validate a multiplex urinary biomarker-based prediction model for diagnosis and surveillance of urothelial carcinoma of bladder, using high-throughput proteomics methods.
Label-free quantification of data-dependent and data-independent acquisition of 12 and 24 individuals was performed in each of the discovery and verification phases using mass spectrometry, simultaneously using urinary exosome and proteins. Based on five scoring system based on proteomics data and statistical methods, we selected eight proteins. Enzyme-linked immunosorbent assay on urine from 120 patients with bladder mass lesions used for validation. Using multivariable logistic regression, we selected final candidate models for predicting bladder cancer.
Comparing the discovery and verification cohorts, 38% (50/132 exosomal differentially expressed proteins [DEPs]) and 44% (109/248 urinary DEPs) are consistent at statistically significance, respectively. The 20 out of 50 exosome proteins and 27 out of 109 urinary proteins were upregulated in cancer patients. From eight selected proteins, we developed two diagnostic models for bladder cancer. The area under the receiver operating characteristic curve (AUROC) of two models were 0.845 and 0.842, which outperformed AUROC of urine cytology.
The results showed that the two diagnostic models developed here were more accurate than urine cytology. We successfully developed and validated a multiplex urinary protein-based prediction, which will have wide applications for the rapid diagnosis of urothelial carcinoma of the bladder. External validation for this biomarker panel in large population is required.
Bladder cancer is the 2nd most common malignancy of the urinary tract [
Current standard methods for diagnosis and surveillance of bladder tumors are based on cystoscopy [
Numerous biomarkers have been developed in the last decades; however, none of them could replace standard clinical practice [
In this study, we designed systematic and precise strategies for biomarker discovery in patients with bladder mass, for source selection (urinary proteins and exosomes), mass screening, candidate biomarker selection, and clinical validation.
For biomarker discovery and verification, we used prospectively collected urine samples from patients who underwent transurethral resection of bladder tumor (TURB) from October 2016 to August 2017. Urine samples of three patients were used to initially set up the protocol for urinary protein and exosome analysis. Healthy controls were selected from kidney donors who underwent donor nephrectomy before surgery during the same period.
We used the urine specimens of 12 subjects (six with bladder cancer and six for control) for the discovery phase and 24 subjects (18 for bladder cancer and six for control) for the verification phase. For the validation phase, via enzyme-linked immunosorbent assay (ELISA), we used urine samples from 120 patients from the prospective, bio-specimen linked cohort of Seoul National University Prospectively Enrolled Registry for Urothelial Cancer-TURB (SUPER-UC-TURB) [
We extracted urinary protein and exosome from collected participants urine. Detailed methods of urinary protein and exosome extraction and preparation described in
All MS raw files were processed in MaxQuant (ver. 1.5.3.1) [
To generate spectral libraries, 12 DDA measurements were performed with urine samples. The DDA spectra were searched using MaxQuant against the Uniprot Human Database (December 2014, 88,657 entries) and the iRT standard peptide sequence. A spectral library was generated using the spectral library generation feature of Spectronaut 10. The DIA data from individual samples were analyzed with Spectronaut 10 (Biognosys, Schlieren, Switzerland). First, we converted the DIA raw files into htrm format using the GTRMS converter tool provided with Spectronaut. The FDR was estimated with the mProphet [
A schematic workflow for proteomics analysis in the discovery, verification, and validation phases of this study is shown in
For exosome analysis, we selected known exosome proteins from the Exocarta database (
Candidate biomarkers were selected under the sum of five scoring systems, which reflected significant differences between cancerous and benign urine data. The first and second scores were based on fold changes in the LC-MS/MS–based DIA study (test and repeated test) between cancer and benign patient urine. The third score was based on the area under the curve of the receiver operating characteristic curve (AUROC) of each protein for diagnosis of bladder cancer; AUROC > 0.95 as ten, > 0.9 as eight, > 0.85 as six, > 0.8 as four, and > 0.75 as two points. Fourth and fifth scores were assigned using the multivariable logistic regression model in the repeated DIA study. Under five scoring systems, we selected top eight candidate proteins for ELISA study (
Levels of eight proteins are as follows: alpha-2 macroglobulin (A2M; Magnetic Luminex Assays, catalog No. LXSAHM, R&D Systems Systems Inc., Minneapolis, MN), cofilin-1 (CFL1; catalog No. MBS2886911, MyBioSource Inc., San Diego, CA), apolipoprotein A-I (APOA1; R-PLEX platform, catalog No. F21PR-8, Meso Scale Diagnostics LLC, Rockville, MD), inter-alpha-trypsin inhibitor heavy chain H2 (ITIH2; catalog No. MBS100133, MyBioSource Inc.), afamin (AFM; catalog No. DY8065-05, R&D Systems Inc., Minneapolis, MN), fibrinogen beta chain (FGB; ProcartaPlex Multiplex Immunoassay, catalog No. MAN0016941, Thermo Fisher Scientific Inc.), cell division cycle 5-like protein (CDC5L, catalog No. MBS7227993, MyBioSource Inc.), and CD5 antigen-like protein (CD5L; catalog No. ELH-CD5L, RayBiotech, Peachtree Corners, GA) were analyzed in patient urine samples using commercial ELISA, following the manufacturer’s instructions. Calibration curves were prepared using purified standards before each protein was assessed. We normalized the data if the protein expression was highly skewed (over +2 or less than −2 of skewness), using natural log [ln] transformation [
Continuous variables were described as the median±standard deviation (interquartile range), and categorical variables were described as the frequency (%). The statistical significance of two AUROCs followed De Long’s non-parametric approach [
In the development and verification phases, the patients in the cancer group were older than those in the control group. Abnormal cytological findings (atypical and malignant cells) were significantly higher (p < 0.01) in the cancer group than in the benign group, in the verification phase. In the validation phase, age was not statistically different between the benign and cancerous samples. The percentage of cancerous samples were higher in the male population and positively correlated with abnormal cytology findings. All enrolled patient characteristics are shown in
To identify urinary and exosome biomarker candidates, we performed a label-free quantitative proteomic analysis based on data-dependent acquisition (DDA) in a discovery cohort of bladder cancer urine samples (n=12) (
To verify the DEPs in the discovery cohort, we adopted a data-independent acquisition strategy (DIA), both because it can achieve high data completeness [
Comparing the discovery and verification cohorts, 44% (109/248 urinary DEPs) and 38% (50/132 exosomal DEPs) are consistent at significance levels of 5% of FDR, respectively (
Furthermore, quantitative alterations of protein levels between control and urothelial carcinoma were very consistent between the discovery and verification cohorts. Control/Urothelial Carcinoma fold changes of proteins were highly correlated with Pearson’s correlation coefficients at r=0.945 and r=0.92 for the comparisons of exosome and urine, respectively (
Owing to limitations in the isolation of some specimens in the discovery and verification phases, we used urinary proteins for biomarker candidate selection. From 27 and 28 upregulated urinary proteins and exosome proteins, nine proteins are abundant in both samples in cancer patients. We selected upregulated urinary proteins in cancer patients as potential biomarkers; however, four proteins were removed because there was no available ELISA antibody. FGB was upregulated seven folds, whereas A2M, CD5L, fibrinogen gamma chain (FGG), complement factor H (CFH), and Rho GDP dissociation inhibitor beta (ARHGDIB) were upregulated five folds in the discovery set. In the verification set, A2M was upregulated six folds, FGB and FGG were upregulated five folds, and APOA1, complement C3 (C3), CFH and apolipoprotein C-III were upregulated four folds. For AUROC, A2M scored ten points, whereas AFM, FGB, FGG, C3, CFH, protein S isoform 1, apolipoprotein M, heparin cofactor 2 (SERPIND1), and plasminogen scored eight points. By multivariable regression modeling, A2M, CFL1, APOA1, CDC5L, and CD5L were selected in the first model, and A2M, CFL1, ITIH2, and AFM were selected for the second model (
Before analysis, we normalized the expression levels of AFM, CD5L, APOA1, ITIH2, and FGB via natural logarithmic transformation (
The AUROC of the eight biomarkers ranged from 0.629–0.759, and the AUROC of urine cytology was 0.718 (
The optimal cutoff values of model 1 and 2 were 0.735 and 0.870, respectively. The sensitivity, specificity, positive predictive value, and negative predictive value of model 1 were 0.880, 0.813, 0.485, and 0.949, respectively, and those of model 2 were 0.850, 0.747, 0.425, and 0.958, respectively. With a combination of urine cytology and the predicted value of model 1, AUROC for cancer prediction was 0.851 (95% CI, 0.770 to 0.912), and that of the combination of urine cytology and model 2 was 0.827 (95% CI, 0.743 to 0.893) (
In this study, we aimed to develop multiplex urinary biomarkers for bladder cancer diagnosis. Using a combined approach of urine and exosome proteins to identify candidate biomarkers using LC/MS-MS, we selected several candidate proteins, in the discovery and verification phases. After narrowing down the candidate proteins using statistical methods, we finalized eight of the most promising biomarkers. Using ELISA result, we developed two diagnostic models, which were more accurate than urine cytology. With the combination of urine cytology and the developed model with cutoff value application, the AUROC of model 1 with urine cytology was 0.851 and that of model 2 with urine cytology was 0.827.
In recent decades, numerous urinary biomarkers have been discovered and some of them have been approved for clinical use, however, the diagnostic performance is still limited in patients with haematuria [
Urine is an easily accessible body fluid and a large amount of protein in it is considered a good candidate for biomarker discovery. However, the problem with using urinary protein as a biomarker is that we do not know which protein is delivered from cancer cells. Urinary exosome has benefit in it containing intracellular molecules of cancer cells [
In present study, we selected CDC5L, ITIH2, AFM, CFL1, APOA1, A2M, FGB, and CD5L for final urine biomarkers. Most of them shows differential expression of bladder cancer and benign population in previous studies, however clinical application of these proteins is still under discovering. APOA1 and FGB were relatively well-known biomarker for bladder cancer detection and prognosis [
This study has several limitations. Only urine from a selected number of participants could be used for high-throughput LC-MS/MS owing to technical limitations. We only performed annalistic validation of the biomarkers. Despite these limitations, this study has strength with developed multiplex prediction model in a clinically indistinguishable setting of TURB. We plan to conduct a large prospective validation study for bladder cancer diagnosis, surveillance.
In conclusion, we successfully developed a multiplex urinary biomarker-based model using next-generation proteomics in patients with bladder mass. Multiple urinary biomarker-based panels overcome the predictive ability of urine cytology alone. With the combination of urine cytology and the developed model, diagnostic performance further increased. A large-scale prospective validation study is required for future studies.
Supplementary materials are available at Cancer Research and Treatment website (
This study was approved by the Seoul National University Hospital Institutional Review Board (No: 1801-015-912). Informed consent was obtained from all the subjects or their guardians. All experiments were performed in accordance with the relevant guidelines and regulations.
Conceived and designed the analysis: Jeong CW.
Collected the data: Suh J, Han D, Ku JH, Kim HH, Kwak C, Jeong CW.
Contributed data or analysis tools: Han D, Ku JH, Kim HH, Kwak C.
Performed the analysis: Suh J, Han D.
Wrote the paper: Suh J, Han D.
Conflict of interest relevant to this article was not reported.
The biospecimens for the validation phase of this study were provided by the Seoul National University Hospital Human Biobank, a member of the Korea Biobank Network, which is supported by the Ministry of Health and Welfare. All samples derived from the National Biobank of Korea were obtained with informed consent under institutional review board-approved protocols.
This study was supported by grants from the National R&D Program for Cancer Control (HA17C0039) through the Korea Health Industry Development Institute, funded by the Ministry of Health & Welfare, Republic of Korea. This research was supported by the Bio & Medical Technology Development Program of the National Research Foundation funded by the Ministry of Science & ICT (2016M3A9E2915717). None of the sponsors had any access to the data or any influence on or access to the analysis plan, the results, or the manuscript.
The overall workflow of urine protein biomarkers development. ELISA, enzyme-linked immunosorbent assay; LC-DIA/MS, liquid-chromatography data independent acquisition mass spectrometry; LC-MS/MS, liquid chromatography-tandem mass spectrometry.
Results of label-free quantification in the discovery stage. (A) Proteomic workflow of label-free quantification. (B) Number of Identification and quantification in urine and exosome. (C) Volcano plots. (D) Principal component analysis plots. FASP, filter-aided sample preparation; LC-MS/MS, liquid chromatography-tandem mass spectrometry.
Results of label-free quantification in the discovery stage. (A) Proteomic workflow of label-free quantification. (B) Flowchart of verification process using data-independent acquisition approach. (C) Correlation of protein control/urothelial carcinoma fold changes between the discovery and verification cohorts. DEP, differentially expressed protein; FASP, filter-aided sample preparation; LC-DIA/MS, liquid-chromatography data independent acquisition mass spectrometry.
Receiver operating characteristic for diagnosis of bladder cancer by each candidate proteins (A) and developed multiplex biomarker models (B). Model 1 for selected proteins and model 2 for all protein-based model. A2M, alpha-2 macroglobulin; AFM, afamin; APOA1, apolipoprotein A-I; AUROC, area under the receiver operating characteristic curve; CD5L, CD5 antigen-like protein; CDC5L, cell division cycle 5-like protein; CFL1, cofilin-1; CI, confidence interval; FGA, fibrinogen alpha chain; ITIH2, inter-alpha-trypsin inhibitor heavy chain H2.
Clinical characteristics of subjective for each phase
Discovery |
Verification |
Validation | |||||||
---|---|---|---|---|---|---|---|---|---|
|
|
| |||||||
Control | Cancer | p-value | Control | Cancer | p-value | Benign | Cancer | p-value | |
6 | 6 | 6 | 18 | 25 | 95 | ||||
| |||||||||
54.5 (45.5–56.0) | 74.5 (66.0–80.0) | < 0.01 |
57.0 (49.5–59.3) | 73.0 (68.0–80.8) | 0.01 |
66.0 (55.0–79.0) | 72.0 (64.0–77.0) | 0.21 | |
| |||||||||
3 (50.0) | 2 (33.3) | 1.00 |
3 (50.0) | 2 (11.1) | 0.08 |
11 (44.0) | 17 (17.9) | < 0.01 | |
| |||||||||
24.8 (22.5–26.7) | 22.7 (22.6–24.6) | 0.55 |
22.1 (21.1–22.3) | 25.0 (21.0–25.7) | 0.50 |
22.7 (21.2–25.4) | 24.5 (22.0–26.7) | 0.11 | |
| |||||||||
0 | 1 (16.7) | 1.00 |
0 | 1 (5.6) | 1.00 |
7 (28.0) | 18 (18.9) | 0.32 | |
| |||||||||
1 (16.7) | 3 (50.0) | 0.55 |
0 | 7 (38.9) | 0.13 |
9 (36.0) | 44 (46.3) | 0.56 | |
| |||||||||
| |||||||||
RBC | 1 (16.7) | 2 (33.3) | 1.00 |
0 | 9 (50.0) | 1.00 |
21 (84.0) | 72 (75.8) | 0.38 |
| |||||||||
WBC | 1 (16.7) | 1 (16.7) | 0.55 |
1 (16.7) | 5 (27.8) | 1.00 |
6 (24.0) | 30 (31.6) | 0.46 |
| |||||||||
0.14 |
0.01 |
< 0.01 | |||||||
| |||||||||
Benign cellular change | 6 (100) | 3 (50.0) | N/A | 6 (100) | 5 (27.8) | N/A | 20 (0.0) | 38 (40.0) | N/A |
| |||||||||
Atypical cell | 0 | 2 (33.3) | 0 | 7 (38.9) | 5 (20.0) | 29 (30.5) | |||
| |||||||||
Malignant cell | 0 | 1 (16.7) | 0 | 6 (33.3) | 0 | 28 (29.5) | |||
| |||||||||
N/A | N/A | N/A | |||||||
| |||||||||
Benign | N/A | 0 | N/A | 0 | 25 (100) | 0 | |||
| |||||||||
Tis | N/A | 0 | N/A | 4 (22.2) | N/A | 20 (26.3) | |||
| |||||||||
Ta | N/A | 3 (50.0) | N/A | 6 (33.3) | N/A | 35 (36.8) | |||
| |||||||||
T1 | N/A | 3 (50.0) | N/A | 3 (16.7) | N/A | 20 (21.1) | |||
| |||||||||
≥T2 | N/A | 0 | N/A | 5 (27.8) | N/A | 20 (21.1) | |||
| |||||||||
N/A | 1 (16.7) | N/A | N/A | 5 (27.8) | N/A | N/A | 24 (25.3) | N/A | |
| |||||||||
N/A | N/A | N/A | |||||||
| |||||||||
Low grade | N/A | 3 (50.0) | N/A | 0 | N/A | 20 (21.1) | |||
| |||||||||
High grade | N/A | 3 (50.0) | N/A | 18 (100) | N/A | 75 (78.9) |
Values are presented as median (range) or number (%). Control, kidney donor; N/A, not available; RBC, red blood cell; WBC, white blood cell.
Label-free quantification,
Data-independent acquisition.
Student t test,
Fisher exact test,
Pearson’s chi-square test.
Differences between cancer and benign patient’s urine protein expression in ELISA study of transurethral resection of bladder tumor patients
Benign | Cancer | p-value | |
---|---|---|---|
25 | 95 | ||
A2M | 76,897.4 (4,579.0 to 35,307.0) | 159,609.5 (1,429.5 to 224,686.5) | 0.045 |
CFL1 | 16,766.87 (2,242.5 to 14,108.0) | 33,026.01 (3,692.0 to 65,795.5) | 0.017 |
APOA1 | 6,779,594.0 (175,659.0 to 1,158,217.0) | 19,927,277.3 (703,284.5 to 18,851,324.0) | 0.061 |
ITIH2 | 8.07 (3.42 to 3.68) | 7.29 (0.33 to 3.45) | 0.875 |
AFM | 12,894.0 (716.0 to 13,301.0) | 48,942.7 (1,997.5 to 46,017.0) | 0.002 |
FGB | 256,907.1 (6,421.0 to 160,181.0) | 673,964.6 (26,786.75 to 731,388.5) | 0.039 |
CDC5L | 2.45 (1.98 to 2.74) | 2.85 (2.08 to 3.45) | 0.034 |
CD5L | 272.0 (88.70 to 189.72) | 508.5 (130.90 to 511.93) | 0.038 |
ln_ITIH2 | 1.28 (1.22 to 1.30) | 0.11 (−1.11 to 1.24) | < 0.001 |
ln_AFM | 7.85 (6.57 to 9.50) | 9.26 (7.60 to 10.74) | 0.003 |
ln_CD5L | 5.06 (4.49 to 5.25) | 5.57 (4.87 to 6.24) | 0.023 |
ln_APOA1 | 13.22 (12.08 to 13.96) | 15.05 (13.46 to 16.75) | 0.001 |
ln_FGB | 10.64 (8.77 to 10.20) | 11.6 (10.20 to 13.49) | 0.049 |
A2M, alpha-2 macroglobulin; AFM, afamin; APOA1, apolipoprotein A-I; CD5L, CD5 antigen-like protein; CDC5L, cell division cycle 5-like protein; CFL1, cofilin-1; ELISA, enzyme-linked immunosorbentassay; FGB, fibrinogenbeta chain; ITIH2, inter-alpha-trypsin inhibitor heavy chain H2.