Europe PMC

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Abstract 


The residual cancer burden index was developed as a method to quantify residual disease ranging from pathological complete response to extensive residual disease. The aim of this study was to evaluate the inter-Pathologist reproducibility in the residual cancer burden index score and category, and in their long-term prognostic utility. Pathology slides and pathology reports of 100 cases from patients treated in a randomized neoadjuvant trial were reviewed independently by five pathologists. The size of tumor bed, average percent overall tumor cellularity, average percent of the in situ cancer within the tumor bed, size of largest axillary metastasis, and number of involved nodes were assessed separately by each pathologist and residual cancer burden categories were assigned to each case following calculation of the numerical residual cancer burden index score. Inter-Pathologist agreement in the assessment of the continuous residual cancer burden score and its components and agreement in the residual cancer burden category assignments were analyzed. The overall concordance correlation coefficient for the agreement in residual cancer burden score among pathologists was 0.931 (95% confidence interval (CI) 0.908-0.949). Overall accuracy of the residual cancer burden score determination was 0.989. The kappa coefficient for overall agreement in the residual cancer burden category assignments was 0.583 (95% CI 0.539-0.626). The metastatic component of the residual cancer burden index showed stronger concordance between pathologists (overall concordance correlation coefficient=0.980; 95% CI 0.954-0.992), than the primary component (overall concordance correlation coefficient=0.795; 95% CI 0.716-0.853). At a median follow-up of 12 years residual cancer burden determined by each of the pathologists had the same prognostic accuracy for distant recurrence-free and survival (overall concordance correlation coefficient=0.995; 95% CI 0.989-0.998). Residual cancer burden assessment is highly reproducible, with reproducible long-term prognostic significance.

Free full text 


Logo of nihpaLink to Publisher's site
Mod Pathol. Author manuscript; available in PMC 2016 Jul 1.
Published in final edited form as:
PMCID: PMC4830087
NIHMSID: NIHMS772366
PMID: 25932963

Reproducibility of Residual Cancer Burden For Prognostic Assessment of Breast Cancer After Neoadjuvant Chemotherapy

Abstract

The residual cancer burden index was developed as a method to quantify residual disease ranging from pathological complete response to extensive residual disease. The aim of this study was to evaluate the inter-pathologist reproducibility in the residual cancer burden index score and category, and in their long-term prognostic utility.

Pathology slides and pathology reports from 100 cases selected at random from patients treated in a randomized neoadjuvant trial were reviewed independently by five pathologists at M.D Anderson Cancer Center without prior coaching. Size of tumor bed, average percent overall tumor cellularity, average percent of the in situ cancer within the tumor bed, size of largest axillary metastasis and number of involved nodes were assessed separately by each pathologist and residual cancer burden categories were assigned to each case following calculation of the numerical residual cancer burden index score. Inter-pathologist agreement in the assessment of the continuous residual cancer burden score and its components and agreement in the residual cancer burden category assignments were evaluated and analyzed.

The overall concordance correlation coefficient for the agreement in residual cancer burden score among all five pathologists was 0.931 (95% Confidence Interval 0.908 – 0.949). Overall accuracy of the residual cancer burden score determination was 0.989. The kappa coefficient for overall agreement in the residual cancer burden category assignments was 0.583 (95% Confidence Interval 0.539 – 0.626), indicating good overall inter-pathologist agreement. The metastatic component of the residual cancer burden index showed stronger concordance between pathologists (overall concordance correlation coefficient = 0.980; 95% Confidence Interval 0.954 – 0.992), than the primary component (overall concordance correlation coefficient = 0.795; 95% Confidence Interval 0.716 – 0.853). At a median follow-up of 12 years residual cancer burden determined by each of the pathologists had the same prognostic accuracy for distant recurrence-free and survival (overall concordance correlation coefficient = 0.995; 95% Confidence Interval 0.989 – 0.998). residual cancer burden assessment is highly reproducible, with reproducible long-term prognostic significance when evaluated by different pathologists. This supports the feasibility of incorporating evaluation of residual cancer burden within neoadjuvant trials and within standardized pathology reporting guidelines.

Introduction

Neoadjuvant chemotherapy is often used in patients with locally advanced breast cancer to downstage the tumor and to evaluate in vivo chemosensitivity. 1,2 Pathological complete response is defined as the absence of invasive cancer in the breast and in the nodes after completion of neoadjuvant chemotherapy. A recent meta-analysis of 12 randomized trials by the Collaborative Trials in Neoadjuvant Breast Cancer confirmed pathologic complete response as a surrogate endpoint for event-free and overall survival. In particular, pathologic complete response was associated with a 52% reduction in the probability of an event and a 64% reduction in the probability of death.3 Thus, pathologic complete response has been used as the primary endpoint in a number of trials evaluating efficacy of different drugs. Breast cancer of certain subtypes may have an excellent chemosensitivity but may also show a spectrum of post- neoadjuvant chemotherapy residual disease ranging from minimal (near pathologic complete response) to extensive residual disease. At present, a variety of non-standardized procedures are used for the evaluation of pathological response after neoadjuvant treatment and this can impair the quality and reliability of pathology assessment across different institutions. Evaluation of the Neo-tAnGo study showed that only 45% of the pathology reports from patients with residual disease indicated the chemotherapy effect and less than 10% quantified response at all.4 Residual disease can be subtle and/or scattered, and in these cases pathology reports tend to collect more descriptive rather than quantitative information in the absence of standardized guidelines to measure and report the extent of residual cancer. In addition, the reproducibility and prognostic significance of reported residual disease assessment across different pathologists is difficult to study and has not been formally tested.

At M.D Anderson Cancer Center we developed residual cancer burden (RCB) as a method to quantify residual disease after neoadjuvant chemotherapy for breast cancer.5 RCB can be calculated through a web-based calculator either as a numerical score (index) or as a category.6 The RCB index is based on histopathological variables such as number of involved nodes, size of the largest nodal metastasis and size and percent cellularity of the primary tumor bed.

The RCB categories have been shown to correlate with long-term survival outcomes across breast cancer subtypes and a number of clinical study groups such as I-SPY (1,2), GEICAM, ACOSOG (Z11103), CALGB (40601, 40603), NSABP (B-40, B-41) and ABCSG (34) have incorporated RCB as the primary or secondary endpoint of chemotherapy response in prospective neoadjuvant trials.7 Concerns have been raised that parameters used for RCB calculation are not part of a standardized pathology report and may be somewhat subjective for evaluation among different observers, especially when reporting the extent of residual tumor cellularity.8

The aim of this study was to evaluate the inter-pathologist reproducibility of RCB index score and category, and of the long-term prognostic utility, when assessed by five different pathologists in a blinded “round-robin” analysis of retrospective reports and slides from patients with residual in situ, invasive, and/or nodal disease after six months of neoadjuvant taxane-anthracycline based chemotherapy.

Materials and Methods

We selected one hundred random cases with residual in situ, invasive, or metastatic carcinoma in the axillary nodes from patients who were treated in a published randomized neoadjuvant trial (protocol MDACC DM 98-240) with a regimen including paclitaxel followed by fluorouracil, doxorubicin, and cyclophosphamide.9 These cases included 60 hormone-receptor positive, 23 HER2-positive and 17 triple-negative tumors. The gross pathologic reviews, tissue sampling, description of gross findings and tissue sections had been performed in the past using legacy clinical methods (i.e. without standardization and before RCB had been conceived). The pathology slides and original pathology reports were reviewed independently by the five pathologists including two fellows, one visitor and, two faculty members at M.D Anderson Cancer Center. The original pathology reports at M.D Anderson Cancer Center routinely included two-dimensional measurements of the macroscopic tumor dimensions, number of involved nodes and the diameter of the largest metastasis. In cases of multicentric disease, the largest tumor bed was measured. Pathologists were free to infer results from reports or their interpretation of the slides from these retrospective materials, as they saw fit.

Pathologist A's RCB results were derived from the original development cohort that was published in 2007.5 Pathologists B, C, D, and E were blinded to other results or outcomes and were assessing RCB for the first time in their career when they participated in this study. They did not receive individual coaching nor were they trained in RCB evaluation at the microscope. Pathologists B, C, and D assessed the 100 cases soon after publication of the original RCB paper (in 2007), and pathologist E assessed the same cases one year later (five cases were missing from pathologist E). They were provided with the published materials and the corresponding website for appended instructions and protocol for pathologists and RCB calculation.

Microscopic and macroscopic pathological components, namely size of tumor bed (mm), average percent overall tumor cellularity (invasive and in situ), average percent of the cancer within the tumor bed that is in situ, size of largest axillary metastasis (mm) and number of involved nodes were assessed separately by each pathologist and RCB categories were assigned to each case following calculation of the numerical RCB index score (http://www3.mdanderson.org/app/medcalc/index.cfm?pagename=jsconvert3).6

Inter-pathologist agreement in the assessment of the continuous RCB score and its components was evaluated based on the overall concordance correlation coefficient and agreement in the RCB category assignments was assessed based on the kappa coefficient.

The agreement between the continuous scores obtained by the five pathologists was evaluated based on the overall concordance correlation coefficient for multiple observers.10 When disagreement was present, it was assessed in terms of a systematic shift (inaccuracy) component and a random error (imprecision) component. Confidence intervals for the overall concordance correlation coefficient were obtained based on U-statistics. Each tumor was also assigned into one of four pre-defined RCB categories according to the RCB score. The agreement in the RCB category assignments by the five pathologists was evaluated based on the simple (unweighted) kappa statistic for multiclass observations. Confidence intervals were obtained based on the asymptotic variance of the statistic. Computations were performed using R 3.1.11

Distant recurrence-free survival was defined as the interval from diagnosis until distant disease recurrence or death from any cause. Overall survival was defined as the interval from diagnosis until death from any cause. Survival analyses were computed using the R package survival.12 The Kaplan-Meier estimator and the log rank test was used to assess the effect of RCB classes on survival outcome. Significance of the effect of the continuous RCB score on survival outcome was evaluated by Cox regression analysis adjusted for hormone-receptor status.

Results

Agreement in Continuous RCB Score

Five cases were excluded from the analysis due to missing data from any one of the five observers. Therefore, a total of ninety-five cases were evaluated for consistency or agreement between the five pathologic measurements of residual cancer by assessing agreement between the corresponding RCB score on the continuous scale. In this analysis the five pathologists are treated symmetrically, i.e. none of them is considered as providing a reference score. Agreement in this setting implies that the observations by any of the pathologists can be used interchangeably. Table 1 shows the estimated pairwise concordance correlation coefficients for each pair of pathologists. There was generally good agreement between pathologists with the pairwise concordance correlation coefficients ranging from 0.91 to 0.95. To evaluate the source of disagreement, the concordance correlation coefficient is typically expressed as the product of two terms, accuracy and precision, which can be estimated separately. The estimated accuracy coefficients for all pairwise comparisons were very close to 1 indicating that the marginal distributions of RCB score between two pathologists are equal, i.e. both means and variances are equal. The source of disagreement seems to be due to reduced precision, which is measured by the Pearson correlation coefficient between pairs of observations. The overall concordance correlation coefficient for the agreement among all five pathologists was 0.931 (95% Confidence Interval = 0.908-0.949). Overall accuracy of the RCB score determination was 0.989, suggesting negligible shift (location or scale) in the distribution of RCB score among pathologists. However, the overall precision was 0.941, which indicates appreciable within-sample variation or random error in the evaluations of the pathologists. The overall concordance, accuracy, and precision were 0.923 (95% Confidence Interval 0.892 - 0.946), 0.982, and 0.941 in the hormone receptor positive; and 0.942 (95% Confidence Interval 0.899 - 0.967), 0.993, and 0.949 in the hormone receptor negative subset.

Table 1

Estimated concordance correlation coefficients for continuous RCB scores from 95 breast cancer specimens
ComparisonPairwise Concordance Correlation Coefficients95% Confidence IntervalAccuracyPrecision
Path A vs Path B0.9520.926 to 0.9690.9920.959
Path A vs Path C0.9480.929 to 0.9630.9980.950
Path A vs Path D0.9380.914 to 0.9560.9800.958
Path A vs Path E0.9080.849 to 0.9450.9700.937
Path B vs Path C0.9500.923 to 0.9680.9980.953
Path B vs Path D0.9500.922 to 0.9680.9970.953
Path B vs Path E0.9210.870 to 0.9520.9900.930
Path C vs Path D0.9150.885 to 0.9370.9900.925
Path C vs Path E0.9060.854 to 0.9400.9830.922
Path D vs Path E0.9230.872 to 0.9540.9950.927
Overall0.9310.908 to 0.9490.9890.941

Agreement Among RCB Category Assignments

Instead of using the RCB score of a tumor directly for prognosis, each tumor is typically assigned to one of four RCB categories ranging from no residual cancer (RCB-0 or pathologic complete response), minimal (RCB-I), moderate (RCB-II) or extensive residual cancer (RCB-III) based on published cutoff points on the RCB score.5 Agreement between the RCB categories was evaluated using the simple kappa statistic for multinomial observations. Table 2 shows the number of calls in each of the four RCB categories made by the five pathologists. The marginal distributions by the first two pathologists look similar, but those by pathologists C, D and E appear more deviant at the tails. The kappa statistic for inter-rater agreement with respect to classification into a single RCB category is also shown in Table 2. Values of the kappa statistic around 0.6 indicate moderate to substantial agreement. Classification of extensive residual disease (RCB-III) appeared to be very consistent among the five pathologists, with a kappa of 0.666, whereas classification of minimal residual disease (RCB-I) appears to be the least concordant with a kappa of 0.533. The overall kappa was 0.583 (95% CI = 0.539-0.626), indicating good overall agreement.

Table 2

Pathologists' observed marginal distributions for RCB categories obtained from 95 breast cancer specimens
ObserverPathologic complete responseRCB-IRCB-IIRCB-III
A4186112

B3185519

C1255217

D6115325

E4115525

Kappa*0.6820.5330.5420.666
95% Confidence Interval0.618 to 0.7450.469 to 0.5960.478 to 0.6060.602 to 0.730

*Overall0.583
95 % Confidence Interval0.539 to 0.626

Agreement in Pathologic Components of the RCB Score

The overall concordance analysis showed excellent accuracy in the determination of the RCB score by the five pathologists, but indicated reduced precision due to a sizable within-sample variability or random error component. In order to understand the source of this variability, we evaluated the concordance separately for the primary and metastatic components of the RCB score. The primary RCB component was the main source of within-sample variability of the RCB score (overall concordance correlation coefficient = 0.795; 95% Confidence Interval = 0.716-0.853), whereas the metastatic component shows perfect concordance between pathologists (overall concordance correlation coefficient = 0.980; 95% Confidence Interval = 0.954-0.992). Further evaluation of the pathologic measurements contributing to the primary RCB component revealed that estimation of the primary tumor bed size (concordance correlation coefficient = 0.704; 95% Confidence Interval = 0.550-0.812) and of the fraction of invasive cancer (concordance correlation coefficient = 0.699; 95% Confidence Interval = 0.621-0.763) affects precision and accuracy. The precision of the two estimates was similar (0.781 for invasive carcinoma vs. 0.742 for tumor bed size), but estimation of the fraction of invasive cancer was less accurate (0.894) or more biased compared to estimation of the tumor bed size (accuracy = 0.949).

Agreement in Prognostic Risk Assessment

RCB scores and RCB classes determined by each of the pathologists were prognostic for DRFS at a median follow-up of 12.16 years. Figure 1A and Table 3 summarize the results of the Cox regression analysis for the continuous RCB data with adjustment for hormone-receptor status. The results demonstrate reproducible estimation of risk by the different pathologists reporting continuous RCB score. Figure 2 shows excellent concordance of the predicted survival proportions derived from these models with an overall concordance correlation coefficient of 0.995 (95% Confidence Interval = 0.989-0.998). Concerning the categorical RCB classification, Figure 1B and Table 3 show the results from a Cox regression using the categorical RCB data adjusted for hormone-receptor status. Figure 3 shows Kaplan-Meier plots for RCB classes defined by each of the pathologists, suggesting generally good agreement for the survival estimates. Each of the categorizations is prognostic for distant recurrence-free survival, and there is some variation in the estimated 5- and 10-year survival, but the differences are not significant (Table 4). The one outlier appears to be Pathologist D, whose RCB-I cases had worse prognosis than expected. However, it should be noted that the result is likely to have resulted from classification of two cases as RCB-I (of 11 total) that had an early relapse event and were not classified as RCB-I by the other pathologists. This is also reflected in the estimates of 5-year and 10-year distant recurrence-free survival for Pathologist D (Table 4). The other observed difference in prognosis relates to the RCB-III category. In general, the estimates for 5-year and 10-year distant recurrence-free survival were higher for RCB-III (Table 4) for the pathologists who more frequently classified tumors as RCB-III (Table 2). For example, pathologist A identified 13% of cases as RCB-III corresponding to 5-year distant recurrence-free survival of 42%, whereas pathologists D and E identified 26% RCB-III (5-year DRFS of 60%). The analysis of overall survival shows similar results and is presented in the supplementary material.

An external file that holds a picture, illustration, etc.
Object name is nihms772366f1.jpg

Figure 1A) Hazard rates with 95% confidence intervals for prediction of distant recurrence-free survival for pathologists A to E. The hazard rates correspond to 1 unit increase of the continuous RCB score.

Figure 1B) Hazard ratios for the categorical RCB classes. The risk for pathologic complete response (pCR)/RCB-I (grey) and RCB-II (black) is reported relative to that for RCB-III. The analysis was adjusted for hormone-receptor status

An external file that holds a picture, illustration, etc.
Object name is nihms772366f2.jpg

Concordance of predicted distant recurrence-free survival from the continuous RCB score. The predicted survival proportions per year of observation are derived from Cox models with adjustment for hormone-receptor status and plotted pairwise for the different observers.

An external file that holds a picture, illustration, etc.
Object name is nihms772366f3.jpg
Kaplan-Meier plots for distant recurrence-free survival for RCB classes determined by observers A-E

Table 3

Hazard rates for continuous RCB score and hazard ratios with 95% confidence intervals for prediction of distant recurrence-free survival for observers A to E. The estimates are adjusted for hormone-receptor status
Continuous RCB-Index: Distance recurrence-free survival
ObserverHazard RateLower 95% Conf LimitUpper 95% Conf LimitP-value (Likelihood Ratio Test)
A3.1832.0304.990< 0.001
B2.5911.7553.824< 0.001
C2.7211.8344.035< 0.001
D2.7311.8364.062< 0.001
E3.1021.9774.868< 0.001

Categorial RCB: Distance recurrence-free survival - reference class RCB-III

ObserverPathologic complete response/RCB-I vs. RCB-IIIRCB-II vs. RCB-IIIP-value (LR Test)
Hazard Ratio95% CIHazard Ratio95% CI
A0.0210.003 to 0.1660.2680.121 to 0.599< 0.001
B0.0520.011 to 0.2420.2880.135 to 0.611< 0.001
C0.0680.018 to 0.2510.2690.124 to 0.585< 0.001
D0.1410.037 to 0.5320.3710.173 to 0.796< 0.001
E0.0310.004 to 0.2530.2920.138 to 0.619< 0.001

Table 4

Kaplan-Meier estimates of distant recurrence-free survival rates at 5 and 10 years (95% CI) for RCB classes obtained from observers A through E
ObserverPathologic complete responseRCB-IRCB-IIRCB-III

5-year distant recurrence-free survivalA10010079 (69 to 90)42 (21 to 81)
5-year distant recurrence-free survivalB10094 (84 to 100)84 (74 to 94)47 (30 to 76)
5-year distant recurrence-free survivalC10096 (89 to 100)81 (71 to 92)47 (28 to 78)
5-year distant recurrence-free survivalD10082 (62 to 100)85 (75 to 95)60 (44 to 83)
5-year distant recurrence-free survivalE10010082 (72 to 93)60 (44 to 83)

10-year distant recurrence-free survivalA10094 (84 to 100)70 (59 to 82)17 (5 to 60)
10-year distant recurrence-free survivalB10089 (75 to 100)74 (63 to 87)32 (16 to 61)
10-year distant recurrence-free survivalC10088 (75 to 100)72 (61 to 86)27 (12 to 61)
10-year distant recurrence-free survivalD10073 (51 to 100)75 (64 to 88)47 (31 to 72)
10-year distant recurrence-free survivalE10091 (75 to 100)74 (63 to 87)43 (27 to 68)

Discussion

Overall, there was strong concordance across the five pathologists for assessment of the continuous RCB score (concordance correlation coefficient = 93.1%), that is the combination of excellent accuracy (98.9%) and good precision (94.1%). However, inter-pathologist agreement for assignment of RCB score to a category (pathologic complete response, RCB-I, RCB-II, or RCB-III) was only good, with overall kappa value of 0.583 (95% Confidence Interval 0.539 to 0.626). More variability was seen between pathologists in the assessment of percent tumor cellularity (concordance correlation coefficient = 69.9%) and tumor bed size (concordance correlation coefficient = 70.4%). Importantly, the prognostic assessment of future risk using RCB was highly reproducible, whether by continuous score or category (Figures 1, ,22).

One can be encouraged that the inter-pathologist concordance of RCB index scores represents the “buffering” effect of a multivariate index that is not dependent on any single measurement of a single parameter. Similarly, the RCB index scores from the 5 different pathologists had significant prognostic value for distant recurrence-free survival (hazard rates 2.6 to 3.2) and OS (hazard rates 2.1 to 2.5, see supplementary information) adjusted for hormone-receptor status. We observed that the two trainees performed at least as well as their more experienced colleagues, suggesting that learning from educational materials, attention to detail, and experience are important.

It is also not surprising that imperfect precision (94.1%) would correspond to reduced inter-pathologist agreement on the category of RCB (58.3%) because the four RCB categories do not really represent independently different pathological outcomes, but are defined by two different arbitrary thresholds applied to a distribution of RCB scores. Minor variation in RCB score therefore leads to higher rate of disagreement among RCB classes, even if the prognostic relevance of the disagreements are actually trivial. For example, the prognostic risk for a high RCB-I is similar to that of a low RCB-II, and a low rate of imprecision between pathologists would be likely to assign such cases to either category. Consequently, the Kaplan-Meier plots for the four RCB groups were generally quite similar across the five pathologists. However, we did observe that the pathologists who classified pathologic complete response or minimal residual disease less frequently tended to classify RCB-III more frequently (Table 3), but their corresponding survival estimates for this most resistant category were more favorable. This illustrates how any bias toward over-estimation of cellularity and/or tumor bed size would diminish the prognostic meaning of the RCB-III category. It is important to estimate the average cellularity across the tumor bed area (described in the protocol) rather than the maximum cellularity or the average of the more cellular areas in the tumor bed.

We strictly designed this study to not allow coaching or training of pathologists in the performance of RCB assessment, but only provide the published materials from the original manuscript and the accompanying website (http://www3.mdanderson.org/app/medcalc/index.cfm?pagename=jsconvert3). The purpose was to simulate adoption of this method by others from publicly available education materials and to learn about both the analytical and the prognostic reproducibility of RCB assessment by different pathologists.5,10 One can sometimes recognize a case as minimal, moderate or extensive residual disease based on first impressions after reviewing the slides. However, the very high concordance between pathologists when measuring RCB index, and even higher concordance of the prognostic information derived from those measurements, provide reliable and more specific information to the treating physician and surgeon that justifies the utilization of this method.

However, the study design has several important limitations. Firstly, the perfect comparative study can never be achieved because it is impossible to have different pathologists receive, examine, sample, and interpret the extent of residual disease from identical surgical resection specimens. That would test the entire standard operating procedure for this method of pathologic assessment. This study tested the interpretation of slides and reports from archival samples that could never be seen or fully understood by the study pathologists, and were devoid of any relevant clinical or radiologic information. Moreover, the cases were selected to have long follow up (in order to study prognostic relevance) and so they pre-dated current procedures for the gross assessment of a post-neoadjuvant resection specimen. Thus, there were no radiographs, photographs or diagrams of the specimens, no precise maps of how the slides related to the specimen or to each other, no standardized procedures for sampling at that time (other than the intent to determine pathologic complete response from residual disease), the primary tumor beds were not routinely marked with radiologic clips at that time, and sentinel node biopsies were not performed in patients with clinically node-positive disease at that time. Indeed, we might anticipate even stronger prognostic utility when using current standard operating procedures.

The results from this retrospective study of archival materials are informative for the interpretation of clinical trials where RCB is proposed as a primary or secondary endpoint. One can appreciate several scenarios with different implications for the expected quality of results. Some trials have prospectively included a standard operating procedure to standardize pathology assessment for RCB, pathologic complete response, and AJCC/UICC staging and proactively trained at least one dedicated pathologist at each site to incorporate the protocol and prospectively interpret and report RCB (e.g. I-SPY2, ABCSG, Kristine). This approach even allows for internal auditing by central review of part or all of the subjects. Others include a relevant standard operating procedure for pathology, but intend to retrospectively collect slides and reports for a central review, or request prospective assessment of RCB without training of pathologists at each site or real-time confirmation of performance. Yet others do not include any standardized procedures specific to the assessment of RCB but intend to obtain RCB assessments by retrospective review of available materials. Our study represents the results that might be expected from the last scenario. The rate of inter-pathologist disagreement in their interpretations of the tumor bed size and cellularity is likely to be over-stated compared to a higher quality approach (first scenario). These would reasonably be expected to improve with standardized procedures for macroscopic assessment and mapping of tissue sections to the gross specimen that are recommended in the published standard operating procedure for RCB assessment in a prospective setting.5,13 Consequently, one might expect that inter-pathologist agreement in RCB assessments would also be higher with standardization of more consistent methods to map the residual tumor bed for extent of disease and definition of the area in which to estimate percent cancer cellularity (e.g. protocol for pathologists, www.mdanderson.org/breastcancer_RCB). What happens prospectively in the grossing room (and is recorded by images, maps and description) profoundly affects the quality of interpretation and reporting of the extent and burden of residual disease.

Assessment of residual nodal disease was highly reproducible (overall concordance correlation coefficient = 0.980), despite fibrotic chemotherapy effects that might contribute to variable interpretation. Indeed, it is well established in the pathology community that thorough examination of the axillary specimen is essential to avoid underestimation of nodal residual disease, and so nodal assessment is already well standardized.14 Nevertheless, the excellent concordance would be bolstered by a majority of cases with zero nodal disease and the limitation that no study could ever compare the sampling of the same axillary contents by multiple pathologists.

Despite the limitations imposed on this study of archival materials by the outdated methods for evaluation of resection specimens, the concordance of RCB measurements and agreement of RCB categories still compare very favorably to those described for common diagnostic procedures such as receptor testing, Ki 67 immunohistochemistry for assessment of proliferation index, or inter-observer measurements of tumor diameter using imaging methods. For example, concordance studies for estrogen receptor protein testing by immunohistochemistry have shown an overall concordance rate of 87%, 90% and 97% between primary institution and central testing.15, 16, 17 Similarly, in an international reproducibility study assessment of proliferation using Ki 67 revealed an intra-laboratory reproducibility of 94% and an inter-laboratory reproducibility of 59% - 71% (central and local staining, respectively).18 In another study assessment of the inter-observer variability among pathologists showed that agreement was 89% for evaluation of Ki 67 in predicting response to neoadjuvant chemotherapy.19 Also, inter-observer agreement among radiologists evaluating conventionally the largest tumor diameter after neoadjuvant chemotherapy using MRI showed a concordance correlation coefficient of 93%.20

In conclusion, there was strong reproducibility of measurements and similar long-term prognostic meaning of RCB when evaluated by five pathologists. Minor imprecision in scoring has a larger effect on the assignment to RCB categories, but those differences have little prognostic relevance. This study demonstrates that it would be feasible and reasonable to retrospectively evaluate RCB from the slides and reports from subjects in a clinical trial. However, we predict that incorporation of a standardized protocol, with formal training of site pathologists to prospectively evaluate RCB would produce even stronger results (because assessment of tumor bed area and cellularity would be improved), such that inter-pathologist reproducibility would be excellent and prognostic meaning of the RCB results would be even better than we report here.

Supplementary Material

Suppl Fig S1

Suppl Fig S2

Suppl Fig S3

Suppl Table S1

Suppl Table S2

Suppl Table S3

Suppl Table S4

Acknowledgments

We acknowledge Dr. G.R. Qureshi for his valuable contribution to this study by acquisition of data.

Footnotes

Disclosure: Dr. Symmans has filed Residual Cancer Burden (RCB) as intellectual property (Nuvera Biosciences), patenting the RCB equation. (The RCB calculator is freely available on the worldwide web.) Dr. Symmans reports current stock in Nuvera Biosciences and past stock in Amgen.

All remaining authors have declared no conflicts of interest.

References

1. Fisher B, Bryant J, Wolmark N, et al. Effect of preoperative chemotherapy on the outcome of women with operable breast cancer. J Clin Oncol. 1998;16:2672–2685. [Abstract] [Google Scholar]
2. Kuerer HM, Neuman LA, Smith TL, et al. Clinical course of breast cancer patients with complete pathologic primary tumor and axillary lymph node response to doxorubicin-based neoadjuvant chemotherapy. J Clin Oncol. 1999;17:460–469. [Abstract] [Google Scholar]
3. Cortazar P, Zhang L, Untch M, et al. Pathological complete response and long-term clinical benefit in breast cancer: The CTneoBC pooled analysis. The Lancet. 2014;384:164–172. [Abstract] [Google Scholar]
4. Provenzano E, Vallier AL, Champ R, et al. A central review of histopathology reports after breast cancer neoadjuvant chemotherapy in the neo-tango trial. Br J Cancer. 2013;108:866–872. [Europe PMC free article] [Abstract] [Google Scholar]
5. Symmans WF, Peintinger F, Hatzis C, et al. Measurement of residual breast cancer burden to predict survival after neoadjuvant chemotherapy. J Clin Oncol. 2007;25:4414–22. [Abstract] [Google Scholar]
7. Symmans WF, Wei C, Gould R, et al. Long-term prognostic value of residual cancer burden (RCB) classification following neoadjuvant chemotherapy. Cancer Research. 2013;73(24 Supplement):S6–02. [Google Scholar]
8. Abrial C, Thivat E, Tacca O, et al. Measurement of residual disease after neoadjuvant chemotherapy. J Clin Oncol. 2008;26:3094. [Abstract] [Google Scholar]
9. Green MC, Buzdar AU, Smith T, et al. Weekly paclitaxel improves pathologic complete remission in operable breast cancer when compared with paclitaxel once every 3 weeks. J Clin Oncol. 2005;23:5983–5992. [Abstract] [Google Scholar]
10. Barnhart HX, Haber M, Song J. Overall concordance correlation coefficient for evaluating agreement among multiple observers. Biometrics. 2002;58:1020–1027. [Abstract] [Google Scholar]
11. R Core Team. R Foundation for Statistical Computing. Vienna, Austria: 2014. R: A language and environment for statistical computing. URL http://www.R-project.org/ [Google Scholar]
12. Therneau TA. Package for Survival Analysis in S. R package version 2.37-7. 2014 URL: http://CRAN.R-project.org/package=survival/
14. Boughey JC, Donohue JH, Jakub JW, et al. Number of lymph nodes identified at axillary dissection: effect of neoadjuvant chemotherapy and other factors. Cancer. 2010;116:3322–3329. [Abstract] [Google Scholar]
15. Gelber RD, Gelber S. International Breast Cancer Study Group (IBCSG) and the Breast International Group (BIG): Facilitating consensus by examining patterns of treatment effects. Breast. 2009;18(suppl 3):S2–S8. [Abstract] [Google Scholar]
16. Badve SS, Baehner FL, Gray RP, et al. Estrogen- and progesterone-receptor status in ECOG 2197: Comparison of immunohistochemistry by local and central laboratories and quantitative reverse transcription polymerase chain reaction by central laboratory. J Clin Oncol. 2008;26:2473–2481. [Abstract] [Google Scholar]
17. Viale G, Regan MM, Maiorano E, et al. Prognostic and predictive value of centrally reviewed expression of estrogen and progesterone receptors in a randomized trial comparing letrozole and tamoxifen adjuvant therapy for postmenopausal early breast cancer: BIG 1-98. J Clin Oncol. 2007;25:3846–3852. [Abstract] [Google Scholar]
18. Polley MY, Leung SC, McShane LM, et al. An international Ki67 reproducibility study. J Natl Cancer Inst. 2013;105:1897–1906. 10.1093/jnci/djt306. [Europe PMC free article] [Abstract] [CrossRef] [Google Scholar]
19. Manucha V, Zhang X, Thomas RM. The satisfactory reproducibility of the Ki-67 index in breast carcinoma, and it's correlation with the recurrence score. Clin Cancer Investig J. 2014;3:310–314. 10.4103/2278-0513.134483. [CrossRef] [Google Scholar]
20. Takeda K1, Kanao S, Okada T, et al. Assessment of CAD-generated tumor volumes measured using MRI in breast cancers before and after neoadjuvant chemotherapy. Eur J Radiol. 2012;81:2627–2631. 10.1016/j.ejrad.2011.12.013. [Abstract] [CrossRef] [Google Scholar]

Citations & impact 


Impact metrics

Jump to Citations

Citations of article over time

Alternative metrics

Altmetric item for https://www.altmetric.com/details/3960728
Altmetric
Discover the attention surrounding your research
https://www.altmetric.com/details/3960728

Article citations


Go to all (48) article citations

Data 


Data behind the article

This data has been text mined from the article, or deposited into data resources.

Similar Articles 


To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.


Funding 


Funders who supported this work.

NCI NIH HHS (1)