A non-lab nomogram of survival prediction in home hospice care patients with gastrointestinal cancer

Background Patients suffering from gastrointestinal cancer comprise a large group receiving home hospice care in China, however, little is known about the prediction of their survival time. This study aimed to develop a gastrointestinal cancer-specific non-lab nomogram predicting survival time in home-based hospice. Methods We retrospectively studied the patients with gastrointestinal cancer from a home-based hospice between 2008 and 2018. General baseline characteristics, disease-related characteristics, and related assessment scale scores were collected from the case records. The data were randomly split into a training set (75%) for developing a predictive nomogram and a testing set (25%) for validation. A non-lab nomogram predicting the 30-day and 60-day survival probability was created using the least absolute shrinkage and selection operator (LASSO) Cox regression. We evaluated the performance of our predictive model by means of the area under receiver operating characteristic curve (AUC) and calibration curve. Results A total of 1618 patients were included and divided into two sets: 1214 patients (110 censored) as training dataset and 404 patients (33 censored) as testing dataset. The median survival time for overall included patients was 35 days (IQR, 17–66). The 5 most significant prognostic variables were identified to construct the nomogram among all 28 initial variables, including Karnofsky Performance Status (KPS), abdominal distention, edema, quality of life (QOL), and duration of pain. In training dataset validation, the AUC at 30 days and 60 days were 0.723 (95% CI, 0.694–0.753) and 0.733 (95% CI, 0.702–0.763), respectively. Similarly, the AUC value was 0.724 (0.673–0.774) at 30 days and 0.725 (0.672–0.778) at 60 days in the testing dataset validation. Further, the calibration curves revealed good agreement between the nomogram predictions and actual observations in both the training and testing dataset. Conclusion This non-lab nomogram may be a useful clinical tool. It needs prospective multicenter validation as well as testing with Chinese clinicians in charge of hospice patients with gastrointestinal cancer to assess acceptability and usability.


Background
Cancer is currently considered as an important cause of mortality around the globe. According to the latest estimated result of global cancer burden, the incidence and mortality in China account for 23.7 and 30.2% of cancer in the world respectively [1]. Moreover, the incidence and mortality cases of gastrointestinal cancer such as esophageal cancer, gastric cancer, and liver cancer in China make up about half of that observed globally [2]. Most patients with gastrointestinal cancer are at the advanced stage when diagnosed [3]. For treatmentrefractory disease and as functional decline begins, patients can benefit from hospice care for symptom management and to reduce suffering at the end of life. When someone is choosing hospice, predicting survival time is more important than predicting treatment response, as it provides opportunity for patients and families to achieve closure [4,5]. The dying trajectory of patients with cancers is part of the most predictable prognostic information [6]. However, previous studies consistently reported about the inaccuracy of clinicians in estimating the survival time and mainly rely on their intuitions or self-clinical judgment [7,8]. Systematic reviews have shown that clinicians often overestimated actual life expectancy [9,10].
To improve the accuracy of clinicians' predictions, numerous prediction tools have been designed specifically for advanced-stage patients. These tools were the Palliative Prognostic (PaP) score [11], Delirium-PaP (D-PaP) [12], Palliative Performance Scale (PPS) [13,14], Palliative Prognostic Index (PPI) [15], modified Glasgow Prognostic Score (mGPS) [16,17], and Prognosis in Palliative Care Study (PiPS) [18] etc. However, there is no consensus regarding the most appropriate tool for clinical use [19]. Therefore, many studies have further determined prognostic factors in terminal cancer patients and constructed specific survival prediction models. For example, Feliu et al. produced an exceedingly accurate nomogram that uses basic clinical and analytical information to predict the probability of survival at 15, 30, and 60 days in terminally ill cancer patients [20]. Schonwetter et al. performed statistical analysis on data from more than 300 terminal lung cancer patients in a nonprofit community hospice to develop a lung cancerspecific prognostic tool to predict 50 and 90% mortality in the days after admission to a hospice [21]. As far as we know, there is no gastrointestinal cancer-specific prognostic model for home hospice patients in China.
Compared with developed countries, China's hospice career started late and developed more slowly. In China, as an important developing country, only a few investigators and institutions participate in hospice care-related research, especially for prognostic survival. As a result, China has lost the opportunity to share and exchange experiences with the world in the field of hospice [22]. Wang YM et al. performed a follow-up study on 674 patients with advanced stages of cancer in a hospice and identified factors that significantly affect the survival rate [23]. Zhou LJ et al. retrospectively analyzed data from 1019 advanced cancer patients who died within six months in a palliative home care service and produced a simple Chinese Prognostic Scale (ChPS) for predicting the survival rate of patients with an advanced stage of cancer [24]. In summary, there is a scarcity of studies concerning the survival of gastrointestinal cancer patients receiving home hospice care service in China as well as its predictors. Moreover, the least absolute shrinkage and selection operator (LASSO) Cox regression, with advantages of building predictive models that are more accurate, robust, and generalizable [25], has not been used in these patients. Thus, the aim of our study was to utilize LASSO Cox regression to build a model to accurately predict survival time in home hospice care patients with gastrointestinal cancer. In addition, we constructed a nomogram to represent our predictive model in a graphical format, making the model more accessible to clinicians and patients alike.

Research objects
Patients with gastrointestinal cancer who survived less than six months from the Hospice Unit of Shantou University Medical College-affiliated First Hospital between January 2008 and December 2018 were included in this retrospective study. The Hospice Unit of Shantou University Medical College-affiliated First Hospital is the first Hospice Unit established in 1998, founded by Li Ka Shing Foundation to provide free home-based holistic care for patients with terminal cancer in mainland China [26]. Patients with any missing data were excluded from our study. The current study includes the retrospective statistical analysis on clinical data of the deceased patients, without disclosing the patients' identity, and signed consent was not obtained, in accordance with the guidelines of the Chinese Ministry of Health.

Data collection
The following data were collected from the case records: (1) general baseline characteristics-including age, sex, area of residence (rural or urban), education, survival time, awareness of the disease (full understanding/partial understanding/ complete ignorance), hypertension history, diabetes history, smoking history, drinking history; (2) disease-related characteristics-including cancer diagnosis, metastasis, previous cancer treatment (surgery/chemotherapy/radiotherapy), duration of pain before admission, related major symptom (constipation/ anorexia/nausea/vomiting/abdominal distention/weight loss/insomnia/edema/tachypnea), previous analgesic treatment (none/NSAIDs/weak opioids/strong opioids), and effect of previous analgesic treatment (none/bad/average/ satisfied); (3) related assessment scale score-including Karnofsky Performance Scale (KPS) score, quality-of-life (QOL) score, and numeric rating scale (NRS) score. These data were recorded by the clinical team, consisting of four physicians and two nurses, on a series of structured data collection sheets on admission. The survival time was calculated as the number of days from admission to an event (dead or service paused). The symptoms were collected as "present" or "absent" on admission. The degree of pain was assessed by numeric rating scale (NRS) [27]: 0 for painless, 1-3 for mild pain, 4-6 for moderate pain, and 7-10 for severe pain. The patient's performance status was evaluated according to the Karnofsky Performance Scale (KPS) [28,29], an 11-point rating scale that ranges from normal functioning (100) to dead (0), which has been translated into Chinese. The QOL scale (Chinese version), consisting of 12 items-including appetite, energy, attitude toward treatment, sleep, family relationships, fatigue, work relationships, pain, perception of cancer, activities of daily life, side effects of treatment and facial expression, was developed by Sun Yan in the 1990s by applying international scales to the context of the Chinese culture [24]. The total score for this scale is 60, with 1-5 scores for each item. For instance, the appetite is scored from hardly eat (1) to normally eat (5).

Statistical analysis
The data were split into two sets using stratified random sampling: 75% as a training set for developing a predictive model and 25% as a testing set for validating it. The differences between the testing and training sets were evaluated using the Mann-Whitney U test for continuous variables and the chi-square test for categorical variables. Categorical variables were represented as percentages while continuous variables were reported as median and interquartile ranges (IQR). Before performing statistical analyses, we converted variables including KPS scores, QOL scores, NRS scores and age into categorical variables by using the X-tile software (version 3.6.1, http://medicine.yale.edu). X-tile plots provide an intuitive method to assess the association between variables and survival. The X-tile program can automatically select the optimum data cut point according to the highest χ 2 value (minimum p value) defined by Kaplan-Meier survival analysis and log-rank test [30]. As a result, we categorized KPS scores as 30 or lesser, 40, 50 or more; QOL scores as 30 or lesser, 31or more; NRS scores as 3 or lesser, 4 to 6, 7 or more; age as less than 60 years, 60 or more years.
We used the 10-fold cross-validated Cox proportional hazard regression with LASSO-penalization to select the most significant prognostic variables of all initial variables. By performing both variable selection and penalization, the LASSO is able to build accurate models without under-fitting or overfitting, which leads to superior performance over traditional multivariable regression [31]. Consequently, the LASSO has been extended and broadly applied to the Cox proportional hazard regression model for survival analysis [32]. Further, the most significant predictors were identified to construct the nomogram to predict the 30-day and 60-day survival probability by using multivariate Cox regression. In  other words, we used the Cox regression model to do the multivariable survival analysis, and Cox regression coefficients to generate the nomogram [33]. For multivariate analysis of survival probability, the Cox regression was performed with the forward stepwise procedure. Then, the performance of the nomogram was evaluated using the area under receiver operating characteristic curve (AUC) along with a 95% confidence interval and calibration curves (500 bootstrap resamples) in both the training and testing dataset. The AUC value is almost treated as C-statistics to evaluate the predicting performances dynamically and more intuitively [34,35]. And calibration curve is useful for assessing whether predicted outcomes approximate actual outcomes. The R software version 3.6.2 (https://www.r-project. org/) was used for the statistical analysis and P < 0.05 was considered as the statistically significant. The overall survival analysis was performed by Kaplan-Meier using "survival" and "survminer" packages. LASSO Cox regression analysis and nomogram were operated with the "glmnet" and "rms" packages. Receiver operating characteristic curves and calibration curves analysis was conducted using the "timeROC" and "rms" packages. A table for baseline patient characteristics was generated using the "tableone" package.

Results
Characteristics of the dataset 181 patients with any missing data were excluded from our study analysis. After exclusion, a total of 1618 patients with gastrointestinal cancer were included in our study and randomly divided into two sets: 1214 patients (110 censored) as training dataset and 404 patients (33 censored) as testing dataset. The overall survival function with a risk table was shown in Fig. 1. The median survival time for overall included patients was 35 days (interquartile ranges [IQR], 17-66). Among all cases, 70.3% were men and 57.8% were older than 60 years.  Detailed information between the training and testing dataset were summarized in

Nomogram construction and validation
The optimal tuning parameter λ for LASSO regression with 10-fold cross-validation was 0.093, with log(λ) = − 2.375, following the one standard error of the minimum criteria (Fig. 2a). At the optimal values log (λ), five variables (KPS, abdominal distention, edema, QOL, and duration of pain) with a nonzero coefficient were selected in the LASSO analysis (Fig.  2b). Then the five retained variables were used to construct the nomogram by using multivariate Cox regression. As shown in Table 2, KPS, abdominal distention, edema, QOL, and duration of pain were a panel of significant predictors of overall survival (OS) in patients with gastrointestinal cancer. In the nomogram (Fig. 3), each prognostic variable corresponded to a specific point by drawing a straight line upward to the points axis. After calculating all variables' points, the total points on the bottom scales that correspond to the 30-day and 60-day survival probability were showed respectively. We examined the performance of our predictive nomogram by employing both discrimination and calibration assessments. The receiver operating characteristic (ROC) curve analysis showed quite useful discrimination in both the training and testing dataset. As shown in Fig. 4, the AUC value was 0.723 (95% confidence interval, 0.694-0.753) at 30 days and 0.733 (95% confidence interval, 0.702-0.763) at 60 days in the training dataset. Similarly, the AUC value was 0.724 (0.673-0.774) at 30 days and 0.725 (0.672-0.778) at 60 days in the testing dataset (Fig. 5). To assess the calibration of the prognostic nomogram, we compared the predicted 30-day and 60-day survival probabilities to the actual 30-day and 60-day survival probabilities. As shown in Fig. 6 and Fig. 7, the calibration curves revealed good agreement between the predicted and observed probabilities.   [23,24]. However, several studies have shown that prognosis information varies by cancer types [21,36,37]. As a developing country, because of environmental pollution and the lack of early diagnosis and treatment, high incidence occurs in gastrointestinal cancer with poor prognosis in China [1]. This is the first time to analyze gastrointestinal cancer-specific prognostic factors that influence patients' survival and build a model to accurately predict survival time in home hospice care. Compared with previous studies, the application of LASSO Cox regression with cross-validation enabled us to develop a more parsimonious predictive model with superior performance. For example, Zhou LJ et al. identified 10 prognostic variables to develop a simple Chinese Prognostic Scale (ChPS) with 65.4% accuracy in the testing set by using traditional multivariable regression [24]. In our study, we used the LASSO analysis to identify 5 preditors from all 28 initial variables. And the evaluation of our predictive model showed quite useful discrimination and good agreement calibration in both the training and testing dataset. This is consistent with the opinion that the LASSO have a better performance against the traditional multivariable regression since it can perform both variable selection and penalization [38]. Penalized regression is utilized to avoid model overfitting by using a loss function or penalty term that is added to the objective function to control the complexity of the model [31]. In clinical scenarios, a more selective model would be preferred because it could save time and resources, by avoiding collection of less useful data. Besides, our predictive nomogram has the advantage of not utilizing laboratory measures, which are difficult to obtain in hospice patients.
Many studies suggested that performance status along with some clinical symptoms could improve the prediction of survival for terminal cancer patients [24,39,40]. This parallels our findings. In our prognostic nomogram, there are five predictors including KPS, abdominal distention, edema, QOL, and duration of pain. Among these, KPS, the recognized tool to evaluate performance status, has been found to be reasonably reliable in survival prediction for patients with advanced cancer even when scores were as low as 50 [40,41]. Poor performance status is associated with short survival. As shown in our nomogram, the lower KPS patients were scored, the   higher points they receive, indicating the worse their 30day and 60-day overall survival. Similarly, a patient obtained a QOL score 30 or lower has worse probability of survival than those were scored higher than 30. For the predictor that "duration of pain" in our nomogram, the shorter duration, the higher points, the worse probability of survival. This is likely because those patients who experience acute pain usually have sever disease progression, which leads to shorter survival. Furthermore, we should note that the symptoms (edema, abdominal distention and duration of pain) included in our nomogram were not exactly the same as those included in previous studies [42]. This may be due to the different characteristics of the samples included in studies. As Glare reported that there appeared to be apparent differences in prognostic factors between those predicted survival less than 3 months vs. those predicted survival ranging in the 3-12 months [43]. In addition, KPS and QOL scores are generally low in our study. That's possibly because the performance and QOL status of patients with advanced cancer in home hospice care is generally poor. This study is certainly limited because it was performed using a retrospective database from one hospice center in China. First, the AUC is quite acceptable, but not outstanding. It may be partly due to the retrospective nature of this study and the lack of our ability to capture all useful predictors and the precision of each predictor. The data such as symptoms were collected as "present" or "absent" from the case records. Second, the optimal method of validation for our predictive model is to use a separate dataset from another center. A potential solution is to prospectively perform a multicenter study, though this is time-consuming and potentially unfeasible. Moreover, patients with any missing data were excluded from our study analysis, which may affect the robustness of the model to some extent. Last but not least, we only included research objects who died within six months, which seems to cause subject bias. However, the clinical reality also needs to be considered. Precious few patients survived more than six months in the hospice unit, and they were mostly in the early stages of cancer. Those patients chose hospice treatment mainly because of financial difficulties.

Conclusion
To our knowledge, this is the first application of LASSO Cox regression with cross-validation to produce a gastrointestinal cancer-specific nomogram to predict the probability of survival at 30 days and 60 days in home hospice care patients in China. Our nomogram may be a useful non-lab clinical tool that needs prospective multicenter validation as well as testing with Chinese clinicians in charge of hospice patients with gastrointestinal cancers to assess acceptability and usability.