Table of Contents
Study design and population
This population included living or deceased, transplanted or discarded kidneys enrolled between January 1, 2000 and December 31, 2021, who underwent kidney biopsy prior to kidney transplantation as part of standard of care. It consisted of adult donors for transplantation. For the derivation cohort, the study involved 14 sites in seven countries (France, Belgium, Croatia, Spain, United States, Canada, Australia) and 15 centers, including the largest His OPO (OneLegacy) in the United States. The external validation cohort included his two institutions in two countries: Columbia University Medical Center in the United States and Sun Yat-sen University in China. A total of 15,121 renal biopsies were evaluated overall. Exclusion criteria were inappropriate biopsy according to Banff International Classification requirements (n==1089, 7.2%)twenty one. The final analysis included a total of 14,032 kidney allograft biopsies, including his 1,372 (9.8%) from discarded kidneys. Of these, 12,402 were included in the derivation cohort and 1,630 in the external validation cohort.
Inclusion and Ethics Statement
All data were anonymized and clinical and biological data were collected from each center and entered into the Paris Transplant Group database (French Data Protection Agency (CNIL) registration number 363505). Data was accessed from the database on January 1, 2021. China data was accessed from the database on November 19, 2021. On June 8, 2022, OneLegacy OPO data was accessed from the database. The study protocol (NCT04759209) was approved by the Paris Transplant Group Institutional Review Board (IRB). Written informed consent was obtained from all living donors at the time of transplantation. The Institutional Review Board of the Paris Transplant Institute approved this study and waived informed consent for deceased donors (registration number 2018-1017-Virtual-Biopsy). Initial collection and export of data was approved by the Ministry of Science and Technology of Sun Yat-sen University in China. All data from the Paris Transplant Group centers (Necker Hospital, St. Louis Hospital, Toulouse Hospital) were prospectively entered at the time of transplantation. A structured protocol was used to ensure harmonization between study centers. Annual audits were conducted to ensure data accuracy. As part of standard clinical procedures, other datasets from centers in Europe, North America, Australia and Asia were compiled, entered into the center’s database according to regional and national regulatory standards, and submitted anonymously to the Paris Transplant Group. it was done.
Renal biopsy histological evaluation and protocol
After removing the organ from the donor, a day zero biopsy was performed by a surgeon using a 16-gauge needle device or a straight blade according to standard techniques. Tissues were immediately fixed in aqueous formaldehyde (formalin) or alcohol-formalin-acetic acid solution and then embedded in paraffin or immediately frozen. Biopsy sections (4 μm) were stained with periodic acid-Schiff, Masson’s trichrome, hematoxylin, and eosin.Use of the International Banff Classification Kidney Lesion Scoring Systemtwenty oneExpert renal pathologists grade graft biopsy lesions using the following criteria: number of glomeruli, atherosclerosis, arteriolar hyalinosis, interstitial fibrosis and tubular atrophy, and percentage of sclerotic glomeruli. Did. A detailed table summarizing biopsy practices and procedures for participating centers is provided in Supplementary Table 14.
interesting results
The outcome of interest is biopsy findings according to the International Banff Classification of Allograft Pathology, which uses a validated semi-quantitative ordinal grading scheme for all renal compartments, including: (i) arteriosclerosis defined by intimal thickening in the most severely affected arteries (Banff “CV” score); (ii) defined by periodic acid Schiff (PAS) positive arteriolar hyaline thickening; (iii) interstitial fibrosis and tubular atrophy (Banff “IFTA” € score) degree of cortical fibrosis (Banff “ci” score) and cortical urine It is calculated from the degree of tubular atrophy (Banff “CT” score).twenty one. These semi-quantitative lesion grading scores are not linear. Finally, the continuous rate of glomerulosclerosis was defined by the percentage of the total number of glomeruli affected by global sclerosis (“glomerulosclerosis score”).Five. See Supplementary Method 3 and Supplementary Table 15 for details of the Banff grading scheme.
Candidate predictors of histological lesions in renal biopsies
Eleven candidates, universally collected donor predictors at the time of donation, of day 0 histological lesions of the kidney were examined. They include donor age, gender, type (living or deceased donor), donor cerebrovascular cause of death, donor circulatory cause of death (DCD), and donor history of hypertension, diabetes, and hepatitis. C virus (HCV) status, body mass index (BMI), minimum serum creatinine at donation, and donor proteinuria status. See Supplementary Method 4 for details on these predictors.
statistical analysis
TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) statements were used to report the development and validation of the virtual biopsy system.38, was adapted for machine learning (Supplementary Method 5). Figure 4 summarizes the process of machine learning model generation and validation.
Descriptive analysis of baseline characteristics
For continuous variables, mean and standard deviation or median and interquartile range were used. Student’s was used to compare means and proportions between groups. t-test, analysis of variance (ANOVA) (or Mann-Whitney test and Kruskal-Wallis as appropriate), or chi-square test (or Fisher’s exact test as appropriate). All tests were two-tailed.
Algorithm preprocessing
Uploaded during the model training process to minimize data imbalance in lesion scores and maximize prediction performance for mild/low grade (overrepresentation) than severe/high grade (underestimation). A sampling technique was applied. By randomly resampling severe/high grade kidneys. Three continuous donor numerical parameters (age, BMI, creatinine) were standardized to have a mean of 0 and a standard deviation of 1. These preprocessing steps were performed as follows. caret R package39.
Development of virtual biopsy system
To develop a virtual biopsy system, we calculated the probability of each day zero histological lesion score from six machine learning models, Random Forest (RF).40model average neural network (avNNet)41Gradient Boosting Machine (GBM)42Extreme Gradient Boosting Tree (XGBoost)43linear discriminant analysis (LDA)41and multinomial logistic regression (MNOM)44. To avoid overfitting and sampling bias, hyperparameters were optimized by robust cross-validation with three 10-fold iterations when tuning the model.45. We then aggregated the classification models by averaging the probabilities provided by each model. This produced an ensemble model, or meta-classifier, that aims to reduce bias and overfitting by considering the “no free lunch” theorem.46,47,48. MNOM and LDA were not used to predict glomerulosclerotic lesions (regression) as they are only designed to predict categorical variables (classification). For the regression model, we constructed a linear model of the regression model to create an ensemble model, or meta-regression.
Predicted performance of virtual biopsy system
Model performance was evaluated as internal and external validation. For internal validation, performance was evaluated on 30 resamples from a 10-fold cross-validation repeated three times on the derivation cohort. For external validation, performance was assessed in an external cohort. To evaluate the discrimination performance of the machine learning model used for continuous glomerulosclerosis, we used MAE and RMSE as auxiliary metrics.49. For normal day zero histological lesion scores, cv, ah, and IFTA, we used multi-AUC (Multi-AUC) using the Hand and Till equation.50. Additional complementary indicators of cv, ah, and IFTA were also reported for both internal and external validation. Sensitivity, specificity, balanced accuracy (average of sensitivity and specificity), precision, and area under the receiver operating characteristic curve (AUROC). To present these supplementary indicators, the categorical Banff scores ‘none’ (Banff score 0) and ‘mild’ (Banff score 1) are defined as negative classes, and ‘moderate’ (Banff score 2) and ‘severe’ are classified as negative classes. It was divided into two. (Banff score 3) as a positive class. The cutoff for bidifferential Banff lesions was calculated using Youden’s method of his J statistic for internal validation.51. Supplemental Method 6 contains the rationale for the cutoffs used to measure performance. 1000 bootstraps were used to obtain 95% CIs, and samples from the external validation cohort were used for point estimates for each indicator.
Model calibration was checked using a confusion matrix. Additionally, we averaged the feature importance for RF, GBM, XGBoost, LDA (classification model only), avNNet, and MNOM (classification model only) to evaluate the donor parameters that govern model performance.
Fill in missing data
For biopsies with at least one missing data element for the predictor of interest, a random forest imputation algorithm was performed using the missForest R package.52. The maximum number of assignment iterations was set to 10. Details of the imputation process and results are presented in Supplementary Methods 7.
Kidney Donor Profile Index (KDPI)
We conducted a sensitivity analysis to investigate whether KDPI could predict day zero biopsy lesions. We developed a model using only KDPI scores. Biopsies from living donors and biopsies from donors with missing ethnicity, height, and weight data were excluded from the imputation dataset. KDPI calculations followed Organ Procurement and Transplant Network (OPTN) guidelines based on databases as of April 7, 2023. An ensemble of RF, XGBoost, LDA, avNNet, and MNOM models was employed. LDA and MNOM were excluded from predicting glomerulosclerotic lesions. GBM was excluded due to difficulty in deriving a univariate model.
Assessing consistency in biopsy evaluation
To assess interpathologist consistency in evaluating biopsy findings, 10% of the biopsy results were randomly selected and evaluated by four specialist nephrologists at the two original transplant centers (Necker Hospital and Mayo Clinic). It was reevaluated by a scientist. The pathologist was blinded to previous biopsy findings. Fleiss Kappa was used to measure consistency and was weighted to account for the magnitude of error in the reassessment.
software and packages
Descriptive and machine learning analyzes were performed using R (version 3.5.1, R Foundation for Statistical Computing) and RStudio (version 2022.7.2.576). The packages used for data and machine learning analysis were randomForest (version 4.6-14), gbm (version 2.1.5), xgboost (version 1.4.1.1), plyr (version 1.8.4), MASS (version 7.3- 51.4), nnet (version 7.3-12), caret (version 6.0-84), caretEnsemble (version 2.0.1), tidyverse (version 1.3.0), ggsci (version 2.9), rsample (version 0.1.1), tidymodels (version 0.0).2), patchwork (version 1.0.0), dplyr (version 1.0.7), ggplot2 (version 3.3.1), yardstick (version 0.0.8), readr (version 1.3.1), cvms ( version 1.3.3) ), pROC (version 1.18.0), rlist (version 0.4.6.2), autoxgboost (version 0.0.0.9000), shiny (version 1.6.0), shiny theme (version 1.1.2), kableExtra ( version 1.3.4), and compareGroups (version 4.0.0).
Report overview
For more information on the study design, see the Nature Portfolio Reporting Summary linked in this article.