Advertisement
Article| Volume 45, ISSUE 5, P1039-1045, November 2022

Supporting first FSH dosage for ovarian stimulation with machine learning

  • Nuria Correa
    Affiliations
    Clínica Eugin-Eugin Group, Carrer de Balmes 236, Barcelona 08006, Spain

    Instituto de Investigación en Inteligencia Artificial, Consejo Superior de Investigaciones Científicas (IIIA-CSIC), Campus de la UAB, Carrer de Can Planas, Zona 2, Cerdanyola de Valles Barcelona 08193, Spain

    Universitat Autònoma de Barcelona (UAB), Plaça Cívica, Bellaterra Barcelona 08193, Spain
    Search for articles by this author
  • Jesus Cerquides
    Affiliations
    Instituto de Investigación en Inteligencia Artificial, Consejo Superior de Investigaciones Científicas (IIIA-CSIC), Campus de la UAB, Carrer de Can Planas, Zona 2, Cerdanyola de Valles Barcelona 08193, Spain
    Search for articles by this author
  • Josep Lluis Arcos
    Affiliations
    Instituto de Investigación en Inteligencia Artificial, Consejo Superior de Investigaciones Científicas (IIIA-CSIC), Campus de la UAB, Carrer de Can Planas, Zona 2, Cerdanyola de Valles Barcelona 08193, Spain
    Search for articles by this author
  • Rita Vassena
    Correspondence
    Corresponding author.
    Affiliations
    Clínica Eugin-Eugin Group, Carrer de Balmes 236, Barcelona 08006, Spain
    Search for articles by this author

      HIGHLIGHTS

      • We developed a ML model to recommend first FSH dosage for all types of patients.
      • The model performance surpassed the clinicians’ in both development and validation.
      • The model can serve as quality check, second opinion or learning tool for trainees.

      Abstract

      Research question

      Is it possible to identify accurately the optimal first dose of FSH in ovarian stimulation by means of a machine learning model?

      Design

      Observational study (2011–2021) including first IVF cycles with own oocytes. A total of 2713 patients from five private reproductive centres were included in the development phase (2011–2019) and 774 in the validation phase (2020–2021). Predictor variables included age, BMI, AMH, AFC and previous live births. Performance was measured with a proposed score based on the number of MII oocytes retrieved and dose received, recommended, or both.

      Results

      The included cycles were from women aged 37.7 ± 4.4 years (18–45 years), with a BMI of 23.5 ± 4.2 kg/m2, AMH of 2.4 ± 2.3 ng/ml, AFC of 11.3 ± 7.6, and an average number of MII obtained 6.9 ± 5.4. The model reached a mean performance score of 0.87 (95% CI 0.86 to 0.88) in the development phase, significantly better than for doses prescribed by clinicians for the same patients (0.83, 95% CI 0.82 to 0.84; P = 2.44 e-10). Mean performance score of the model recommendations was 0.89 (95% CI 0.88 to 0.90) in the validation phase, also significantly better than clinicians (0.84, 95% CI 0.82 to 0.86; P = 3.81 e-05). The model was shown to surpass the performance of standard practice.

      Conclusion

      This machine learning model could be used as a training and learning tool for new clinicians, and as quality control for experienced clinicians.

      KEYWORDS

      Introduction

      Although significant strides have been made in the last 40 years, the mean pregnancy rate after an IVF cycle still hovers around 30%, with a 20% chance of delivery (
      • De Geyter C.
      • Calhaz-Jorge C.
      • Kupka M.S.
      • Wyns C.
      • Mocanu E.
      • Motrenko T.
      • Baranowski R.
      ART in Europe, 2014: Results generated from European registries by ESHRE.
      ). An important requisite to the success of an IVF cycle is the availability of a certain number of mature oocytes (metaphase III [MII]); usually obtained after ovarian stimulation.
      Ovarian stimulation, therefore, represents, a key step for IVF success, as failing to ensure an optimal number of MII oocytes will likely hinder the procedure. As the number of MII oocytes retrieved increases, so does the chance of producing some embryos with high pregnancy potential (
      • Drakopoulos P.
      • Blockeel C.
      • Stoop D.
      • Camus M.
      • De Vos M.
      • Tournaye H.
      • Polyzos N.P.
      Conventional ovarian stimulation and single embryo transfer for IVF/ICSI. How many oocytes do we need to maximize cumulative live birth rates after utilization of all fresh and frozen embryos?.
      ;
      • Esteves S.C.
      • Carvalho J.F.
      • Bento F.C.
      • Santos J.
      A novel predictive model to estimate the number of mature oocytes required for obtaining at least one euploid blastocyst for transfer in couples undergoing in vitro fertilization/intracytoplasmic sperm injection: The ART calculator.
      ), but stimulating a patient too much leads to an increased risk of ovarian hyperstimulation syndrome (OHSS). As such, a compromise must be reached to strive to retrieve a number inside of a range considered as optimal that does not increase chances of OHSS but maintains good pregnancy potential. The definition of an optimal range of oocytes considered during this study ranges from 10 to 15 oocytes (
      • Sunkara S.K.
      • Rittenberg V.
      • Raine-Fenning N.
      • Bhattacharya S.
      • Zamora J.
      • Coomarasamy A.
      Association between the number of eggs and live birth in IVF treatment: An analysis of 400 135 treatment cycles.
      ;
      • Steward R.G.
      • Lan L.
      • Shah A.A.
      • Yeh J.S.
      • Price T.M.
      • Goldfarb J.M.
      • Muasher S.J.
      Oocyte number as a predictor for ovarian hyperstimulation syndrome and live birth: An analysis of 256,381 in vitro fertilization cycles.
      ). Anything outside these values is considered too many or too few. Whenever a patient falls outside the defined range, the risk of an unsuccessful or cancelled cycle increases as well as the occurrence of OHSS. This implies the need to freeze all the embryos when generated, which increases costs and causing delays in treatment. Acceptance of an increased risk of OHSS, when properly managed with gonadotrophin releasing hormone agonist trigger, in exchange for a higher number of MII oocytes, is controversial.
      • Sunkara S.K.
      • Rittenberg V.
      • Raine-Fenning N.
      • Bhattacharya S.
      • Zamora J.
      • Coomarasamy A.
      Association between the number of eggs and live birth in IVF treatment: An analysis of 400 135 treatment cycles.
      and
      • Steward R.G.
      • Lan L.
      • Shah A.A.
      • Yeh J.S.
      • Price T.M.
      • Goldfarb J.M.
      • Muasher S.J.
      Oocyte number as a predictor for ovarian hyperstimulation syndrome and live birth: An analysis of 256,381 in vitro fertilization cycles.
      found that live birth rates (LBR) in fresh cycles with more than 15 oocytes plateaued or even declined; other investigators (
      • Ji J.
      • Liu Y.
      • Tong X.H.
      • Luo L.
      • Ma J.
      • Chen Z.
      The optimum number of oocytes in IVF treatment: An analysis of 2455 cycles in China.
      ) showed an increased cumulative LBR when the frozen embryo transfers were taken into account. This could benefit patients with specifically advanced maternal age but not patients with polycystic ovary syndrome (
      • Chen Y.
      • Wang Q.
      • Zhang Y.
      • Han X.
      • Li D.
      • Zhang C.
      Cumulative live birth and surplus embryo incidence after frozen-thaw cycles in PCOS: how many oocytes do we need?.
      ). For the present study, uniform criteria for all patients were needed, so a conservative view was considered adequate. A response below 15 oocytes was then set as ideal.
      Essential to all ovarian stimulation protocols is the starting dose of exogenous FSH. This dose should be sufficient to recruit all FSH responsive follicles but should not be any higher to avoid unsafe effects, i.e. OHSS or decreased oocyte quality. After about 8 days of stimulation, changing the FSH dose does not allow for a significant further recruitment of follicles (
      • Fleming R.
      • Deshpande N.
      • Traynor I.
      • Yates R.W.S.
      Dynamics of FSH-induced follicular growth in subfertile women: Relationship with age, insulin resistance, oocyte yield and anti-Mullerian hormone.
      ). In other words, if the starting dose of exogenous FSH is inadequate, little can be done to fix its effects on MII yield.
      The choice of the FSH starting dose is mostly based on the patient's characteristics, i.e. age, body mass index (BMI) or ovarian reserve and clinical characteristics, i.e. past gravidity and parity. Sometimes, ovarian stimulation leads to unexpected and widely different results even among apparently similar patients, resulting in either too many or too few oocytes collected. Furthermore, where the MII oocytes retrieved are in the expected number range, they may still be of insufficient quality to achieve success, as only 30–40% of microinjected oocytes develop to blastocyst (
      • Maggiulli R.
      • Cimadomo D.
      • Fabozzi G.
      • Papini L.
      • Dovere L.
      • Ubaldi F.M.
      • Rienzi L.
      The effect of ICSI-related procedural timings and operators on the outcome.
      ;
      • Vaiarelli A.
      • Cimadomo D.
      • Conforti A.
      • Schimberni M.
      • Giuliani M.
      • D'Alessandro P.
      • Ubaldi F.M.
      Luteal phase after conventional stimulation in the same ovarian cycle might improve the management of poor responder patients fulfilling the Bologna criteria: a case series.
      ), and around 11% to an euploid blastocyst (
      • Chamayou S.
      • Sicali M.
      • Alecci C.
      • Ragolia C.
      • Liprino A.
      • Nibali D.
      • Guglielmino A.
      The accumulation of vitrified oocytes is a strategy to increase the number of euploid available blastocysts for transfer after preimplantation genetic testing.
      ).
      Clinicians use their knowledge and experience to prescribe a starting FSH dose to reach the appropriate range of follicular stimulation. So far, some machine learning models have been developed to encapsulate that medical experience reflected in historical data to try to automate that decision. Two separate nomograms based on patient age, anti-Müllerian hormone (AMH) or antral follicle count (AFC) and basal FSH levels have been developed for this task (
      • La Marca A.
      • Papaleo E.
      • Grisendi V.
      • Argento C.
      • Giulini S.
      • Volpe A.
      Development of a nomogram based on markers of ovarian reserve for the individualisation of the follicle-stimulating hormone starting dose in in vitro fertilisation cycles.
      ;
      • Ebid A.H.I.M.
      • Motaleb S.M.A.
      • Mostafa M.I.
      • Soliman M.M.A.
      Novel nomogram-based integrated gonadotropin therapy individualization in in vitro fertilization/intracytoplasmic sperm injection: A modeling approach.
      ). One of them was tested prospectively (
      • Allegra A.
      • Marino A.
      • Volpes A.
      • Coffaro F.
      • Scaglione P.
      • Gullo S.
      • La Marca A.
      A randomized controlled trial investigating the use of a predictive nomogram for the selection of the FSH starting dose in IVF/ICSI cycles.
      ), reporting an increased number of patients with an optimal range of MII oocytes retrieved, and a decreased number in patients with lower response in those using the nomogram. These two nomograms did not include patients older than 40 years or those with irregular cycles, including patients with polycystic ovary syndrome. In a RCT for another model developed specifically for individualized dosage of FSH delta (
      • Nyboe Andersen A.
      • Nelson S.M.
      • Fauser B.C.J.M.
      • García-Velasco J.A.
      • Klein B.M.
      • Arce J.C.
      • Arce J.C.
      Individualized versus conventional ovarian stimulation for in vitro fertilization: a multicenter, randomized, controlled, assessor-blinded, phase 3 noninferiority trial.
      ), no differences in pregnancy rate were observed.
      Additionally, the CONSORT model, based on multivariate regression (
      • Howles C.M.
      • Saunders H.
      • Alam V.
      • Engrand P.
      Predictive factors and a corresponding treatment algorithm for controlled ovarian stimulation in patients treated with recombinant human follicle stimulating hormone (follitropin alfa) during assisted reproduction technology (ART) procedures. An analysis.
      ) predicted overall lower starting doses compared with those prescribed by clinicians in normo-ovulatory patients (
      • Naether O.G.J.
      • Tandler-Schneider A.
      • Bilger W.
      Individualized recombinant human follicle-stimulating hormone dosing using the CONSORT calculator in assisted reproductive technology: A large, multicenter, observational study of routine clinical practice.
      ;
      • Pouly J.L.
      • Olivennes F.
      • Massin N.
      • Celle M.
      • Caizergues N.
      • Contard F.
      Usability and utility of the CONSORT calculator for FSH starting doses: A prospective observational study.
      ). CONSORT was also tested by RCT (
      • Olivennes F.
      • Trew G.
      • Borini A.
      • Broekmans F.
      • Arriagada P.
      • Warne D.W.
      • Howles C.M.
      Randomized, controlled, open-label, non-inferiority study of the CONSORT algorithm for individualized dosing of follitropin alfa.
      ), showing that the model was able to reduce the risk of OHSS in patients while maintaining comparable pregnancy rates compared with the clinician-chosen dose, despite a reduction in the number of retrieved oocytes.
      The aim of the present study was to develop and validate a model based on machine-learning designed to identify the optimal starting dose for all variants of FSH except delta (as it is not quantified in IU/ml), and to collect several MII oocytes as close as possible to 12 (a middle point in the optimal range considered in this study), for all types of patients.

      Materials and methods

      Patient population and ethical approval

      Data from a total of 2713 first IVF cycles, from 2011–2019, registered in five private centres from two countries pertaining to the same company, were used to develop the model. All five centres operated under similar quality-control protocols, but choice of stimulation and modifications to standard protocols were left to each clinician. Natural cycles and cycles in which gonadotrophin doses were not expressed in IU/ml were excluded. The inclusion of first cycles aimed to prevent bias caused by unrecorded clinician knowledge (such as FSH dosage and results of previous cycles). An additional 774 cycles between 2020 and 2021 were used for prospective validation of the model. Three categories of data were collected as variables. First, the input data, composed of age, BMI, proven fertility (Y/N) and reserve markers AMH and AFC; second, the intervention, namely the first dose of FSH prescribed by the clinician; and third, result data expressed as number of metaphase II oocytes (MII) collected after stimulation (Table 1). Throughout the study, only cases with complete data on all the variables were included. Cycles from both the development and validation databases corresponded to women aged 37.7 ± 4.4 years (18–45 years), with a BMI of 23.5 ± 4.2 kg/m2, AMH of 2.4 ± 2.3 ng/ml, AFC of 11.3 ± 7.6, and with an average number of MII obtained 6.9 ± 5.4.
      Table 1PATIENT CHARACTERISTICS IN THE TWO DATABASES USED IN THE STUDY
      CharacteristicsDevelopment database (n = 2713)Validation database (n = 774)P-value
      Age, years37.7 ± 4.638.3 ± 4.40.007
      AMH, ng/ml2.4 ± 2.32.2 ± 2.20.003
      AFC, n11.1 ± 7.311.3 ± 8.50.7
      BMI, kg/m223.6 ± 4.223.2 ± 3.90.007
      Number of MII retrieved6.9 ± 5.06.8 ± 6.50.005
      Proven female fertility, %1310.10.067
      Values are expressed as mean and SD or as %. Variables were compared using Mann-Whitney U test. For proportions, a two-sample z-test was conducted. P < 0.05 was considered statistically significant. AFC, antral follicle count; AMH, anti-Müllerian hormone; BMI, body mass index; MII, metaphase II.
      Permission to conduct this study was obtained from the Ethical Committee for Research of Eugin on 20 October 2020 (approval code: ALGO2).

      Predictive model construction

      In agreement with recently published research (
      • Sunkara S.K.
      • Rittenberg V.
      • Raine-Fenning N.
      • Bhattacharya S.
      • Zamora J.
      • Coomarasamy A.
      Association between the number of eggs and live birth in IVF treatment: An analysis of 400 135 treatment cycles.
      ;
      • Steward R.G.
      • Lan L.
      • Shah A.A.
      • Yeh J.S.
      • Price T.M.
      • Goldfarb J.M.
      • Muasher S.J.
      Oocyte number as a predictor for ovarian hyperstimulation syndrome and live birth: An analysis of 256,381 in vitro fertilization cycles.
      ;
      • Polyzos N.P.
      • Sunkara S.K.
      Sub-optimal responders following controlled ovarian stimulation: An overlooked group?.
      ), the aim of the present study was to predict the initial dose of FSH to achieve a number of MII as close as possible to 12. The range 10–15 was considered desirable, the range four to nine suboptimal and MII lower than four or above 15 not desirable. Given patient characteristics and limitations in the maximum dose of FSH administered, not every patient is considered able to reach the desired goal. The number of MII was selected because, as an end point, it is close in time and association with the intervention while maintaining clinical relevance (a recognized association exists between number of MII and chances of pregnancy and live birth). Live birth rate (LBR) and clinical pregnancy rate (CPR) were considered initially in the building of the model but were too distant in time from the intervention for any model to be able to predict accurately the effect of a specific treatment using, as in the present study, only the information at the start of treatment and from female participants.
      A predictive model was constructed to predict the patient's capability of reacting to the first dose of FSH. This capability can be described by the slope of a simplified linear dose-response function. For any patient, during a natural cycle (0 IU/ml of exogenous FSH) the outcome in number of MII collected would stay mainly between 0 and 1. Given that results of a specific dose of FSH were entered into the database, the value of individual slopes was easily computable. To avoid negative slope values, it was assumed that all patients would achieve 0 MII if given 0 exogenous FSH.
      The slope of a linear function is defined as follows:
      m=y2y1x2x1


      As the first data point (x1, y1) is set at the origin (0, 0), the slope value for every patient is computed by dividing the outcome or y2 (MII) by x2 (the first dose of FSH).
      A linear regression algorithm was trained to predict the slope for every case (defined by its values at the start of the stimulation in age, BMI, AFC AMH and proven fertility). Training was conducted on a random 80% of the development database. The remaining 20% was reserved for testing purposes. The training process was cross-validated five times with five randomly selected training datasets, with their corresponding five test sets.

      Dose recommendation by the model

      For dose-recommending purposes, the predicted slope for each test patient was used to compute the necessary FSH to obtain an outcome of 12 MII using the following linear function:Y = m * x
      Where y is the number of MII, m, the value of the slope, and x, the FSH quantity.
      As prescribing more than 300 IU/ml of FSH has been reported to give little to no advantages (
      • Harrison R.F.
      • Jacob S.
      • Spillane H.
      • Mallon E.
      • Hennelly B.
      A prospective randomized clinical trial of differing starter doses of recombinant follicle-stimulating hormone (follitropin-β) for first time in vitro fertilization and intracytoplasmic sperm injection treatment cycles.
      ;
      • Bastu E.
      • Celik C.
      • Keskin G.
      • Buyru F.
      Evaluation of embryo transfer time (day 2 vs day 3) after imposed single embryo transfer legislation: When to transfer?.
      ), recommended doses were capped at 300 IU/ml.

      Development of a performance score

      A score function was designed to compare recommendations made by the model with the prescriptions made by the clinicians. Given any FSH prescription with its resulting MII outcome, the score function assigns a score for a hypothetical recommended dose from –1 (the recommended dose was too low) to 1 (too high), 0 being the best possible value (the dose recommended as appropriate). Doses of FSH were categorized in four ordinal ranks (100 to 150, 151 to 200, 201 to 250 and 251 to 300) to create the score function.
      The score function also allows clinical prescriptions to be assessed by setting the recommended dose equal to the clinician prescribed dose. In doing so, the function evaluates how close the MII outcome is from the optimal range (10 to 15), and if there is any room for improving the dose (Supplementary Information).

      Evaluating the performance of the model

      The performance of model-based recommendations was evaluated using the proposed score in the 20% reserved for testing the development database and in the prospective validation database. In both cases, two scores were computed for each patient. One score for the dose prescribed by the clinician and another score for the model- recommended dose. Absolute values of both scores were compared across all cases to identify which group (clinical or model recommended) had more scores closer to 0, being of no importance if the dose was too high or too low. The Wilcoxon signed-rank test was used for this purpose, as distribution of the scores was not normal. For an easier interpretation of the results, mean score values were expressed as 1 − |score|, where a value close to 1 is best. Scikit-learn 0.24 in Python 3.7.6 was used for all computations.

      Results

      Predictive and recommendation performance

      During the development phase of the model, the mean performance score for clinical doses was 0.83 (95% CI 0.82 to 0.84), and for model recommendations was 0.87 (95% 0.86 to 0.88; P = 2.44 e-10).
      During validation, the mean score for prescriptions was calculated to be 0.84 (95% CI 0.82 to 0.86), and for the model's recommendations 0.89 (95% CI 0.88 to 0.90; P = 3.81 e-05).

      Score and dosage analysis

      To further understand the performance of the model and of the clinical prescriptions, the mode was compared graphically, and clinicians’ score distributions were compared in the test set of the development database and in the validation one (Figure 1).
      Figure 1
      Figure 1Performance during development and validation. Clinical and model scores during development in panel A, and during validation in panel B. The development data includes the test results of the five cross validations.
      The model's score approached 0 (the best possible dose) more times than the clinicians’ dose, suggesting a dose higher than the one favoured by clinicians when not approaching 0. In 57.4% of cases in the test set and in 68.8% in the validation database, the dose rank was not modified in relation to the clinician-prescribed dose.
      How the dosage was changed from clinician prescription to model recommendation was further analysed in relation to the real outcome in number of MII (Figure 2).
      Figure 2
      Figure 2Dose ranks prescribed per range of metaphase II (MII) retrieved. Panel A: by the clinicians during development; panel B: by the model during development; panel C: by the clinicians during validation; and panel D: by the model during validation. The development data include the test results of the fives cross validations.
      The model tends to increase the dose for patients with low and sub-optimal oocyte retrievals, but also increases dosage for some of the hyper-responders.

      Discussion

      Currently, FSH dosage models include several recommended models that have provided optimistic results. Yet, some of them have not been tested by RCT; those that have (
      • Olivennes F.
      • Trew G.
      • Borini A.
      • Broekmans F.
      • Arriagada P.
      • Warne D.W.
      • Howles C.M.
      Randomized, controlled, open-label, non-inferiority study of the CONSORT algorithm for individualized dosing of follitropin alfa.
      ;
      • Allegra A.
      • Marino A.
      • Volpes A.
      • Coffaro F.
      • Scaglione P.
      • Gullo S.
      • La Marca A.
      A randomized controlled trial investigating the use of a predictive nomogram for the selection of the FSH starting dose in IVF/ICSI cycles.
      ;
      • Nyboe Andersen A.
      • Nelson S.M.
      • Fauser B.C.J.M.
      • García-Velasco J.A.
      • Klein B.M.
      • Arce J.C.
      • Arce J.C.
      Individualized versus conventional ovarian stimulation for in vitro fertilization: a multicenter, randomized, controlled, assessor-blinded, phase 3 noninferiority trial.
      ), however, have not been developed for use on all types of patients. The inclusion of only normo-ovulatory patients (
      • Howles C.M.
      • Saunders H.
      • Alam V.
      • Engrand P.
      Predictive factors and a corresponding treatment algorithm for controlled ovarian stimulation in patients treated with recombinant human follicle stimulating hormone (follitropin alfa) during assisted reproduction technology (ART) procedures. An analysis.
      ), or patients younger than 40 years with regular cycles (
      • La Marca A.
      • Papaleo E.
      • Grisendi V.
      • Argento C.
      • Giulini S.
      • Volpe A.
      Development of a nomogram based on markers of ovarian reserve for the individualisation of the follicle-stimulating hormone starting dose in in vitro fertilisation cycles.
      ;
      • Nyboe Andersen A.
      • Nelson S.M.
      • Fauser B.C.J.M.
      • García-Velasco J.A.
      • Klein B.M.
      • Arce J.C.
      • Arce J.C.
      Individualized versus conventional ovarian stimulation for in vitro fertilization: a multicenter, randomized, controlled, assessor-blinded, phase 3 noninferiority trial.
      ) restricts this new personalization of the first FSH dose to a small subset of patients. In this subset, however, this personalized dose finding is not as critical as for the excluded patients. As the model presented in this study includes every type of patient, the results are enhanced for all of them.
      In addition to age, AFC, AMH, BMI and presence of previous successful pregnancies as variables in the core model have been shown to be good predictors of the dose-response function slope. This value has already been used as ovarian sensitivity (oocytes recovered per unit of starting FSH) in the development of a monogram tested by RCT (
      • La Marca A.
      • Papaleo E.
      • Grisendi V.
      • Argento C.
      • Giulini S.
      • Volpe A.
      Development of a nomogram based on markers of ovarian reserve for the individualisation of the follicle-stimulating hormone starting dose in in vitro fertilisation cycles.
      ). Its use as an objective variable of the core model mitigates the confounding effect produced in any non-randomized treatment database, and that could lead a direct model (oocyte number as objective variable) to determine, for example, that higher doses, which are often prescribed for low-responders, lead to smaller oocyte yields. As in the present study, the treatment is tailored to the patient by the clinician, and it is especially important to account for it. Removal of this effect also allows for the extension of the recommender model for all types of patients so as not to confuse the model; on the contrary, the core model learns that extreme patients (low-responders and hyper-responders) have extreme ovarian potential values.
      Additionally, the model constructed around a linear approximation of the dose-response function enables the final user to select the number of MII desired to be retrieved, and then obtain the corresponding FSH dose recommendation. This opens the use of this model to all kinds of situations, not just those in which 12 MII are the desired result, as in the present study.
      As a separate contribution to an inclusive recommender model, we have developed a way to test in silico whether the model would improve results compared with historical data, as a step preceding an RCT. To this end, the performance score was designed to encode and automate faithfully an expert clinical assessment of treatment-recommendation-outcome combinations. In other words, it enables us to assert if a recommended dose could fare better than the one already prescribed, given the real result in retrieved MII. In this way, it is possible to estimate reliably whether the model has the possibility to improve current clinical practice. With this information, the investment in a well-designed RCT can be made more confidently. Additionally, results of the in-silico performance of the model are more informative than the sole prediction scores of the core recommender model.
      The scores of the present model were consistently better than those of clinical practice, both in the development and validation databases. This is of interest as the model holds its value even though the population of the validation database is significantly older. Therefore, it means that the core model has learnt the important aspects of the relationship between the patient's characteristics and her ovarian potential or slope in the dose-response function. It is worth noting that the most significant predicted improvement was for the patients whose oocyte yield was low or sub-optimal, in which doses are increased on average. Upon implementation, the system's recommendations may improve the average results and most probably avoid some cycle cancellations owing to lack of embryos for transfer.
      Detailed analysis of the behaviour of the model revealed its tendency, when incorrect, to overdose some patients. This contrasts with clinical practice, in which the tendency is to underdose when the prescription is inadequate. These instances of overestimations by the models correspond mainly to hyper-responder patient profiles, which are under-represented in our databases. As such, the algorithm could not learn appropriately owing to the lack of a sufficient sample size. Importantly, although the model does tend to overdose these patients, it still recommends the same or lower doses than the clinician in most of these cases, i.e. the clinician also tends to overdose. Nonetheless, we cannot dismiss the possibility that this could lead to a small increase in the risk of OHSS. This contrasts with previously published results in which RCT-tested models reduced the incidence of OHSS risk (
      • Olivennes F.
      • Trew G.
      • Borini A.
      • Broekmans F.
      • Arriagada P.
      • Warne D.W.
      • Howles C.M.
      Randomized, controlled, open-label, non-inferiority study of the CONSORT algorithm for individualized dosing of follitropin alfa.
      ;
      • Allegra A.
      • Marino A.
      • Volpes A.
      • Coffaro F.
      • Scaglione P.
      • Gullo S.
      • La Marca A.
      A randomized controlled trial investigating the use of a predictive nomogram for the selection of the FSH starting dose in IVF/ICSI cycles.
      ;
      • Nyboe Andersen A.
      • Nelson S.M.
      • Fauser B.C.J.M.
      • García-Velasco J.A.
      • Klein B.M.
      • Arce J.C.
      • Arce J.C.
      Individualized versus conventional ovarian stimulation for in vitro fertilization: a multicenter, randomized, controlled, assessor-blinded, phase 3 noninferiority trial.
      ). Secondary results of these studies, however, failed to show an increase in either retrieved oocytes or pregnancy results, with one reporting a reduction in oocyte yield (
      • Olivennes F.
      • Trew G.
      • Borini A.
      • Broekmans F.
      • Arriagada P.
      • Warne D.W.
      • Howles C.M.
      Randomized, controlled, open-label, non-inferiority study of the CONSORT algorithm for individualized dosing of follitropin alfa.
      ). Although the risk of OHSS must be taken seriously, it is also true that it can be managed within a cycle with proper prevention, such as gonadotrophin releasing hormone agonist trigger. All things considered, perhaps a manageable risk for a small portion of patients could be a fair trade-off to avoid a lack of embryos suitable for transfer for other patients.
      Further analysis of the instances in which the model made a suboptimal suggestion led to another conclusion. Instances in which the model had negative error scores seem to coincide frequently with negative error scores for the clinician's prescription. Analysis of these cases in more detail produced a profile of patients with good markers and an unexplained low retrieval of oocytes. This could possibly be related to undiagnosed genetic polymorphisms in the FSHR or LHB genes (
      • Lledo B.
      • Ortiz J.A.
      • Llacer J.
      • Bernabeu R.
      Pharmacogenetics of ovarian response.
      ), which, obviously, neither the clinicians nor the model could detect.
      Despite the possible limitations of the system, it is encouraging that the preliminary results show, in most cases, a similar or better performance score of the model's recommendation compared with the dose prescribed by the clinician.
      In conclusion, clinicians prescribe the first FSH dose for each patient based on their characteristics, reserve markers and their own experience with similar cases. Although most of the time they prescribe the dose necessary for an optimal result, sometimes the outcome can unexpectedly vary and fall into suboptimal or extreme ranges. Our model could avoid most of these deviations by analysing the patient's profile and making suggestions for the medical professional to assess.
      Once tested and its performance confirmed by RCT, the machine learning model that we have developed could be used as a training and learning tool for new clinicians and could serve as quality control for experienced ones; furthermore, it could provide a second opinion as the information could be useful in peer-to-peer case discussions.

      Acknowledgements

      We would like to thank Dr Maria Jesús López for kindly lending us her time and her expertise on ovarian stimulation protocols.

      Funding

      This work was supported by Doctorat Industrial funded by Generalitat de Catalunya [DI-2019-24], by project CI-SUSTAIN funded by the Spanish Ministry of Science and Innovation [PID2019-104156GB-I00], by EUROVA Innovative Training Network (MSCA-ITN-2019-860960), and by intramural funding by Clínica Eugin-Eugin Group.

      Appendix. Supplementary materials

      References

        • Allegra A.
        • Marino A.
        • Volpes A.
        • Coffaro F.
        • Scaglione P.
        • Gullo S.
        • La Marca A.
        A randomized controlled trial investigating the use of a predictive nomogram for the selection of the FSH starting dose in IVF/ICSI cycles.
        Reproductive BioMedicine Online. 2017; 34: 429-438https://doi.org/10.1016/j.rbmo.2017.01.012
        • Bastu E.
        • Celik C.
        • Keskin G.
        • Buyru F.
        Evaluation of embryo transfer time (day 2 vs day 3) after imposed single embryo transfer legislation: When to transfer?.
        Journal of Obstetrics and Gynaecology. 2013; 33: 387-390https://doi.org/10.3109/01443615.2012.761186
        • Chamayou S.
        • Sicali M.
        • Alecci C.
        • Ragolia C.
        • Liprino A.
        • Nibali D.
        • Guglielmino A.
        The accumulation of vitrified oocytes is a strategy to increase the number of euploid available blastocysts for transfer after preimplantation genetic testing.
        Journal of Assisted Reproduction and Genetics. 2017; 34: 479-486https://doi.org/10.1007/s10815-016-0868-0
        • Chen Y.
        • Wang Q.
        • Zhang Y.
        • Han X.
        • Li D.
        • Zhang C.
        Cumulative live birth and surplus embryo incidence after frozen-thaw cycles in PCOS: how many oocytes do we need?.
        Journal of Assisted Reproduction and Genetics. 2017; 34 (huinanhan): 1153-1159https://doi.org/10.1007/s10815-017-0959-6
        • De Geyter C.
        • Calhaz-Jorge C.
        • Kupka M.S.
        • Wyns C.
        • Mocanu E.
        • Motrenko T.
        • Baranowski R.
        ART in Europe, 2014: Results generated from European registries by ESHRE.
        Human Reproduction. 2018; 33: 1586-1601https://doi.org/10.1093/humrep/dey242
        • Drakopoulos P.
        • Blockeel C.
        • Stoop D.
        • Camus M.
        • De Vos M.
        • Tournaye H.
        • Polyzos N.P.
        Conventional ovarian stimulation and single embryo transfer for IVF/ICSI. How many oocytes do we need to maximize cumulative live birth rates after utilization of all fresh and frozen embryos?.
        Human Reproduction. 2016; 31: 370-376https://doi.org/10.1093/humrep/dev316
        • Ebid A.H.I.M.
        • Motaleb S.M.A.
        • Mostafa M.I.
        • Soliman M.M.A.
        Novel nomogram-based integrated gonadotropin therapy individualization in in vitro fertilization/intracytoplasmic sperm injection: A modeling approach.
        Clinical and Experimental Reproductive Medicine. 2021; 48: 163-173https://doi.org/10.5653/cerm.2020.03909
        • Esteves S.C.
        • Carvalho J.F.
        • Bento F.C.
        • Santos J.
        A novel predictive model to estimate the number of mature oocytes required for obtaining at least one euploid blastocyst for transfer in couples undergoing in vitro fertilization/intracytoplasmic sperm injection: The ART calculator.
        Frontiers in Endocrinology. 2019; 10: 1-14https://doi.org/10.3389/fendo.2019.00099
        • Fleming R.
        • Deshpande N.
        • Traynor I.
        • Yates R.W.S.
        Dynamics of FSH-induced follicular growth in subfertile women: Relationship with age, insulin resistance, oocyte yield and anti-Mullerian hormone.
        Human Reproduction. 2006; 21: 1436-1441https://doi.org/10.1093/humrep/dei499
        • Harrison R.F.
        • Jacob S.
        • Spillane H.
        • Mallon E.
        • Hennelly B.
        A prospective randomized clinical trial of differing starter doses of recombinant follicle-stimulating hormone (follitropin-β) for first time in vitro fertilization and intracytoplasmic sperm injection treatment cycles.
        Fertility and Sterility. 2001; 75: 23-31https://doi.org/10.1016/S0015-0282(00)01643-5
        • Howles C.M.
        • Saunders H.
        • Alam V.
        • Engrand P.
        Predictive factors and a corresponding treatment algorithm for controlled ovarian stimulation in patients treated with recombinant human follicle stimulating hormone (follitropin alfa) during assisted reproduction technology (ART) procedures. An analysis.
        Current Medical Research and Opinion. 2006; 22: 907-918https://doi.org/10.1185/030079906X104678
        • Ji J.
        • Liu Y.
        • Tong X.H.
        • Luo L.
        • Ma J.
        • Chen Z.
        The optimum number of oocytes in IVF treatment: An analysis of 2455 cycles in China.
        Human Reproduction. 2013; 28: 2728-2734https://doi.org/10.1093/humrep/det303
        • La Marca A.
        • Papaleo E.
        • Grisendi V.
        • Argento C.
        • Giulini S.
        • Volpe A.
        Development of a nomogram based on markers of ovarian reserve for the individualisation of the follicle-stimulating hormone starting dose in in vitro fertilisation cycles.
        BJOG: An International Journal of Obstetrics and Gynaecology. 2012; 119: 1171-1179https://doi.org/10.1111/j.1471-0528.2012.03412.x
        • Lledo B.
        • Ortiz J.A.
        • Llacer J.
        • Bernabeu R.
        Pharmacogenetics of ovarian response.
        Pharmacogenomics. 2014; 15: 885-893https://doi.org/10.2217/pgs.14.49
        • Maggiulli R.
        • Cimadomo D.
        • Fabozzi G.
        • Papini L.
        • Dovere L.
        • Ubaldi F.M.
        • Rienzi L.
        The effect of ICSI-related procedural timings and operators on the outcome.
        Human Reproduction. 2020; 35: 32-43https://doi.org/10.1093/humrep/dez234
        • Naether O.G.J.
        • Tandler-Schneider A.
        • Bilger W.
        Individualized recombinant human follicle-stimulating hormone dosing using the CONSORT calculator in assisted reproductive technology: A large, multicenter, observational study of routine clinical practice.
        Drug, Healthcare and Patient Safety. 2015; 7: 69-76https://doi.org/10.2147/DHPS.S77320
        • Nyboe Andersen A.
        • Nelson S.M.
        • Fauser B.C.J.M.
        • García-Velasco J.A.
        • Klein B.M.
        • Arce J.C.
        • Arce J.C.
        Individualized versus conventional ovarian stimulation for in vitro fertilization: a multicenter, randomized, controlled, assessor-blinded, phase 3 noninferiority trial.
        Fertility and Sterility. 2017; 107 (e4): 387-396https://doi.org/10.1016/j.fertnstert.2016.10.033
        • Olivennes F.
        • Trew G.
        • Borini A.
        • Broekmans F.
        • Arriagada P.
        • Warne D.W.
        • Howles C.M.
        Randomized, controlled, open-label, non-inferiority study of the CONSORT algorithm for individualized dosing of follitropin alfa.
        Reproductive BioMedicine Online. 2015; 30: 248-257https://doi.org/10.1016/j.rbmo.2014.11.013
        • Polyzos N.P.
        • Sunkara S.K.
        Sub-optimal responders following controlled ovarian stimulation: An overlooked group?.
        Human Reproduction. 2015; 30: 2005-2008https://doi.org/10.1093/humrep/dev149
        • Pouly J.L.
        • Olivennes F.
        • Massin N.
        • Celle M.
        • Caizergues N.
        • Contard F.
        Usability and utility of the CONSORT calculator for FSH starting doses: A prospective observational study.
        Reproductive BioMedicine Online. 2015; https://doi.org/10.1016/j.rbmo.2015.06.001
        • Steward R.G.
        • Lan L.
        • Shah A.A.
        • Yeh J.S.
        • Price T.M.
        • Goldfarb J.M.
        • Muasher S.J.
        Oocyte number as a predictor for ovarian hyperstimulation syndrome and live birth: An analysis of 256,381 in vitro fertilization cycles.
        Fertility and Sterility. 2014; 101: 967-973https://doi.org/10.1016/j.fertnstert.2013.12.026
        • Sunkara S.K.
        • Rittenberg V.
        • Raine-Fenning N.
        • Bhattacharya S.
        • Zamora J.
        • Coomarasamy A.
        Association between the number of eggs and live birth in IVF treatment: An analysis of 400 135 treatment cycles.
        Human Reproduction. 2011; 26: 1768-1774https://doi.org/10.1093/humrep/der106
        • Vaiarelli A.
        • Cimadomo D.
        • Conforti A.
        • Schimberni M.
        • Giuliani M.
        • D'Alessandro P.
        • Ubaldi F.M.
        Luteal phase after conventional stimulation in the same ovarian cycle might improve the management of poor responder patients fulfilling the Bologna criteria: a case series.
        Fertility and Sterility. 2020; 113: 121-130https://doi.org/10.1016/j.fertnstert.2019.09.012

      Biography

      Núria Correa is senior clinical embryologist and researcher at the R&D department of the Eugin Group. She is a PhD candidate at the Universitat Autònoma de Barcelona, working on a research project centred on the application of artificial intelligence in assisted reproduction.
      Key message
      A machine learning model was trained to recommend first FSH doses for ovarian stimulation. Compared with clinicians, the model achieved consistently better performance scores. The model could be used as a second opinion and as a learning tool for new clinicians to avoid as many non-optimal outcomes as possible.