Advertisement

Artificial intelligence in scientific writing: a friend or a foe?

  • Signe Altmäe
    Correspondence
    Corresponding author.
    Affiliations
    Department of Biochemistry and Molecular Biology, Faculty of Sciences, University of Granada, Granada, Spain

    Instituto de Investigación Biosanitaria ibs.GRANADA, Granada, Spain

    Division of Obstetrics and Gynecology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden
    Search for articles by this author
  • Alberto Sola-Leyva
    Affiliations
    Department of Biochemistry and Molecular Biology, Faculty of Sciences, University of Granada, Granada, Spain

    Instituto de Investigación Biosanitaria ibs.GRANADA, Granada, Spain
    Search for articles by this author
  • Andres Salumets
    Affiliations
    Division of Obstetrics and Gynecology, Department of Clinical Science, Intervention and Technology (CLINTEC), Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden

    Competence Centre on Health Technologies, Tartu, Estonia

    Department of Obstetrics and Gynaecology, Institute of Clinical Medicine, University of Tartu, Tartu, Estonia
    Search for articles by this author
Published:April 19, 2023DOI:https://doi.org/10.1016/j.rbmo.2023.04.009

      Abstract

      The generative pre-trained transformer, ChatGPT, is a chatbot that could serve as a powerful tool in scientific writing. ChatGPT is a so-called large language model (LLM) that is trained to mimic the statistical patterns of language in an enormous database of human-generated text combined from text in books, articles and websites across a wide range of domains. ChatGPT can assist scientists with material organization, draft creation and proofreading, making it a valuable tool in research and publishing. This paper discusses the use of this artificial intelligence (AI) chatbot in academic writing by presenting one simplified example. Specifically, it reflects our experience of using ChatGPT to draft a scientific article for Reproductive BioMedicine Online and highlights the pros, cons and concerns associated with using LLM-based AI for generating a manuscript.

      Key words

      Introduction

      The introduction of the generative pre-trained transformer ChatGPT in November 2022 by OpenAI shook the scientific world. It was instantly acknowledged as a new level of tool that artificial intelligence (AI) can provide for seeking online for information, answers and solutions (
      • Macdonald C.
      • Adeloye D.
      • Sheikh A.
      • Rudan I.
      Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis.
      ). ChatGPT is a type of chatbot designed to provide natural language-processing capabilities for a wide range of applications. It is a large language model (LLM) that generates sentences based on mimicking the statistical patterns of language in an enormous database of human-generated text combined from text from books, articles and websites across a wide range of domains (
      • Stokel-Walker C.
      ChatGPT listed as author on research papers: many scientists disapprove.
      ). In addition to ChatGPT, there are other free LLM platforms such as cohere.com, writesonic.com (100 free generations per month), you.com (10 free extracts of writing per day) and anthropic.com.
      Language-based AI has already entered the academic community. Many researchers are believed to use chatbots as research assistants to help in organizing their thoughts, provide feedback on their work, help with code writing and summarize the research literature (
      • Hutson M.
      Could AI help you to write your next paper?.
      ). Regardless of the lack of guidelines for its use in scientific writing, and the debate over whether ChatGPT should be listed as a co-author or how/whether to acknowledge its use, four manuscripts have been reported in preprint where ChatGPT has been credited as an author (
      • Stokel-Walker C.
      ChatGPT listed as author on research papers: many scientists disapprove.
      ), and one article has documented the use of AI for generating a scientific paper (
      • Getahun H.
      After an AI bot wrote a scientific paper on itself, the researcher behind the experiment says she hopes she didn't open a “pandora's box.
      ). These LLMs offer a powerful tool for a wide range of applications and demonstrate the potential of these models to transform the way we interact with language, scientific writing and science, which may lead to another revolutionary stage of progress in science.
      This paper reflects on the experience of using ChatGPT to draft a research article for Reproductive BioMedicine Online. We highlight the pros and cons of AI for manuscript generation and discuss potential concerns regarding the use of the LLM platform alongside our writing. For this test, it was decided to generate a simulated study comprising of 10,000 women undergoing IVF cycles who had been tested for endometrial receptivity; the main focus of the study was to be whether the state of the endometrium (receptive/non-receptive) influences the outcome of treatment (successful pregnancy/no pregnancy). The data were wholly fabricated. ChatGPT was then used to guide us in deciding how to proceed with the data and to help to draft the manuscript. Using ChatGPT it took the authors a total of 15 min to create the initial chat and the inputs, and to obtain the output texts for the manuscript.

      Describing the study data

      First, the simulated dataset was described. The conversation with ChatGPT was as follows (all conversations are unmodified text):SA: We have a dataset consisting of a cohort of 10000 women, some have receptive endometrium and some have not. We also have data if the embryo transfer resulted in a successful pregnancy or not.ChatGPT: Great! Having this data set on women and the success of their embryo transfer can be useful for analyzing factors that contribute to successful pregnancy outcomes.One possible analysis that could be done with this data is to examine if there is a difference in the success rates of embryo transfer between women with receptive endometrium and those without. This can be done using statistical tests such as a chi-square test or logistic regression.Additionally, it may be helpful to investigate other factors that could potentially influence successful pregnancy outcomes, such as age, BMI, previous pregnancies, and any underlying medical conditions. These factors could be included as covariates in the statistical analysis to see if they have a significant effect on the relationship between receptive endometrium and successful embryo transfer.Overall, the analysis of this data set could provide insights into the factors that contribute to successful pregnancy outcomes and inform clinical practice for improving embryo transfer success rates.
      Next, the data were specified:SA: Our data were as follows: 5500 women with receptive phase endometrium and 4500 women with non-receptive endometrium. Our analysis using chi-square test showed that as 80% of women with receptive phase endometrium (n=4400) had successful pregnancy while 40% of women with non-receptive endometrium (n=1800) achieved successful pregnancy, the differences were significant.ChatGPT: Based on the results of your analysis using the chi-square test, it appears that there is a significant difference in the success rates of embryo transfer between women with receptive phase endometrium and those without. Specifically, 80% of women with receptive endometrium achieved a successful pregnancy, while only 40% of women without receptive endometrium did.This difference in success rates may suggest that having a receptive endometrium is an important factor for successful embryo transfer and subsequent pregnancy. However, it is important to note that other factors, such as age, BMI, and medical history, may also be contributing to the observed differences in success rates.Further analyses, such as logistic regression, could be performed to explore the relationship between endometrial receptivity and successful pregnancy outcomes while controlling for potential confounding variables.
      Based on this example it can be seen that, once the dataset has been explained, the platform can guide and help scientists in choosing proper study analysis methods, and additionally suggest important factors that should be considered in the analysis. The current paper did not test the platform's capacity for more advanced statistical approaches, but it has been demonstrated that ChatGPT is able to generate codes for using the programming language R (
      • Macdonald C.
      • Adeloye D.
      • Sheikh A.
      • Rudan I.
      Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis.
      ).

      Composing the abstract

      Based on the few lines of explanation in the invented dataset (above), we started to test ChatGPT's ability to write a scientific paper with all its sections, starting with the Abstract:SA: Based on these data, could you please draft an abstract for Reproductive Biomedicine Online journal?ChatGPT: Sure, here's a possible draft of an abstract for the Reproductive Biomedicine Online journal based on the data you provided:Title: The Impact of Endometrial Receptivity on Successful Embryo Transfer in Women: A Retrospective Cohort StudyAbstract: This retrospective cohort study aimed to investigate the impact of endometrial receptivity on successful embryo transfer in women. The study included 10,000 women who underwent embryo transfer and had data on endometrial receptivity and pregnancy outcomes. Of these, 5500 women had receptive endometrium, while 4500 women did not. The chi-square test revealed a significant difference in the success rates of embryo transfer between the two groups, with 80% of women with receptive endometrium achieving a successful pregnancy, compared to only 40% of women without receptive endometrium. These findings suggest that endometrial receptivity may play an important role in the success of embryo transfer and subsequent pregnancy. Further research, such as logistic regression analysis, may help to identify other factors that could impact successful pregnancy outcomes in women undergoing embryo transfer. These results have important implications for clinical practice, as they highlight the need to consider endometrial receptivity when assessing the likelihood of successful pregnancy following embryo transfer.
      The generated Abstract needs an additional one or two sentences of introduction about the topic, but as a whole the text is acceptable. The unstructured format is valid for Review Articles for the journal, whereas Original Articles would require a structured abstract with subheadings. The particular form for the Abstract was chosen for the current journal, but it can easily be changed to any journal's format of interest.
      In addition to constructing an Abstract, ChatGPT also suggested a title: ‘The impact of endometrial receptivity on successful embryo transfer in women: a retrospective cohort study’, which sounds highly appropriate considering the data that were provided.

      Conducting a literature search

      As a next step, ChatGPT was asked to perform a literature search for finding relevant studies in the field that could be integrated into the manuscript:SA: Could you help me finding some papers that have performed similar studies, and help to create a bibliography?ChatGPT: Sure, here are a few papers that have performed similar studies on the relationship between endometrial receptivity and successful embryo transfer:
      • 1.
        Elizabete, V., Armanda, C., Rodrigo, C., João, M., Miguel, V., Ana, P., ... & Ana, R. (2015). Impact of endometrial thickness and pattern on pregnancy outcome after in vitro fertilization and embryo transfer. Journal of Assisted Reproduction and Genetics, 32(2), 287-293.
      • 2.
        Kasius, A., Smit, J. G., Torrance, H. L., Eijkemans, M. J., & Mol, B. W. (2014). Endometrial thickness and pregnancy rates after IVF: a systematic review and meta-analysis. Human Reproduction Update, 20(4), 530-541.
      • 3.
        Zhang, Q., Liu, Y., Sun, Y., & Yan, L. (2015). The association between endometrial thickness and outcome of in vitro fertilization/intracytoplasmic sperm injection cycles: a systematic review and meta-analysis. International Journal of Clinical and Experimental Medicine, 8(4), 5277-5288.
      • 4.
        Yu, X., Chen, Y., Zhang, X., Li, Y., & Zhao, Y. (2015). Endometrial thickness and its relationship with the outcome of in vitro fertilization and embryo transfer in fresh and frozen-thawed embryo transfer cycles. Acta Obstetricia et Gynecologica Scandinavica, 94(12), 1292-1297.
      As one can see the suggested papers are of relevance for the outcomes of embryo transfer and IVF but they are not relevant in the current study setting where the aim was to assess the effect of endometrial receptivity on the success of IVF treatment. Therefore, the following query was specified:SA: The studies you highlight focus on endometrial thickness. Could you make a new search of similar studies and bibliography focussing on endometrial receptivity?ChatGPT: Certainly! Here are some studies that focus on endometrial receptivity and successful embryo transfer, along with their corresponding bibliographies:
      • 1.
        Lessey, B. A. (2000). The role of endometrial receptivity in implantation. Human Reproduction, 15(suppl_6), 87-95.
      • 2.
        Norwitz, E. R., Schust, D. J., & Fisher, S. J. (2001). Implantation and the survival of early pregnancy. New England Journal of Medicine, 345(19), 1400-1408.
      • 3.
        Gzgzyan, A. M., Safonova, Y. A., Melnikova, N. V., & Rudakova, E. B. (2018). Endometrial receptivity in the course of infertility treatment: current approaches and prospects. Gynecological Endocrinology, 34(sup1), 21-25.
      • 4.
        Nikas, G., Develioglu, O. H., Toner, J. P., Jones Jr, H. W., & Check, J. H. (2000). Endometrial receptivity defects during IVF cycles with and without ovarian stimulation. Journal of Assisted Reproduction and Genetics, 17(6), 347-352.
      In both bibliography searches, it was found that the references provided did not, in most cases, exist – and indeed this, in our opinion, is one of the biggest concerns when using ChatGPT. Out of the eight references provided within the two searches, 25% (n = 2) of the references were correct while 75% (n = 6) did not exist in the PubMed/Google databases. In fact, it has been consistently reported that created references do not exist in the form presented, although they have similar titles to some existing references (
      • Macdonald C.
      • Adeloye D.
      • Sheikh A.
      • Rudan I.
      Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis.
      ). Furthermore, the current ChatGPT version was trained in 2021 (
      • Nath S.
      • Marie A.
      • Ellershaw S.
      • Korot E.
      • Keane P.A.
      New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology.
      ), meaning that more recent papers might not be included.
      We therefore tested the system further by asking for newer references:SA: Could you search for studies also among the last years, like 2022 and 2023?ChatGPT: Sure, here are some recent studies on endometrial receptivity and successful embryo transfer published in 2022 and 2023:
      • 1.
        Li, Y., Liu, J., Zhang, Q., Sun, J., Hu, J., Gao, Y., ... & Yang, L. (2022). Endometrial receptivity analysis using transcriptomics and microbiome analysis in women with unexplained infertility. BMC Women's Health, 22(1), 17.
      • 2.
        Dai, W., Han, Q., Wang, Y., Xu, B., Sun, Y., & Zhang, H. (2022). Endometrial receptivity analysis using a non-invasive biopsy approach in IVF/ICSI cycles: a prospective study. Reproductive BioMedicine Online, 45(2), 362-370.
      • 3.
        Liu, Y., Han, Y., Wang, Y., Zhang, S., Gao, L., Zhao, X., ... & Liu, Y. (2023). The role of endometrial microbiota in embryo implantation and pregnancy outcomes in IVF/ICSI cycles. Human Reproduction, 38(1), 136-145.
      • 4.
        Hu, Y., Wang, Y., Zhang, S., Yang, X., Zou, J., Wei, D., ... & Sun, Y. (2022). Endometrial receptivity signature based on the endometrial microbiota and transcriptomics predicts pregnancy outcomes in IVF/ICSI cycles. Fertility and Sterility, 118(4), 768-778.
      It can be seen that even though the references are new (with a main focus on the microbiome), they do not match with the study's scope of the effect of endometrial receptivity on the outcome of IVF treatment. Furthermore, none of the four references ChatGPT provided exists in PubMed/Google. It is evident that the suggested references are inaccurate, so authors should adopt a critical approach towards the technology, which evidently requires further development.
      In short, the references currently provided by ChatGPT cannot be trusted and need to be meticulously revised. To achieve truly reliable responses, LLM need to be trained on specific knowledge domains, which can be a complex and costly processes. Stanford researchers and the company MosaicML are collaborating to develop a PubMed GPT (
      • Alvero R.
      ChatGPT: Rumors of Human Providers’ Demise Have Been Greatly Exaggerated.
      ). However, it is acknowledged that, due to the extensive parameters involved in such a PubMed GPT, compromises need to be made between the model's intricacy, the expense and the need for a specialized architecture (
      • Gupta K.
      Stanford and MosaicML Researchers Announce the Release of PubMed GPT, a Purpose-Built AI Model Trained to.
      ).

      Composing the Introduction for a scientific paper

      As a next step, we asked ChatGPT to prepare an Introduction:SA: Could you write a short introduction about endometrial receptivity and its importance for embryo transfer?ChatGPT:Certainly!Endometrial receptivity refers to the window of time during a menstrual cycle when the endometrium, the lining of the uterus, is capable of supporting embryo implantation and pregnancy. It is a critical factor in successful embryo transfer and is influenced by a variety of hormonal, molecular, and cellular factors. Identifying the optimal time for embryo transfer based on endometrial receptivity can improve the chances of successful pregnancy and reduce the risk of implantation failure and miscarriage. Endometrial receptivity assessment is therefore an important tool in assisted reproductive technology (ART) and fertility treatments.
      As an Introduction should include references, this input was also specified:SA: Could you please write again a short introduction about endometrial receptivity and its importance for embryo transfer with adding references?ChatGPT:Certainly!Endometrial receptivity refers to the window of time during a menstrual cycle when the endometrium, the lining of the uterus, is capable of supporting embryo implantation and pregnancy. It is a critical factor in successful embryo transfer and is influenced by a variety of hormonal, molecular, and cellular factors. Identifying the optimal time for embryo transfer based on endometrial receptivity can improve the chances of successful pregnancy and reduce the risk of implantation failure and miscarriage. Endometrial receptivity assessment is therefore an important tool in assisted reproductive technology (ART) and fertility treatments.Several studies have demonstrated the importance of assessing endometrial receptivity in optimizing the timing of embryo transfer. In a study by Ruiz-Alonso et al., endometrial receptivity analysis using transcriptomic biomarkers improved pregnancy rates by up to 23% in women undergoing in vitro fertilization (IVF) treatment (1). Similarly, another study by Haouzi et al. found that endometrial receptivity analysis improved the success rate of frozen embryo transfers by up to 29% (2).Furthermore, other studies have investigated the molecular mechanisms and cellular processes involved in endometrial receptivity. For example, a study by Miravet-Valenciano et al. explored the role of microRNAs in endometrial receptivity and found that specific microRNAs were differentially expressed during the receptive phase of the menstrual cycle (3). Another study by Grewal et al. investigated the impact of endometrial microbiota on endometrial receptivity and found that microbial dysbiosis was associated with implantation failure and pregnancy loss (4).Overall, assessing endometrial receptivity is a crucial step in optimizing the timing of embryo transfer and improving the success rates of ART and fertility treatments.References:
      • 1.
        Ruiz-Alonso et al. Fertility and Sterility. 2013; 99(3): 801-7.
      • 2.
        Haouzi et al. Human Reproduction. 2013; 28(3): 761-9.
      • 3.
        Miravet-Valenciano et al. Fertility and Sterility. 2017; 107(1): 246-55.e5.
      • 4.
        Grewal et al. Journal of Assisted Reproduction and Genetics. 2020; 37(8): 1931-9.
      A critical reader will notice that the Introduction is not exactly what would be expected for the current study idea, but some concepts and sentences could be used and it definitely provides ideas and sentences that would help with the writing process. Again, the revision of the references resulted in disappointing results, with none of the provided references actually existing.

      Writing the Methods section

      SA: Based on our study data provided before, could you write a methods section for the journalChatGPT: Certainly! Here's a draft of a methods section for the study based on the data you provided:MethodsStudy populationThe study population consisted of 10,000 women who underwent embryo transfer at a single fertility clinic between January 1, 20XX and December 31, 20XX. The inclusion criteria were women of reproductive age who underwent embryo transfer with fresh or frozen embryos. Women with a history of recurrent implantation failure, uterine anomalies, or endometrial pathologies were excluded from the study.Endometrial receptivity assessmentEndometrial receptivity was assessed by histologic evaluation of endometrial biopsy samples obtained prior to embryo transfer. Endometrial biopsy samples were obtained using a Pipelle catheter during the mid-luteal phase of the menstrual cycle (day 21 to 24). The samples were fixed in formalin, embedded in paraffin, and stained with hematoxylin and eosin. Endometrial receptivity was determined by the presence or absence of morphological changes consistent with the receptive phase of the menstrual cycle, including the presence of uterodomes, secretion in the glandular epithelium, and edema in the stroma.Pregnancy outcome assessmentPregnancy outcome was assessed by measurement of serum beta-human chorionic gonadotropin (β-hCG) levels and transvaginal ultrasound examination. Serum β-hCG levels were measured 14 days after embryo transfer, and a value greater than 25 IU/L was considered a positive pregnancy test. Transvaginal ultrasound examination was performed 4 to 5 weeks after embryo transfer to confirm the presence of a gestational sac and fetal heartbeat.Statistical analysisData were analyzed using the chi-square test to compare the proportion of women with successful pregnancy between the receptive and non-receptive endometrium groups. A P-value of less than 0.05 was considered statistically significant. All statistical analyses were performed using SPSS version XX (IBM Corp., Armonk, NY, USA).EthicsThis study was approved by the Institutional Review Board of [Name of Institution]. All patients provided written informed consent prior to participation in the study. Confidentiality and anonymity of the participants were maintained throughout the study.
      As only two sentences on study data had been given to ChatGPT, it was surprising to obtain a generated output with so much text, all of which would be relevant for a manuscript. For instance, the text on the inclusion and exclusion criteria, endometrial receptivity assessment, pregnancy outcome assessment and ethical statement is helpful in guiding authors.

      Composing the Results and Discussion sections

      SA: Based on my study data, could you also compose results and discussion section for the Reproductive Biomedicine Online journal?ChatGPT: Certainly! Here's a draft of the results and discussion sections for the study based on the data you provided:ResultsOf the 10,000 women included in the study, 5500 (55%) had receptive phase endometrium and 4500 (45%) had non-receptive endometrium. Of the 5500 women with receptive phase endometrium, 4400 (80%) had successful pregnancy, while of the 4500 women with non-receptive endometrium, 1800 (40%) had successful pregnancy. The difference in the proportion of successful pregnancies between the two groups was statistically significant (P < 0.001) according to the chi-square test.DiscussionThe present study investigated the association between endometrial receptivity and pregnancy outcome in women undergoing embryo transfer. Our results demonstrate that women with receptive phase endometrium had a significantly higher proportion of successful pregnancies compared to women with non-receptive endometrium. These findings are consistent with previous studies that have shown a positive association between endometrial receptivity and pregnancy outcome (1-3).The endometrium undergoes dynamic changes throughout the menstrual cycle, and the receptive phase of the endometrium is a critical window for successful embryo implantation (4). The presence of uterodomes, secretion in the glandular epithelium, and edema in the stroma are all morphological changes that occur during the receptive phase and are indicative of a favorable endometrial environment for embryo implantation (5). The histologic evaluation of endometrial biopsy samples obtained during the mid-luteal phase of the menstrual cycle is a reliable method for assessing endometrial receptivity (6).Our findings have important clinical implications for the management of women undergoing assisted reproductive technologies (ART) such as in vitro fertilization (IVF). Endometrial receptivity testing may be used to identify women with a higher likelihood of successful pregnancy and optimize the timing of embryo transfer (7). Additionally, endometrial receptivity may serve as a potential target for therapeutic interventions aimed at improving pregnancy outcomes in women undergoing ART (8).In conclusion, our study provides further evidence for the importance of endometrial receptivity in the success of embryo transfer and supports the use of endometrial receptivity testing in the management of women undergoing ART. Future research is needed to explore potential interventions to improve endometrial receptivity and ultimately improve pregnancy outcomes in this population.References
      • 1.
        Haouzi D, Assou S, Mahmoud K, et al. Gene expression profile of human endometrial receptivity: comparison between natural and stimulated cycles for the same patients. Hum Reprod. 2009;24(6):1436-1445.
      • 2.
        Figueira RCS, Setti AS, Braga DPAF, et al. Is the receptivity of the endometrium impaired in obese patients undergoing IVF? Reprod Biol Endocrinol. 2011;9(1):123.
      • 3.
        Zeng Y, Chen X, Liu Y, et al. Endometrial thickness, morphology, and molecular markers may predict unsuccessful pregnancy with IVF. Reprod Biol Endocrinol. 2018;16(1):114.
      • 4.
        Lessey BA. Assessment of endometrial receptivity. Fertil Steril. 2011;96(3):522-529.
      • 5.
        Maccarrone M, Bari M, Battista N, et al. The key role of the endocannabinoid system in human implantation. J Reprod Immunol. 2014;104-105:9-16.
      • 6.
        Ruiz-Alonso M, Blesa D, Diaz-Gimeno P, et al.
      Again, the references are discouraging: the text refers to eight references but the output provides only six, and even then the final one is incomplete. Further checking shows that only two of the references exist.
      The Discussion text is not ‘perfect’, but it certainly provides helpful concepts and sentences to help to build a critical discussion section. The use of term ‘uterodome’ is interesting. This term was not familiar to the authors and after a Google/PubMed search we understood that it meant ‘pinopodes’, and only 13 studies were found in PubMed using uterodome. It is intriguing, however, how the AI algorithm found the rare expression ‘uterodome’ instead of the commonly used term ‘pinopodes’.

      Assessing plagiarism in the ChatGPT-generated text

      Plagiarism of original content is undoubtedly a major concern when using AI, and while ChatGPT itself has not been detected to commit extensive plagiarism, AI journalism as a whole has been reported to engage in this (
      • Christian J.
      CNET's AI Journalist Appears to Have Committed Extensive Plagiarism.
      ). This raises the question of whether there should be a threshold for the acceptable amount of AI-generated content, and also whether its frequent use, in the long term, would result in the production of similar paragraphs and structures in papers within the same field (
      • Macdonald C.
      • Adeloye D.
      • Sheikh A.
      • Rudan I.
      Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis.
      ).
      ChatGPT was therefore asked to assess the level of plagiarism:SA: Could you assess the level of plagiarism in the last text of Results and Discussion?ChatGPT: As an AI language model, I don't have access to plagiarism detection tools or databases to check for similarity between the text and other sources. However, I can confirm that the text includes proper citations and references to previously published studies, which is an important aspect of avoiding plagiarism. It's always recommended to use proper citation and referencing when using information or ideas from other sources to avoid any issues related to plagiarism.
      We set out to run our own plagiarism checking using turnitin.com, and to our surprise the similarity index was only 19%, with the main areas of similarity being detected in the Materials section, which is to be expected (see the Supplementary File for the Turnitin report). In short, the similarity check was acceptable and, based on the plagiarism checking that was applied, the manuscript would pass an editorial office scrutiny.

      Potentially flawed AI detection

      It can be challenging to distinguish between AI-generated and original texts (
      • Else H.
      Abstracts writtten by ChatGPT fool scientists.
      ). However, a number of tools, including GPTZero, GPT-2 Output Detector and several AI Detectors (writer.com, copyleaks.com, sapling.ai, contentatscale.ai, openai.com, crossplag.com), have been developed to determine whether a given text is generated by a language model. These tools classify the paper as either ‘real’ (human-generated) or ‘fake’ (AI-generated), with a percentage representing the confidence level (
      • Anderson N.
      • Belavy D.L.
      • Perle S.M.
      • Hendricks S.
      • Hespanhol L.
      • Verhagen E.
      • Memon A.R.
      AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in Sports & Exercise Medicine manuscript generation.
      ). A previous study used AI-generated text and, after applying paraphrasing-AI software, found that the detection rate of ‘real’ text increased from 0.02% to 99.52% (
      • Anderson N.
      • Belavy D.L.
      • Perle S.M.
      • Hendricks S.
      • Hespanhol L.
      • Verhagen E.
      • Memon A.R.
      AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in Sports & Exercise Medicine manuscript generation.
      ). This clearly suggests that new protection methods may need to be developed and implemented for AI detection.
      It has been shown that ChatGPT texts of research papers can pass plagiarism detectors with a 100% originality score (
      • Else H.
      Abstracts writtten by ChatGPT fool scientists.
      ). This leads to significant concerns regarding authorship and publication ethics, as scientists can no longer be certain if the text is an original work by a scientist or has been generated by AI (
      • Sample I.
      Science journals ban listing of ChatGPT as co-author on papers.
      ). Some scientific journals have already warned that they will review texts in order to avoid the copy and paste approach of AI-generated texts (

      ICMJE, I.C. of M.J.E., n.d. Defining the Role of Authors and Contributors https://www.icmje.org/recommendations/browse/roles.

      ), while others support AI systems to provide a digital watermark on their creations, or just support transparency in scientific writing via an adequate acknowledgement of LLM use. As this race to control and find new solutions for rephrasing AI tools is never-ending, we believe that, instead of developing new tools to control the situation, we should focus on informing and training authors how to adequately use LLM platforms.

      Co-author or a helpful tool?

      SA: Could you review this paper (the texts generated such as Abstract, Introduction, Methods, Results and Discussion) and tell us if you agree to submit it to Reproductive Biomedicine Online journal?ChatGPT: As an AI language model, I cannot make a final decision on whether to submit a paper to a specific journal. However, I can provide some general feedback on the paper based on the content generated:
      • The abstract provides a clear summary of the study's aims, methods, and key findings.
      • The introduction provides a concise overview of the importance of endometrial receptivity in embryo transfer and contextualizes the study's aims within the existing literature.
      • The methods section provides a clear description of the study design, data collection, and statistical analysis.
      • The results section presents the key findings in a clear and concise manner and includes appropriate statistical analysis.
      • The discussion section provides a thorough interpretation of the findings and their implications for clinical practice, with reference to the existing literature.Overall, the paper appears to be well-structured and clearly written, with appropriate citation of relevant literature. However, it's important to note that the decision to submit a paper to a specific journal depends on many factors, including the journal's scope, editorial policies, and the perceived fit of the paper with the journal's readership. It's always a good idea to carefully read the journal's guidelines and review previously published papers in the journal to ensure that the paper is a good fit before submitting it for consideration.
      The initial concern over whether ChatGPT should be listed as a co-author now seems to be clear that an AI language model cannot take responsibility for and make decisions related to its content and therefore does not meet the requirements of an author. Indeed, the Nature and Science editorial teams have recently announced that AI packages do not fulfil the criteria for a study author as they cannot take responsibility for the content and integrity of the paper (
      • Stokel-Walker C.
      ChatGPT listed as author on research papers: many scientists disapprove.
      ). In the end, authors are responsible for the integrity and validity of their work and should cite any use of LLM in the Methods or Acknowledgements section.
      The following principles have been incorporated into the existing guide for authors of all Springer Nature journals: (i) no LLM tool will be accepted as a credited author on a research paper, and (ii) researchers using LLM tools should detail their use of such tools in the Methods or Acknowledgements section. If these sections are absent, the Introduction or another appropriate section maybe used to acknowledge the use of the LLM (
      Editorial
      Tools such as ChatGPT threaten transparent science; here are our ground rules for their use.
      ).

      Conclusions

      Our experiment of using ChatGPT for scientific writing demonstrates that ChatGPT has a high potential for becoming a great help to researchers in designing their study, performing analyses and drafting the study results into a scientific article. This would definitely help in overcoming the ‘blank-sheet syndrome’ that all authors face from time to time. Nevertheless, it should be considered as a helpful tool to speed up the process and not as a replacement for authors’ work, as it clearly requires human oversight at all stages and final input for guaranteeing the accuracy and reliability of the results.
      When using ChatGPT, several issues arise that should be considered, such as ethics and integrity, accuracy and reliability. Although LLMs are able to produce increasingly realistic text, the integrity and accuracy of using these models in scientific writing is unknown. Often the statements generated are not necessarily true, and if the same question is asked several times, it generates different output answers (
      • Stokel-Walker C.
      AI bot ChatGPT writes smart essays - should professors worry?.
      ). It has been reported that while ChatGPT is capable of producing credible scientific essays, the data it generates are a combination of both true and entirely fabricated information (
      • Alkaissi H.
      • McFarlane S.I.
      Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.
      ;

      J, V., 2022. Ai-generated answers temporarily banned on coding Q&A site Stack Overflow [WWW Document]. The Verge.

      ). Another issue is intellectual property rights: do they belong to the algorithm, to the company that created it or to the scientist who presents the work? Also, what happens with the data that are inserted into the online system? Will it be saved or used later by others?
      When using ChatGPT in scientific writing, we want to highlight that if papers are not properly revised, there is a high risk of presenting incorrect information and non-existent references, especially among writers without domain expertise in the topic. Furthermore, once a paper has been submitted to a journal, the editors and referees will not have the time to check its accuracy phrase by phrase, reference by reference. We sincerely hope that ChatGPT (and other similar LLMs) will not turn out to be a ‘tool of misinformation’. We consider the inaccuracy and misinformation to be the biggest concern today in using ChatGPT. When using the ChatGPT, it creates surprisingly intelligent-sounding text. We would like to stress the word ‘sounding’ here as, based on our own (and others’) experience, the text at first seems acceptable but on closer inspection can be seen to contain misinformation and inaccuracies. In short, we would like authors to be aware that it is an excellent tool to help but not to replace their work.
      AI tools such as ChatGPT have sparked a polarized debate among academics. While some consider it to be a helpful tool for speeding up research, others see it as a threat to the integrity of authorship (
      • Salvagno M.
      • Chat GPT
      • Taccone F.S.
      • Gerli A.G.
      Can artificial intelligence help for scientific writing?.
      ). Whether we like it or not, LLMs are conquering scientific writing in both good and bad ways, and it is our task to be aware of their shortcomings and strengths. Several journals have raised the question of how to regulate their use. We believe that instead of regulating use, the focus should be on providing guidelines on ‘how to use them correctly’, highlighting the pros and cons of their use. These free-to-use tools are and will continue to be the reality and it is up to us how to accept and work with them.
      We encourage researchers to try ChatGPT (openai.com/blog/chatgpt). Overall, the experience is pleasant and exciting, resembling an instant and friendly conversation. In fact, the current version of ChatGPT is trained to be polite and pleasant as filters to prevent it generating hate speech have been applied (
      • Alvero R.
      ChatGPT: Rumors of Human Providers’ Demise Have Been Greatly Exaggerated.
      ). Who would not want to co-work or collaborate with ‘somebody’ who is always available, answers your queries within a second and is supportive and positive throughout the process?

      Acknowledgements

      The Estonian Research Council (grant PRG1076), Horizon 2020 innovation grant (ERIN, grant no. EU952516) and Enterprise Estonia (grant EU48695).

      Funding

      This work is supported by Grants Endo-Map PID2021-12728OB-100 and PRE2018-085440 funded by MCIN/AEI/10.13039/501100011033 and ERFD A way of making Europe; Grants RYC-2016-21199 funded by MCIN/AEI/10.13039/501100011033 and by ESF Investing in your future; FEDER/Junta de Andalucía-Consejería de Economía y Conocimiento: ROBIN A-CTS-614-UGR20, and IRENE P20_00158.

      Appendix. Supplementary materials

      References

        • Alkaissi H.
        • McFarlane S.I.
        Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.
        Cureus. 2023; 15: e35179https://doi.org/10.7759/cureus.35179
        • Alvero R.
        ChatGPT: Rumors of Human Providers’ Demise Have Been Greatly Exaggerated.
        Fertil. Steril. in press. 2023; https://doi.org/10.1016/j.fertnstert.2023.03.010
        • Anderson N.
        • Belavy D.L.
        • Perle S.M.
        • Hendricks S.
        • Hespanhol L.
        • Verhagen E.
        • Memon A.R.
        AI did not write this manuscript, or did it? Can we trick the AI text detector into generated texts? The potential future of ChatGPT and AI in Sports & Exercise Medicine manuscript generation.
        BMJ Open Sport Exerc. Med. 2023; 9: 1-4https://doi.org/10.1136/bmjsem-2023-001568
        • Christian J.
        CNET's AI Journalist Appears to Have Committed Extensive Plagiarism.
        Futurism. 2023;
        • Editorial
        Tools such as ChatGPT threaten transparent science; here are our ground rules for their use.
        Nature. 2023; 613: 612
        • Else H.
        Abstracts writtten by ChatGPT fool scientists.
        Nature. 2023; 613: 423
        • Getahun H.
        After an AI bot wrote a scientific paper on itself, the researcher behind the experiment says she hopes she didn't open a “pandora's box.
        Insider. 2022;
        • Gupta K.
        Stanford and MosaicML Researchers Announce the Release of PubMed GPT, a Purpose-Built AI Model Trained to.
        Interpret Biomedical Language. Marketechpost. 2022;
        • Hutson M.
        Could AI help you to write your next paper?.
        Nature. 2022; 611: 192-193
      1. ICMJE, I.C. of M.J.E., n.d. Defining the Role of Authors and Contributors https://www.icmje.org/recommendations/browse/roles.

      2. J, V., 2022. Ai-generated answers temporarily banned on coding Q&A site Stack Overflow [WWW Document]. The Verge.

        • Macdonald C.
        • Adeloye D.
        • Sheikh A.
        • Rudan I.
        Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis.
        J. Glob. Health. 2023; 13: 01003https://doi.org/10.7189/jogh.13.01003
        • Nath S.
        • Marie A.
        • Ellershaw S.
        • Korot E.
        • Keane P.A.
        New meaning for NLP: the trials and tribulations of natural language processing with GPT-3 in ophthalmology.
        Br. J. Ophthalmol. 2022; 106: 889-892https://doi.org/10.1136/bjophthalmol-2022-321141
        • Salvagno M.
        • Chat GPT
        • Taccone F.S.
        • Gerli A.G.
        Can artificial intelligence help for scientific writing?.
        Crit. Care. 2023; 27: 75https://doi.org/10.1186/s13054-023-04380-2
        • Sample I.
        Science journals ban listing of ChatGPT as co-author on papers.
        Guardian. 2023;
        • Stokel-Walker C.
        ChatGPT listed as author on research papers: many scientists disapprove.
        Nature. 2023; 613: 620-621https://doi.org/10.1038/d41586-023-00107-z
        • Stokel-Walker C.
        AI bot ChatGPT writes smart essays - should professors worry?.
        Nature. 2022; (9 december)https://doi.org/10.1038/d41586-022-04397-7