Bias and unclear outcomes in clinical trials of diabetic retinopathy: a cross-sectional analysis of literature

| BACKGROUND: Clinical trials are well-designed papers that aim to answer questions in the real world. However, sometimes they present missing, dubious, and unclear outcomes challenging to apply in practice. OBJECTIVE: This work aims to evaluate the way and frequency with which the outcomes in randomized clinical trials of intervention in diabetic retinopathy can be presented in an unclear way to readers. Analyze how these dubious presentations can lead to misinterpretations, why this happens and how they can be remedied. METHODS AND MATERIALS: We searched for RCT about DR intervention in PubMed published over the past five years. RESULTS: Seventy RCTs were included, 27 in peripheral diabetic retinopathy (PDR) and 43 in diabetic macular edema (DME). In the DME group, we found 25.6% reporting and publication bias, 34.9% subjective outcomes, 44.1% presented a lack of presentation of the baseline, and 51.1% underreporting adverse events. In the PDR group, we found 29.6% reporting and publication bias, 44.4% subjective outcomes, 14.8% presented a lack of baseline, and 62.9% underreporting adverse events. CONCLUSION: In addition to the resulting bias, we found other forms of publication of unclear RCT outcomes on DR. Most of them occurred due to disrespect for CONSORT parameters. The reader must be attentive to recognize them and know how they can influence the data's interpretation.

Seventy RCTs were included, 27 in peripheral diabetic retinopathy (PDR) and 43 in diabetic macular edema (DME). In the DME group, we found 25.6% reporting and publication bias, 34.9% subjective outcomes, 44.1% presented a lack of presentation of the baseline, and 51.1% underreporting adverse events. In the PDR group, we found 29.6% reporting and publication bias, 44.4% subjective outcomes, 14.8% presented a lack of baseline, and 62.9% underreporting adverse events. CONCLUSION: In addition to the resulting bias, we found other forms of publication of unclear RCT outcomes on DR. Most of them occurred due to disrespect for CONSORT parameters. The reader must be attentive to recognize them and know how they can influence the data's interpretation.

Introduction
The number of people worldwide impacted by diabetic retinopathy (DR) was predicted to increase from 2.6 million in 2015 to 3.2 million in 2020. 1 With the increase in the number of cases, there will be an increase in the demand for more effective treatments by the decision-maker who finds answers in randomized controlled trials (RCTs). Randomized clinical trials are the appropriate study design to assess the impact of an intervention. 2 The global number of registered clinical trials increased fivefold between 2004 and 2013. 3 The RCT occupies one of the highest stages in the evidence pyramid and is immediately below the systematic review. 4 It is the most important study project to answer questions in the real world.
However, clinical trials take a long time, require large teams and high economic costs 5 , leading the author to simplify a search by performing biases as generating inadequate randomization, allocation errors, not masking patients and teams evaluators, reporting incomplete or selective outcomes, among other systematized errors 6 that can lead the reader to errors in the evaluation of outcomes. 7 However, there are other dubious forms of publication outcomes beyond bias that can also lead the decision-maker to make mistakes. 8 This article aims to present some biased and unclear ways of presenting the outcomes in RCT on DR to alert readers about how and why they appeared, how they can influence the interpretation of outcomes, and how they can be avoided in the future.

Methods
It is a cross-sectional analysis of literature developed at the Department of Ophthalmology, Universidade Federal de São Paulo (UNIFESP), São Paulo, Brazil. We conducted a search strategy according to table 1 for PubMed clinical trials with a filter over the past five years without language restrictions or impact factor of the journals. We searched PubMed on 08 April 2020 and found 288 articles. Two authors independently selected the only RCT about intervention in DR. Duplicates papers were removed.
We only consider RCT in parallel design in humans of any sex, age and follow-up that has received any treatment for diabetic retinopathy, surgery, laser or intravitreal injection in the intervention group and any intervention or no intervention in the control group.
Two independent authors selected articles by title and abstract using web application Rayyan 9 according to the inclusion criteria, and the discrepancies were solved by consensus. The selected papers were read in total, and primary and secondary outcomes and adverse events were captured and classified according to the criteria proposed by Heneghan et al. 8 , as we will see below.
Poorly chosen outcomes or study design flaws are surrogates, composite and subjective outcomes, complex scales, and lack of relevance for patients and decision-makers.
Surrogates outcomes: Are substitute markers for assessing an outcome with a strong association with the outcome of genuine interest. They are used to infer a direct result related to the outcome sought, usually cheaper and quicker to observe than the clinical result of real interest. 10 Composite outcomes: The use of combined outcome measures as an association of combined signs or symptoms or multiple causes to culminate in a result. 11 Its use decreases the sample size as it increases the chance of achieving some outcome, consequently decreasing the cost of the study.
Subjective results: Occur when there is a need for the observer's judgment on a result. There is no objective measure or when the result is self-reported by the subject. 12 Complex scales: Combinations of symptoms and signs forming a scale determined by the author and not standardized. 13 The lack of relevance for patients and decisionmakers: They are presenting results that do not have a practical purpose either for the patient or for the decision-maker. These are outcomes with little or no applicability to the real world. 14 According to poorly collected outcomes, it fails in methods that are missing data and poorly specified outcomes.
Missing data: It is the attrition bias of the incomplete outcomes data. It is considered a loss of data in relation to the baseline. The 'rule 5 and 20' (ie, if> 20% of missing data, then the study is highly biased; if <5%, then low risk of bias) exists to help the reader understand the missing size. 8,16 Poorly specified results: It is essential to clarify in the publication protocol and methodology how the problem will be defined, how the evaluation will be technically made, and how the outcomes will be considered, as poorly specified results can confuse. Self-report measures are particularly prone to bias due to their subjectivity. 12,15 Selectively reported results or problems related to publication are publication bias, report bias, and underreporting of an adverse event.
Publication bias: It is an actual bias. It is directly related to the registration of the protocol before starting the registration of the participants 16 determined the study design in detail.
Reporting bias: It is an actual bias, systematized error, and, as such, interferes with mixed results of clinical trials. Selective bias occurs when a study is published, but some of the measured and analyzed results have not been reported. It is directly related to the publication of the protocol that will serve as a parameter of the research's real interest regardless of the results obtained. 16 Underreporting of adverse events: It is characterized by the absence of mention of any form of adverse events or side effects, mild, moderate, and severe, that are beneficial or not. 8 We can observe relative measures, spin, multiplicity, and core outcomes sets according to inappropriately interpreted outcomes.
Relative measures. The outcome should always be presented in relative and absolute numbers as determined by CONSORT. 18 This way, they can be recalculated, giving credibility to the study, dates can be used to be meta-analyzed according to others authors' needs and so that they are not overvalued given the strictly relative presentation.
Spin results are characterized by the exaltation of discourse in the abstract that does not match the data obtained or even try to divert the reader by presenting secondary results or subgroup analyses and not the direct result, or to focus the reader on another study objective far from the result statistically not significant and also a multiplicity of results particularly when there are several times of evaluation of the outcome. 8 Core outcomes set when observing multiple outcomes that can take the central focus of the research. It considers the set of outcomes of real interest to the patient that must be evaluated. Much like the lack of relevance, we will be considering the set of outcomes 8 and the need for interpretation.
The results were compiled in Table 2 and presented in a descriptive analysis evaluating the results.

DME outcomes
In the 43 studies addressing diabetic macular edema, no surrogate outcomes were found; composite outcomes three (6.9%); 15 (34.9%) subjective outcomes (nine due to individual analysis unit and six uncertain analysis unit). Two (4.7%) complex scale; three (6.9%) outcomes with lack of relevance to patients and decision-makers; one (2.3%) trial missing data and 19 (44.1%) it is impossible to affirm due to the lack of presentation of the baseline characterizing poorly specified data; 11 (25.6%) publication bias, that is, without protocol registration and two of which were registered late, after the collection of results began as well as reporting bias because they depend on the protocol registration in an appropriate period, the others did not present selective reporting of the data; 22 (51.2%) underreporting of adverse events and relative measures; four (9.3%) spin; six (13.9%) multiplicity and zero core outcome sets inconsistencies.

PDR outcomes
In the 27 clinical trials of peripheral diabetic retinopathy, no one presented surrogate outcomes, one (3.7%) composite outcome; 12 (44.4%) subjective outcomes; six (22.2%) complex scales and three (11.1%) lack of relevance. One (3.7%) trial presents missing data, and four (14.8%) it is impossible to affirm due to the lack of presentation of the baseline. 14 (51.9%) poorly specified results, eight (29.6%) publication bias (one of them with a late protocol record) as well as reporting bias which is directly dependent on the protocol record, 17 (62.9%) underreporting of adverse events, 14 (51.9%) with only relatives measures, four (14.8%) spin; four (14.8%) multiplicity and no core outcome sets inconsistencies.

Discussion
The primary purpose of all medical science is to guarantee the quality of life for humans. For the RCT to fulfill this function, the results must be presented in a relevant, appropriate, and essential way for patients in the real world. In a clinical trial, bias is a dangerous practice with or without intent in collecting, analyzing, interpreting, publishing, or reviewing data that can lead to conclusions that are different from the truth.
According to Heneghan et al. 8 , there are dubious ways of reporting outcomes in RCT that, as well as bias, can lead to misinterpretation.
In the present study, we can observe several of these unclear ways of presenting the outcomes. We separated the analysis into central diabetic retinopathy (diabetic macular disease, DME) and peripheral diabetic retinopathy (outside the great vascular arches) because some treatments and outcomes are observed differently.
In this study's sample, we did not find surrogate outcomes in either macular studies or peripheral retinal studies, probably because there are welldetermined methods for evaluating well-known outcomes in diabetic retinopathy. 18 When surrogate outcomes occur, caution is required when interpreting, as it provides less direct relevant evidence than the relevant results. 19 Three (6.9%) and one (3.7%) composite outcomes were found in DME and peripheral DR, respectively. An advantage of using it is to increase statistical efficiency making the study shorter and cheaper 11 ; however, it can lead to exaggerated treatment estimates. 13 It is a practice with high prevalence in cardiological studies. Suspicion for the occurrence of composite outcomes would be in pathology with multiple clinical signs and symptoms. Its occurrence imposes the need for data imputation, which still does not present a safe rule and ends up incurring errors.
In this sample, we found subjective outcomes in 15 (34.9%) DME trials and 12 (44.4%) of peripheral DR. Most of them occur due to the unit of analysis be considered the individual. The presentation of subjective outcomes is usually overvalued and tends to give positive results, and papers with positive results tend to be more published. 12,15,21,22 A complex scale is another form to present unclear outcomes characterized by creating a scale composed of multiple signs and symptoms not standardized. In this sample, we observed two (4.7%) trials in DME and six (22.2%) in peripheral DR. Studies without validated scales usually give more positive results, and these are more likely to be published because it induces more favorable results that are always more publishable. 15 The creation of scales for the classification of diabetic retinopathy is not justified since it is well established and fulfills proposing treatment and determining prognosis.
An observation for ophthalmological studies is appropriate here. Different standardized scales can measure visual acuity (VA); Snellen and LogMar are two more commons in scientific publications. 23 It cannot be classified as presenting a perverted presentation of outcome since the scales are standardized, known, and quickly transcribed between them, but they need to choose the logarithmic scale as the ideal form for publications persists (24). In the DME sample, two (4.7%) studies assessed VA by Snellen, 38 (88.3%) studies assessed by Log Mar, and three (7%) studies did not assess VA. In the peripheral DR test (3.7%) measured vision by Snellen, 16 (59.3%) tests in Log Mar.
Lack of relevance to patients and decision-makers occurs when the laboratory markers do not present clinical correlation in outcome important to patients or decide in real-world, although every study has its importance. In our sample, we considered the lack of relevance in 6 trials, three in each subgroup, as the measurement of serum markers which, although essential for research, are of little practical interest to the patient.
VA, QoL (quality of life), comfort with no pain matters are absolute values for the patient in ophthalmological studies. Only four trials assessed QoL (5.7%), which means keeping very low, but the most surprising is the presence of studies that do not measure VA is 13 (18.6%). The end of all science is the human being and what matters to the individual is well-being, so in ophthalmological studies, one of the most important outcomes is VA and QoL. We cannot forget it.
Missing data was found in one (2.3%) trial on DME sample and 19 (44.1%) it is impossible to affirm due to the lack of presentation of the baseline or inadequate baseline, featuring 20 (46.5%) poorly specified data. One (3.7%) trial in peripheral DR sample with missing data and four (14.8%) is impossible to affirm due to the baseline's lack of presentation. The missing data is considered a bias. 16 a systematic error that affects all outcomes. Data loss throughout the study is frequent, particularly in extended studies, so if> 20% missing data, then the study is highly biased; if <5%, then a low risk of bias. Evaluating the outcome compared to the baseline is a determining factor to prove the study's result; therefore, its non-presentation raises suspicion. 24 Poorly specified outcomes are another distortion of the methodology and occur due to the lack of an adequate description of the outcome, the intervention, the evaluation technique, and how the study methodology and protocols will be treated. Our sample found 16 (37.2%) in DME and 14 (51.6%) in the peripheral DR sample.
It is worth remembering that the unit of analysis defined as an individual in ophthalmological studies is restricted to economic analysis, systemic effects, adverse events, and QoL. It must always be clearly defined in the methodology. The analysis unit in ophthalmological studies must always be defined and shown as the eye. 25 In the analysis of this study's DME, two trials presented both units of analysis in the same study, seven trials used the individual, and two remained undetermined. In the peripheral DR sample, six trials presented the outcomes with the unit of analysis individual, and the other six do not make clear requiring interpretation by the reader.
Publication bias is a real bias related to the lack of publication of the study protocol. It was found in 11 (25.6%) in DME, and 8 (29.6%) in our sample's peripheral DR. The purpose of the protocol record is to show that the author's original idea was not disturbed after the results were collected. 26 The lack of a protocol record is not justified as it is an easy practice and without economic cost.
Reporting bias is a real bias too. It is the publication of only selected outcomes and not all predicted outcomes. It is only possible to know when the protocol is published and the lack of protocol registration. It occurred in 11 (25.6%) and eight (29.6%) in DME and peripheral DR, respectively. Evaluating the outcome compared to the initial research interest is a determining factor in proving the study's result if it is being presented without hiding bad results; therefore, the presentation of the protocol gives the publication veracity and confidence. 26 In our sample, we could observe that two records of PDR and four of DME were registered in a different platform than clinicaltrials.gov, but all of them are public. Two protocols registers in DME trials and one in peripheral DR trial were done after results. Several years have passed from complaints of lack of registration, and the problem persists. 27 It is surprising the underreporting of adverse events observed in this work, 55.7%. Twenty-two trials (51.1%) in DME, 17 did not mention having evaluated any adverse events, 11 of these trials evaluated the use of VEGF, five assessed strictly ocular events. 17 (62.9%) in PDR, 15 did not mention having evaluated any adverse events, four of these trials evaluated VEGF, two assessed strictly ocular events. Studies have shown that journal publications underestimate side effects and exaggerate treatment benefits when assessing the risk-benefit ratio. 27,28 In the interpretation block, we find 22 (50%) of the results in DME trials and 14 (51.8%) trials in PDR of results presented only in a relative way, which can exaggerate interpretation. As indicated by CONSORT (17), the results must be presented in an absolute and relative way, proving the results' clarity. Also, 25 (35.7%) trials presented the author's report as a trial with insufficient sample size, a problem that could be solved by systematic review. For that, the trials need to present the outcomes in the absolute and relative way to facilitate the reviewer work.
In the case of the spin evaluation, there were no misleading titles in our sample, but four (9.3%) spin in the macula trials and four (14.8%) in the peripheral retina, all related to the loss of focus of the primary outcome presenting subgroup analysis and secondary outcomes. Spin shifts the reader's focus on another objective of the study other than the primary outcome, keeping it away from the statistically insignificant result. 29 All works that did not present a protocol registration may have incurred a multiplicity error since the original proposal's lack of access frees the author to enrich the study with other analyses. In addition to the lack of registration, we found six (13.9%) and four (14.8%) trials of the macula and non-central retina respectively, with illusory graphics without scale and unnecessary photographs taking the author's focus away. The risk of multiplicity lies in the greater the number of outcomes, the greater the chance of false-positive results and unfounded claims of effectiveness. 30 Already proposed by COMET 31 , core outcome sets bring together relevant resources to facilitate the development of a proposal with a limited number of outcomes that the reader does not get lost. In our sample, we do not observe bad core set outcomes. 24 trials show limited follow-up, and 25 studies with a deficiency in the sample size reported by the author. Systematic reviews can help in the sample size, compiling tests with the same PICO (Population, Intervention, Comparative, Outcome), but it is necessary to have uniformity in the technical analysis, to be added in a single meta-analysis.
This study aimed to show how the resulting bias and other dubious presentations of outcomes can appear in RCT on DR and alert readers to read the articles with more criteria. Many of these biases have already been reported in the literature 32 and could be corrected with strict compliance with CONSORT rules. 17 We observed the persistence of a lack of standardization in assessing outcomes, which makes it challenging to carry out systematic reviews.
One of the limitations of this paper was not to have limited RCT analysis published by the impact factor of the journals. On the other hand, it was not our objective to evaluate the quality of publications but rather evaluate the quality of the RCTs found in the literature that has guided decision-makers conduct. The impact factor is another metric that the reader should view with great care because it is multifactorial and influences inattentive readers; this would be a proposal for future research. The message that remains is to be very careful when reading scientific articles.

Conclusion
This study showed that the outcome biases and other unclear forms of presentation outcomes in RCT about DR are frequent and that most would not exist if there were strict compliance with the parameters dictated by CONSORT. Given the risk in understanding that these errors can cause, the reader must be alert not to be deceived when making a decision.