Systematic review and meta‐analysis on the agreement of non‐cycloplegic and cycloplegic refraction in children

Abstract Objective To determine the diagnostic agreement of non‐cycloplegic and cycloplegic refraction in children. Method The study methodology followed Preferred Reporting Items for Systematic Reviews and Meta‐Analyses (PRISMA) guidelines. Electronic databases were searched for comparative studies exploring refraction performed on children under non‐cycloplegic and cycloplegic conditions. There was no restriction on the year of publication; however, only publications in the English language were eligible. Inclusion criteria consisted of children aged ≤12 years, any degree or type of refractive error, either sex and no ocular or binocular co‐morbidities. The QUADAS‐2 tool was used to evaluate the risk of bias. Meta‐analysis was conducted to synthesise data from all included studies. Subgroup and sensitivity analyses were undertaken for those studies with a risk of bias. Results Ten studies consisting of 2724 participants were eligible and included in the meta‐analysis. The test for overall effect was not significant when comparing non‐cycloplegic Plusoptix and cycloplegic autorefractors (Z = 0.34, p = 0.74). The pooled mean difference (MD) was −0.08 D (95% CI −0.54 D, +0.38 D) with a prediction interval of −1.72 D to +1.56 D. At less than 0.25 D, this indicates marginal overestimation of myopia and underestimation of hyperopia under non‐cycloplegic conditions. When comparing non‐cycloplegic autorefraction with a Retinomax and Canon autorefractor to cycloplegic refraction, a significant difference was found (Z = 9.79, p < 0.001) and (Z = 4.61, p < 0.001), respectively. Discussion Non‐cycloplegic Plusoptix is the most useful autorefractor for estimating refractive error in young children with low to moderate levels of hyperopia. Results also suggest that cycloplegic refraction must remain the test of choice when measuring refractive error ≤12 years of age. There were insufficient data to explore possible reasons for heterogeneity. Further research is needed to investigate the agreement between non‐cycloplegic and cycloplegic refraction in relation to the type and level of refractive error at different ages.


INTRODUC TION
Cyclopentolate hydrochloride is a synthetic antimuscarinic cycloplegic agent. 1 It is the first choice for providing excellent short-term paralysis of accommodation. 2,3 Cycloplegic refraction (CR) is an effective way of reducing fluctuations in accommodation or spasm of the ciliary muscle. [4][5][6] The temporary paralysis of accommodation is useful when refracting young children as their accommodative system is vigorous, leading to inaccuracies in non-cycloplegic refraction (NCR) 7 with overestimation of myopia and underestimation of hyperopia. 8 Cyclopentolate hydrochloride has several ocular side effects, including irritation, lacrimation, allergic blepharoconjunctivitis, conjunctival hyperaemia and systemic side effects such as drowsiness, disorientation, incoherent speech and visual hallucinations. 9 From a clinical perspective, CR is considered the gold standard for measuring refractive errors due to high accuracy. 10 However, CR is an invasive procedure involving the use of eye drops which many patients find uncomfortable, potentially causing distress amongst younger children. 11 Many parents and children refuse cycloplegia due to the stinging sensation experienced on insertion of the drop, the resulting blurred vision and other side effects such as light sensitivity. [11][12][13][14] Therefore, the use of these diagnostic drops could deter parents and children from attending an eye examination. 15 The cycloplegic effects begin between 25 and 75 min after the administration of the drug with recovery up to 24 h later which may prove a deterrent for the patient [16][17][18][19] and has significant cost implications for the UK National Health Service (NHS) with finite resources. 20 The costeffectiveness of cycloplegic use has been explored in 2005, and in a cohort of 78 children, a median total cost of 2.08 Pounds Sterling per patient for this agent has been reported. 21 Despite the side effects, time and cost implications, the Royal College of Ophthalmologists (UK) recommends that children under the age of 12 years require CR. 22 This is particularly true when hyperopia is suspected or when a binocular vision anomaly is present and the full hyperopic prescription needs to be known. 22 This is also the case when prescribing the least minus prescription to a myopic child, as over prescribing can drive accommodation and axial length growth resulting in myopic progression. 23 In addition, the College of Optometrists (UK) advises optometrists to consider performing cycloplegic examination when refracting young children (without age specification) to obtain an accurate refraction and the best possible view of the fundus. 24 Research suggests that there might be differences in the diagnostic agreement of NCR and CR based on the type of refractive error, 8 the level of refractive error, 11,13 the patient's age 10 or the method of refraction. 25,26 Older patients are less likely to show significant differences between non-cycloplegic (NC) and cycloplegic (C) refractive error measurements. 27,28 Previous work has explored NC and C retinoscopy and autorefraction and found more myopic/less hyperopic measurements during NC assessments, resulting in underestimation of hyperopia and overestimation of myopia in children. 11,13,26 Therefore, it is important to understand whether NCR and CR are comparable in some clinical scenarios. Evidence suggests that the NCR with the Plusoptix autorefractor (pluso ptix.com) has high sensitivity in the detection of myopia and astigmatism 29,30 and shows good agreement with C retinoscopy, the latter being conducted by a paediatric ophthalmologist. 31 Uncorrected refractive error is the leading cause of moderate to severe visual impairment and the second most common cause of blindness. 32 In addition, a refractive error that remains uncorrected or not appropriately corrected can lead to manifest strabismus and anisometropia, which are common risk factors for amblyopia, and have the potential to result in the permanent loss of binocular function. [33][34][35] Accurate measurement of refractive error is essential to ensure that children achieve optimal visual acuity and binocular status. Despite ample research investigating NCR versus CR, the inclusion criteria, results and quality of research are variable. While comparisons have been made between cycloplegic and other refraction methods in children, there are no clear indications of the need for cycloplegia in children, based on age, type or level of refractive error. Therefore, more guidance is required on the need for CR in infants and children and whether NC options are appropriate in some scenarios.
The present review and analysis aims to provide guidance on the use of CR in children ≤12 years old by synthesising data from relevant studies at low risk of bias, and to determine the diagnostic agreement of NCR and CR in children ≤12 years of age.

Eligibility criteria
Comparative studies including participants who had undergone refraction with and without cyclopentolate hydrochloride as a cycloplegic agent were included; studies

Key points
• Without cycloplegia, the Plusoptix is the most accurate machine to measure the spectacle prescription in children below 12 years of age. • However, the Plusoptix cannot be substituted for a measurement which includes the use of cycloplegic drops. • In future studies, the precision of cycloplegia needs to be examined across different spectacle prescriptions and children's ages.
using all other cycloplegic agents were excluded. There was no restriction on the year of publication or location of study; however, only publications in the English language were eligible due to the reviewers' capabilities. Published and unpublished journals (abstracts and dissertations) were searched. The reference lists from relevant retrieved studies were also searched. Citation alerts were used to ensure that the more recently published studies were included. Studies were included if the study participants met the following inclusion criteria: children ≤12 years old of either sex, with any type or level of refractive error and without other ocular or binocular vision anomalies. To avoid duplication of results, studies were excluded if they reported re-analysis or republication from initial data.

Search methods
The Preferred Reporting Items for Systematic Reviews and Meta Analyses (PRISMA) guidelines were followed. 36 Electronic searches were conducted using both thesaurus controlled and text terms to increase sensitivity. The following resources were checked on 24 June 2020 to identify whether a systematic review or a similar research question had been proposed or if one had been published: PROSPERO, Cochrane Library, National Institute for Health and Care Excellence (NICE) evidence, TRIP, EBSCOhost and OVID online. No current review protocol exists under NCR and CR (https://www.crd.york.ac.uk/PROSP ERO/).

Selection process
Two review authors (SW and MC) independently assessed the titles and abstracts of all investigations retrieved via the electronic searches to exclude irrelevant publications. Studies were marked as 'definitely relevant', 'possibly relevant' or 'definitely not relevant'. Those investigations marked as 'definitely not relevant' by both review authors were excluded. Studies marked as 'definitely relevant' or 'possibly relevant' by both review authors were independently assessed against the inclusion criteria to determine whether they were relevant to this review. This was done by obtaining full copies of the relevant papers. Any disparities at this stage were initially solved through discussion to reach consensus.

Data collection process
The two review authors (SW and MC) independently extracted data from the included studies using a standardised data collection form. If disagreements arose during this process, they were resolved through discussion until consensus was achieved. Data were then entered into Review Manager 5 (RevMan 5). 37

Study risk of bias assessment
The two reviewing authors independently assessed each eligible study for risk of bias and assessed the quality of the body of evidence in this review using the QUADAS-2 tool. 38 This assessment tool has undergone evaluation to ensure validity and usefulness. 38,39 Bias is a systematic error or a deviation from the truth in the results or inferences leading to over or underestimating test accuracy. 39 A checklist approach was used with categories 'yes', 'no' and 'unclear' to assess the quality of each domain (patient selection, index test, reference standard and flow and timing). 40 Any discrepancies were resolved through discussion until consensus had been obtained. The questions used to evaluate the risk of bias can be found in Table A1 in Appendix 1.

Effect measures
The main outcome measure was the agreement between NCR and CR. In addition, the mean difference (MD) with a corresponding 95% confidence interval (CI) was calculated. 41 The 95% limits of agreement (LoA) were not obtained due to limitations with the available data. Having reviewed previous research in this area, we used a value of ±0.85 D which falls outside the published intra-examiner limits of repeatability for C retinoscopy in 4-year-olds, 42,43 and was rounded up to 1.00 D as refractive error is measured in 0.25 D steps. 44 The 95% prediction interval was calculated to quantify the impact of between-study heterogeneity. The pooled estimate of MD and random error was calculated using the Der Simonian and Laird randomeffects method. 45 Moreover, the mean difference between NC and C conditions with the Plusoptix autorefractor could not be undertaken, as the Plusoptix has been designed to be used on undilated pupils. 46

Synthesis methods
Statistical analysis and data synthesis were conducted in line with methods described in Chapter 10 of the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy. 40 We included studies that either evaluated only one eye of each participant or both eyes. This was due to the nature of the data available which was averaged for each sample making extraction of individual data unfeasible. However, this is likely to overestimate the mean difference when the analysis is concluded. The eligible studies were tabulated according to the number and age of participants, range of refractive error explored and the type of refraction undertaken.
Heterogeneity was evaluated using the Cochrane Q test and I 2 index. 47 To explore between-study variance, tau-squared (τ 2 ) was calculated. 48 The prediction interval, which is the index of dispersion, was also calculated to examine how widely the scores varied and the variance of the effect size.
Sensitivity analyses were undertaken to assess the impact of the risk of bias on test accuracy by repeating the analysis after removing the studies with a high risk of bias. Heterogeneity was explored using subgroup analysis based on the mode of refraction (retinoscopy or autorefractor) and the child's age.

Study selection
Of 150 reports, 131 full text manuscripts were obtained. Ten of these studies met the inclusion criteria (Figure 1). During the study selection process, 119 reports were excluded as they did not meet the review criteria. Those studies that were eligible for this review are shown in Table 1. The risk of bias in the included studies was due to a different criterion being used for selecting children that would have NCR, a refractive threshold introduced for NCR limiting the range of refractive error being explored, lack of masking or an insufficient interval between NCR and CR.

Risk of bias in studies
Unclear concern about applicability was found for only one study due to participant selection and refraction criteria.

Comparison of NC autorefractors and C retinoscopy
The forest plot in Table 3 compares NC autorefraction and C retinoscopy. The overall effect (MD) is −0.55 D with a 95% confidence interval of −1.13 D to +0.04 D (Z = 1.84, p = 0.07). Therefore, on average, refractive error estimation is more myopic under NC than C conditions by ≈0.50 D. The Q-value is 84.50 with 5 degrees of freedom and p < 0.0001 indicating a significant difference between NCR and CR.
The true effect size varies between studies. The I 2 statistic tells us that the proportion of the observed variance F I G U R E 1 Preferred reporting items for systematic reviews and meta-analyses (PRISMA) flow diagram illustrating the screening process. 36

Comparison of NCR and CR in children under and over 5 years of age
In comparison between NCR and CR in the different age subgroups (Table 4), the test for overall effect is significant

Sensitivity analysis
Following the removal of studies with a high risk of bias (Tables 5-7), there was no change in terms of significance when assessing the type of autorefractor under NC and C conditions (Q = 13.14, df = 2, p = 0.001). Neither was there a change when comparing between age subgroups (Q = 0.01, df = 1, p = 0.93). Comparisons between NC and C autorefractors, NC autorefractors and C retinoscopy and NCR and CR in children either under or over 5 years of age are shown in Tables 5, 6 and 7, respectively.

DISCUSSION
This systematic review and meta-analysis aimed to determine the diagnostic agreement between NCR and CR in children. We found the mean effect size for the Righton Retinomax (right on-oph.com) to be −1.17 D. A negative mean effect size indicates that this method provides a higher estimate of myopia under NC conditions. As the 95% confidence intervals for the NC Retinomax fall outside the intra-examiner limits of repeatability for C retinoscopy (>1.00 D), these results suggest that the Retinomax autorefractor is an inaccurate method for measuring refractive error under NC conditions. Similar findings were found with the NC Canon (RK-F1) autorefractor (canon.ca). These findings are in line with previous reports that under NC conditions autorefractors over-and under-estimate myopia and hyperopia, respectively. 5,42,58 The confidence intervals suggest that under NC conditions, both autorefractors are inaccurate methods for measuring refractive error in children under and over 5 years of age and up to 12 years.
Meta-analysis reveals that NC Plusoptix autorefraction showed reasonable agreement with C retinoscopy and the least variability in the refractive error measurements with a MD close to zero. Since the 95% confidence intervals fall within the intra-examiner limits of repeatability for C retinoscopy (<1.00 D), the results suggest that the NC Plusoptix is an accurate method for examining refractive error. Binocular open-field autorefractors were developed to avoid accommodation that a monocular closed-field autorefractor would generate without cycloplegia. 59 Research has shown less myopic findings with a binocular open-field instrument than a monocular closed-field 60 autorefractor, which may explain the agreement between the NC Plusoptix and C retinoscopy. However, these findings should be interpreted with caution because they specify the accuracy of the mean effect for all of the five studies included, but not the variance of the effect size within the population as a whole. Therefore, results cannot be extrapolated to patients with clinical characteristics beyond the five studies included.
Estimating refractive error with eccentric infrared photorefraction depends on calibration of the luminance slopes in the pupil (conversion of the distribution of light reflected across the pupil to refractive error), and research has shown that better agreement can be found with low levels of refractive error and significant errors can arise as the refractive error increases. 61 These findings and the prediction interval suggest that the NC Plusoptix should not be used exclusively due to the potential variability being significantly larger than 1.00 D. The level of variability exceeded the limits of maximum acceptable difference, meaning that NC Plusoptix cannot be substituted for a CR. In addition, the total sample size for studies with the NC Plusoptix was small (n = 650), which could have potentially led to an underestimation of the difference between NCR and CR.
Our findings suggest that when compared with C retinoscopy, NC autorefraction is an inaccurate method of measuring refractive error, as it often results in an overestimate of myopia and a wide range of effect sizes. Comparison between NC retinoscopy and C autorefraction or C retinoscopy was not possible as the relevant data were not available from the included papers. Future studies should examine this relationship as practitioners may find it easier to obtain a measurement using a retinoscope because younger children like to move around. In addition, a practitioner has some control over accommodation during NC retinoscopy, unlike an autorefractor with its inherently inflexible design. Moreover, it has been reported that 80% of intra-and inter-examiner repeatability for C retinoscopy falls within ±0.50 D. 42 Accommodation is known to affect the precision of refractive error assessment as active accommodation produces a negative shift in the measurements. Rosenfield and Benzoni found a considerable decline in the amplitude of accommodation (3.80 D) between the age of 5-10 years, suggesting that inaccuracy of NCR is particularly likely in younger children. 62 Most of the evidence on refraction in children <12 years old indicates that myopia is over estimated in NC compared with C autorefraction 5,41,48 with greater differences in younger children. 63 These findings suggest that younger children on average should have larger effect sizes with increased variability. Results from the current study indicate a smaller effect size with less variability in children <5 years of age than for those >5 years old. One possible explanation for this unusual finding is that the type of autorefractor also differed between the studies and therefore confounded the results. This possibility is supported by the high I 2 value suggesting that there is significant heterogeneity between the groups (e.g., a range of different autorefractors). The prediction intervals were also extremely large in both groups indicating that the effect size is hard to predict due to the large amount of heterogeneity. Unfortunately, due to the low number of studies included, we were unable to carry out a metaregression to verify this.
As indicated above, an important finding of this review was the level of heterogeneity between the results of individual studies. The meta-analysis quantified the degree to which the findings differed between studies, and showed inconsistencies and variation between them. Several factors in the individual study findings limit the extent to which we can accurately represent the evidence and explore reasons for heterogeneity. Unfortunately, many investigations averaged all types and levels of refractive error measurements into one overall figure. As a direct result, we were unable to investigate how types and levels of refractive error influenced the differences between NCR and CR. In addition, several studies included both eyes per child, which resulted in the clustering of intra-individual data 64 making it impossible to extract data for just one eye. Finally, accommodation can influence refractive error measurements and potentially affect the agreement between NCR and CR. These could not be formally investigated because of the limitations of the reported data. Recent findings have shown that a clinically significant change in spherical equivalent (≥0.50 D) between NC and C autorefraction is more likely to occur in children who have a lag of accommodation <1.15 D. 12 Additional work is therefore required to create guidance on the application of cyclopentolate, and when it can be avoided.
In future studies, bias should be minimised by ensuring that all subjects, irrespective of their level and type of refractive error, are randomised into one arm of the trial. Studies should specify their cycloplegia regimen, including the interval between administration of drops and refraction. In addition, future studies should implement the assessment of accommodative function and pupil size both pre-and post-instillation of cyclopentolate.
The strengths of this review include the methodological rigour used to conduct the review process which is consistent with best practice. 65 A comprehensive search strategy was used to identify as many potential studies for inclusion as possible, with no clinical setting, study design or publication year restriction. Two review authors independently screened all titles and abstracts. The data extraction and quality assessment for each study using the QUADAS-2 Tool was conducted independently.
There are some limitations that are worth highlighting. Many studies had a high or unclear risk of bias in at least one domain with substantial heterogeneity observed between studies. This should be taken into consideration when interpreting the review findings. In addition, we were unable to conduct the planned sensitivity analysis. A lack of data from various degrees and types of refractive error might affect the overall estimate in this review. The range of refractive error was not always disclosed in the included publications, and therefore, the level of agreement between NCR and CR was unknown. Accordingly, the findings should be interpreted with caution until more data are available. The assumptions made from this review are restricted to the autorefractors included in this study. In addition, only the analysis of low to moderate hyperopia can be established. Agreement for hyperopia >5.00 D is yet to be explored. Moreover, analysis of the NC Plusoptix and C Plusoptix (pluso ptix.com) could not be undertaken to establish a mean difference between these autorefractors with and without cycloplegia.
The present systematic review and meta-analysis highlight substantial gaps in our knowledge of the accuracy of refracting young children without cycloplegia. Unfortunately, as most studies averaged different types and levels of refractive error, we were unable to determine whether cycloplegia is needed for all children or whether it can be safely administered only to children with specific types or levels of refractive error. Further quality research is needed to allow this analysis to be conducted. This could either be addressed by a large primary study or potentially via a meta-analysis of individual patient data obtained from study authors. In conclusion, many different forms of autorefractors can be used to help evaluate refractive error objectively. However, CR is still recommended to ensure diagnostic accuracy in children younger than 12 years of age.

CO N F L I C T O F I N T E R E S T
The authors have no proprietary or commercial interest in any materials discussed in this article.