Utility of unidimensional and functional pain assessment tools in adult postoperative patients: a systematic review

Background We aimed to appraise the evidence relating to the measurement properties of unidimensional tools to quantify pain after surgery. Furthermore, we wished to identify the tools used to assess interference of pain with functional recovery. Methods Four electronic sources (MEDLINE, Embase, CINAHL, PsycINFO) were searched in August 2020. Two reviewers independently screened articles and assessed risk of bias using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Results Thirty-one studies with a total of 12 498 participants were included. Most of the studies failed to meet the methodological quality standards required by COSMIN. Studies of unidimensional assessment tools were underpinned by low-quality evidence for reliability (five studies), and responsiveness (seven studies). Convergent validity was the most studied property (13 studies) with moderate to high correlation ranging from 0.5 to 0.9 between unidimensional tools. Interpretability results were available only for the visual analogue scale (seven studies) and numerical rating scale (four studies). Studies on functional assessment tools were scarce; only one study included an ‘Objective Pain Score,’ a tool assessing pain interference with respiratory function, and it had low-quality for convergent validity. Conclusions This systematic review challenges the validity and reliability of unidimensional tools in adult patients after surgery. We found no evidence that any one unidimensional tool has superior measurement properties in assessing postoperative pain. In addition, because promoting function is a crucial perioperative goal, psychometric validation studies of functional pain assessment tools are needed to improve pain assessment and management. Clinical trial registration PROSPERO CRD42020213495.

Patients experience acute pain after surgery as a result of tissue damage and inflammation at the operation site. 1e3 Careful assessment of pain using a valid and reliable tool 4 is the first step towards a rational choice of analgesic therapy, 5 which is essential for ensuring patient comfort, mobility, and satisfaction and reducing healthcare costs. 6 The most commonly used tools for the assessment of postoperative pain are unidimensional and assess only pain intensity. 4 These include the visual analogue scale (VAS), 7 numerical rating scale (NRS), 8 verbal rating scale (VRS), 9 sometimes referred to as the verbal descriptor scale (VDS), 10 and faces pain scales (FPS). 11 They are quick to administer and do not encroach on the time required for usual care. 12 Despite their extensive use, the reliance on these unidimensional tools as the sole approach to measuring pain is currently insufficient as the cut-off points commonly used by healthcare providers do not reflect the patient's desire for additional analgesics. 13,14 Furthermore, patients have reported difficulties in describing the complexity of their pain experience by a single numerical value, descriptive words, or as a mark on a line. 12 Striving to lower pain intensity scores to zero as suggested by the 'Pain as the 5th Vital Sign' campaign has not improved pain outcomes, 15e17 and resulted in increased opioid analgesic use in the post-anaesthesia care unit (PACU). 17 Furthermore, Vila and colleagues 18 highlighted the potential hazard associated with a pain score-based treatment algorithm in increasing the prevalence of sedation-related side-effects by more than twofold. Treating pain as the fifth vital sign has been abandoned now as it may have contributed to the current US opioid epidemic. 19,20 Restoration of function by allowing the patient to breathe, cough, ambulate, and turn in bed is important for postoperative pain relief. 21,22 Therefore, assessing the functional impact of pain, which includes patient-centred objective assessment by a healthcare provider who judges if the pain prevents the patient from performing activities that help accelerate recovery, could be an appropriate alternative to achieve better pain assessment. 23 Hence, options to treat pain will be used to maximise functional capacity, rather than striving to reduce the patient's postoperative pain score to below a specified numerical value. 4,20 Despite being used widely, the validity, reliability, and utility of unidimensional pain assessment tools for postoperative patients have not been reviewed systematically. The aim of this systematic review was to appraise the available evidence concerning the measurement properties of different unidimensional and functional pain assessment tools when used to assess postoperative pain in hospitalised adults.

Methods
We performed this systematic review according to COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) (http://www.cosmin.nl/) guidelines, and reported it according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement guidelines. 24

Search strategy
We performed a systematic search of the MEDLINE, Embase, PsycINFO (all via OVID) and CINAHL (via EBSCOhost) databases from their inception to August 2020. Our search strategy consisted of four search concepts: (1) measurement properties or outcome terms, (2) pain assessment tool terms, (3) acute postoperative pain, and (4) limits (English language or English translation, human adults 18 yr old). We combined the first three using the Boolean operator AND, which works as a conjunction to narrow the search to include our specific three search concepts resulting in more focused results. This was then combined with the result string of the fourth concept to limit the results. We performed these steps separately for each pain assessment tool. We carried out backward citation tracking as well by checking the reference lists from eligible studies. The comprehensive search strategy used is provided in Supplementary material, Appendix S1.

Inclusion criteria
We included any of the following pain measurement tools to assess acute pain in hospitalised adult patients from all surgical specialties: unidimensional pain assessment tools (including the numerical pain rating scale, VRS, VAS, faces scales [Wong-Baker FACES, Faces Pain Scale e Revised]), and functional pain assessment tools included any tool that helps assess acute pain based on its interference with functional activity, including walking, breathing, turning in bed, and coughing. Included functional pain assessment tools could be used objectively by the clinician or when self-reported by patients.
We included instrument validation or instrument evaluation types of studies. Any studies that included at least one or more of the instruments to evaluate postoperative pain and assessed at least one of the nine measurement properties identified by COSMIN taxonomy: internal consistency, testeretest reliability, measurement error, content validity, structural validity, construct validity, hypothesis testing, cross-cultural validity, criterion validity, and responsiveness were considered (Appendix S2). In addition, we included any study that evaluated any of the specified additional outcomes of the tools, including feasibility, interpretability, and desire for analgesia.

Exclusion criteria
We excluded abstracts, editorials, reviews, and studies that included paediatric or adolescent populations, or sedated, mechanically ventilated and critically ill patients.

Editor's key points
Well validated assessment tools are essential for measuring postoperative pain intensity and impact This systematic review shows that despite many tools available, evidence regarding their validity or reliability is scarce. After surgery, the Visual Analogue Scale (VAS) showed the highest error rate in general and was the least preferred compared to the 0-10 Numerical Rating Scale (NRS). Importantly statistically significant changes in VAS or NRS do not necessarily indicate clinically important changes, and NRS cut-off points used by healthcare professionals to determine acute pain severity do not always reflect patients' desire for analgesics.

Selection of articles
After our database search, we collated and uploaded all identified citations to EndNote X9 (Clarivate Analytics, Philadelphia, PA, USA) and removed duplicates. The identified studies were uploaded to Rayyan QCRI online software. 25 Two reviewers (RMB and AI) independently applied the inclusion criteria to the titles, then to relevant abstracts. Afterwards, we thoroughly examined potentially eligible full texts for inclusion. We documented the full search results in the PRISMA flow diagram (Fig. 1). Excluded studies and the reasons for their exclusion are provided in Appendix S3.

Data extraction
One reviewer (RMB) extracted data from the included full-text articles, with the extraction verified by a second reviewer (AI). The two reviewers resolved any disagreements through discussion, or consultation with other reviewers (RDK, LST, or DNL) when necessary. The extracted data included specific details about the assessment tool used, country, language of scale administration, study design, patient characteristics, surgical procedure, the specific measurement properties assessed, outcomes related to the review question and objectives, and the main statistical analysis.

Assessment of methodology
Two independent reviewers (RMB and AI) critically appraised the methodological quality of studies looking at feasibility and interpretability using a modified version of the NewcastleeOttawa Scale 26 (Appendix S4). For validation studies, we assessed the quality using the COSMIN criteria for methodological quality. 27e29 We included three phases in the assessment of each measurement property. First, we assessed the risk of bias, which pertains to methodological quality in each study: very good, adequate, doubtful, or inadequate quality was assigned to each study. Second, we related the results to a measurement property rated against criteria for 'sufficient measurement properties', and the results were classified as sufficient, insufficient, or indeterminate (Appendix S5). Third, we combined the results from each study and graded the quality of evidence for each pain assessment tool. A summary of the scoring criteria and appraisals is provided in (Appendices S6 and S7).

Protocol registration
The protocol was registered (No. CRD42020213495) with the PROSPERO database and can be accessed at https://www.crd. york.ac.uk/prospero/display_record.php?RecordID¼213495.

Results
The search identified 14 216 potential studies after removal of duplicates. After reviewing the titles, we excluded 13 798 for irrelevance and another 380 after abstract screening. Of the 38 remaining studies, we excluded 19 after examination of the full texts against the inclusion criteria (Appendix S2). An additional 12 studies were identified through searching the bibliography of eligible studies, so a total of 31 studies 2,3,6,13,30e56 ( Fig. 1) with 12 498 participants were included. The number of participants in individual studies ranged from 35 30 to 3045. 31 The distribution of male and female participants in the studies varied, with some studies including only female participants 30 or only male participants 40 and others not reporting sex distribution. 38,50,52,53 The studies matching our inclusion criteria were published between 1982 52 and 2018, 37 and assessed postoperative pain after different types of surgical procedures (Table 1). Nine studies included only cognitively intact patients, 6,32,35,38,47,49,51,54,55 whereas two studies included mild cognitively impaired participants. 46,56 The remaining 20 studies did not report on cognitive function. 2,3,13,30,32e36,39e45,48,50,52,53 Seven studies were performed in the USA, 3,36e38,44,45,52 three in China, 46,47,56 three in Australia, 48e50 and two each in the UK, 35,43 the Netherlands, 13,54 Ghana, 33,42 France, 32 and Canada. 6,40 One study each was performed in Finland, 51 Spain, 34 Nigeria, 30 Iran, 39 India, 53 Vietnam, 55 Israel, 2 and Germany. 41 Although all the included studies were reported in English, some of the tools were administered in other languages: Chinese, 46,47,56 Twi, 33,42 Vietnamese, 55 Finnish, 51 and both English and Yoruba. 30 Using the modified NewcastleeOttawa Score, the majority of studies looking at feasibility were of medium 2,30,32,33,37,39,49,54 or high quality. 3,6,13,35,36,41,46e48,50,51 The methodological quality of three secondary analysis studies that looked at VAS interpretability could not be assessed. 44,45,52 The methodological quality for other measurement properties is described under each measurement property section.
Cross-cultural validity. One study 42 established the validity of a Twi (Ghanaian) version of the VAS. The pain scores reported by patients using the new instrument correlated significantly with those reported by patients using the original (English) version of the VAS, with the highest correlation on the fifth postoperative day. Because of inadequate quality owing to an extremely serious risk of bias and imprecision, very low quality evidence was reported for cross-cultural validity of the VAS.
Responsiveness. Seven studies 33,40,43,45e47,55 reported responsiveness results for the four unidimensional pain assessment tools and provided low-quality evidence because of a very serious risk of bias ( Table 4). The identified risk of bias was mainly related to the use of inappropriate measures of responsiveness such as effect size and statistical tests used.
Measurement error. Only one study assessed measurement error of VAS by determining the minimal detectable change (MDC), 37 which describes the smallest change outside of inherent measurement error that the VAS can detect. The study showed that the MDC on a 100 mm VAS was 15 mm for total hip arthroplasty and 16 mm for total knee arthroplasty. 37 We evaluated the evidence regarding VAS measurement error as moderate quality because we could not determine the minimal important change for VAS in acute pain to compare with MDC and the risk of bias.

Functional pain assessment tool
Only one study examined the 'Objective Pain Score', which assesses the interference of pain with respiratory function. 53 The study evaluated the correlation between scores obtained from the Objective Pain Score and NRS. Whilst patients rated their pain using a printed NRS, the clinician rated pain using the Objective Pain Score. A linear regression model determined the relationship between NRS and Objective Pain Score, and showed that, for every unit increase in the NRS, the Objective Pain Score decreased by 0.334. The study reported sufficient convergent validity with the NRS, although with low-quality evidence because of risk of bias and imprecision. A summary of findings on all assessed measurement properties is provided (Table 2).

Other outcomes
Interpretability and desire for analgesics Visual analogue scale. Seven studies 31,37,44,48e50,52 looked at the interpretability of VAS, and one study 3 included the desire for analgesics as an outcome. Several studies 31,44,52 reported nearly similar cut-off points for VAS, indicating that VAS Table 1 Characteristics of included studies. BNS, box numerical rating scale; CAS, coloured analogue scale; CCPS, colour circle pain scale; ENT, ear, nose and throat; FPS, face pain scale;  ICU, intensive care unit; MPQ, McGill pain questionnaire; M-VRS, modified verbal rating scale with 11 description of pain intensity; NR, not reported; NRS, numerical rating scale; OPS,  objective pain score; PCA, patient controlled analgesia; PPI, present (68) 53 (17) Continued ratings of 0e5 mm were very likely to be rated as no pain by patients, 6e44 mm were considered mild pain, 45e69 mm were considered moderate pain, and VAS ratings 70 mm were suggestive of severe pain. Two studies 37,48 determined the interpretability of VAS by identifying the minimal clinically important difference (MCID) defined as the minimal change in score indicating a meaningful change in pain status. 57 The use of a combination of distribution-and anchor-based methods resulted in an MCID of 9.9 mm for VAS in assessing several types of surgical procedures. 48 In contrast, Danoff and colleagues 37 reported higher MCID values for pain improvement in patients undergoing total hip or knee arthroplasty. Pain was improving clinically when the VAS decreased by 19 and 23 mm, respectively.
Bodian and colleagues 3 found that the proportion of patients requesting additional analgesia after abdominal surgery increased as VAS increased (4%, 43%, and 80% with VAS scores of 30 mm or less, 31e70 mm, and greater than 70 mm, respectively).
Numerical rating scale. Four studies 2,36,41,54 looked at interpretability of the NRS, and one study included desire for analgesics as an outcome. 13 Sloman and colleagues 2 determined the meaning of changes in NRS in relation to perceived pain relief before and after treatment. Patients who rated their pain relief as 'minimal' had, on average, a 35% reduction in NRS. NRS was less sensitive to detect changes from 'moderate' to 'much' as there was a 67% reduction for those who rated their reduction as 'moderate', a 70% decrease for those who rated it is as 'much', and a 94% reduction for those assessed their pain reduction as 'complete'. 2 Inconsistent cut-off points between moderate to severe pain were identified for NRS. For example Gerbershagen and colleagues 41 determined NRS 4 as a cut-point for moderate pain, whereas 'pain interfering with function' resulted in a lower cut-off point of NRS 3. While using receiver operating characteristic analysis in another study, Van Dijk and colleagues 54 found that the sensitivity of NRS to differentiate bearable pain (VRS 2) from unbearable pain (VRS >2) reached higher values (94%) for high cut-off point of NRS >5 compared with lower cut-off points of 3 and 4 (sensitivity 72% and 83%), respectively.
In another study, Van Dijk and colleagues 13 showed that 19% of patients with NRS scores ranging from 5 to 10 had no desire for additional opioids; meanwhile, 62% reported that they did not want additional opioids because their pain was tolerable. When patients were asked at which score they would request opioids, both the median and the modal pain scores were an NRS of 8.

Feasibility
Eight studies included feasibility of pain assessment tools as an outcome measure. 6,32,33,35,46,47,51,56 Error rates were reported as an inability to understand the tool, responses that could not be scored reliably, and lack of responses. 6,35,47,51 Some studies reported the most preferred scale or the easiest to complete ones. 6,33,46,56 There was a lack of studies that assessed the time required to complete the tool or time taken to train patients or nurses.
For multiple types of surgical procedures and in different populations, VDS or VRS was more successful when compared with other tools. Using VRS in patients aged 75 yr  after cardiac surgery showed a higher success rate (81%) compared with VAS (60%) and the FPS (44%). These rates varied significantly on all postoperative days (P<0.02). 51 The reported reasons for the failure rate, which was identified as failure to understand or express level of pain using the assessment tool, were postoperative confusion, delirium, exhaustion, and an inability to differentiate between facial expressions. 51 In a similar way, VRS was more suited for compliance and ease of use after orthopaedic surgery compared with VAS in which 56% of patients included in the study did not understand how to complete VAS and one-third could not perform the assessment using VAS because of visual or hearing impairment. 35 Moreover, VAS showed the highest error rate of 12.3% when used in Chinese populations, whereas VRS reported the lowest error rate (0.8%), which was statistically significant (P<0.05). 47 Interestingly, 40% of the patients rated NRS as the easiest, most preferred tool for assessment; in contrast, VAS was reported the least preferred. 6 From the nurses' perspectives in PACUs, NRS was the most preferred tool in 60% of the included sample. 32 Even though the VAS was the recommended tool to be used in the institution where the study was conducted, 50% of the nurses preferred to use either NRS or VRS owing to its complexities making it difficult for patients to understand VAS. 32 Three studies reported FPS as the preferred tool among a Chinese population, 47 for women, 46 middle-aged adults, and older patients without and with mild cognitive impairment, followed by VRS and NRS. 56 Likewise, FPS (55%) was preferred to NRS (33%) among a Ghanaian population. 33

Discussion
This systematic review presents a comprehensive examination of the measurement properties of unidimensional and functional assessment tools used for adult postoperative patients. The quality of evidence for the measurement properties and utility of the VAS, VDS, NRS, and FPS was suboptimal. Overall, construct validity (convergent validity) was most commonly assessed across measures. Content validity, internal consistency, and structural validity were not assessed as these measures are not designed for single-item scales. The VAS had the greatest number of studies assessing its measurement properties in the postoperative setting, followed by the NRS. Studies on functional pain assessment tools were scarce. Most of the reviewed studies failed to meet the COS-MIN methodological standards required. Good-quality studies were found for interpretability and feasibility as assessed by the NewcastleeOttawa Scale. 26 Most of the studies reported sufficient convergent validity of several unidimensional pain assessment tools, indicating that the scales tended to measure score variations in the same direction. 58 Similar positive findings of good convergent validity results were reported when these tools were used to assess pain associated with rheumatoid arthritis, 59 osteoarthritis, 60 and low back pain. 61 However, the methodology used to measure convergent validity was limited. Because no gold standard tool exists for assessing pain, most studies assessed the correlation of scores obtained from one unidimensional tool with another, measuring only pain intensity. However, when a multidimensional tool such as the McGill Pain Table 3 Reliability of unidimensional pain assessment tools in surgical patients. *Average interclass correlation coefficient calculated for 7 days. y No separate result for each scale. z Results categorised in 20e44 yr (n¼43), 45e59 yr (n¼39), 60 yr without cognitive impairment (n¼40), 60 yr with mild cognitive impairment (n¼31). ¶ 95% confidence interval. FPS, faces pain scale; n, number of patients; NRS, numerical rating scale; PROM/s, patient-reported outcome measures; SD, standard deviation; VAS, visual analogue scale; VDS, verbal descriptor scale. Questionnaire was used as a comparator, studies reported lower correlation scores. 6,40,62 This variation may be related to assessor and patient fatigue during the detailed pain assessment.
There was good reliability of pain assessment for all unidimensional tools. However, the quality of evidence was low for all four scales because of serious risk of bias owing to unreported intervals for repeated measures or the use of inappropriate reliability measures by treating ranked NRS, VDS, or FPS scores as a continuous value. Measurement error was only available for VAS; however, the study outcome was indeterminate because we could not determine for VAS in acute pain to compare it with the MDC. When the MDC is smaller than the minimal important change, significant change can be distinguished from measurement error. 63 Small, albeit statistically significant changes in VAS do not necessarily indicate clinically important changes to guide the interpretation of studies evaluating analgesic therapies. 37 Therefore, obtaining an accurate MCID is crucial. 64 Previous studies have shown that the MCID differs by patient population and diagnosis. We identified two studies reporting inconsistent MCID values for the postoperative population. 37,48 The MCID tended to be higher in patients who underwent joint arthroplasties than other procedures. 48 One explanation might be that patients reporting severe, acute pain need a larger reduction in pain to be clinically meaningful. 65 Measures of responsiveness are an important psychometric property to assess the sensitivity of change in pain over time. 66 Measures of responsiveness used included effect size, standardized response mean, and scores before and after intervention. 33,40,43,44,46,47,55 According to COSMIN methodology, effect size and standardised response mean are inappropriate to assess responsiveness because they measure the size of the change scores rather than their validity. Moreover, the P-value of statistical tests only measures the statistical significance of the change in scores rather than their validity. 63 Pain assessment tools help diagnose surgical catastrophes, allow communication between healthcare providers, and are used to assess efficacy of analgesic treatments and allow comparison between therapies. As no agreement exists on how to identify the optimal cut-off point of a unidimensional pain assessment tool, various arbitrarily chosen values are used. 41 In general, VAS cut-off points of 30, 70, and 100 mm indicate the upper boundaries of mild, moderate, and severe pain, respectively. However, a recent study conducted found a higher cut-off point between mild and moderate pain of around 55 mm on the VAS, which is greater than the values reported by most earlier studies and physicians' consensus. 44,67e69 , colour circle pain scale; CI, confidence  interval; CPI, categorical verbal pain rating scale; FPS, face pain scale; G, group; MPQ, McGill pain questionnaire; PPI, present pain  intensity; PROM/s, patient-reported outcome measures; SRM, standardized response mean; VAS, visual analogue scale; VAS-R, visual  analogue scale at rest; VAS-M, visual analogue scale at movement; VDS, verbal descriptor 13 Previous studies have also found that a high proportion of patients with pain scores >4 did not demand analgesics (28% of patients visiting an emergency department 70 and 42% of children after surgery 71 ). Cho and colleagues 62 showed that postoperative patients requested an analgesic when their pain was VAS 5.5, NRS 6, FPS-R 6, or VRS 2 (moderate or severe pain). This might be influenced by a general refusal for analgesic medicines, or fear of side-effects or addiction, especially with opioids. 13,72,73 Cut-off points, although important, are not validated to guide analgesic interventions. Previously, postoperative pain assessment and management was focused on providing humanitarian pain relief, which constitutes only one objective to tackle a complex experience, and that was achieved by using unidimensional scores. However, healthcare providers should address pain by several approaches to determine if the pain is tolerable, is hindering recovery, or requires intervention. 62 Efforts have been made to encourage use of multidimensional tools to assess postoperative pain. A recent systematic review indicated that the Brief Pain Inventory and the American Pain Society Pain Outcomes Questionnaire e Revised were the two commonly used and studied multidimensional pain assessment tools for patients after surgery, followed by the McGill Pain Questionnaire. These multidimensional tools showed good ratings for some psychometric properties such as internal consistency. However, this recommendation was based on low-to moderate-quality evidence. 66 Moreover, these tools involve a detailed assessment that can range from 5 to 30 min, 74 hindering routine use for frequent assessment in a busy surgical ward. 20 Alternatively, functional pain assessment has been recommended. 14,75 However, as no gold standard objective measures exist for pain-related functional capacity in postoperative patients, 76 we included objective tools assessing the impact of pain on function. Only one study reported sufficient convergent validity of functional assessment based on pain interference with normal breathing and NRS score. 53 The low methodological quality of the study limits the generalisability of the result. Other researchers have tried to incorporate a non-formally validated three-level 'Functional Activity Score' 20 into clinical practice. One study in a Chinese population combining the Functional Activity Score and dynamic NRS found that this allowed nurses to guide and educate patients to better use patient-controlled analgesia to facilitate functional recovery. 77 In addition, a pilot study in hospitalised patients validated a four-level scale (no interference, interference with some or most activities, or inability to do any activity). 78 It established the convergent validity of this tool compared with NRS and VAS in cognitively intact patients. Patients aged 40 yr also preferred a functional assessment scale, 78 possibly because functional assessment considered the impact of pain on activity.
The heterogeneity of study designs, including the assessment scales used, surgical procedures, sample sizes, countries in which the studies were conducted, and the languages used, make determining the most feasible assessment tool difficult. However, the VAS showed the highest error rate and was the least preferred in several studies, whereas the VRS showed the lowest error rate. Difficulties comprehending the VAS and linearly quantifying pain resulted in a higher frequency of incomplete responses, especially for older patients. 12,13 Therefore, older adults and children who have less abstract thinking ability might prefer a categorical scale such as the VRS for easier use. 14 Interestingly, although the FPS is commonly used in paediatric populations, it was also the most preferred tool in the Ghanaian and Chinese adult populations. This might be because of the simplicity of facial expressions, which can quickly reflect pain. Alternatively, cultural aspects may explain why the FPS was preferred. 79

Strengths and limitations
The main strength of this review is that it includes the most frequently used unidimensional and functional pain assessment tools. In addition, we put no limits on publication date, enabling us to obtain information on early studies of these tools. To our knowledge, this is the first review to evaluate the validity of these tools, focusing solely on postsurgical populations and applying COSMIN methodology.
Potential limitations include the fact that the search strategy may have excluded grey literature and studies published in languages other than English. However, we tried to limit the effect of language and publication biases by searching the references of included studies. In addition, the clinical diversity and limitations in the methodologies and quality of the included studies, may have reduced the strength of the conclusions.

Conclusions
This systematic review challenges the validity and reliability of unidimensional tools to quantify pain in adult patients after surgery. Despite their extensive use, no evidence clearly suggests that one tool has superior measurement properties in assessing postoperative pain. Therefore, future studies should be prioritised to assess their validity, reliability, measurement error, and responsiveness using COSMIN methodology. Moreover, adequate quality head-to-head comparison studies are required to assess several unidimensional pain assessment tools alongside other tools covering multiple dimensions of the pain experience. In addition, because promoting function is a crucial perioperative goal, psychometric validation studies of functional pain assessment tools are warranted to identify patients who need additional interventions to promote recovery and improve postoperative pain assessment and management.