Differential Diagnosis of Inflammatory Arthropathies by Musculoskeletal Ultrasonography: A Systematic Literature Review

Background: Differential diagnosis in early arthritis is challenging, especially early after symptom onset. Several studies applied musculoskeletal ultrasound in this setting, however, its role in helping diagnosis has yet to be clearly defined. The purpose of this work is to systematically assess the diagnostic applications of ultrasonography in early arthritis in order to summarize the available evidence and highlight possible gaps in knowledge. Methods: In December 2017, existing systematic literature reviews (SLR) on rheumatoid arthritis (RA), osteoarthritis (OA), psoriatic arthritis (PsA), polymyalgia rheumatica (PMR), calcium pyrophosphate deposition disease (CPPD), and gout were retrieved. Studies on ultrasound to diagnose the target conditions and detecting elementary lesions (such as synovitis, tenosynovitis, enthesitis, bone erosions, osteophytes) were extracted from the SLRs. The searches of the previous reviews were updated and data from new studies fulfilling the inclusion criteria extracted. Groups of reviewers worked separately for each disease, when possible diagnostic accuracy (sensitivities, specificities) was calculated from primary studies. When available, the reliability of ultrasound to detect elementary lesions was extracted. Results: For all the examined disease, recent SLRs were available. The new searches identified 27 eligible articles, with 87 articles included from the previous SLRs. The diagnostic performance of ultrasound in identifying diseases was addressed by 75 studies; in most of them, a single elementary lesion was used to define diagnosis, except for PMR. Only studies on RA included consecutive patients with new onset of arthritis, while studies on gout and CPPD often focused on subjects with mono-arthritis. Most of the remaining studies enrolled patients with a defined diagnosis. Synovitis was the most frequently detected lesion; clinical diagnosis was the most common reference standard. The diagnostic performance of ultrasound across different conditions was extremely variable. Ultrasound to identify elementary lesions was assessed in 38 studies in OA, gout and CPPD. Its performance in OA was very variable, with better results in CPPD and gout. The reliability of ultrasound was moderate to good for most lesions. Conclusions: Although a consistent amount of literature investigated the diagnostic application of ultrasound, in only a minority of cases its additional value over clinical diagnosis was tested. This SLR underlines the need for studies with a pragmatic design to identify the placement of ultrasound in the diagnostic pathway of new-onset arthritis.


INTRODUCTION
With effective treatment strategies for inflammatory arthropathies becoming extensively available, in the last decade a prompt diagnosis, allowing intervention within the window of opportunity, has become a critical point in the management of early arthritis (1). However, in rheumatology diagnosis can be achieved with certainty in a minority of cases, and this is particularly true when patients are assessed at very early stages of diseases. While in some cases the presence of valuable biomarkers, such as anticyclic citrullinated peptides antibodies (ACPA), drives the diagnostic process, in seronegative early arthritis the degree of uncertainty remains high. Moreover, the current classification criteria for the main rheumatic diseases, which are often inappropriately used to help diagnosis, require differential diagnosis to be performed before they are applied (2). This difficulty in the correct definition of diagnoses at early stages might lead to inappropriate management, delaying the start of effective treatment but also exposing patients to useless and potentially toxic drugs. In addition, also in a research setting, an imprecise diagnosis implies the impossibility to measure reliably the effect of innovative treatments in early phases. In this context, there is a great interest in the research of new biomarkers and new tools to help the diagnostic process.
Musculoskeletal ultrasonography has been widely applied in rheumatic diseases, demonstrating to be a valid and reproducible tool in both inflammatory and non-inflammatory pathologies. The relevance of this instrument has also been recognized by the European League Against Rheumatism (EULAR), that recommends ultrasound among the imaging which can be considered to help the clinical management of several conditions (3)(4)(5). The applications of ultrasound cover the areas of diagnosis, assessment of prognosis, follow-up of diseases and guide for intra-articular and peri-tendinous procedures. In the field of diagnosis, most of the studies on ultrasound investigated the frequency of elementary lesions characteristics of diseases, thus providing information on the diagnostic performance of this tool to detect single abnormalities or on the performance of single lesions to diagnose a disease. On the other hand, only a minority of studies tests the diagnostic value of combinations of lesions, assessed at the same time. Moreover, in this context elementary lesions are not selected based on their diagnostic properties and specificity for a certain condition. Only a minority of studies, in which the added value of ultrasound is tested jointly with clinical evaluation (6,7), apply a pragmatic design that reproduces the clinical context. The lack of information on the application of ultrasound in a realistic clinical process of diagnosis translates into the limited weight given to this imaging in classification criteria. For instance, the only role for ultrasound in Rheumatoid Arthritis (RA) classification is the possible confirmation of the presence of synovitis (2), while to date the only classification criteria including ultrasound are those for polymyalgia rheumatica (PMR) (8).
Given the limited availability of methodologically sound studies to address the diagnostic performance of ultrasound in a realistic clinical context of differential diagnosis of inflammatory arthropathies, the Ultrasound Study Group of the Italian Society for Rheumatology (SIR) prioritized its research on this subject. The present study represents the first step of such project. The aim of the present work was the evaluation of the available literature on the diagnostic application of ultrasound in inflammatory arthropathies.

MATERIALS AND METHODS
As a first step, the most relevant differential diagnoses in patients with suspected inflammatory arthropathies were identified, including also osteoarthritis (OA) as a relevant differential diagnosis. We afterwards individuated two research questions, rephrased following the PICOs (Patient, Intervention, Comparator, Outcome, Study type) methodology to provide inclusion and exclusion criteria ( Table 1). On this basis, we planned separate systematic literature reviews (SLR) to assess the diagnostic performance of ultrasound to diagnose OA, RA, psoriatic arthritis (PsA), PMR, gout, calcium pyrophosphate deposition disease (CPPD). The SLRs were not registered, but a common protocol was available for all researchers before If studies on diagnostic performance reported also data on intra and inter-reader reliability on elementary lesions, that information was also extracted. Working groups composed by supervisors and fellows were created to work separately on each topic, participants were selected based on the expertise on the specific disease and on SLR methodology to create uniform groups. The most recent SLRs on ultrasound in the same diseases were first sought in electronic databases (5,(9)(10)(11)(12)(13). Some of the authors involved in the present project were also co-authors of these SLR and could provide background material (AA, ABa, AI, AZ, CS, EF, GF, GS). Since many of the existing SLR had a broader focus, only primary studies focusing on the diagnostic use of ultrasound were taken into account for the present work. The search strategies of the previous SLR were applied in PubMed and Embase, starting from the date of the last search of the previous reviews (5,(9)(10)(11)(12)(13). Searches were last run on November 30th 2017. The search on PubMed and Embase was selected because we expected that all the relevant literature would be retrieved, and we did not expect to find further evidence including other databases. The records retrieved from the new searches were transferred into a bibliographic manager software (Zotero, RRID:SCR_013784) and libraries shared with each working group. The titles and abstracts of the retrieved records were evaluated by pairs of reviewers to assess the eligibility for full-text review according to the pre-specified criteria. Full-texts were afterwards evaluated by the same criteria and data from the included studies extracted into a standardized form, including 2 × 2 tables of diagnostic performance. A flow-chart describing the selection process was separately generated for each SLR. Results were summarized through summary of findings tables, describing both studies included in the previous reviews and those identified by the present ones.

RESULTS
In total, all search strategies retrieved 943 references since the date of the last search of the previous SLRs. The higher number of references belonged to the fields of PsA and gout (Additional Online File). After reviewing the abstracts, 27 papers were finally included, together with 87 articles from previous SLRs meeting the inclusion criteria, for a total of 114 papers included in the present SLR ( Table 2). The PRISMA flow-chart of the SLR for each disease is available in the Additional Online File, as well as the full results, presented through summary of findings tables.

Ultrasound for the Clinical Diagnosis of Inflammatory Arthropathy
Information regarding the value of ultrasound to diagnose diseases could be extracted from 75 studies. The greatest amount of evidence was available for PsA, with 29 studies assessing the diagnostic performance of ultrasound.
There were meaningful differences in terms of enrolled populations across different diseases. In fact, in studies addressing PsA and OA, the primary aim was mostly to report the prevalence of different lesions. The frequency of each lesion was compared in patients with already known PsA or OA and healthy controls or patients with other definite diseases. A realistic clinical scenario of consecutive patients referred for suspicion of inflammatory arthropathy was rarely available (6).
Conversely, studies on RA evaluated the added value of ultrasound over classification criteria (14)(15)(16)(17), the added value for diagnosis on top of clinical findings (18)(19)(20) or its prognostic value over the future development of RA (21-25) by crosssectional or longitudinal study design.
Studies dealing with PMR mostly included populations of consecutive patients with shoulder pain (8,13,26) and some of them evaluated the additional value of ultrasound on the diagnostic performance of the 2012 classification criteria (8,26,27).
In the fields of both gout and CPPD, most of the studies included patients presenting with mono-arthritis and with suspect crystal-related arthritis.
Despite these discrepancies across different conditions, there were only a few studies, mainly focused on RA, that enrolled a population of consecutive patients with joint pain (6, 16-20, 22, 24, 28). The interventions used to help diagnosis were also variable. Since most of the studies did not have diagnostic accuracy as primary objective, data on the diagnosis of disease were based on single elementary lesions. A relevant exception was represented by PMR, for which some studies addressed different lesions (tenosynovitis, bursitis and synovitis) in combination (8,26,27). Graph 1 summarizes all the different lesions used to define diagnosis.
The confirmation of the diagnosis was based on a variety of reference standards, which depended on the diagnostic suspicion, as expected. While clinical diagnosis was frequently considered in RA and PMR, for PsA the confirmation of diagnosis mostly relied on clinical diagnosis and classification criteria, while synovial fluid analysis was frequently considered in crystalrelated arthropathies (Graph 2). GRAPH 2 | Reference standards adopted to confirm diagnoses. PMR, polymyalgia rheumatica; CPPD, calcium pyrophosphate deposition disease; OA, osteoarthritis; PsA, psoriatic arthritis; RA, rheumatoid arthritis; SFA, synovial fluid analysis.
In OA, adding ultrasound information to the clinical evaluation increased the certainty of the diagnosis made by the clinician (6), while the likelihood of OA, compared to being healthy, increased with the finding of bone erosions (44).
In the field of RA, some studies supported the possibility to integrate clinical and ultrasound findings to reclassify undifferentiated arthritis (14)(15)(16)(17)21), while in other studies ultrasound information was applied to confirm a diagnosis of RA or tested against a clinical diagnosis (18)(19)(20), leading in general to an increase in diagnostic performance. The prognostic value of ultrasound in predicting the future development of the disease or the need for specific treatment has also been tested, once again with positive results supporting this application (22)(23)(24)28) (Tables 3, 4). The most specific lesion to diagnose RA were bone erosions, with the specificity of 1 reported by a single study (19), although also the specificity of PD positive synovitis was high (ranging from 0.88 to 0.93).
Despite a higher number of studies with a focus on PsA, in this area there was a greater variability, due to many different lesions, tissues and sites assessed. Many studies (14 studies) focused on the assessment of entheseal abnormalities (30,34,35,39,40,42,(45)(46)(47)(48)(49)(50)(51)(52) and the joints (6 studies) (29,32,33,38,41,53), while only a few studies assessed the fingers (considering joints, tendons, soft tissues and entheses) (31,36,37) or the nails (43,54,55). The primary aim of the included studies rarely addressed the diagnostic accuracy. In fact, most of the studies compared the prevalence of lesions in PsA and other diseases. Also, for this, the diagnostic performance of ultrasound findings, which were usually considered alone and not in combination or in addition to clinical findings, was extremely variable across lesions and sites ( Table 5). Among the tested lesions, those proving to be more specific to detect PsA were those at the level of the entheses. In fact, the specificity of entheseal PD ranged from 0.33 to 0.99, of enthesophytes from 0.52 to 1 and of calcifications from 0.86 to GRAPH 3 | Study design of the included studies, depending on the assessed disease. PMR, polymyalgia rheumatica; CPPD, calcium pyrophosphate deposition disease; OA, osteoarthritis; PsA, psoriatic arthritis; RA, rheumatoid arthritis; SFA: synovial fluid analysis. 0.97. Peritenonitis was also very specific (from 0.95 to 1 when PD signal was present).
Studies focusing on ultrasound of the hips and the shoulders in PMR had a more variable design. In fact, along with some older studies with a case-control design (8,(56)(57)(58)(59) several cohort studies, including that on which the current classification criteria are based (8), included consecutive patients with shoulder pain (34,60). Moreover, several recent studies provided external validation for the classification criteria (26). Again, in terms of accuracy, studies yielded very heterogeneous results ( Table 6). In general, bilateral findings seemed to be more specific for PMR. The specificity of bilateral subacromiodeltoid bursitis ranged from 0.68 to 0.99, while for bilateral long head of the biceps tenosynovitis ranged from 0.62 to 0.98.
Studies in CPPD evaluated several different sites, including the knees (61-68), the wrist (69, 70), the affected joint or all joints (71). Study design was variable, including both case-control and cohort studies. The diagnosis of CPPD was confirmed more frequently by synovial fluid analysis, while in some cases a clinical diagnosis (70,71) or histology (68) were used as references. In general, ultrasound seemed to perform well in identifying this condition, especially at the knee and the wrist. The specificity to confirm CPPD at the knee (considering all the assessed sites) ranged from 0.66 to 1, while at the wrist from 0.81 to 0.91.
In the field of gout, the type of joint under investigation was widely variable, all studies (72-74) but two (75, 76) adopted synovial fluid analysis as reference standard to diagnose the  Summary of sensitivities, specificities and reliability across studies assessing the performance of ultrasound to diagnose RA elementary lesions. Of the 13 papers included,only four reported sensibility-sensitivity by using gray scale (GS) and/or power Doppler (PD) ≥ 2. *Hands (including proximal interphalangeal joints) and wrists.  disease. 4/6 studies had a cross-sectional design, while the two remaining were a prospective (73) and a retrospective (72) study. While 4 studies reported a satisfactory performance of ultrasound (73)(74)(75)77), for 2 studies sensitivity was low (72,76). Considering the combination of all possible elementary lesions (e.g., double contour, aggregates, tophi), the specificity of ultrasound to diagnose gout ranged from 0.42 to 0.87.
The typical population enrolled was represented by subjects with confirmed disease, in which ultrasound was compared to a reference standard to confirm the presence of a lesion.
As expected, also the reference standard was variable, in particular for OA. For CPPD, the only assessed target lesion was CPP deposition, which was evaluated by conventional radiography (2 studies), synovial fluid analysis (6 studies), microscopic analysis (2 studies). All studies on gout but one (107), in which conventional radiography was used, adopted synovial fluid analysis as reference standard.
Most of the studies assessing ultrasound to detect elementary lesions had a cross-sectional design, in particular, all the studies on OA, 4 (74-77) and 9 (62,69,70,(99)(100)(101)103) studies for gout and CPPD, respectively, while the remaining studies for these two conditions had a cohort design.
In OA, results on the performance of ultrasound were once again widely variable across studies. This was also due to the variability of the reference standards adopted to define each separate lesion and the assessment of different anatomical areas.
Most of the studies on CPPD reported good performance of ultrasound to detect deposits, and this was true especially for specificity. The same conclusions can be drawn from the included articles on gout.

Reliability
Most of the studies on OA in which reliability data were presented reported good reliability for the assessment of osteophytes, erosions, effusion, cartilage damage, synovitis and cysts ( Table 7).
In RA, the available evidence supported a good intra-reader and inter-reader reliability for erosions, GS and PD synovitis across all the assessed sites (Tables 3, 4).
There was less information about reliability in the ultrasound assessment of PsA; entheseal PD, synovial hypertrophy and bursitis were the only lesions for which reliability was available. Inter-reader reliability was good for synovial hypertrophy and bursitis, as well as intra-reader for entheseal PD and bursitis ( Table 5). Among the included studies on PMR, none reported information on the reliability for the assessed lesions.
For CPPD, some studies reported a good inter-reader reliability to assess both the meniscal fibrocartilage and the hyaline cartilage at the level of the knee ( Table 8). In gout during acute attacks, very good intra-reader reliability was reported for double contour, aggregates, erosions and hypervascularisation. Inter-reader reliability was assessed for tophi, erosions, double contour, hypervascularisation and aggregates, still with good values ( Table 9). The reliability on the same lesions was also assessed in the intercritical phases, with still good, although in general lower, results (Table 10).

DISCUSSION
The aim of our SLR was that of retrieving all the available evidence to support future studies on the integration of the information provided by ultrasound in the diagnostic process. Several groups had already focused on this aspect, since recent SLRs were available for all of the conditions of our interest (5,9,11,13,104). The existing reviews presented a summary of the diagnostic use of ultrasound deriving from a relevant number of studies for each considered disease. Despite all the reviews being relatively recent, we found additional studies in the subsequent literature from which we could retrieve further evidence. The number of SLRs and eligible studies represents a clue of the interest that ultrasound as diagnostic tool has raised. The easier availability of high-end ultrasound equipment, the accessibility to training and the possibility to apply directly the information provided by ultrasound during a routine visit are likely the features that have driven the enthusiasm about the technique. However, when analyzing in depth the available literature, there is an evident gap between the interest in the diagnostic applications of ultrasound and the quality of the studies produced so far in this field. In fact, with some important exceptions, the main objective of the studies was that of describing the prevalence of different lesions and comparing groups of patients in terms of ultrasound findings. Although information on diagnostic accuracy can be retrieved also from such study design, these results cannot be generalized to external populations, since a realistic clinical setting is not reproduced. Many studies, in fact, included patients with definite and longstanding diagnosis and adopted a case-control design, with controls that were unlikely to be very similar to the true differential diagnoses of disease. This is particularly true for PsA, for which most of the studies had a case-control design.
There was limited evidence regarding the diagnosis of OA (6), while for RA and PMR the studies reproduced a more pragmatic context. In fact, in RA, some studies evaluated patients with new-onset arthralgia and tested the ability of ultrasound to help confirm diagnosis (19), while some others integrated ultrasound on top of classification criteria (16,17). There were also some studies testing the prognostic value of ultrasound on the future development of RA (22).
In the context of PMR, some older studies still adopted a case-control design (56), however, since the development of the new classification criteria (8), the interest has shifted to the evaluation of the additional impact of ultrasound on classification (26). The performance of US in this context was highly variable. Such heterogeneous results might be due to the disease, which may present with variable abnormalities, thus affecting the US sensitivity. Bilateral pathologic conditions appear to be the most specific US findings.
In the field of crystal-related arthropathies, several studies evaluated both patients during the acute presentation and the inter-critical periods. The population of interest was that of patients presenting with monoarthritis, representing a realistic clinical scenario for this diagnostic suspicion, although quite specific.
The ability of ultrasound to correctly identify elementary lesions typical of each disease seemed to be good, and this was especially true for inflammatory lesions. When a suboptimal performance was achieved, it must be kept in mind that in several studies the reference standard adopted to define a lesion (e.g., physical examination) could not be considered the optimal one for the specific lesion.
Although this was not the primary objective of this SLR, we extracted information on intra and inter-reader reliability, if available. The information from the primary studies supported good reliability of ultrasound to identify inflammatory lesions, as well as signs of damage, at the level of joints and entheses, as well as deposition of crystals. It must however be considered that rheumatologists taking part in ultrasound studies might have greater expertise on a specific lesion or disease than average, so that such reliabilities could not be reproduced in a clinical setting.
The present SLR has some limitations. First, only two databases were searched, and, although probably the greatest part of the literature has been covered, we cannot exclude the presence of further studies, even among gray literature. Due to the clinical heterogeneity of the results, we did not perform a pooled estimate of the diagnostic performance. Moreover, a formal assessment of quality and risk of bias was not performed. However, the present work is, to our knowledge, the first one to provide a comprehensive overview on the diagnostic use of ultrasound in arthritis, with a focus on the general question and without concentrating on a single disease.
What emerges from the overview of the results of our SLR is that a very few studies (6,16,19,22,24) investigated the additional impact of ultrasound findings in making a diagnosis in consecutive patients presenting with joint symptoms, which is indeed the typical scenario of every day's rheumatologist work.
In most studies, clinical and ultrasound assessments were performed separately, and ultrasound findings were not evaluated on top of clinical findings but validated against clinical diagnosis. With this being almost the only evidence available today, it is of no surprise that so far, the relevance of ultrasound in recommendations on the diagnosis and management of rheumatic diseases and in classification criteria is so limited. This happens despite ultrasound being an ideal tool in this context: adequate ultrasound equipment can now be easily accessible, they can be used during scheduled visits and provide immediately helpful information. Multiple sites can be assessed at the same time with good acceptability by the patients. Several other modern imaging have been applied in the setting of early arthritis, such as magnetic resonance imaging (MRI), positron emission tomography (PET) od dual energy CT (DECT), however they present a limited feasibility compared to ultrasound, limited availability, higher costs and, in some cases, limited data in the clinical setting. Since the accuracy of ultrasound in detecting elementary lesions has been established and the increasing ultrasound expertise across rheumatologists allows at least some findings to be detected reliably, the time has come to test the real potentialities of ultrasound during the first evaluation for the suspicion of inflammatory arthropathy. The Musculoskeletal ultrasound Study Group of the Italian Society for Rheumatology has recently focused on the design of such study, which implies the definition of the ideal combination of joints to be assessed based on the clinical suspicion and confirming diagnoses after a follow-up. Before the application of ultrasound, an initial set of differential diagnoses should be defined for each patient, based on clinical features. The additional value of an ultrasound examination, targeted on the clinical suspicion, would afterwards be tested in terms of correct and timely diagnosis. We expect that these results will help clarify the real role of ultrasound through the process of diagnosis and help giving a new insight into its correct placement in the management of inflammatory arthropathies.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
GS, SC, GF, and AI conceived the study and supervised its conduction. SC performed the searches in the electronic databases. AA, ABa, ABo, AD, OD, CD, OE, EF, LI, AP, AZ, and GS performed the systematic literature review. GS drafted the manuscript. All authors revised critically the article, read and approved its final version for submission.

FUNDING
This work was supported by the Italian Society of Rheumatology by affording the open access fees.