Reliability of assessing skeletal muscle architecture and tissue organization of the gastrocnemius medialis and vastus lateralis muscle using ultrasound and spatial frequency analysis

Introduction The purpose of this study was to investigate inter- and intra-rater reliability as well as the inter-rater interpretation error of ultrasound measurements assessing skeletal muscle architecture and tissue organization of the gastrocnemius medialis (GM) and vastus lateralis (VL) muscle. Methods The GM and VL of 13 healthy adults (22 ± 3 years) were examined thrice with sagittal B-mode ultrasound: intraday test-retest examination by one investigator (intra-rater) and separate examinations by two investigators (inter-rater). Additionally, images from one investigator were analysed by two interpretators (interpretation error). Muscle architecture was assessed by muscle thickness [MT], fascicle length [FL], as well as superior and inferior pennation angle [PA]. Muscle tissue organization was determined by spatial frequency analysis (SFA: peak spatial frequency radius, peak −6 dB width, PSFR/P6, normalized peak value of amplitude spectrum [Amax], power within peak [PWP], peak power percent). Reliability of ultrasound examination and image interpretation are presented as intraclass correlation coefficient (ICC), test-retest variability, standard error of measurement as well as bias and limits of agreement. Results GM and VL demonstrated excellent ICCs for inter- and intra-rater reliability, along with excellent ICCs for interpretation error of MT (0.91–0.99), showing minimal variability (<5%) and SEM% (<5%). Systematic bias for MT was less than 1 mm. For PA and FL poor to good ICCs for inter- and intra-rater reliability were revealed (0.41–0.90), with moderate variability (<12%), low SEM% (<10%) and systematic bias between 0.1–1.4°. Tissue organization analysis indicated moderate to good ICCs for inter- and intra-rater reliability. Notably, Amax and PWP consistently held the highest ICC values (0.77–0.87) across all analyses but with higher variability (<24%) and SEM% (<18%), compared to lower variability (<9%) and SEM% (<8%) in other tissue organization parameters. Interpretation error of all muscle tissue organization parameters showed excellent ICCs (0.96–0.999) with very low variability (≤1%) and SEM% (<2%), except Amax & PWP (TRV%: <6%; SEM%: <7%). Conclusion Our findings demonstrated excellent inter- and intra-rater reliability for MT. However, agreement for PA, FL, and SFA parameters was not as strong. Additionally, MT and all SFA parameters exhibited excellent agreement for inter-rater interpretation error. Therefore, the SFA seems to offer the possibility of objectively and reliably evaluating ultrasound images.


Introduction:
The purpose of this study was to investigate inter-and intra-rater reliability as well as the inter-rater interpretation error of ultrasound measurements assessing skeletal muscle architecture and tissue organization of the gastrocnemius medialis (GM) and vastus lateralis (VL) muscle.Methods: The GM and VL of 13 healthy adults (22 ± 3 years) were examined thrice with sagittal B-mode ultrasound: intraday test-retest examination by one investigator (intra-rater) and separate examinations by two investigators (inter-rater).Additionally, images from one investigator were analysed by two interpretators (interpretation error).Muscle architecture was assessed by muscle thickness [MT], fascicle length [FL], as well as superior and inferior pennation angle [PA].Muscle tissue organization was determined by spatial frequency analysis (SFA: peak spatial frequency radius, peak −6 dB width, PSFR/P6, normalized peak value of amplitude spectrum [Amax], power within peak [PWP], peak power percent).Reliability of ultrasound examination and image interpretation are presented as intraclass correlation coefficient (ICC), test-retest variability, standard error of measurement as well as bias and limits of agreement.Results: GM and VL demonstrated excellent ICCs for inter-and intra-rater reliability, along with excellent ICCs for interpretation error of MT (0.91-0.99), showing minimal variability (<5%) and SEM% (<5%).Systematic bias for MT was less than 1 mm.For PA and FL poor to good ICCs for inter-and intra-rater reliability were revealed (0.41-0.90), with moderate variability (<12%), low SEM % (<10%) and systematic bias between 0.1-1.4°.Tissue organization analysis indicated moderate to good ICCs for inter-and intra-rater reliability.Notably, Amax and PWP consistently held the highest ICC values (0.77-0.87) across all analyses but with higher variability (<24%) and SEM% (<18%), compared to lower variability (<9%) and SEM% (<8%) in other tissue organization parameters.Interpretation error of all muscle tissue organization parameters showed excellent ICCs (0.96-0.999) with very low variability (≤1%) and SEM% (<2%), except Amax & PWP (TRV%: <6%; SEM%: <7%).

Introduction
The architecture of human skeletal muscles plays an important role in its function (1)(2)(3).Studies have shown that muscle thickness (MT), fascicle length (FL) as well as fascicle pennation angle (PA) are important determinants for the force generating capacity of a particular muscle (1,2,4).Muscle architecture seems to be highly adaptable in response to different stimuli, which can partly explain the changes in function following training and injury (3).Significant alterations in muscle architecture are evident following both an acute bout of resistance exercise as well as long-term resistance training (3,(5)(6)(7)(8).For instance, Vieira et al. (8) found that the fatigue-induced drop in performance due to an acute bout of concentric isokinetic knee extension exercises were associated with changes in muscle architecture (VL: MT +11%-14%, PA: +39%).Limited evidence exists to characterise the effect of injury on muscle architecture.Timmins et al. (9) provided evidence for shorter fascicles and greater pennation angles in individuals with a history of strain injury.Due to a lack of prospective studies, it is still unclear whether these architectural changes are the cause or consequence of injury.
To investigate human skeletal muscle architecture (i.e., MT, FL, and PA) in vivo, B-mode ultrasound imaging has become a popular method due to its inexpensive, portable, safe, and non-invasive nature (10).Recently, spatial frequency analysis (SFA) of ultrasound images has been also used to assess micromorphological characteristics such as tissue density and organization of tendons resulting from pathological or training related adaptations (11,12).Due to similar hierarchical structure of tendon and muscle tissue, a recent investigation successfully adapted and extended SFA for the application in skeletal muscle (13,14).In a first examination Crawford et al. (13) accomplished a characterization of muscle injury and recovery due to SFA parameters by providing quantitative information on both fascicular disruption and edema presence in acute hamstring strain injury.Afterwards, Crawford et al. (15) conducted a reliability study investigating hamstring muscles and focusing on four SFA parameters (i.e., PSFR, Mmax, Mmax%, Sum).They found excellent intraclass correlation coefficients (ICCs) for the inter-rater interpretation error between different interpreters for the extracted spatial frequency parameters (ICC: 0.95-0.98).Therefore, SFA may be an objective method to determine training induced acute and chronic micromorphological adaptations or changes in skeletal muscles organization due to injury and pathologies.
Methodological limitations of ultrasound imaging include inconsistency of image acquisition [probe placement, probe rotation, probe orientation (e.g., angle), and probe pressure] and interpretation.Whether or not a standardization protocol is sufficient to overcome these methodological limitations can be assessed by the investigation of inter-rater reliability (investigator 1 vs. investigator 2), intra-rater reliability (measurement 1 vs. measurement 2 of one investigator) as well as interpretation error (same image: interpretator 1 vs. interpretator 2).
Ten years ago Kwah et al. ( 16) conducted a systematic review on the reliability and validity of ultrasound measurements of PA and FL in human muscles.Among other subanalyses, they took a closer look on the inter-and intra-rater reliability as well as the inter-rater interpretation error.They found good to excellent inter-rater reliability (PA: ICC 0.80; FL: ICC 0.80-0.97),moderate to excellent intra-rater reliability (PA: ICC 0.51-1.00;FL: ICC 0.62-0.99)as well as good to excellent interpretation error (PA: 0.85-1.00;FL: ICC 0.87-0.99) of the measurement of PA and FL in humans.However, Kwah et al. (16) summarized findings from studies that investigated different muscles (e.g., shoulder, arm, shank, and thigh muscles) and populations (e.g., live subjects vs. cadavers).The feasibility of the ultrasound measurement is muscle-specific due to different muscle localization and size.Therefore, it is assumed that the reliability of ultrasound measurement is also muscle-specific.The gastrocnemius medialis (GM) and vastus laterlis (VL) are two of the main locomotor muscles and active for instance during walking, running, and jumping.Particularly for the GM and VL muscle, no study exists that comprehensively investigates the inter-and intrarater reliability as well as the inter-rater interpretation error of ultrasound measurement to assess the muscle architecture (i.e., MT, FL, PA) and especially muscle tissue organization using SFA of these two muscles.Furthermore, the assessment of measurement errors in our laboratory setup is vital for accurately interpreting potential intervention effects in subsequent intervention studies.
Hence, the aim of this study was to assess the inter-and intrarater reliability as well as inter-rater interpretation error of ultrasound skeletal muscle architecture and tissue organization measurements using ultrasound and SFA of the GM and VL muscle in healthy young adults.

Study design
A single-group study design was conducted to examine interrater and intraday intra-rater reliability as well as inter-rater interpretation error of sonographic examinations evaluating skeletal muscle architecture and tissue organization.To determine interrater reliability of ultrasound assessment, all participants were examined by two investigators independently (investigator 1 vs. investigator 2; Figure 1).Both investigators held the probe manually at the predefined location (for more details see below).Both Investigators have several weeks/month of experience with sonographic assessments.According to König et al. (17), investigators with only several weeks/month of experience can assess muscle architecture with good to high reliability.To allow intraday intra-rater reliability assessment, all participants were measured twice by the same investigator (investigator 1: measurement 1 vs. measurement 2; Figure 1).To assess inter-rater interpretation error, ultrasound scans were interpreted twice (same image: interpretator 1 vs. interpretator 2; Figure 1).

Participants
Thirteen healthy young, physically active adults (i.e., sport students) aged 19 to 30 years volunteered to participate in this study (Table 1).Exclusion criteria were defined a priori as any musculoskeletal, neurological, and/or orthopedic disorders in the lower extremities that occurred within the last six months prior to the start of the study.Written informed consent was obtained from all participants before study inclusion.The study was approved by the local ethics committee.All experiments were conducted according to the latest version of the declaration of Helsinki (18).

Measurement procedure
At the beginning of the testing session, anthropometric and body composition tests were performed under strictly standardized conditions.Body composition was analyzed with the InBody720 (Biospace; Seoul, South Korea).Afterwards the participants laid prone on an examination table with the right leg supported on an inclined foam wedge (ankle 40°of plantar flexion; Figure 2) to assess the right GM muscle.For assessing the right VL muscle, participants laid supine on an examination table with the right leg supported on an inclined foam wedge (knee 25°flexed; Figure 3) to avoid tension in the VL muscle.The knee and ankle joint angles were consistent within each subject during repeated measurements because the wedge was consistently placed in the same position.GM and VL were assessed under resting condition.Longitudinal ultrasound scans (Vivid q; GE Healthcare, Tirat Carmel, Israel) of the GM and VL muscle belly were conducted with a 7.5-MHz linear ultrasound array (6 cm, 4-13 MHz).The preset was standardized (frequency: 11 MHz; depth: 4.5 cm; gain: 38%; dynamic range: 102; foci for GM: 1.2 cm and 2.5 cm, foci for VL: 1.625 cm and 3.125 cm) and kept constant for all image acquisitions.Care was taken to apply minimal pressure on the probe to prevent compression of the muscle.
To standardize the measurement location of the muscles, the placement of the probe was marked for GM and VL using a marker pen.The GM location was defined at a point one-third of the distance between the popliteal crease (tendon of semitendinosus muscle) and the medial malleolus.The VL location was defined half-way between the mid of patella and the greater trochanter (19).To recover the marked location in the ultrasound images, a thin strip of echoabsorptive tape was placed 1.5 cm proximal above the previously marked location (20).The probe was aligned longitudinally to the leg.
After image acquisitions were completed from one investigator, the participant sat on the exam table for 60 s (15), before laying back down to be measured again by the same investigator (measurement 2) or the second investigator, respectively.The measurements were performed in randomized order (Figure 4).For each condition, 3 scans of the GM and VL muscle were conducted at the same location, resulting in a total of 117 images for each muscle.

Data analysis
Images were saved and transferred to a computer.Free Javabased ImageJ software (National Institutes of Health, Bethesda, A single-group study design was conducted to examine inter-rater and intraday intra-rater reliability as well as inter-rater interpretation error of the skeletal muscle architecture and tissue organization assessment via B-mode ultrasound scans.5).Due to a calibration image, the Image J software was provided with precise information regarding the specific distance in the captured images that corresponds to 1 cm.After this calibration of the system, distances and angles can be determined by simply clicking on points in the image.Superior PA was measured as the angle between the upper aponeurosis and the fascicle.
Inferior PA was measured as the angle between the lower aponeurosis and the fascicle.MT was measured at the predefined position (1.5 cm distal from the echoabsorptive tape) by taking the distance between the upper and lower aponeuroses.
In case the fascicle extended the field of view during the ultrasound, according to Baudry et al. (21) FL is calculated from the visible FL (FL1) plus the calculated FL (FL2): where d is the height between the inferior aponeurosis and the most distal part of the fascicle in the field of view, β is the angle between the inferior aponeurosis and the image border, and θ is the inferior PA.
To analyze tissue organization, all ultrasound images were imported into MATLAB (Mathworks, USA R2016a) to conduct the SFA described by Bashford et al. ( 11).The SFA is a quantitative ultrasound method, which analyzes the anisotropic Bmode speckle pattern arising from within a tissue type in the spatial frequency domain.Briefly, a polygonal region of interest (ROI) is manually drawn in an image, and within this ROI, smaller sub-regions termed "kernels" are analyzed in the spatial frequency domain.For each kernel that fits within the ROI, parameters extracted from the FFT-derived spatial frequency spectral estimate are extracted.The parameters may be used for statistical comparisons such as classification.This method was recently updated by Bashford (unpublished work, 2023).In the previous method, the maximum value within each spatial spectrum was used to identify a spectral region of interest.In the new method, each spectral estimate F(u,v) is modelled as a twodimensional narrowband spatial signal with a dominant spatial peak consisting of an elliptical Gaussian: Where A m denotes the amplitude of the elliptical Gaussian, (u m ,v m ) denotes the center of the elliptical Gaussian, and helper variables a, b, and describe the spread; here (s 2 u , s 2 v ) are the variances of the Gaussian in the u-and v-directions respectively, and θ is the rotation of the Gaussian ellipsoid.The spectral data was fit to the ellipsoidal   Gaussian using an unconstrained multivariable simplex method (22).Spatial frequency parameters, including four new parameters (Table 2), were extracted from the model after fitting was achieved.
The kernel dimensions (number of pixels) was set the same as Bashford et al. (11) as the structures under analysis were at similar image depths and similar ROI sizes.A standardized ROI was selected by measuring a 1 cm wide rectangular area of the GM and VL from the upper to the lower aponeuroses at the muscle belly, at a predefined position located such that the center of the ROI was 1.5 cm from the echoabsorptive tape; i.e., the vertical edges of the ROI were 1 cm and 2 cm distal from the echoabsorptive tape (Figure 6).The SFA analysis yielded six spatial frequency parameters: peak spatial frequency radius [PSFR], peak −6 dB width [P6], PSFR/P6 [Q6], normalized peak value of amplitude spectrum [Amax], power within peak [PWP], and peak power percent [PPP] (Table 2).
To ensure blinded image evaluation, all images were coded with randomly assigned numbers throughout the entire interpretation process using a software that additionally reordered the images.The analysis was consequently carried out in a shuffled sequence without any knowledge of the participant ID, the investigator, the number of measurements, or trial.Following the image analysis, the measured results were then matched to the participants based on the random number codes in the master file.

Statistical analysis
Patient characteristics as well as muscle outcome parameters are presented descriptively by mean ± standard deviation (SD).
For muscle parameters, the respective means of the 3 scans for each condition (Figure 4) were calculated and used for further statistical analysis.To assess inter-rater and intraday intra-rater comparisons as well as interpretation error, the ICC 2.1 (23) was calculated, defining the level of reliability as poor (ICC < 0.5), moderate (0.5 ≤ ICC < 0.75), good (0.75 ≤ ICC ≤ 0.9), or excellent (ICC > 0.9) (23).Furthermore, Bland-Altman analyses were conducted to determine the Bias (mean difference) and the 95% limits of agreement (LoA).To ensure valid LoA data, homoscedasticity was checked by applying Pearson Moment Correlation with the variables "mean" and "difference" of the two measurements (24).If homoscedasticity was not present in a data set, LoA derived from log transformed data that were finally back-transformed to give LoA for the ratio of the actual measurements (24).Moreover, test-retest variability (TRV%) was calculated by difference mean Â 100% for each participant.The level of variability was defined as very low (TRV% < 5%), low (5% ≤ TRV % < 10%), moderate (10% ≤ TRV% < 20%), and high (TRV% > 20%).To provide an estimate of the precision of measurements that keeps the unit of the parameter, standard error of measurement (SEM) was calculated as follows: SD Â ffiffiffiffiffiffiffiffiffiffiffiffiffi ffi (1 À r) p ; where SD is the average standard deviation of the means of the 2 measurements and r is the calculated ICC (25,26).SEM% was calculated by SEM Â 100% mean ; where mean is the mean score of all trials.The calculation of the ICCs, Pearson moment correlations, and log transformations was performed with IBM SPSS Statistics Version 26.0 (IBM Corporation, Armonk, NY).The calculation of Bias, LoA, TRV%, SEM, and SEM% was performed in Microsoft Excel 2016 (Microsoft Corporation, Redmond, WA). 3 Results

Muscle architecture and tissue organization parameters and its inter-rater and intraday intra-rater reliability
The mean values ± SD, as well as the calculated parameters of inter-and intra-rater reliability for GM and VL muscle architecture and tissue organization, are presented in Tables 3, 4 as well as in the Supplementary Files (Supplementary Figures S1-S4).The analyses indicated excellent ICCs (0.91-0.99) for inter-and intra-rater reliability of MT for both muscles, with very low variability (TRV%: <5%), a systematic bias of −0.2 to 0.7 mm as well as SEM% of 1.5 to 4.2%.Furthermore, the analyses revealed poor to good ICCs (0.41-0.79) for inter-and intra-rater reliability of PA and FL for GM, with low variability (TRV%: <9%) and SEM% (<7%).Systematic bias was between 0 and 1°for PA and between 1.2 to 1.5 mm for FL.For VL, good ICC values (0.82-0.90) were obtained for inter-and intra-rater reliability of PA and FL, with moderate variability (TRV%: <12%) and low SEM% (<10%).Systematic bias was in maximum 0.4°for PA and between 0.3 to 3.3 mm for FL.Regarding GM and VL muscle tissue organization parameters, analyses showed moderate to good ICCs for inter-and intra-rater reliability (ICCs: 0.58-0.87),except for PPP of GM for inter-rater reliability (ICC: 0.29) and P6 of VL for intra-rater reliability (ICC: 0.37).Amax and PWP consistently showed the highest ICC values (0.77-0.87) across all analyses but also high variability (TRV%: <24%) and SEM% (7.4%-17.4%)compared to low variability (TRV%: <9%) and SEM% (<8%) of all other tissue organization parameters.For systematic bias and LoA of SFA parameters see Tables 3, 4.

Muscle architecture and tissue organization parameters and its inter-rater interpretation error
The mean values ± SD as well as the calculated parameters of inter-rater interpretation error for GM and VL muscle architecture and tissue organization, are presented in Table 5 and in the Supplementary Files (Supplementary Figures S1-S4).The analyses indicated excellent ICCs (0.96-0.99) for inter-rater interpretation error for MT and all tissue organization parameters for both muscles.Variability (TRV%: <1%) and SEM % (<2%) for these parameters were very low, except for Amax and PWP (TRV%: <6%; SEM%: <7%).Systematic bias for MT was in maximum 0.1 mm.For SFA parameter-specific systematic bias and LoA see Tables 5. Regarding PA and FL both muscles analyses showed moderate to good ICCs (0.69-0.87) for the inter-rater interpretation error with moderate variability (TRV%: <12%) and low SEM% (<10%), except excellent ICC (0.94) for superior PA of VL.Systematic bias of PA were less than 1°and between 3.2-6.5 mm for FL.Selected region of interest for the spatial frequency analysis.

Discussion
The aim of this study was to assess inter-rater and intraday intra-rater reliability as well as inter-rater interpretation error of ultrasound skeletal muscle architecture and tissue organization measurements using ultrasound and SFA of GM and VL muscle in healthy young adults.Findings revealed that GM had lower MT, shorter FL as well as greater PA compared to VL.In terms of reliability testing, GM and VL showed excellent ICC values for inter-rater and intraday intra-rater reliability as well as excellent ICCs for inter-rater interpretation error for MT, with very low variability and SEM% (<5%).Systematic bias for MT was less than 1 mm.Furthermore, the analyses revealed poor to good ICCs for inter-rater and intraday intra-rater reliability of PA and FL for both muscles, with moderate variability (<12%), low SEM % (<10%) and systematic bias between 0.1 to 1.4°.Reliability testing of tissue organization indicated moderate to good ICCs for inter-rater and intraday intra-rater reliability.Amax and PWP consistently showed the highest ICC values (0.77-0.87) across all analyses, but also the highest variability (<24%) and SEM% (<18%), compared to the low variability (<9%) and SEM% (<8%) of all other tissue organization parameters.In particular for the inter-rater interpretation error, ICCs of all muscle tissue organization parameters were excellent with very low variability (≤1%) and SEM% (<2%) between interpretators, except Amax and PWP (TRV%: <6%, SEM%: <7%).Several years ago, Kwah et al. ( 16) conducted a systematic review indicating good to excellent ICCs regarding the inter-rater reliability, moderate to excellent ICCs for the intra-rater reliability as well as good to excellent ICCs concerning interpretation error of ultrasound measurements of muscle architecture (i.e., PA and FL) in humans.However, they summarized findings from studies that investigated different muscles (e.g., shoulder, arm, shank, and thigh muscles), populations (e.g., in vivo vs. cadaver), and basis of calculation (e.g., single scan vs. mean of 3 scans).To have a closer look, in particular, on inter-and intra-rater reliability as well as interrater interpretation error of GM and VL ultrasound imaging in live humans, there exist only few studies (17, 27-30).For instance, König et al. ( 17) investigated inter-rater reliability as well as inter-rater interpretation error of ultrasound measurements (in clinical settings) of the GM architecture in healthy female and male adults (6 males: 29 ± 5 years, 9 females: 28 ± 3 years).They found good inter-rater reliability (ICC: 0.77-0.90;SEM: 0.1 cm [MT], 1.0-1.1°[PA],0.4 cm [FL]) and good to excellent interpretation error (ICC: 0.76-0.96;SEM: 0.05 cm [MT], 1.3-1.7°[PA],0.2 cm [FL]).They used the mean of 3 scans of each participants for their analysis and their findings are comparable to the present results for GM.Concerning intra-rater reliability, Raj et al. (29) investigated the ultrasound measurements of GM and VL muscle architecture in older adults (11 males, 10 females; 68 ± 5 years) and used the mean of 3 scans of each participants for their analysis.Measures were taken on two separate occasions and indicated good to excellent ICC values (0.80-0.97) in terms of intra-rater test-retest reliability for MT, PA, and FL of GM and VL muscle.Likewise, May et al. ( 27) investigated the intra-rater test-retest reliability of ultrasonographic measurement of GM muscle architecture.They examined 87 participants (44 males, 43 females; 22 ± 9 years) on two separate occasions and found moderate to excellent ICCs (0.63-0.91) regarding the intra-rater test-retest reliability for MT, PA, and FL.These findings are comparable to the present results for intraday intra-rater reliability of GM.
Our findings showed excellent ICCs for MT between different investigators, between different measurements of the same investigator as well as between different interpretators.Regardless of image acquisition and interpretation, the variation in assessed MT was very low.Thus, with a minimum of standardization, sonographic MT measurements of GM and VL can be conducted extremely reliable by different investigators and by the same investigator at different measurements as well as evaluated by different interpretators.In contrast, inter-rater and intraday intra-rater reliability testing showed lower ICCs and higher variability of PA and FL for GM and VL.Because ultrasound imaging is usually limited to a two-dimensional view, especially the standardization of the plane of the visualization (probe rotation and orientation) of the three-dimensional muscle structure is crucial.By varying probe rotation and probe orientation just very slightly (e.g., misalignment of perpendicular probe orientation), PA and FL can be under-or overestimated (31).For instance, Klimstra et al. (32) highlighted that changes in probe rotation and orientation can result in a 12% difference in the reported PA.Therefore, variations in PA and FL might be due to subtle differences in probe orientation between different investigators, but also between the measurements of the same investigators.Furthermore, also inter-rater interpretation error was higher for PA and FL for GM and VL compared to MT. Superior and inferior PA as well as FL had to be manually detected on ultrasound images frame by frame.Thus, manual assessment of PA and FL in ultrasound images seems to be very subjective.Additionally, when comparing the reliability of FL between GM and VL, the variations in FL are greater for VL.This might be due to the difference in FL between GM and VL muscles.In fact, the FL of the VL muscle is longer (≈11.1 cm) compared to the GM muscle (≈4.5 cm).The fascicles of the VL muscle extended the field of view during the ultrasound.Therefore, the true length of the fascicle requires estimation instead of directly measuring the FL and therefore is more sensitive to errors.In this context, it is also noticeable that while the variability of FL is greater for VL compared to GM, the ICC values of FL are higher for VL compared to GM.The discrepancy between the ICC and variability may become apparent through the relationship between the accuracy of the ICC and several factors: the sample size, the range of the measuring scale, and the ratio of variances (33).Specifically, the larger the sample size and the wider the measuring scale, the more accurate the ICC.Additionally, higher variability within groups compared to between groups can result in lower ICC values.
In typical sonographic images of healthy skeletal muscle, the hierarchical arrangement is visually represented by parallel striations of hypoechoic muscle fibers and hyperechoic perimysium.SFA is employed to quantify this speckle pattern and the light-dark banding pattern seen in longitudinal B-mode images of healthy muscle, enabling a comprehensive examination and measurement of the characteristic speckle pattern in muscle tissue.While SFA has been predominantly used for investigating tendon tissue, recent research has successfully adapted and extended it for application in skeletal muscle (13,14).Consequently, SFA may prove to be useful in assessing differences in muscle structure, such as those resulting for instance from training interventions.To utilize SFA for this purpose, it is necessary to verify the reliability of the various SFA parameters to distinguish measurement inaccuracies from training adjustments.Crawford et al. (15) conducted a reliability study investigating hamstring muscles and focusing on four SFA parameters (i.e., PSFR, Mmax, Mmax%, Sum).They found excellent ICCs for the inter-rater interpretation error between different interpreters for the extracted spatial frequency parameters (0.95-0.98) and concluded that SFA may be an objective method for examining changes in muscle tissue due to muscle hypertrophy, swelling, localized edema, or mechanical disruption of the perimysium.Recent updates to the SFA algorithm have introduced other parameters that could be crucial, especially in the context of muscle investigations, for better quantifying aspects like the alignment and packing of muscle fibers (see Table 2).Therefore, the reliability of the new SFA parameters has to be tested, too.Our analyses regarding interrater interpretation error indicate excellent ICC values (0.96-0.999) for all SFA parameters, showing that SFA evaluation of the same images with a standardized ROI (performed by different interpreters) is extremely reliable.Nevertheless, in terms of inter-rater and intraday intra-rater reliability, we found moderate to good ICC values with low to high variability.Specifically, Amax and PWP consistently showed the highest ICC values across all analyses but also exhibited the highest variability (ranging from low to high) compared to the low variability of all other tissue organization parameters.Again, this discrepancy between the ICC and variability may become apparent through the previously mentioned relationship between the accuracy of the ICC and several factors (i.e., sample size, range of the measuring scale, ratio of variances).Thus, the SFA parameters appear to respond with different sensitivity to image variations caused by slight changes in the probe position, orientation, and rotation when repositioning the probe.Considering this, the use of special foam casts (17, 20, 34) to standardize probe orientation may be helpful, enabling a constant probe orientation due to rigid probe-skin surface fixation.
A limitation present in this study pertains to the consecutive execution of all measurements for both the test and retest sessions within a single day.This experimental design was intentionally selected to mitigate the influence of confounding variables.Furthermore, it is possible that sitting and subsequently lying down between repeated measurements led to a temporary shift of the skin over the muscle.Insufficient time before the next measurement may prevent the skin from returning to its original position above the muscle, causing the skin marker to not precisely identify the same analysis spot.Nevertheless, we examined the images of several participants for the distance between specific muscular features in the ultrasound image and the skin marker from repeated measurements, and we found no discernible difference in distance.Moreover, the calculation of reliability by averaging data instead of considering each trial separately might be perceived as a form of data smoothing, potentially obscuring to some extent the variability in the data.Additionally, the analyses revealed a disparity between the ICC and the variability (TRV%).As previously mentioned, this incongruity could potentially arise due to the ICC's sensitivity to factors such as sample size, the range of measurement scale, and the variance ratios.
In summary, the present findings have showcased excellent inter-rater and intraday intra-rater reliability concerning MT with minmal variability and systematic bias.However, the agreement was not as robust for the PA, FL, and SFA parameters.Especially Amax and PWP should be interpreted with caution, as they consistently showed high variability (TRV %: <24%).Notably, MT and all SFA parameters exhibited excellent agreement for inter-rater interpretation error.This implies that the updated SFA algorithm holds the promise of objectively and consistently evaluating ultrasound images.To minimize measurement errors, it is advised to standardize probe rotation and orientation.The incorporation of foam casts could potentially facilitate consistent probe orientation by establishing a rigid fixation between the probe and the skin surface.Future research should focus on evaluating the reliability of foam cast scans to ensure the accurate detection of small adaptive changes in muscle architecture and tissue organization.their affiliated organizations, or those of the publisher, the editors and the reviewers.Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

FIGURE 2
FIGURE 2Measurement location of the right gastrocnemius medialis muscle.

FIGURE 3
FIGURE 3Measurement location of the right vastus laterlis muscle.

FIGURE 5
FIGURE 5 Muscle architecture parameters: muscle thickness 1.5 cm distal from echoabsorptive tape (a), fascicle length (visible length (FL1) + calculated length (FL2) as described by Baudry et al. (21)), superior pennation angle (α), inferior pennation angle (θ), angle between the inferior aponeurosis and the image boarder (β), and height between the inferior aponeurosis and the most distal part of the fascicle in the field of view (d).

TABLE 1
Characteristics of study participants.

TABLE 2
Mathematical description and physiological correlate of spatial frequency parameters.