Gut and Liver is an international journal of gastroenterology, focusing on the gastrointestinal tract, liver, biliary tree, pancreas, motility, and neurogastroenterology. Gut atnd Liver delivers up-to-date, authoritative papers on both clinical and research-based topics in gastroenterology. The Journal publishes original articles, case reports, brief communications, letters to the editor and invited review articles in the field of gastroenterology. The Journal is operated by internationally renowned editorial boards and designed to provide a global opportunity to promote academic developments in the field of gastroenterology and hepatology. +MORE
Yong Chan Lee |
Professor of Medicine Director, Gastrointestinal Research Laboratory Veterans Affairs Medical Center, Univ. California San Francisco San Francisco, USA |
Jong Pil Im | Seoul National University College of Medicine, Seoul, Korea |
Robert S. Bresalier | University of Texas M. D. Anderson Cancer Center, Houston, USA |
Steven H. Itzkowitz | Mount Sinai Medical Center, NY, USA |
All papers submitted to Gut and Liver are reviewed by the editorial team before being sent out for an external peer review to rule out papers that have low priority, insufficient originality, scientific flaws, or the absence of a message of importance to the readers of the Journal. A decision about these papers will usually be made within two or three weeks.
The remaining articles are usually sent to two reviewers. It would be very helpful if you could suggest a selection of reviewers and include their contact details. We may not always use the reviewers you recommend, but suggesting reviewers will make our reviewer database much richer; in the end, everyone will benefit. We reserve the right to return manuscripts in which no reviewers are suggested.
The final responsibility for the decision to accept or reject lies with the editors. In many cases, papers may be rejected despite favorable reviews because of editorial policy or a lack of space. The editor retains the right to determine publication priorities, the style of the paper, and to request, if necessary, that the material submitted be shortened for publication.
Sangsoo Han1 , Miyoung Choi2
, Bora Lee3
, Hye-Won Lee4
, Seong Hee Kang5
, Yuri Cho6
, Sang Bong Ahn7
, Do Seon Song8
, Dae Won Jun9
, Jieun Lee10
, Jeong-Ju Yoo11
Correspondence to: Dae Won Jun
ORCID https://orcid.org/0000-0002-2875-6139
E-mail noshin@hanyang.ac.kr
Jeong-Ju Yoo
ORCID https://orcid.org/0000-0002-7802-0381
E-mail puby17@naver.comr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Gut Liver 2022;16(6):952-963. https://doi.org/10.5009/gnl210391
Published online February 23, 2022, Published date November 15, 2022
Copyright © Gut and Liver.
Background/Aims: Several noninvasive scoring systems have been developed to determine the risk of advanced fibrosis in nonalcoholic fatty liver disease (NAFLD). We examined the diagnostic accuracy of the fibrosis-4 (FIB-4) score and NAFLD fibrosis score (NFS) in patients with biopsy-proven NAFLD.
Methods: For this meta-analysis, various databases including PubMed (MEDLINE), EMBASE, OVID Medline and the Cochrane Library were systematically searched. After the acquired abstracts were reviewed by two investigators, manuscripts were chosen for a full-text examination.
Results: Thirty-six studies evaluating biopsy-proven NAFLD were selected for meta-analysis. A total of 14,992 patients were analyzed. The lower cutoff sensitivity of the FIB-4 score predicting histological fibrosis stage 3 or more (≥F3) was 69%, with specificity of 64%, positive likelihood ratio (LR+) of 1.96, and negative likelihood ratio (LR–) of 0.47. The low baseline sensitivity of the NFS score predicting ≥F3 was 70%, with a specificity of 61%, LR+ of 1.83, and LR– of 0.48. The area under the receiver operating characteristic curve (AUC) values of the FIB-4 score predicting ≥F3 and ≥F2 were 76% and 68%, respectively. The AUC values of the NFS score predicting ≥F3 and ≥F2 were 74% and 60%, respectively.
Conclusions: The FIB-4 or NFS test can be used to predict the degree of liver fibrosis in NAFLD, and the diagnostic accuracy resulted as relatively high in fibrosis stages of F3 or higher.
Keywords: Liver fibrosis, Meta-analysis, Nonalcoholic fatty liver disease, Predictive value of tests
With a prevalence of 25% to 40% in the general population, nonalcoholic fatty liver disease (NAFLD) is the most common liver disease worldwide, a pressing health concern associated with insulin resistance and metabolic syndrome.1,2 NAFLD affects nearly 100 million individuals in the United States and occurs in 90% of the obese population.1,3 Due to such burden of the disease, the early identification of patients with high morbidity and mortality associated with NAFLD is essential.
NAFLD can be categorized into various stages, from simple steatosis without fibrosis to nonalcoholic steatohepatitis related cirrhosis.1 The severity of NAFLD is determined by three factors: steatosis, inflammation, and fibrosis. Among these factors, the degree of hepatic fibrosis is the most essential factor in clinical settings, allowing clinicians to estimate the long-term prognosis in patients with NAFLD, such as the development of hepatocellular carcinoma, liver-related death or cardiovascular mortality.1,4 In fact, while simple steatosis is considered a non-progressive condition, nonalcoholic steatohepatitis or significant fibrosis (SF) is regarded as one of the main causes of liver transplantation.5 Therefore, it is vital to promptly identify advanced fibrosis (AF; ≥stage 3 fibrosis) or SF (≥stage 2 fibrosis) in such patients.6
Liver biopsy is the gold standard for staging and identifying fibrosis in NAFLD patients.7 However, it is not suitable for a routine screening use, due to various reasons including its invasive nature, potential complications, possibility of sampling error, and high cost.7,8 Therefore, a simple, inexpensive and noninvasive panel to identify and quantify liver fibrosis is necessary. Likewise, though techniques such as magnetic resonance elastography or transient elastography have been recently developed, their high expense prevents their use as routine screening tests. Thus, noninvasive fibrosis scoring systems based on serologic tests such as the fibrosis-4 (FIB-4) index and NAFLD fibrosis score (NFS) have been developed and widely used as screening tools to assess the degree of fibrosis.9,10 However, a comprehensive study on such scoring system is crucial, since not only was the FIB-4 score developed only for patients with viral hepatitis, the accuracy of these serologic scoring systems in NAFLD patients also differ among studies. The aim of this systematic review and meta-analysis is to evaluate the diagnostic accuracy of noninvasive fibrosis scoring systems (FIB-4 and NFS), compared to that of the corresponding liver histologic data, to predict AF and SF in patients with NAFLD.
This meta-analysis adhered to the protocol previously registered with PROSPERO (International Prospective Register of Systematic Reviews, CRD42021241243). We administered this systematic review and meta-analysis following guidelines provided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Diagnostic Test Accuracy.
Studies that documented the accuracy of FIB-4 and NFS, evaluated by the corresponding liver histology results in NAFLD patients, were considered eligible for inclusion. The following criteria were required for studies to be selected: (1) patients with NAFLD; (2) reports of the accuracy of FIB-4 and NFS based on liver histology results. The state of fatty liver was determined by histologic characteristics. Eligible study designs were randomized controlled trials, cross-sectional studies, and cohort studies, both prospective and retrospective. Studies were excluded by the following criteria: (1) case reports; (2) case series, in which less than five patients in total were involved; (3) reviews; (4) cell or animal studies; (5) chronic viral hepatitis, such as hepatitis B or hepatitis C; (6) human immunodeficiency virus; (7) significant alcohol consumption; (8) fatty liver defined by imaging or serologic criteria, without any histology result provided; or (9) non-English studies.
The primary outcome of this meta-analysis was the diagnostic accuracy of the FIB-4 score and NFS, compared to the corresponding liver histology in patients with NAFLD.
We searched PubMed (MEDLINE), EMBASE, the Cochrane Library, Korean Medical Database, and Korean Studies Information Service System to identify studies published in English between January 1, 1997, and October 31, 2020. The keywords used in the Patient/Problem, Intervention, Comparison, and Outcome model are provided in the Supplementary Material. The search words were NAFLD index words, FIB-4-related index words or NFS-related index words. We combined free-text words and controlled terms such as Medical Subject Headings and EMTREE according to the databases. The search strategy and the following result of each database are provided in the Supplementary Material. The entire search process was administered by a professional librarian (M.C.).
During the process of study selection, two reviewers (S.H. and J.J.Y.) first independently extracted relevant titles and abstracts. After an independent examination of the full-text articles, any resulting disparity between the two reviewers was resolved by a discussion with a third reviewer (H.W.L. or S.H.K.).
In addition, the two reviewers thoroughly examined the remaining procedures, such as screening full-text articles and assessing the risk of bias. The extraction of study characteristics and outcomes was conducted independently and documented in a standardized format by the two reviewers. Any discrepancy was settled by a discussion with Y.C. and S.B.A.
To determine the risk of bias, we utilized the Cochrane risk of bias tool, with any relevant information provided in the Supplementary Material. Again, any discrepancy was resolved by a discussion with additional reviewers (D.S.S. and D.W.J.). Risk of bias was evaluated using two tools, QUADAS11 and QUIPS12 tools. The overall outcome of the risk of bias is provided in the Supplementary Material. Publication bias was evaluated through a funnel plot.
The process of meta-analysis with sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio is as the following: (1) transform the proportion into a quantity (Freeman-Tukey variant of the arcsine square root transformed the proportion); (2) calculate the pooled prevalence as the back-transformation of the weighted mean of the transformed prevalence using DerSimonian-Laird weights assuming the random-effect model; (3) calculate the confidence interval with the Clopper-Pearson interval. To further analyze the heterogeneity within the studies, researchers conducted a meta-regression to understand the influence of other factors on diagnostic accuracy. RevMan 5 (Cochrane Library) or the meta package in R version 4.1.0 (The R Foundation for Statistical Computing, Vienna, Austria) were utilized in the statistical analyses.
A thorough database search of titles and abstracts resulted in 86 relevant studies. Out of the 86, 50 studies were excluded due to inappropriate patient population (n=1), inappropriate outcome measurement (n=42), overlapping population (n=6), or insufficient data (n=1). Finally, 36 studies were eligible for inclusion in this review (Fig. 1). Detailed characteristics of the studies included in this meta-analysis are provided in Table 1.13-48 A total of 14,992 patients were analyzed, with a mean age of 48.57±6.13. Studies were conducted in various countries in the world (Asia 16, Europe 8, America 9, two or more continents 2, and Australia 1). The median co-morbidity rates of diabetes, hypertension, and dyslipidemia were 36.2%, 43.5%, and 54.3%, respectively. Median aspartate aminotransferase and alanine aminotransferase levels were 43 U/L and 62 U/L, respectively.
Table 1. Characteristics and Results of the Included Studies
Author (year) | Location | Method | No. of samples | Mean age, yr | Male, % | Clinical characteristics | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DM, % | HTN, % | Dyslipidemia, % | Mean BMI, kg/m2 | Mean waist, cm | Mean AST, U/L | Mean ALT, U/L | ||||||
Aida (2015)13 | Japan | FIB-4 | 148 | 61 | 36 | - | - | - | 26.9 | - | 42 | 52 |
Anstee (2019)14 | Global | FIB-4/NFS | 3,123 | 59 | 42 | 68 | - | - | - | - | 44 | 45 |
Balakrishnan (2020)15 | USA | FIB-4/NFS | 99 | 44.7 | 33.9 | 43.5 | - | - | 32.1 | - | 59 | 108 |
Boursier (2019)16 | France | FIB-4/NFS | 938 | 56.5 | 58.5 | 51.1 | - | - | 31.8 | - | 39 | 56 |
Chan (2015)17 | Malaysia | NFS | 147 | 50.5 | 54.4 | 52.4 | 89.1 | - | 29.3 | 98.2 | 41 | 71 |
Chan (2019)18 | Asia | FIB-4/NFS | 583 | 50.9 | 52.9 | 52.3 | 55.1 | 74.5 | 28.9 | 96.5 | 38 | 63 |
Cui (2015)19 | USA | FIB-4 | 102 | 51 | 41.2 | 25.5 | - | - | 31.7 | - | 42.3 | 58 |
Demir (2013)20 | Germany | NFS | 120 | 43.8 | 47.2 | 19.9 | 41.9 | - | 37 | - | 36.8 | 56.6 |
de Carli (2019)21 | Brazil | NFS | 266 | 36.5 | 20.4 | 10.5 | - | - | 44.2 | 123.9 | 24.9 | 32.3 |
Goh (2015)22 | USA | NFS | 238 | 52 | 29.3 | 100 | - | 42 | 37.1 | - | 53.5 | 64 |
USA | NFS | 263 | 46 | 46 | 0 | - | 15.2 | 35.2 | - | 58.5 | 78 | |
Joo (2017)23 | Korea | FIB-4/NFS | 315 | 55 | 50.8 | 37.8 | 38.4 | - | 27 | 94.6 | 36 | 43 |
Jun (2017)24 | Korea | FIB-4/NFS | 328 | 36.4 | 70.7 | 33 | 14.6 | - | 28.6 | 96.4 | 91.3 | 98.5 |
Kakisaka (2018)25 | Japan | FIB-4 | 63 | 54.9 | 58 | - | - | - | 28.1 | - | 62 | 94 |
Kao (2020)26 | Taiwan | FIB-4 | 73 | 35.3 | 31.5 | 16.9 | 26.8 | - | 41 | 118.3 | 38.2 | 55 |
Kaya (2020)27 | Turkey | FIB-4/NFS | 463 | 46 | 47.5 | 37.8 | 34.8 | - | 31.7 | 104 | 42 | 66 |
Kim (2013)28 | USA | FIB-4/NFS | 142 | 52.8 | 26.8 | 27.5 | 45.1 | - | 36.32 | - | 47.2 | 60.4 |
Labenz (2018)29 | Germany | FIB-4/NFS | 261 | 51 | 52.5 | 29.9 | - | 37.5 | 30.9 | - | 48 | 60 |
Lang (2020)30 | Germany | FIB-4/NFS | 95 | 50 | 46.2 | 10.8 | 56.9 | - | 30 | 105 | 32.5 | 50.5 |
Lum (2020)31 | Singapore | FIB-4/NFS | 263 | 50.4 | 52.5 | 49 | - | 66.5 | 30.4 | 113.2 | - | - |
McPherson (2013)32 | UK | FIB-4/NFS | 70 | 54 | 56 | 43 | - | - | 32.9 | 105 | 28 | 28 |
UK | FIB-4/NFS | 235 | 48 | 63 | 40 | - | - | 34.4 | 110 | 59 | 95 | |
Meneses (2020)33 | Spain | FIB-4/NFS | 50 | 49 | 30 | 26 | 52 | 28 | 44.3 | 135 | 21 | 25 |
Nasr (2016)34 | Sweden | FIB-4/NFS | 58 | 60.4 | 71 | 53 | 93 | - | 28 | 102 | 34 | 60 |
Ooi (2017)35 | Australia | FIB-4/NFS | 101 | 49 | 33.7 | 34.7 | 79.2 | 73.3 | 41.9 | - | - | - |
Australia | FIB-4/NFS | 53 | 43 | 30.2 | 24.5 | 45.3 | 19.2 | 46.6 | - | - | - | |
Patel (2018)36 | USA | FIB-4/NFS | 114 | 41.8 | 79 | 30 | - | - | 33.9 | - | 49 | 84 |
USA | FIB-4/NFS | 151 | 60 | 95 | 70 | - | - | 33.7 | - | 48.2 | 54.6 | |
Pérez-Gutiérrez (2013)37 | Mexico | FIB-4/NFS | 243 | 48.6 | 49 | 21.5 | - | - | - | - | 57.6 | 73 |
Petta (2015)38 | Italy | FIB-4/NFS | 179 | 45.4 | 67.5 | 19.5 | 24 | - | 29.3 | - | 45.7 | 80.3 |
Italy | FIB-4/NFS | 142 | 43.9 | 71.8 | 15.4 | 11.9 | - | 27.4 | - | 42.2 | 75.6 | |
Petta (2019)39 | Global | FIB-4/NFS | 968 | 50.1 | 62.9 | 37 | 39.4 | - | 29.3 | - | 46.1 | 76.1 |
Siddiqui (2020)40 | USA | FIB-4/NFS | 1,904 | 50.3 | 37 | 39 | 58 | 62 | 34.4 | - | 51.2 | 69.8 |
Singh (2020)41 | USA | FIB-4/NFS | 1,134 | 51.1 | 35.4 | 100 | 74.5 | 70.8 | 35.5 | - | 27 | 28 |
Treeprasertsuk (2016)42 | Thailand | FIB-4/NFS | 139 | 40.9 | 47 | 38 | - | - | 36.1 | - | 38 | 56 |
Wong (2008)43 | Hong Kong | NFS | 128 | 46 | 59 | 57 | 48 | - | 28.5 | 95 | 43 | 75 |
Wong (2010)44 | Hong Kong | FIB-4/NFS | 246 | 51 | 54.9 | 36.2 | 40.2 | - | 28 | 94 | - | 75 |
Xun (2012)45 | China | FIB-4/NFS | 152 | 37.1 | 79.6 | 32.2 | - | - | 26.1 | - | 61 | 100 |
Yang (2019)46 | China | FIB-4/NFS | 453 | 36.56 | 58.9 | 30.2 | 34.8 | - | 26.93 | - | 74.12 | 135.11 |
Yoneda (2013)47 | Japan | FIB-4/NFS | 235 | 59.9 | - | 46 | - | 63.8 | 26.9 | - | 24.7 | 23.7 |
Zhou (2019)48 | China | FIB-4/NFS | 207 | 41.8 | 73 | 24.6 | 20.3 | 46.6 | 27 | 91.2 | 45.7 | 49 |
DM, diabetes mellitus; HTN, hypertension; BMI, body mass index; AST, aspartate aminotransferase; ALT, alanine aminotransferase; FIB-4, Fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score.
Analyzing the diagnostic accuracy of FIB-4 for predicting AF involved 13,764 patients from 32 studies (Table 2). As a lower cutoff value for predicting AF, a value from 1.02 to 1.45 was most frequently used (20 studies), and as a higher cutoff value, 2.67 was predominantly used (18 studies). Regarding the FIB-4 index, pooled sensitivity was 0.42 (95% confidence interval [CI], 0.33 to 0.51) and pooled specificity was 0.93 (95% CI, 0.91 to 0.95). Pooled diagnostic odds ratio (DOR) with 95% CI was 10.83 (7.55 to 15.54) with I2 of 85% (p<0.01). Summary statistics of FIB-4 at various thresholds for prediction of AF and forest plots are presented in Supplementary Table 1 and Supplementary Fig. 1. The area under the receiver operating characteristic curve (AUC) of summary receiver operating characteristic (SROC) was 0.76 (95% CI, 0.74 to 0.81) (Fig. 2A).
Table 2. Summary Sensitivities, Specificities, Diagnostic Odds Ratio, Positive Likelihood Ratio, Negative Likelihood Ratio and AUC of FIB-4 Index and NFS at Various Diagnostic Thresholds for the Prediction of Advanced Fibrosis and Significant Fibrosis
Variable | Cutoff | No. of samples (No. of patients) | Summary statistics (95% CI) | |||||
---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | DOR | LR+ | LR– | AUC | |||
Advanced fibrosis | ||||||||
FIB-4 | All | 32 (13,764) | 0.42 (0.33–0.51) | 0.93 (0.91–0.95) | 10.83 (7.55–15.54) | 6.65 (5.01–8.83) | 0.61 (0.52–0.71) | 0.76 (0.74–0.81) |
1.02 to 1.45 | 20 (10,304) | 0.69 (0.59–0.77) | 0.64 (0.57–0.71) | 4.11 (2.24–7.55) | 1.96 (1.49–2.57) | 0.47 (0.33–0.67) | 0.73 (0.67–0.80) | |
1.515 to 2.09 | 5 (2,408) | 0.74 (0.58–0.85) | 0.80 (0.72–0.86) | 11.97 (7.48–19.16) | 3.82 (3.00–4.88) | 0.32 (0.20–0.51) | 0.82 (0.78–0.87) | |
2.67 | 18 (8,731) | 0.34 (0.27–0.42) | 0.95 (0.92–0.96) | 9.32 (6.26–13.87) | 6.46 (4.66–8.95) | 0.69 (0.62–0.77) | 0.74 (0.70–0.76) | |
3.25 | 9 (2,721) | 0.39 (0.22–0.59) | 0.95 (0.93–0.97) | 14.23 (5.92–34.19) | 8.97 (4.87–16.50) | 0.63 (0.45–0.86) | 0.76 (0.77–0.88) | |
NFS | All | 33 (13,337) | 0.38 (0.28–0.50) | 0.94 (0.90–0.96) | 10.16 (7.18–14.37) | 6.60 (4.85–8.97) | 0.64 (0.54–0.76) | 0.74 (0.71–0.79) |
–1.98 to –1.036 | 23 (10,158) | 0.70 (0.57–0.80) | 0.61 (0.53–0.69) | 3.77 (2.16–6.56) | 1.83 (1.46–2.28) | 0.48 (0.33–0.70) | 0.72 (0.66–0.79) | |
–0.126 to 0.19 | 2 (1,962) | 0.61 (0.28–0.86) | 0.87 (0.79–0.92) | 10.79 (3.65–31.84) | 4.79 (3.13–7.35) | 0.44 (0.20–0.97) | 0.80 (0.78–0.88) | |
4.39 to 4.8 | 31 (11,471) | 0.31 (0.23–0.41) | 0.95 (0.93–0.97) | 10.17 (7.06–14.63) | 7.29 (5.34–9.93) | 0.71 (0.63–0.81) | 0.73 (0.71–0.81) | |
Significant fibrosis | ||||||||
FIB-4 | All | 6 (547) | 0.42 (0.16–0.73) | 0.93 (0.56–0.99) | 9.71 (2.43–38.70) | 6.05 (1.25–29.29) | 0.62 (0.40–0.95) | 0.68 (0.65–0.76) |
0.66 to 0.89 | 4 (434) | 0.69 (0.55–0.80) | 0.61 (0.44–0.76) | 3.58 (1.96–6.53) | 1.79 (1.26–2.55) | 0.50 (0.35–0.70) | 0.66 (0.61–0.77) | |
1.4 to 1.9 | 2 (136) | 0.65 (0.53–0.75) | 0.66 (0.51–0.79) | 3.69 (1.61–8.49) | 1.94 (1.21–3.09) | 0.52 (0.35–0.78) | 0.74 (0.54–0.85) | |
2.67 to 3.25 | 2 (151) | 0.06 (0.02–0.22) | 0.98 (0.94–1.00) | 4.25 (0.57–31.68) | 10.82 (0.47–257.67) | 0.93 (0.85–1.02) | 0.67 (0.61–0.73) | |
NFS | All | 5 (539) | 0.25 (0.02–0.82) | 0.76 (0.37–0.94) | 1.16 (0.37–3.57) | 1.12 (0.50–2.49) | 0.96 (0.69–1.33) | 0.60 (0.52–0.69) |
–3.168 to –1.455 | 2 (335) | 0.57 (0.28–0.81) | 0.68 (0.37–0.88) | 2.82 (1.53–5.18) | 1.78 (1.11–2.84) | 0.63 (0.42–0.94) | 0.61 (0.58–0.72) | |
0.676 | 3 (279) | 0.04 (0.01–0.25) | 0.92 (0.77–0.98) | 0.52 (0.07–3.59) | 0.54 (0.08–3.49) | 1.03 (0.96–1.12) | 0.53 (0.42–0.73) | |
1.292 | 2 (154) | 0.81 (0.59–0.92) | 0.27 (0.16–0.44) | 1.65 (0.50–5.36) | 1.12 (0.87–1.44) | 0.67 (0.26–1.73) | 0.67 (0.57–0.75) |
AUC, area under curve; FIB-4, fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DOR, diagnostic odds ratio; LR+, positive likelihood ratio; LR–, negative likelihood ratio.
Analyzing the diagnostic accuracy of NFS for predicting AF involved 13,337 patients from 33 studies (Table 2). As a lower cutoff value for predicting AF, a value from –1.98 to –1.03 was most frequently used (23 studies), and as a higher cutoff value, 4.39 to 4.8 was predominantly used (31 studies). For NFS, pooled sensitivity was 0.38 (95% CI, 0.28 to 0.50) and pooled specificity was 0.94 (95% CI, 0.90 to 0.96) (Table 2). Pooled DOR with 95% CI was 10.16 (7.18 to 14.37) with I2 of 85% (p<0.01), indicating heterogeneity of the meta-analysis (Table 2, Supplementary Fig. 2). Summary statistics of FIB-4 at various thresholds for prediction of AF and forest plots are presented in Supplementary Table 2 and Supplementary Fig. 2. The AUC of SROC was 0.74 (95% CI, 0.71 to 0.79) (Fig. 2B).
Studies on SF were relatively scarce, compared to studies on AF (32 studies vs 6 studies) (Table 2). Also, while the cutoff for AF was consistent for each study, the cutoff for SF differed significantly between studies. In regard to the FIB-4 index, the pooled sensitivity was 0.42 (95% CI, 0.16 to 0.73) and pooled specificity was 0.93 (95% CI, 0.56 to 0.99). The pooled DOR with 95% CI was 9.71 (2.43 to 38.70) with I2 of 0% (p=0.54). The AUC of SROC was 0.68 (95% CI, 0.65 to 0.76) (Fig. 2C). Summary statistics of FIB-4 at various thresholds for prediction of SF and forest plots are presented in Supplementary Table 3 and Supplementary Fig. 3.
In regard to NFS, the pooled sensitivity was 0.25 (95% CI, 0.02 to 0.82) and pooled specificity was 0.76 (95% CI, 0.37 to 0.94). The pooled DOR with 95% CI was 1.16 (0.37 to 3.57) with I2 of 0% (p=0.54), indicating homogeneity. The AUC of SROC was 0.60 (95% CI, 0.52 to 0.69) (Fig. 2D). Summary statistics of NFS at various thresholds for prediction of SF and forest plots are presented in Supplementary Table 4 and Supplementary Fig. 4.
Finally, we analyzed whether the accuracy of NFS or FIB-4 differs according to the study area and body mass index (BMI) (Table 3). Regions could be classified into three categories, Asia/Europe/America, except for the two studies that were conducted globally on two or more continents. We found that the accuracy of FIB-4 or NFS to predict F3 relatively increased in Europe (FIB-4: pooled DOR 16.37, NFS: pooled DOR 21.94) compared with Asia (FIB-4: pooled DOR 6.09, NFS: pooled DOR 6.22) or America (FIB-4: pooled DOR 6.23, NFS: pooled DOR 3.70). For BMI, individual patient BMI data could not be obtained, so the average BMI for each study was used. In all studies, BMI was 25 kg/m2 or higher (minimum, 26.1 kg/m2), and BMI was classified into three groups: <30 kg/m2, 30 to <35 kg/m2, and 35 kg/m2 or higher. As a result of stratification analysis, BMI values did not significantly affect the accuracy of NFS or FIB-4. Meta-regression analysis was added for the effect of BMI on DOR of FIB-4 or NFS, but it was not significant as well (Table 4).
Table 3. Summary Sensitivities, Specificities, Diagnostic Odds Ratio, Positive Likelihood Ratio, Negative Likelihood Ratio and AUC of FIB-4 Index and NFS According to Region or Body Mass Index for Prediction of Advanced Fibrosis
Variable | Cutoff | No. of samples (No. of patients) | Summary statistics (95% CI) | |||||
---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | DOR | LR+ | LR– | AUC | |||
Region | ||||||||
FIB-4 (F3) | Asia | 12 (3,388) | 0.48 (0.36–0.60) | 0.86 (0.80–0.91) | 6.09 (4.44–8.36) | 3.63 (2.77–4.75) | 0.59 (0.49–0.72) | 0.74 (0.71–0.77) |
Europe | 6 (1,978) | 0.61 (0.43–0.76) | 0.91 (0.83–0.95) | 16.37 (7.93–33.74) | 6.95 (3.91–12.32) | 0.42 (0.27–0.64) | 0.80 (0.78–0.82) | |
America | 8 (4,155) | 0.51 (0.37–0.65) | 0.85 (0.71–0.93) | 6.23 (2.51–15.47) | 3.52 (1.72–7.22) | 0.56 (0.42–0.75) | 0.73 (0.68–0.78) | |
NFS (F3) | Asia | 12 (3,407) | 0.38 (0.24–0.54) | 0.90 (0.83–0.95) | 6.22 (4.52–8.57) | 4.20 (3.07–5.76) | 0.67 (0.55–0.82) | 0.70 (0.65–0.74) |
Europe | 7 (2,098) | 0.55 (0.39–0.71) | 0.94 (0.85–0.98) | 21.94 (10.03–47.94) | 10.23 (4.43–23.64) | 0.46 (0.33–0.65) | 0.83 (0.80–0.86) | |
America | 8 (4,447) | 0.50 (0.33–0.68) | 0.78 (0.64–0.87) | 3.70 (1.77–7.74) | 2.33 (1.44–3.77) | 0.62 (0.45–0.88) | 0.72 (0.69–0.76) | |
Body mass index | ||||||||
FIB-4 (F3) | <30 kg/m2 | 12 (3,870) | 0.50 (0.38–0.61) | 0.88 (0.82–0.92) | 7.67 (5.46–10.78) | 4.33 (3.21–5.85) | 0.56 (0.46–0.68) | 0.75 (0.71–0.82) |
30 to <35 kg/m2 | 10 (1,331) | 0.51 (0.38–0.64) | 0.88 (0.78–0.93) | 7.92 (3.73–16.81) | 4.33 (2.39–7.85) | 0.54 (0.41–0.71) | 0.74 (0.72–0.76) | |
≥35 kg/m2 | 4 (675) | 0.60 (0.48–0.71) | 0.82 (0.69–0.90) | 7.31 (5.29–10.12) | 3.49 (2.26–5.40) | 0.47 (0.39–0.57) | 0.77 (0.74–0.80) | |
NFS (F3) | <30 kg/m2 | 12 (3,889) | 0.35 (0.23–0.50) | 0.93 (0.88–0.96) | 8.17 (5.54–12.05) | 5.60 (3.82–8.21) | 0.68 (0.57–0.82) | 0.71 (0.67–0.75) |
30 to <35 kg/m2 | 9 (4,583) | 0.51 (0.35–0.66) | 0.84 (0.75–0.91) | 5.96 (2.77–12.85) | 3.41 (1.98–5.86) | 0.57 (0.41–0.78) | 0.73 (0.70–0.67) | |
≥35 kg/m2 | 6 (2,205) | 0.55 (0.33–0.76) | 0.83 (0.61–0.94) | 6.37 (3.82–10.63) | 3.38 (1.81–6.32) | 0.53 (0.37–0.75) | 0.80 (0.74–0.85) |
AUC, area under curve; FIB-4, fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DOR, diagnostic odds ratio; LR+, positive likelihood ratio; LR–, negative likelihood ratio.
Table 4. Meta-Regression for Diagnostic Odds Ratio of Each Measurement
Variable | FIB-4 for AF | NFS for AF | |||
---|---|---|---|---|---|
Coefficient (95% CI) | p-value | Coefficient (95% CI) | p-value | ||
Mean age, yr | 0.063 (0.022 to 0.104) | 0.002 | 0.095 (0.048 to 0.143) | <0.001 | |
Proportion of male, % | –0.019 (–0.037 to –0.001) | 0.034 | –0.022 (–0.043 to –0.001) | 0.040 | |
DM, % | 0.005 (–0.011 to 0.021) | 0.561 | 0.013 (–0.001 to 0.028) | 0.065 | |
HTN, % | 0.028 (–0.001 to 0.045) | 0.055 | 0.021 (–0.003 to 0.046) | 0.089 | |
Dyslipidemia, % | –0.009 (–0.067 to 0.048) | 0.747 | 0.005 (–0.028 to 0.039) | 0.758 | |
Mean BMI, kg/m2 | 0.004 (–0.086 to 0.095) | 0.919 | 0.062 (–0.032 to 0.156) | 0.196 | |
Mean waist, cm | –0.009 (–0.075 to 0.056) | 0.774 | 0.001 (–0.070 to 0.071) | 0.997 | |
Mean AST | –0.020 (–0.041 to 0.001) | 0.060 | –0.023 (–0.047 to 0.001) | 0.055 | |
Mean ALT | –0.014 (–0.026 to –0.002) | 0.017 | –0.018 (–0.032 to –0.005) | 0.007 |
FIB-4, fibrosis-4; AF, advanced fibrosis; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DM, diabetes mellitus; HTN, hypertension; BMI, body mass index; AST, aspartate aminotransferase; ALT, alanine aminotransferase.
This systematic review and meta-analysis of 36 relevant studies indicated that the FIB-4 index and NFS can be effectively used to predict the degree of liver fibrosis in NAFLD. Additionally, our results demonstrated that the diagnostic accuracy of FIB-4 and NFS is relatively higher in predicting AF than in SF. Our study holds significance in its ability to assist clinicians in deciding treatments for NAFLD patients, by accurately predicting the degree of liver fibrosis.
Out of all panels based on serological markers, the NFS system was the most frequently studied. NFS is a scoring system based on 733 NAFLD patients diagnosed by liver biopsy.10 In previous studies, the diagnostic ability of NFS for hepatic fibrosis was AUC 0.82-0.88. Given such precedent, two cutoff values (<–1.455 [low probability, negative predictive value 88% to 93%] and >0.676 [high probability, positive predictive value 82% to 90%]) were proposed.10 Existing meta-studies on the diagnostic predictive ability of NFS in NAFLD have reported up to AUC 0.73–0.86, which is consistent with the results of our study.49-51 However, in the NFS scoring system, there are cases in which either high or low probability for advanced liver fibrosis cannot be classified (indeterminate probability); in such cases, a liver biopsy may be necessary.52
On the other hand, the FIB-4 index was created by Sterling
However, in regard to NAFLDs, NFS and FIB-4 showed higher diagnostic abilities than other noninvasive panels have. Notably, both markers demonstrated a diagnostic ability of AF similar to that of magnetic resonance elastography.56 Therefore, most of the NAFLD guidelines recommend NFS and FIB-4 as screening tools for diagnosing AF.57-59
First, our study found that FIB-4 has better diagnostic performance than NFS in predicting AF (AUC 0.76 vs AUC 0.74). Similar results have been manifested in previous studies. According to a meta-analysis conducted on 13,046 NAFLD subjects in 2017, AUROC of FIB-4 and NFS for the prediction of AF were 0.80 and 0.78, respectively, indicating that FIB-4 index has higher diagnostic accuracy than NFS.49 Also, in a meta-analysis of 5,735 NAFLD patients in 2021, AUROC of FIB-4 index and NFS were 0.76 (95% CI, 0.74 to 0.77) and 0.73 (95% CI, 0.71 to 0.75), respectively.60
Secondly, according to our data, the ability to predict SF was inferior to the AF predictive abilities of both markers. We can assume diagnostic accuracy of each scoring system is higher for more severe types of patients, as supported by previous studies. According to a study by Xiao
Third finding of our study is that the ability of FIB-4 or NFS to predict AF is lower than that of previous reports. This is interpreted because of the diversity of the patient population used in our study analysis. In fact, when comparing existing studies or meta-analyses, the AUC of FIB-4 or NFS gradually decreases as the number of patients or studies analyzed in the study increases. When analyzing 145 patients, AUC of FIB-4 and NFS was 0.86 and 0.81, respectively.61 In 1,038 patients, AUC of FIB-4 was 0.849.50 When analyzed in 5,735 patients from 37 studies, the AUC for FIB-4 and NFS was 0.76 and 0.73, which is almost consistent with our study results.60 In our study, the AUCs of FIB-4 and NFS were 0.76 and 0.74, respectively.
Finally, our study found that accuracy of FIB-4 or NFS to predict F3 relatively increased in Europe compared with Asia or America. This is interpreted because the proportion of Caucasians in the cohort study in which FIB or NFS was developed was relatively high, 79% for FIB-4 and 90% for NFS.9,10,62 Therefore, the accuracy of FIB or NFS is relatively low in Asian countries or in the multiracial America continent.
Advantages of these markers in NAFLD include low cost, quick diagnosis, and easy repeatability. In primary care, the use of noninvasive markers can increase early detection of AF, decrease avoidable referral of patients with mild diseases and ensure cost-effectiveness.63 Therefore, many experts recommend implementing a two-tier approach to improve resource utilization.18,57,64 On the other hand, compared to transient elastography or magnetic resonance elastography, disadvantages such as the low ability to diagnose AF should always be noted.60
The strength of our meta-analysis is its focus on comparing the diagnostic accuracies of two noninvasive and routinely usable scoring systems to predict the degree of liver fibrosis in NAFLD patients. To the best of our knowledge, this meta-analysis has the largest sample size, amongst those that compare the diagnostic accuracy of noninvasive scoring systems for AF and SF in NAFLD patients. However, there are some limitations to consider. First, the biggest limitation of this study is that it lacks new information compared to the existing meta-analysis related to FIB-4 or NFS. Before starting the study, we reviewed the existing meta-analysis literature and found that more literature than expected was not included in the analysis. Therefore, in order to obtain more accurate results, it was determined that accurate inclusion criteria should be reapplied, and as a result, a wider range of papers could be analyzed than existing meta-analysis papers. Eventually, the results of our study are not very different from the existing meta-analyses, and thus provide few new revelations. However, our study has clinical significance as the most extensive analytical research on this subject. Second, limiting our review to manuscripts published only in English may have caused publication bias. We conducted a funnel plot and Egger’s test, later to confirm that our study has no publication bias with p=0.135 in FIB-4 for AF. However, there was a publication bias in FIB-4 for SF, and NSF for AF and SF with p<0.05, which may limit the credibility of our results. Third, only two noninvasive scoring systems were the focus of our analysis. Other noninvasive scoring systems such as BMI, aspartate aminotransferase/alanine aminotransferase ratio, diabetes (BARD) score, and aspartate aminotransferase to platelet ratio tests were not considered.50,65 Fourth, our research focused on liver fibrosis and did not consider the degree of hepatic steatosis. Furthermore, the included studies did not provide sufficient information on the patients’ duration of NAFLD or past treatments, which can affect the incidence and severity of liver fibrosis.
In summary, both FIB-4 index and NFS were useful in predicting the degree of liver fibrosis in NAFLD. The diagnostic accuracy of these scoring systems was higher in predicting AF than in SF. Thus, the FIB-4 index and NFS may be considered as alternative diagnostic methods to liver biopsy when predicting the level of fibrosis in NAFLD.
Supplementary materials can be accessed at https://doi.org/10.5009/gnl210391.
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2021R1G1A1007886), and in part by the Soonchunhyang University Research Fund.
No potential conflict of interest relevant to this article was reported.
Study concept and design: D.W.J. Provision of study materials or patients: M.C., H.W.L. Collection and assembly of data: S.H.K., Y.C., S.B.A., D.S.S. Data analysis and interpretation: B.L., S.H., J.J.Y. Manuscript writing: S.H., J.L., J.J.Y. Final approval of manuscript: all authors.
Gut and Liver 2022; 16(6): 952-963
Published online November 15, 2022 https://doi.org/10.5009/gnl210391
Copyright © Gut and Liver.
Sangsoo Han1 , Miyoung Choi2
, Bora Lee3
, Hye-Won Lee4
, Seong Hee Kang5
, Yuri Cho6
, Sang Bong Ahn7
, Do Seon Song8
, Dae Won Jun9
, Jieun Lee10
, Jeong-Ju Yoo11
1Department of Emergency Medicine, Soonchunhyang University Bucheon Hospital, Bucheon, 2Clinical Evidence Research, National Evidence-based Healthcare Collaborating Agency (NECA), 3Department of Statistics, Graduate School of Chung-Ang University, 4Department of Internal Medicine, Yonsei University College of Medicine, Seoul, 5Department of Internal Medicine, Wonju Severance Christian Hospital, Yonsei University Wonju College of Medicine, Wonju, 6Center for Liver and Pancreatobiliary Cancer, National Cancer Center, Goyang, 7Department of Internal Medicine, Nowon Eulji Medical Center, Eulji University College of Medicine, Seoul, 8Department of Internal Medicine, St. Vincent’s Hospital, College of Medicine, The Catholic University of Korea, Suwon, 9Department of Internal Medicine, Hanyang University College of Medicine, Seoul, 10Department of Internal Medicine, Soonchunhyang University College of Medicine, Cheonan, and 11Department of Internal Medicine, Soonchunhyang University Bucheon Hospital, Bucheon, Korea
Correspondence to:Dae Won Jun
ORCID https://orcid.org/0000-0002-2875-6139
E-mail noshin@hanyang.ac.kr
Jeong-Ju Yoo
ORCID https://orcid.org/0000-0002-7802-0381
E-mail puby17@naver.comr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background/Aims: Several noninvasive scoring systems have been developed to determine the risk of advanced fibrosis in nonalcoholic fatty liver disease (NAFLD). We examined the diagnostic accuracy of the fibrosis-4 (FIB-4) score and NAFLD fibrosis score (NFS) in patients with biopsy-proven NAFLD.
Methods: For this meta-analysis, various databases including PubMed (MEDLINE), EMBASE, OVID Medline and the Cochrane Library were systematically searched. After the acquired abstracts were reviewed by two investigators, manuscripts were chosen for a full-text examination.
Results: Thirty-six studies evaluating biopsy-proven NAFLD were selected for meta-analysis. A total of 14,992 patients were analyzed. The lower cutoff sensitivity of the FIB-4 score predicting histological fibrosis stage 3 or more (≥F3) was 69%, with specificity of 64%, positive likelihood ratio (LR+) of 1.96, and negative likelihood ratio (LR–) of 0.47. The low baseline sensitivity of the NFS score predicting ≥F3 was 70%, with a specificity of 61%, LR+ of 1.83, and LR– of 0.48. The area under the receiver operating characteristic curve (AUC) values of the FIB-4 score predicting ≥F3 and ≥F2 were 76% and 68%, respectively. The AUC values of the NFS score predicting ≥F3 and ≥F2 were 74% and 60%, respectively.
Conclusions: The FIB-4 or NFS test can be used to predict the degree of liver fibrosis in NAFLD, and the diagnostic accuracy resulted as relatively high in fibrosis stages of F3 or higher.
Keywords: Liver fibrosis, Meta-analysis, Nonalcoholic fatty liver disease, Predictive value of tests
With a prevalence of 25% to 40% in the general population, nonalcoholic fatty liver disease (NAFLD) is the most common liver disease worldwide, a pressing health concern associated with insulin resistance and metabolic syndrome.1,2 NAFLD affects nearly 100 million individuals in the United States and occurs in 90% of the obese population.1,3 Due to such burden of the disease, the early identification of patients with high morbidity and mortality associated with NAFLD is essential.
NAFLD can be categorized into various stages, from simple steatosis without fibrosis to nonalcoholic steatohepatitis related cirrhosis.1 The severity of NAFLD is determined by three factors: steatosis, inflammation, and fibrosis. Among these factors, the degree of hepatic fibrosis is the most essential factor in clinical settings, allowing clinicians to estimate the long-term prognosis in patients with NAFLD, such as the development of hepatocellular carcinoma, liver-related death or cardiovascular mortality.1,4 In fact, while simple steatosis is considered a non-progressive condition, nonalcoholic steatohepatitis or significant fibrosis (SF) is regarded as one of the main causes of liver transplantation.5 Therefore, it is vital to promptly identify advanced fibrosis (AF; ≥stage 3 fibrosis) or SF (≥stage 2 fibrosis) in such patients.6
Liver biopsy is the gold standard for staging and identifying fibrosis in NAFLD patients.7 However, it is not suitable for a routine screening use, due to various reasons including its invasive nature, potential complications, possibility of sampling error, and high cost.7,8 Therefore, a simple, inexpensive and noninvasive panel to identify and quantify liver fibrosis is necessary. Likewise, though techniques such as magnetic resonance elastography or transient elastography have been recently developed, their high expense prevents their use as routine screening tests. Thus, noninvasive fibrosis scoring systems based on serologic tests such as the fibrosis-4 (FIB-4) index and NAFLD fibrosis score (NFS) have been developed and widely used as screening tools to assess the degree of fibrosis.9,10 However, a comprehensive study on such scoring system is crucial, since not only was the FIB-4 score developed only for patients with viral hepatitis, the accuracy of these serologic scoring systems in NAFLD patients also differ among studies. The aim of this systematic review and meta-analysis is to evaluate the diagnostic accuracy of noninvasive fibrosis scoring systems (FIB-4 and NFS), compared to that of the corresponding liver histologic data, to predict AF and SF in patients with NAFLD.
This meta-analysis adhered to the protocol previously registered with PROSPERO (International Prospective Register of Systematic Reviews, CRD42021241243). We administered this systematic review and meta-analysis following guidelines provided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Diagnostic Test Accuracy.
Studies that documented the accuracy of FIB-4 and NFS, evaluated by the corresponding liver histology results in NAFLD patients, were considered eligible for inclusion. The following criteria were required for studies to be selected: (1) patients with NAFLD; (2) reports of the accuracy of FIB-4 and NFS based on liver histology results. The state of fatty liver was determined by histologic characteristics. Eligible study designs were randomized controlled trials, cross-sectional studies, and cohort studies, both prospective and retrospective. Studies were excluded by the following criteria: (1) case reports; (2) case series, in which less than five patients in total were involved; (3) reviews; (4) cell or animal studies; (5) chronic viral hepatitis, such as hepatitis B or hepatitis C; (6) human immunodeficiency virus; (7) significant alcohol consumption; (8) fatty liver defined by imaging or serologic criteria, without any histology result provided; or (9) non-English studies.
The primary outcome of this meta-analysis was the diagnostic accuracy of the FIB-4 score and NFS, compared to the corresponding liver histology in patients with NAFLD.
We searched PubMed (MEDLINE), EMBASE, the Cochrane Library, Korean Medical Database, and Korean Studies Information Service System to identify studies published in English between January 1, 1997, and October 31, 2020. The keywords used in the Patient/Problem, Intervention, Comparison, and Outcome model are provided in the Supplementary Material. The search words were NAFLD index words, FIB-4-related index words or NFS-related index words. We combined free-text words and controlled terms such as Medical Subject Headings and EMTREE according to the databases. The search strategy and the following result of each database are provided in the Supplementary Material. The entire search process was administered by a professional librarian (M.C.).
During the process of study selection, two reviewers (S.H. and J.J.Y.) first independently extracted relevant titles and abstracts. After an independent examination of the full-text articles, any resulting disparity between the two reviewers was resolved by a discussion with a third reviewer (H.W.L. or S.H.K.).
In addition, the two reviewers thoroughly examined the remaining procedures, such as screening full-text articles and assessing the risk of bias. The extraction of study characteristics and outcomes was conducted independently and documented in a standardized format by the two reviewers. Any discrepancy was settled by a discussion with Y.C. and S.B.A.
To determine the risk of bias, we utilized the Cochrane risk of bias tool, with any relevant information provided in the Supplementary Material. Again, any discrepancy was resolved by a discussion with additional reviewers (D.S.S. and D.W.J.). Risk of bias was evaluated using two tools, QUADAS11 and QUIPS12 tools. The overall outcome of the risk of bias is provided in the Supplementary Material. Publication bias was evaluated through a funnel plot.
The process of meta-analysis with sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio is as the following: (1) transform the proportion into a quantity (Freeman-Tukey variant of the arcsine square root transformed the proportion); (2) calculate the pooled prevalence as the back-transformation of the weighted mean of the transformed prevalence using DerSimonian-Laird weights assuming the random-effect model; (3) calculate the confidence interval with the Clopper-Pearson interval. To further analyze the heterogeneity within the studies, researchers conducted a meta-regression to understand the influence of other factors on diagnostic accuracy. RevMan 5 (Cochrane Library) or the meta package in R version 4.1.0 (The R Foundation for Statistical Computing, Vienna, Austria) were utilized in the statistical analyses.
A thorough database search of titles and abstracts resulted in 86 relevant studies. Out of the 86, 50 studies were excluded due to inappropriate patient population (n=1), inappropriate outcome measurement (n=42), overlapping population (n=6), or insufficient data (n=1). Finally, 36 studies were eligible for inclusion in this review (Fig. 1). Detailed characteristics of the studies included in this meta-analysis are provided in Table 1.13-48 A total of 14,992 patients were analyzed, with a mean age of 48.57±6.13. Studies were conducted in various countries in the world (Asia 16, Europe 8, America 9, two or more continents 2, and Australia 1). The median co-morbidity rates of diabetes, hypertension, and dyslipidemia were 36.2%, 43.5%, and 54.3%, respectively. Median aspartate aminotransferase and alanine aminotransferase levels were 43 U/L and 62 U/L, respectively.
Table 1 . Characteristics and Results of the Included Studies.
Author (year) | Location | Method | No. of samples | Mean age, yr | Male, % | Clinical characteristics | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DM, % | HTN, % | Dyslipidemia, % | Mean BMI, kg/m2 | Mean waist, cm | Mean AST, U/L | Mean ALT, U/L | ||||||
Aida (2015)13 | Japan | FIB-4 | 148 | 61 | 36 | - | - | - | 26.9 | - | 42 | 52 |
Anstee (2019)14 | Global | FIB-4/NFS | 3,123 | 59 | 42 | 68 | - | - | - | - | 44 | 45 |
Balakrishnan (2020)15 | USA | FIB-4/NFS | 99 | 44.7 | 33.9 | 43.5 | - | - | 32.1 | - | 59 | 108 |
Boursier (2019)16 | France | FIB-4/NFS | 938 | 56.5 | 58.5 | 51.1 | - | - | 31.8 | - | 39 | 56 |
Chan (2015)17 | Malaysia | NFS | 147 | 50.5 | 54.4 | 52.4 | 89.1 | - | 29.3 | 98.2 | 41 | 71 |
Chan (2019)18 | Asia | FIB-4/NFS | 583 | 50.9 | 52.9 | 52.3 | 55.1 | 74.5 | 28.9 | 96.5 | 38 | 63 |
Cui (2015)19 | USA | FIB-4 | 102 | 51 | 41.2 | 25.5 | - | - | 31.7 | - | 42.3 | 58 |
Demir (2013)20 | Germany | NFS | 120 | 43.8 | 47.2 | 19.9 | 41.9 | - | 37 | - | 36.8 | 56.6 |
de Carli (2019)21 | Brazil | NFS | 266 | 36.5 | 20.4 | 10.5 | - | - | 44.2 | 123.9 | 24.9 | 32.3 |
Goh (2015)22 | USA | NFS | 238 | 52 | 29.3 | 100 | - | 42 | 37.1 | - | 53.5 | 64 |
USA | NFS | 263 | 46 | 46 | 0 | - | 15.2 | 35.2 | - | 58.5 | 78 | |
Joo (2017)23 | Korea | FIB-4/NFS | 315 | 55 | 50.8 | 37.8 | 38.4 | - | 27 | 94.6 | 36 | 43 |
Jun (2017)24 | Korea | FIB-4/NFS | 328 | 36.4 | 70.7 | 33 | 14.6 | - | 28.6 | 96.4 | 91.3 | 98.5 |
Kakisaka (2018)25 | Japan | FIB-4 | 63 | 54.9 | 58 | - | - | - | 28.1 | - | 62 | 94 |
Kao (2020)26 | Taiwan | FIB-4 | 73 | 35.3 | 31.5 | 16.9 | 26.8 | - | 41 | 118.3 | 38.2 | 55 |
Kaya (2020)27 | Turkey | FIB-4/NFS | 463 | 46 | 47.5 | 37.8 | 34.8 | - | 31.7 | 104 | 42 | 66 |
Kim (2013)28 | USA | FIB-4/NFS | 142 | 52.8 | 26.8 | 27.5 | 45.1 | - | 36.32 | - | 47.2 | 60.4 |
Labenz (2018)29 | Germany | FIB-4/NFS | 261 | 51 | 52.5 | 29.9 | - | 37.5 | 30.9 | - | 48 | 60 |
Lang (2020)30 | Germany | FIB-4/NFS | 95 | 50 | 46.2 | 10.8 | 56.9 | - | 30 | 105 | 32.5 | 50.5 |
Lum (2020)31 | Singapore | FIB-4/NFS | 263 | 50.4 | 52.5 | 49 | - | 66.5 | 30.4 | 113.2 | - | - |
McPherson (2013)32 | UK | FIB-4/NFS | 70 | 54 | 56 | 43 | - | - | 32.9 | 105 | 28 | 28 |
UK | FIB-4/NFS | 235 | 48 | 63 | 40 | - | - | 34.4 | 110 | 59 | 95 | |
Meneses (2020)33 | Spain | FIB-4/NFS | 50 | 49 | 30 | 26 | 52 | 28 | 44.3 | 135 | 21 | 25 |
Nasr (2016)34 | Sweden | FIB-4/NFS | 58 | 60.4 | 71 | 53 | 93 | - | 28 | 102 | 34 | 60 |
Ooi (2017)35 | Australia | FIB-4/NFS | 101 | 49 | 33.7 | 34.7 | 79.2 | 73.3 | 41.9 | - | - | - |
Australia | FIB-4/NFS | 53 | 43 | 30.2 | 24.5 | 45.3 | 19.2 | 46.6 | - | - | - | |
Patel (2018)36 | USA | FIB-4/NFS | 114 | 41.8 | 79 | 30 | - | - | 33.9 | - | 49 | 84 |
USA | FIB-4/NFS | 151 | 60 | 95 | 70 | - | - | 33.7 | - | 48.2 | 54.6 | |
Pérez-Gutiérrez (2013)37 | Mexico | FIB-4/NFS | 243 | 48.6 | 49 | 21.5 | - | - | - | - | 57.6 | 73 |
Petta (2015)38 | Italy | FIB-4/NFS | 179 | 45.4 | 67.5 | 19.5 | 24 | - | 29.3 | - | 45.7 | 80.3 |
Italy | FIB-4/NFS | 142 | 43.9 | 71.8 | 15.4 | 11.9 | - | 27.4 | - | 42.2 | 75.6 | |
Petta (2019)39 | Global | FIB-4/NFS | 968 | 50.1 | 62.9 | 37 | 39.4 | - | 29.3 | - | 46.1 | 76.1 |
Siddiqui (2020)40 | USA | FIB-4/NFS | 1,904 | 50.3 | 37 | 39 | 58 | 62 | 34.4 | - | 51.2 | 69.8 |
Singh (2020)41 | USA | FIB-4/NFS | 1,134 | 51.1 | 35.4 | 100 | 74.5 | 70.8 | 35.5 | - | 27 | 28 |
Treeprasertsuk (2016)42 | Thailand | FIB-4/NFS | 139 | 40.9 | 47 | 38 | - | - | 36.1 | - | 38 | 56 |
Wong (2008)43 | Hong Kong | NFS | 128 | 46 | 59 | 57 | 48 | - | 28.5 | 95 | 43 | 75 |
Wong (2010)44 | Hong Kong | FIB-4/NFS | 246 | 51 | 54.9 | 36.2 | 40.2 | - | 28 | 94 | - | 75 |
Xun (2012)45 | China | FIB-4/NFS | 152 | 37.1 | 79.6 | 32.2 | - | - | 26.1 | - | 61 | 100 |
Yang (2019)46 | China | FIB-4/NFS | 453 | 36.56 | 58.9 | 30.2 | 34.8 | - | 26.93 | - | 74.12 | 135.11 |
Yoneda (2013)47 | Japan | FIB-4/NFS | 235 | 59.9 | - | 46 | - | 63.8 | 26.9 | - | 24.7 | 23.7 |
Zhou (2019)48 | China | FIB-4/NFS | 207 | 41.8 | 73 | 24.6 | 20.3 | 46.6 | 27 | 91.2 | 45.7 | 49 |
DM, diabetes mellitus; HTN, hypertension; BMI, body mass index; AST, aspartate aminotransferase; ALT, alanine aminotransferase; FIB-4, Fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score..
Analyzing the diagnostic accuracy of FIB-4 for predicting AF involved 13,764 patients from 32 studies (Table 2). As a lower cutoff value for predicting AF, a value from 1.02 to 1.45 was most frequently used (20 studies), and as a higher cutoff value, 2.67 was predominantly used (18 studies). Regarding the FIB-4 index, pooled sensitivity was 0.42 (95% confidence interval [CI], 0.33 to 0.51) and pooled specificity was 0.93 (95% CI, 0.91 to 0.95). Pooled diagnostic odds ratio (DOR) with 95% CI was 10.83 (7.55 to 15.54) with I2 of 85% (p<0.01). Summary statistics of FIB-4 at various thresholds for prediction of AF and forest plots are presented in Supplementary Table 1 and Supplementary Fig. 1. The area under the receiver operating characteristic curve (AUC) of summary receiver operating characteristic (SROC) was 0.76 (95% CI, 0.74 to 0.81) (Fig. 2A).
Table 2 . Summary Sensitivities, Specificities, Diagnostic Odds Ratio, Positive Likelihood Ratio, Negative Likelihood Ratio and AUC of FIB-4 Index and NFS at Various Diagnostic Thresholds for the Prediction of Advanced Fibrosis and Significant Fibrosis.
Variable | Cutoff | No. of samples (No. of patients) | Summary statistics (95% CI) | |||||
---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | DOR | LR+ | LR– | AUC | |||
Advanced fibrosis | ||||||||
FIB-4 | All | 32 (13,764) | 0.42 (0.33–0.51) | 0.93 (0.91–0.95) | 10.83 (7.55–15.54) | 6.65 (5.01–8.83) | 0.61 (0.52–0.71) | 0.76 (0.74–0.81) |
1.02 to 1.45 | 20 (10,304) | 0.69 (0.59–0.77) | 0.64 (0.57–0.71) | 4.11 (2.24–7.55) | 1.96 (1.49–2.57) | 0.47 (0.33–0.67) | 0.73 (0.67–0.80) | |
1.515 to 2.09 | 5 (2,408) | 0.74 (0.58–0.85) | 0.80 (0.72–0.86) | 11.97 (7.48–19.16) | 3.82 (3.00–4.88) | 0.32 (0.20–0.51) | 0.82 (0.78–0.87) | |
2.67 | 18 (8,731) | 0.34 (0.27–0.42) | 0.95 (0.92–0.96) | 9.32 (6.26–13.87) | 6.46 (4.66–8.95) | 0.69 (0.62–0.77) | 0.74 (0.70–0.76) | |
3.25 | 9 (2,721) | 0.39 (0.22–0.59) | 0.95 (0.93–0.97) | 14.23 (5.92–34.19) | 8.97 (4.87–16.50) | 0.63 (0.45–0.86) | 0.76 (0.77–0.88) | |
NFS | All | 33 (13,337) | 0.38 (0.28–0.50) | 0.94 (0.90–0.96) | 10.16 (7.18–14.37) | 6.60 (4.85–8.97) | 0.64 (0.54–0.76) | 0.74 (0.71–0.79) |
–1.98 to –1.036 | 23 (10,158) | 0.70 (0.57–0.80) | 0.61 (0.53–0.69) | 3.77 (2.16–6.56) | 1.83 (1.46–2.28) | 0.48 (0.33–0.70) | 0.72 (0.66–0.79) | |
–0.126 to 0.19 | 2 (1,962) | 0.61 (0.28–0.86) | 0.87 (0.79–0.92) | 10.79 (3.65–31.84) | 4.79 (3.13–7.35) | 0.44 (0.20–0.97) | 0.80 (0.78–0.88) | |
4.39 to 4.8 | 31 (11,471) | 0.31 (0.23–0.41) | 0.95 (0.93–0.97) | 10.17 (7.06–14.63) | 7.29 (5.34–9.93) | 0.71 (0.63–0.81) | 0.73 (0.71–0.81) | |
Significant fibrosis | ||||||||
FIB-4 | All | 6 (547) | 0.42 (0.16–0.73) | 0.93 (0.56–0.99) | 9.71 (2.43–38.70) | 6.05 (1.25–29.29) | 0.62 (0.40–0.95) | 0.68 (0.65–0.76) |
0.66 to 0.89 | 4 (434) | 0.69 (0.55–0.80) | 0.61 (0.44–0.76) | 3.58 (1.96–6.53) | 1.79 (1.26–2.55) | 0.50 (0.35–0.70) | 0.66 (0.61–0.77) | |
1.4 to 1.9 | 2 (136) | 0.65 (0.53–0.75) | 0.66 (0.51–0.79) | 3.69 (1.61–8.49) | 1.94 (1.21–3.09) | 0.52 (0.35–0.78) | 0.74 (0.54–0.85) | |
2.67 to 3.25 | 2 (151) | 0.06 (0.02–0.22) | 0.98 (0.94–1.00) | 4.25 (0.57–31.68) | 10.82 (0.47–257.67) | 0.93 (0.85–1.02) | 0.67 (0.61–0.73) | |
NFS | All | 5 (539) | 0.25 (0.02–0.82) | 0.76 (0.37–0.94) | 1.16 (0.37–3.57) | 1.12 (0.50–2.49) | 0.96 (0.69–1.33) | 0.60 (0.52–0.69) |
–3.168 to –1.455 | 2 (335) | 0.57 (0.28–0.81) | 0.68 (0.37–0.88) | 2.82 (1.53–5.18) | 1.78 (1.11–2.84) | 0.63 (0.42–0.94) | 0.61 (0.58–0.72) | |
0.676 | 3 (279) | 0.04 (0.01–0.25) | 0.92 (0.77–0.98) | 0.52 (0.07–3.59) | 0.54 (0.08–3.49) | 1.03 (0.96–1.12) | 0.53 (0.42–0.73) | |
1.292 | 2 (154) | 0.81 (0.59–0.92) | 0.27 (0.16–0.44) | 1.65 (0.50–5.36) | 1.12 (0.87–1.44) | 0.67 (0.26–1.73) | 0.67 (0.57–0.75) |
AUC, area under curve; FIB-4, fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DOR, diagnostic odds ratio; LR+, positive likelihood ratio; LR–, negative likelihood ratio..
Analyzing the diagnostic accuracy of NFS for predicting AF involved 13,337 patients from 33 studies (Table 2). As a lower cutoff value for predicting AF, a value from –1.98 to –1.03 was most frequently used (23 studies), and as a higher cutoff value, 4.39 to 4.8 was predominantly used (31 studies). For NFS, pooled sensitivity was 0.38 (95% CI, 0.28 to 0.50) and pooled specificity was 0.94 (95% CI, 0.90 to 0.96) (Table 2). Pooled DOR with 95% CI was 10.16 (7.18 to 14.37) with I2 of 85% (p<0.01), indicating heterogeneity of the meta-analysis (Table 2, Supplementary Fig. 2). Summary statistics of FIB-4 at various thresholds for prediction of AF and forest plots are presented in Supplementary Table 2 and Supplementary Fig. 2. The AUC of SROC was 0.74 (95% CI, 0.71 to 0.79) (Fig. 2B).
Studies on SF were relatively scarce, compared to studies on AF (32 studies vs 6 studies) (Table 2). Also, while the cutoff for AF was consistent for each study, the cutoff for SF differed significantly between studies. In regard to the FIB-4 index, the pooled sensitivity was 0.42 (95% CI, 0.16 to 0.73) and pooled specificity was 0.93 (95% CI, 0.56 to 0.99). The pooled DOR with 95% CI was 9.71 (2.43 to 38.70) with I2 of 0% (p=0.54). The AUC of SROC was 0.68 (95% CI, 0.65 to 0.76) (Fig. 2C). Summary statistics of FIB-4 at various thresholds for prediction of SF and forest plots are presented in Supplementary Table 3 and Supplementary Fig. 3.
In regard to NFS, the pooled sensitivity was 0.25 (95% CI, 0.02 to 0.82) and pooled specificity was 0.76 (95% CI, 0.37 to 0.94). The pooled DOR with 95% CI was 1.16 (0.37 to 3.57) with I2 of 0% (p=0.54), indicating homogeneity. The AUC of SROC was 0.60 (95% CI, 0.52 to 0.69) (Fig. 2D). Summary statistics of NFS at various thresholds for prediction of SF and forest plots are presented in Supplementary Table 4 and Supplementary Fig. 4.
Finally, we analyzed whether the accuracy of NFS or FIB-4 differs according to the study area and body mass index (BMI) (Table 3). Regions could be classified into three categories, Asia/Europe/America, except for the two studies that were conducted globally on two or more continents. We found that the accuracy of FIB-4 or NFS to predict F3 relatively increased in Europe (FIB-4: pooled DOR 16.37, NFS: pooled DOR 21.94) compared with Asia (FIB-4: pooled DOR 6.09, NFS: pooled DOR 6.22) or America (FIB-4: pooled DOR 6.23, NFS: pooled DOR 3.70). For BMI, individual patient BMI data could not be obtained, so the average BMI for each study was used. In all studies, BMI was 25 kg/m2 or higher (minimum, 26.1 kg/m2), and BMI was classified into three groups: <30 kg/m2, 30 to <35 kg/m2, and 35 kg/m2 or higher. As a result of stratification analysis, BMI values did not significantly affect the accuracy of NFS or FIB-4. Meta-regression analysis was added for the effect of BMI on DOR of FIB-4 or NFS, but it was not significant as well (Table 4).
Table 3 . Summary Sensitivities, Specificities, Diagnostic Odds Ratio, Positive Likelihood Ratio, Negative Likelihood Ratio and AUC of FIB-4 Index and NFS According to Region or Body Mass Index for Prediction of Advanced Fibrosis.
Variable | Cutoff | No. of samples (No. of patients) | Summary statistics (95% CI) | |||||
---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | DOR | LR+ | LR– | AUC | |||
Region | ||||||||
FIB-4 (F3) | Asia | 12 (3,388) | 0.48 (0.36–0.60) | 0.86 (0.80–0.91) | 6.09 (4.44–8.36) | 3.63 (2.77–4.75) | 0.59 (0.49–0.72) | 0.74 (0.71–0.77) |
Europe | 6 (1,978) | 0.61 (0.43–0.76) | 0.91 (0.83–0.95) | 16.37 (7.93–33.74) | 6.95 (3.91–12.32) | 0.42 (0.27–0.64) | 0.80 (0.78–0.82) | |
America | 8 (4,155) | 0.51 (0.37–0.65) | 0.85 (0.71–0.93) | 6.23 (2.51–15.47) | 3.52 (1.72–7.22) | 0.56 (0.42–0.75) | 0.73 (0.68–0.78) | |
NFS (F3) | Asia | 12 (3,407) | 0.38 (0.24–0.54) | 0.90 (0.83–0.95) | 6.22 (4.52–8.57) | 4.20 (3.07–5.76) | 0.67 (0.55–0.82) | 0.70 (0.65–0.74) |
Europe | 7 (2,098) | 0.55 (0.39–0.71) | 0.94 (0.85–0.98) | 21.94 (10.03–47.94) | 10.23 (4.43–23.64) | 0.46 (0.33–0.65) | 0.83 (0.80–0.86) | |
America | 8 (4,447) | 0.50 (0.33–0.68) | 0.78 (0.64–0.87) | 3.70 (1.77–7.74) | 2.33 (1.44–3.77) | 0.62 (0.45–0.88) | 0.72 (0.69–0.76) | |
Body mass index | ||||||||
FIB-4 (F3) | <30 kg/m2 | 12 (3,870) | 0.50 (0.38–0.61) | 0.88 (0.82–0.92) | 7.67 (5.46–10.78) | 4.33 (3.21–5.85) | 0.56 (0.46–0.68) | 0.75 (0.71–0.82) |
30 to <35 kg/m2 | 10 (1,331) | 0.51 (0.38–0.64) | 0.88 (0.78–0.93) | 7.92 (3.73–16.81) | 4.33 (2.39–7.85) | 0.54 (0.41–0.71) | 0.74 (0.72–0.76) | |
≥35 kg/m2 | 4 (675) | 0.60 (0.48–0.71) | 0.82 (0.69–0.90) | 7.31 (5.29–10.12) | 3.49 (2.26–5.40) | 0.47 (0.39–0.57) | 0.77 (0.74–0.80) | |
NFS (F3) | <30 kg/m2 | 12 (3,889) | 0.35 (0.23–0.50) | 0.93 (0.88–0.96) | 8.17 (5.54–12.05) | 5.60 (3.82–8.21) | 0.68 (0.57–0.82) | 0.71 (0.67–0.75) |
30 to <35 kg/m2 | 9 (4,583) | 0.51 (0.35–0.66) | 0.84 (0.75–0.91) | 5.96 (2.77–12.85) | 3.41 (1.98–5.86) | 0.57 (0.41–0.78) | 0.73 (0.70–0.67) | |
≥35 kg/m2 | 6 (2,205) | 0.55 (0.33–0.76) | 0.83 (0.61–0.94) | 6.37 (3.82–10.63) | 3.38 (1.81–6.32) | 0.53 (0.37–0.75) | 0.80 (0.74–0.85) |
AUC, area under curve; FIB-4, fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DOR, diagnostic odds ratio; LR+, positive likelihood ratio; LR–, negative likelihood ratio..
Table 4 . Meta-Regression for Diagnostic Odds Ratio of Each Measurement.
Variable | FIB-4 for AF | NFS for AF | |||
---|---|---|---|---|---|
Coefficient (95% CI) | p-value | Coefficient (95% CI) | p-value | ||
Mean age, yr | 0.063 (0.022 to 0.104) | 0.002 | 0.095 (0.048 to 0.143) | <0.001 | |
Proportion of male, % | –0.019 (–0.037 to –0.001) | 0.034 | –0.022 (–0.043 to –0.001) | 0.040 | |
DM, % | 0.005 (–0.011 to 0.021) | 0.561 | 0.013 (–0.001 to 0.028) | 0.065 | |
HTN, % | 0.028 (–0.001 to 0.045) | 0.055 | 0.021 (–0.003 to 0.046) | 0.089 | |
Dyslipidemia, % | –0.009 (–0.067 to 0.048) | 0.747 | 0.005 (–0.028 to 0.039) | 0.758 | |
Mean BMI, kg/m2 | 0.004 (–0.086 to 0.095) | 0.919 | 0.062 (–0.032 to 0.156) | 0.196 | |
Mean waist, cm | –0.009 (–0.075 to 0.056) | 0.774 | 0.001 (–0.070 to 0.071) | 0.997 | |
Mean AST | –0.020 (–0.041 to 0.001) | 0.060 | –0.023 (–0.047 to 0.001) | 0.055 | |
Mean ALT | –0.014 (–0.026 to –0.002) | 0.017 | –0.018 (–0.032 to –0.005) | 0.007 |
FIB-4, fibrosis-4; AF, advanced fibrosis; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DM, diabetes mellitus; HTN, hypertension; BMI, body mass index; AST, aspartate aminotransferase; ALT, alanine aminotransferase..
This systematic review and meta-analysis of 36 relevant studies indicated that the FIB-4 index and NFS can be effectively used to predict the degree of liver fibrosis in NAFLD. Additionally, our results demonstrated that the diagnostic accuracy of FIB-4 and NFS is relatively higher in predicting AF than in SF. Our study holds significance in its ability to assist clinicians in deciding treatments for NAFLD patients, by accurately predicting the degree of liver fibrosis.
Out of all panels based on serological markers, the NFS system was the most frequently studied. NFS is a scoring system based on 733 NAFLD patients diagnosed by liver biopsy.10 In previous studies, the diagnostic ability of NFS for hepatic fibrosis was AUC 0.82-0.88. Given such precedent, two cutoff values (<–1.455 [low probability, negative predictive value 88% to 93%] and >0.676 [high probability, positive predictive value 82% to 90%]) were proposed.10 Existing meta-studies on the diagnostic predictive ability of NFS in NAFLD have reported up to AUC 0.73–0.86, which is consistent with the results of our study.49-51 However, in the NFS scoring system, there are cases in which either high or low probability for advanced liver fibrosis cannot be classified (indeterminate probability); in such cases, a liver biopsy may be necessary.52
On the other hand, the FIB-4 index was created by Sterling
However, in regard to NAFLDs, NFS and FIB-4 showed higher diagnostic abilities than other noninvasive panels have. Notably, both markers demonstrated a diagnostic ability of AF similar to that of magnetic resonance elastography.56 Therefore, most of the NAFLD guidelines recommend NFS and FIB-4 as screening tools for diagnosing AF.57-59
First, our study found that FIB-4 has better diagnostic performance than NFS in predicting AF (AUC 0.76 vs AUC 0.74). Similar results have been manifested in previous studies. According to a meta-analysis conducted on 13,046 NAFLD subjects in 2017, AUROC of FIB-4 and NFS for the prediction of AF were 0.80 and 0.78, respectively, indicating that FIB-4 index has higher diagnostic accuracy than NFS.49 Also, in a meta-analysis of 5,735 NAFLD patients in 2021, AUROC of FIB-4 index and NFS were 0.76 (95% CI, 0.74 to 0.77) and 0.73 (95% CI, 0.71 to 0.75), respectively.60
Secondly, according to our data, the ability to predict SF was inferior to the AF predictive abilities of both markers. We can assume diagnostic accuracy of each scoring system is higher for more severe types of patients, as supported by previous studies. According to a study by Xiao
Third finding of our study is that the ability of FIB-4 or NFS to predict AF is lower than that of previous reports. This is interpreted because of the diversity of the patient population used in our study analysis. In fact, when comparing existing studies or meta-analyses, the AUC of FIB-4 or NFS gradually decreases as the number of patients or studies analyzed in the study increases. When analyzing 145 patients, AUC of FIB-4 and NFS was 0.86 and 0.81, respectively.61 In 1,038 patients, AUC of FIB-4 was 0.849.50 When analyzed in 5,735 patients from 37 studies, the AUC for FIB-4 and NFS was 0.76 and 0.73, which is almost consistent with our study results.60 In our study, the AUCs of FIB-4 and NFS were 0.76 and 0.74, respectively.
Finally, our study found that accuracy of FIB-4 or NFS to predict F3 relatively increased in Europe compared with Asia or America. This is interpreted because the proportion of Caucasians in the cohort study in which FIB or NFS was developed was relatively high, 79% for FIB-4 and 90% for NFS.9,10,62 Therefore, the accuracy of FIB or NFS is relatively low in Asian countries or in the multiracial America continent.
Advantages of these markers in NAFLD include low cost, quick diagnosis, and easy repeatability. In primary care, the use of noninvasive markers can increase early detection of AF, decrease avoidable referral of patients with mild diseases and ensure cost-effectiveness.63 Therefore, many experts recommend implementing a two-tier approach to improve resource utilization.18,57,64 On the other hand, compared to transient elastography or magnetic resonance elastography, disadvantages such as the low ability to diagnose AF should always be noted.60
The strength of our meta-analysis is its focus on comparing the diagnostic accuracies of two noninvasive and routinely usable scoring systems to predict the degree of liver fibrosis in NAFLD patients. To the best of our knowledge, this meta-analysis has the largest sample size, amongst those that compare the diagnostic accuracy of noninvasive scoring systems for AF and SF in NAFLD patients. However, there are some limitations to consider. First, the biggest limitation of this study is that it lacks new information compared to the existing meta-analysis related to FIB-4 or NFS. Before starting the study, we reviewed the existing meta-analysis literature and found that more literature than expected was not included in the analysis. Therefore, in order to obtain more accurate results, it was determined that accurate inclusion criteria should be reapplied, and as a result, a wider range of papers could be analyzed than existing meta-analysis papers. Eventually, the results of our study are not very different from the existing meta-analyses, and thus provide few new revelations. However, our study has clinical significance as the most extensive analytical research on this subject. Second, limiting our review to manuscripts published only in English may have caused publication bias. We conducted a funnel plot and Egger’s test, later to confirm that our study has no publication bias with p=0.135 in FIB-4 for AF. However, there was a publication bias in FIB-4 for SF, and NSF for AF and SF with p<0.05, which may limit the credibility of our results. Third, only two noninvasive scoring systems were the focus of our analysis. Other noninvasive scoring systems such as BMI, aspartate aminotransferase/alanine aminotransferase ratio, diabetes (BARD) score, and aspartate aminotransferase to platelet ratio tests were not considered.50,65 Fourth, our research focused on liver fibrosis and did not consider the degree of hepatic steatosis. Furthermore, the included studies did not provide sufficient information on the patients’ duration of NAFLD or past treatments, which can affect the incidence and severity of liver fibrosis.
In summary, both FIB-4 index and NFS were useful in predicting the degree of liver fibrosis in NAFLD. The diagnostic accuracy of these scoring systems was higher in predicting AF than in SF. Thus, the FIB-4 index and NFS may be considered as alternative diagnostic methods to liver biopsy when predicting the level of fibrosis in NAFLD.
Supplementary materials can be accessed at https://doi.org/10.5009/gnl210391.
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2021R1G1A1007886), and in part by the Soonchunhyang University Research Fund.
No potential conflict of interest relevant to this article was reported.
Study concept and design: D.W.J. Provision of study materials or patients: M.C., H.W.L. Collection and assembly of data: S.H.K., Y.C., S.B.A., D.S.S. Data analysis and interpretation: B.L., S.H., J.J.Y. Manuscript writing: S.H., J.L., J.J.Y. Final approval of manuscript: all authors.
Table 1 Characteristics and Results of the Included Studies
Author (year) | Location | Method | No. of samples | Mean age, yr | Male, % | Clinical characteristics | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DM, % | HTN, % | Dyslipidemia, % | Mean BMI, kg/m2 | Mean waist, cm | Mean AST, U/L | Mean ALT, U/L | ||||||
Aida (2015)13 | Japan | FIB-4 | 148 | 61 | 36 | - | - | - | 26.9 | - | 42 | 52 |
Anstee (2019)14 | Global | FIB-4/NFS | 3,123 | 59 | 42 | 68 | - | - | - | - | 44 | 45 |
Balakrishnan (2020)15 | USA | FIB-4/NFS | 99 | 44.7 | 33.9 | 43.5 | - | - | 32.1 | - | 59 | 108 |
Boursier (2019)16 | France | FIB-4/NFS | 938 | 56.5 | 58.5 | 51.1 | - | - | 31.8 | - | 39 | 56 |
Chan (2015)17 | Malaysia | NFS | 147 | 50.5 | 54.4 | 52.4 | 89.1 | - | 29.3 | 98.2 | 41 | 71 |
Chan (2019)18 | Asia | FIB-4/NFS | 583 | 50.9 | 52.9 | 52.3 | 55.1 | 74.5 | 28.9 | 96.5 | 38 | 63 |
Cui (2015)19 | USA | FIB-4 | 102 | 51 | 41.2 | 25.5 | - | - | 31.7 | - | 42.3 | 58 |
Demir (2013)20 | Germany | NFS | 120 | 43.8 | 47.2 | 19.9 | 41.9 | - | 37 | - | 36.8 | 56.6 |
de Carli (2019)21 | Brazil | NFS | 266 | 36.5 | 20.4 | 10.5 | - | - | 44.2 | 123.9 | 24.9 | 32.3 |
Goh (2015)22 | USA | NFS | 238 | 52 | 29.3 | 100 | - | 42 | 37.1 | - | 53.5 | 64 |
USA | NFS | 263 | 46 | 46 | 0 | - | 15.2 | 35.2 | - | 58.5 | 78 | |
Joo (2017)23 | Korea | FIB-4/NFS | 315 | 55 | 50.8 | 37.8 | 38.4 | - | 27 | 94.6 | 36 | 43 |
Jun (2017)24 | Korea | FIB-4/NFS | 328 | 36.4 | 70.7 | 33 | 14.6 | - | 28.6 | 96.4 | 91.3 | 98.5 |
Kakisaka (2018)25 | Japan | FIB-4 | 63 | 54.9 | 58 | - | - | - | 28.1 | - | 62 | 94 |
Kao (2020)26 | Taiwan | FIB-4 | 73 | 35.3 | 31.5 | 16.9 | 26.8 | - | 41 | 118.3 | 38.2 | 55 |
Kaya (2020)27 | Turkey | FIB-4/NFS | 463 | 46 | 47.5 | 37.8 | 34.8 | - | 31.7 | 104 | 42 | 66 |
Kim (2013)28 | USA | FIB-4/NFS | 142 | 52.8 | 26.8 | 27.5 | 45.1 | - | 36.32 | - | 47.2 | 60.4 |
Labenz (2018)29 | Germany | FIB-4/NFS | 261 | 51 | 52.5 | 29.9 | - | 37.5 | 30.9 | - | 48 | 60 |
Lang (2020)30 | Germany | FIB-4/NFS | 95 | 50 | 46.2 | 10.8 | 56.9 | - | 30 | 105 | 32.5 | 50.5 |
Lum (2020)31 | Singapore | FIB-4/NFS | 263 | 50.4 | 52.5 | 49 | - | 66.5 | 30.4 | 113.2 | - | - |
McPherson (2013)32 | UK | FIB-4/NFS | 70 | 54 | 56 | 43 | - | - | 32.9 | 105 | 28 | 28 |
UK | FIB-4/NFS | 235 | 48 | 63 | 40 | - | - | 34.4 | 110 | 59 | 95 | |
Meneses (2020)33 | Spain | FIB-4/NFS | 50 | 49 | 30 | 26 | 52 | 28 | 44.3 | 135 | 21 | 25 |
Nasr (2016)34 | Sweden | FIB-4/NFS | 58 | 60.4 | 71 | 53 | 93 | - | 28 | 102 | 34 | 60 |
Ooi (2017)35 | Australia | FIB-4/NFS | 101 | 49 | 33.7 | 34.7 | 79.2 | 73.3 | 41.9 | - | - | - |
Australia | FIB-4/NFS | 53 | 43 | 30.2 | 24.5 | 45.3 | 19.2 | 46.6 | - | - | - | |
Patel (2018)36 | USA | FIB-4/NFS | 114 | 41.8 | 79 | 30 | - | - | 33.9 | - | 49 | 84 |
USA | FIB-4/NFS | 151 | 60 | 95 | 70 | - | - | 33.7 | - | 48.2 | 54.6 | |
Pérez-Gutiérrez (2013)37 | Mexico | FIB-4/NFS | 243 | 48.6 | 49 | 21.5 | - | - | - | - | 57.6 | 73 |
Petta (2015)38 | Italy | FIB-4/NFS | 179 | 45.4 | 67.5 | 19.5 | 24 | - | 29.3 | - | 45.7 | 80.3 |
Italy | FIB-4/NFS | 142 | 43.9 | 71.8 | 15.4 | 11.9 | - | 27.4 | - | 42.2 | 75.6 | |
Petta (2019)39 | Global | FIB-4/NFS | 968 | 50.1 | 62.9 | 37 | 39.4 | - | 29.3 | - | 46.1 | 76.1 |
Siddiqui (2020)40 | USA | FIB-4/NFS | 1,904 | 50.3 | 37 | 39 | 58 | 62 | 34.4 | - | 51.2 | 69.8 |
Singh (2020)41 | USA | FIB-4/NFS | 1,134 | 51.1 | 35.4 | 100 | 74.5 | 70.8 | 35.5 | - | 27 | 28 |
Treeprasertsuk (2016)42 | Thailand | FIB-4/NFS | 139 | 40.9 | 47 | 38 | - | - | 36.1 | - | 38 | 56 |
Wong (2008)43 | Hong Kong | NFS | 128 | 46 | 59 | 57 | 48 | - | 28.5 | 95 | 43 | 75 |
Wong (2010)44 | Hong Kong | FIB-4/NFS | 246 | 51 | 54.9 | 36.2 | 40.2 | - | 28 | 94 | - | 75 |
Xun (2012)45 | China | FIB-4/NFS | 152 | 37.1 | 79.6 | 32.2 | - | - | 26.1 | - | 61 | 100 |
Yang (2019)46 | China | FIB-4/NFS | 453 | 36.56 | 58.9 | 30.2 | 34.8 | - | 26.93 | - | 74.12 | 135.11 |
Yoneda (2013)47 | Japan | FIB-4/NFS | 235 | 59.9 | - | 46 | - | 63.8 | 26.9 | - | 24.7 | 23.7 |
Zhou (2019)48 | China | FIB-4/NFS | 207 | 41.8 | 73 | 24.6 | 20.3 | 46.6 | 27 | 91.2 | 45.7 | 49 |
DM, diabetes mellitus; HTN, hypertension; BMI, body mass index; AST, aspartate aminotransferase; ALT, alanine aminotransferase; FIB-4, Fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score.
Table 2 Summary Sensitivities, Specificities, Diagnostic Odds Ratio, Positive Likelihood Ratio, Negative Likelihood Ratio and AUC of FIB-4 Index and NFS at Various Diagnostic Thresholds for the Prediction of Advanced Fibrosis and Significant Fibrosis
Variable | Cutoff | No. of samples (No. of patients) | Summary statistics (95% CI) | |||||
---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | DOR | LR+ | LR– | AUC | |||
Advanced fibrosis | ||||||||
FIB-4 | All | 32 (13,764) | 0.42 (0.33–0.51) | 0.93 (0.91–0.95) | 10.83 (7.55–15.54) | 6.65 (5.01–8.83) | 0.61 (0.52–0.71) | 0.76 (0.74–0.81) |
1.02 to 1.45 | 20 (10,304) | 0.69 (0.59–0.77) | 0.64 (0.57–0.71) | 4.11 (2.24–7.55) | 1.96 (1.49–2.57) | 0.47 (0.33–0.67) | 0.73 (0.67–0.80) | |
1.515 to 2.09 | 5 (2,408) | 0.74 (0.58–0.85) | 0.80 (0.72–0.86) | 11.97 (7.48–19.16) | 3.82 (3.00–4.88) | 0.32 (0.20–0.51) | 0.82 (0.78–0.87) | |
2.67 | 18 (8,731) | 0.34 (0.27–0.42) | 0.95 (0.92–0.96) | 9.32 (6.26–13.87) | 6.46 (4.66–8.95) | 0.69 (0.62–0.77) | 0.74 (0.70–0.76) | |
3.25 | 9 (2,721) | 0.39 (0.22–0.59) | 0.95 (0.93–0.97) | 14.23 (5.92–34.19) | 8.97 (4.87–16.50) | 0.63 (0.45–0.86) | 0.76 (0.77–0.88) | |
NFS | All | 33 (13,337) | 0.38 (0.28–0.50) | 0.94 (0.90–0.96) | 10.16 (7.18–14.37) | 6.60 (4.85–8.97) | 0.64 (0.54–0.76) | 0.74 (0.71–0.79) |
–1.98 to –1.036 | 23 (10,158) | 0.70 (0.57–0.80) | 0.61 (0.53–0.69) | 3.77 (2.16–6.56) | 1.83 (1.46–2.28) | 0.48 (0.33–0.70) | 0.72 (0.66–0.79) | |
–0.126 to 0.19 | 2 (1,962) | 0.61 (0.28–0.86) | 0.87 (0.79–0.92) | 10.79 (3.65–31.84) | 4.79 (3.13–7.35) | 0.44 (0.20–0.97) | 0.80 (0.78–0.88) | |
4.39 to 4.8 | 31 (11,471) | 0.31 (0.23–0.41) | 0.95 (0.93–0.97) | 10.17 (7.06–14.63) | 7.29 (5.34–9.93) | 0.71 (0.63–0.81) | 0.73 (0.71–0.81) | |
Significant fibrosis | ||||||||
FIB-4 | All | 6 (547) | 0.42 (0.16–0.73) | 0.93 (0.56–0.99) | 9.71 (2.43–38.70) | 6.05 (1.25–29.29) | 0.62 (0.40–0.95) | 0.68 (0.65–0.76) |
0.66 to 0.89 | 4 (434) | 0.69 (0.55–0.80) | 0.61 (0.44–0.76) | 3.58 (1.96–6.53) | 1.79 (1.26–2.55) | 0.50 (0.35–0.70) | 0.66 (0.61–0.77) | |
1.4 to 1.9 | 2 (136) | 0.65 (0.53–0.75) | 0.66 (0.51–0.79) | 3.69 (1.61–8.49) | 1.94 (1.21–3.09) | 0.52 (0.35–0.78) | 0.74 (0.54–0.85) | |
2.67 to 3.25 | 2 (151) | 0.06 (0.02–0.22) | 0.98 (0.94–1.00) | 4.25 (0.57–31.68) | 10.82 (0.47–257.67) | 0.93 (0.85–1.02) | 0.67 (0.61–0.73) | |
NFS | All | 5 (539) | 0.25 (0.02–0.82) | 0.76 (0.37–0.94) | 1.16 (0.37–3.57) | 1.12 (0.50–2.49) | 0.96 (0.69–1.33) | 0.60 (0.52–0.69) |
–3.168 to –1.455 | 2 (335) | 0.57 (0.28–0.81) | 0.68 (0.37–0.88) | 2.82 (1.53–5.18) | 1.78 (1.11–2.84) | 0.63 (0.42–0.94) | 0.61 (0.58–0.72) | |
0.676 | 3 (279) | 0.04 (0.01–0.25) | 0.92 (0.77–0.98) | 0.52 (0.07–3.59) | 0.54 (0.08–3.49) | 1.03 (0.96–1.12) | 0.53 (0.42–0.73) | |
1.292 | 2 (154) | 0.81 (0.59–0.92) | 0.27 (0.16–0.44) | 1.65 (0.50–5.36) | 1.12 (0.87–1.44) | 0.67 (0.26–1.73) | 0.67 (0.57–0.75) |
AUC, area under curve; FIB-4, fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DOR, diagnostic odds ratio; LR+, positive likelihood ratio; LR–, negative likelihood ratio.
Table 3 Summary Sensitivities, Specificities, Diagnostic Odds Ratio, Positive Likelihood Ratio, Negative Likelihood Ratio and AUC of FIB-4 Index and NFS According to Region or Body Mass Index for Prediction of Advanced Fibrosis
Variable | Cutoff | No. of samples (No. of patients) | Summary statistics (95% CI) | |||||
---|---|---|---|---|---|---|---|---|
Sensitivity | Specificity | DOR | LR+ | LR– | AUC | |||
Region | ||||||||
FIB-4 (F3) | Asia | 12 (3,388) | 0.48 (0.36–0.60) | 0.86 (0.80–0.91) | 6.09 (4.44–8.36) | 3.63 (2.77–4.75) | 0.59 (0.49–0.72) | 0.74 (0.71–0.77) |
Europe | 6 (1,978) | 0.61 (0.43–0.76) | 0.91 (0.83–0.95) | 16.37 (7.93–33.74) | 6.95 (3.91–12.32) | 0.42 (0.27–0.64) | 0.80 (0.78–0.82) | |
America | 8 (4,155) | 0.51 (0.37–0.65) | 0.85 (0.71–0.93) | 6.23 (2.51–15.47) | 3.52 (1.72–7.22) | 0.56 (0.42–0.75) | 0.73 (0.68–0.78) | |
NFS (F3) | Asia | 12 (3,407) | 0.38 (0.24–0.54) | 0.90 (0.83–0.95) | 6.22 (4.52–8.57) | 4.20 (3.07–5.76) | 0.67 (0.55–0.82) | 0.70 (0.65–0.74) |
Europe | 7 (2,098) | 0.55 (0.39–0.71) | 0.94 (0.85–0.98) | 21.94 (10.03–47.94) | 10.23 (4.43–23.64) | 0.46 (0.33–0.65) | 0.83 (0.80–0.86) | |
America | 8 (4,447) | 0.50 (0.33–0.68) | 0.78 (0.64–0.87) | 3.70 (1.77–7.74) | 2.33 (1.44–3.77) | 0.62 (0.45–0.88) | 0.72 (0.69–0.76) | |
Body mass index | ||||||||
FIB-4 (F3) | <30 kg/m2 | 12 (3,870) | 0.50 (0.38–0.61) | 0.88 (0.82–0.92) | 7.67 (5.46–10.78) | 4.33 (3.21–5.85) | 0.56 (0.46–0.68) | 0.75 (0.71–0.82) |
30 to <35 kg/m2 | 10 (1,331) | 0.51 (0.38–0.64) | 0.88 (0.78–0.93) | 7.92 (3.73–16.81) | 4.33 (2.39–7.85) | 0.54 (0.41–0.71) | 0.74 (0.72–0.76) | |
≥35 kg/m2 | 4 (675) | 0.60 (0.48–0.71) | 0.82 (0.69–0.90) | 7.31 (5.29–10.12) | 3.49 (2.26–5.40) | 0.47 (0.39–0.57) | 0.77 (0.74–0.80) | |
NFS (F3) | <30 kg/m2 | 12 (3,889) | 0.35 (0.23–0.50) | 0.93 (0.88–0.96) | 8.17 (5.54–12.05) | 5.60 (3.82–8.21) | 0.68 (0.57–0.82) | 0.71 (0.67–0.75) |
30 to <35 kg/m2 | 9 (4,583) | 0.51 (0.35–0.66) | 0.84 (0.75–0.91) | 5.96 (2.77–12.85) | 3.41 (1.98–5.86) | 0.57 (0.41–0.78) | 0.73 (0.70–0.67) | |
≥35 kg/m2 | 6 (2,205) | 0.55 (0.33–0.76) | 0.83 (0.61–0.94) | 6.37 (3.82–10.63) | 3.38 (1.81–6.32) | 0.53 (0.37–0.75) | 0.80 (0.74–0.85) |
AUC, area under curve; FIB-4, fibrosis-4; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DOR, diagnostic odds ratio; LR+, positive likelihood ratio; LR–, negative likelihood ratio.
Table 4 Meta-Regression for Diagnostic Odds Ratio of Each Measurement
Variable | FIB-4 for AF | NFS for AF | |||
---|---|---|---|---|---|
Coefficient (95% CI) | p-value | Coefficient (95% CI) | p-value | ||
Mean age, yr | 0.063 (0.022 to 0.104) | 0.002 | 0.095 (0.048 to 0.143) | <0.001 | |
Proportion of male, % | –0.019 (–0.037 to –0.001) | 0.034 | –0.022 (–0.043 to –0.001) | 0.040 | |
DM, % | 0.005 (–0.011 to 0.021) | 0.561 | 0.013 (–0.001 to 0.028) | 0.065 | |
HTN, % | 0.028 (–0.001 to 0.045) | 0.055 | 0.021 (–0.003 to 0.046) | 0.089 | |
Dyslipidemia, % | –0.009 (–0.067 to 0.048) | 0.747 | 0.005 (–0.028 to 0.039) | 0.758 | |
Mean BMI, kg/m2 | 0.004 (–0.086 to 0.095) | 0.919 | 0.062 (–0.032 to 0.156) | 0.196 | |
Mean waist, cm | –0.009 (–0.075 to 0.056) | 0.774 | 0.001 (–0.070 to 0.071) | 0.997 | |
Mean AST | –0.020 (–0.041 to 0.001) | 0.060 | –0.023 (–0.047 to 0.001) | 0.055 | |
Mean ALT | –0.014 (–0.026 to –0.002) | 0.017 | –0.018 (–0.032 to –0.005) | 0.007 |
FIB-4, fibrosis-4; AF, advanced fibrosis; NFS, nonalcoholic fatty liver disease (NAFLD) fibrosis score; CI, confidence interval; DM, diabetes mellitus; HTN, hypertension; BMI, body mass index; AST, aspartate aminotransferase; ALT, alanine aminotransferase.