Iterative reconstruction (IR) was developed to decrease image noise [1, 2, 3]. However, conventional IR has two major limitations: a long reconstruction time and an unnatural image texture [4, 5, 6]. Adaptive statistical iterative reconstruction V (AV) demonstrates a short reconstruction time [7, 8, 9]. However, AV has a trade-off between image noise and texture .
Recently, image denoising algorithms using artificial neural networks, termed deep learning-based denoising algorithms (DLA), have been developed to overcome the drawbacks of IR [11, 12]. Shin et al. showed that although their DLAs achieved less noise than filtered back projection (FBP) and advanced modeled iterative reconstruction (ADMIRE) in low-dose CT, they did not maintain spatial resolution . Jensen et al. reported that TrueFidelity, a type of DLA, improves image quality through noise reduction and increased contrast-to-noise ratio (CNR) in routine-dose CT .
Therefore, this study aimed to assess the quality, including noise and spatial resolution, of phantom and abdominal CT with decreased radiation dose using a deep learning-based image reconstruction (DLIR) engine (TrueFidelity, GE Healthcare) with CT using AV, commonly used in abdominal CT.
The raw data were reconstructed in seven different axial images: FBP and ASIR-V with blending factors of 30%, 50%, or 100% (AV30, AV50, and AV100, respectively). The noise power spectrum (NPS), calculated by the standard Fourier transform technique, determined the amount of noise (magnitude) and noise characteristics (texture) in the spatial frequency domain [15, 16, 17]. To measure the NPS, we calculated the peak average spatial frequency of module 3 of the American College of Radiology (ACR) phantom (Gammex 464, Sun Nuclear, Middleton, WI, USA) at multiple doses (Figure 1). Computed tomography (CT) was performed using following parameters: peak kilovoltage (kVp), 100; beam collimation, 0.625 × 64mm; tube current modulation range 50–250 mAs. The task-based transfer function (TTF) is a representative metric of spatial resolution . We measured TTF in two materials (bone and acrylic) in module 1. To quantify TTF, the spatial frequency (TTF50%) was calculated at the point where the Y-axis value became 0.5 in the measured TTF curve. The NPS was implemented and calculated using MATLAB (Version R2017a, The MathWorks, Inc., Natick, MA, USA), and the TTF used imQuest (Duke University) software implemented in MATLAB.
This retrospective study was approved by the Institutional Review Board. Two hundred and three patients had undergone abdominal CT (Revolution CT; GE Healthcare) from February 2020 to April 2020. CT scans with 70 different combination of reconstructions, eight large hepatic lesions > 2 cm, and five poor image quality were excluded. The CT of 120 individuals were retrospectively reviewed (Table 1). The mean body mass index of patients in this study was 23.6 ± 3.6 (SD).
|Age (years)||54.4 ± 20.6|
|Body mass index||23.1 ± 3.6|
|CTDIvol (mGy)||5.06 ± 1.85|
|DLP (mGycm)||281.29 ± 92.69|
All patients underwent abdominal CT using a CT system (Revolution, GE Healthcare) that could reconstruct both the AV and DLIR engines. CT was performed using the following parameters: peak kilovoltage (kVp), 100; beam collimation, 0.625 × 128 mm; tube current modulation range 100–550 mAs; noise index, 17; gantry rotation time, 0.6 s; coverage speed, 132.29 mm/s; pitch, 0.992:1; and slice thickness, 2.5 mm. The mean volume CT dose index was 5.06 ± 1.85 (SD) mGy, and the mean dose length product (DLP) was 281.29 ± 92.69 (SD) mGy.cm. A nonioninated contrast medium (Ioversol 320 mg/mL; 2 mL/kg body weight) was administered for contrast enhancement. The timing of the portal venous phase scan was a fixed time-delay technique of 90 s after contrast administration. The raw data were reconstructed in six different reconstructions: FBP, AV30, AV50, and DLIR (DLIR-Low, DLIR-Medium, and DLIR-High).
One radiologist placed three circular ROIs to measure the mean attenuation (HU) and noise (SD) (Figure 2). Three ROIs were placed within the liver right lobe of right portal vein level, abdominal aorta below both renal artery branches, and subcutaneous fat in right buttock. Each ROI was noted to avoid confounding structures, such as large vessels.
Two radiologists with 12 and 5 years of experience evaluated each of the five sets, except FBP. For the similar evaluation of the image sets, a coaching session was held for the participating radiologists. Readers were blinded to reconstruction methods and the order of image sets was randomized for each patient. Each reader independently graded the pair-wise approach using a two-monitor high-resolution PACS workstation (EIZO RX 240). The results of one radiologist were used, and those of the other were used to evaluate the inter-reader agreement. Each image set was ranked against one another on a comparative scale for overall image quality, image noise, and image sharpness. A score of 5 was assigned to the images with the best quality. The image sharpness was rated in the evaluation of the liver parenchyma, the pancreas contour, and the kidneys.
Repeated measures analysis of variance with the Bonferroni post hoc test was used to compare the NPS and TTF of phantom and the HU, and noise in different reconstructions. The Friedman test was used for qualitative analysis. The weighted Cohen’s kappa statistic was used to evaluate agreement. Statistical significance was set at p < 0.05. Statistical analyses were performed with SPSS software version 21.0 (IBM Corp.).
The CTDIvol (mGy) was 2.1, 4.2, 6.3, 8.4, and 10.5. The NPS peak decreased in the order of DLIR-L, M, H. Overall, the NPS peak of DLIR was smaller than that of AV30 or AV50 (Table 2).
|NPS PEAK (HU2MM2)|
|NPS AVERAGE SPATIAL FREQUENCY (MM–1)|
The highest values of the NPS average spatial frequency were obtained for FBP. The NPS spatial frequency decreased as the percentage of AV factor increased and decreased as the DLIR level increased (Figure 3). Compared with AV30, the NPS spatial frequencies were 5 to 10% higher with DLIR-L or DLIR-M. Compared with AV50, the NPS spatial frequencies were 10 to 20% higher for all DLIR levels.
For lower-contrast objects, TTF values in images with DLIR were higher than those with AV (Table 3). The differences in TTF were greater at low doses. For higher-contrast objects, TTF values did not show significant differences between images with DLIR and those with AV.
|CTDIVOL||TTF50 (MM–1) of ROI1 (BONE)||TTF50 (MM–1) of ROI2 (ACRYLIC)|
The mean HU showed no significant difference between the six different reconstructions. The SD of the liver and aorta showed significant differences (p < 0.001) (Table 4). The SD of fat showed significant differences in different protocols, except between AV50 and DLIR-L (p < 0.001). A higher factor in AV (AV30 < AV50) and higher strength in DLIR (DLIR-L<DLIR-M<DLIR-H) showed significantly lower SD. Comparison of DLIR images with AV images showed that the SD in DLIR-H and DLIR-M was 10 to 50 % lower than both AV30 and AV50 (p < 0.001).
|HU||130.46 ± 22.91||130.46 ± 22.91||130.47 ± 22.91||130.63 ± 22.85||130.74 ± 22.86||130.76 ± 22.86||1.000|
|SD||25.65 ± 1.81||20.03 ± 1.51||16.36 ± 1.34||18.43 ± 1.56||14.40 ± 1.26||10.05 ± 1.00 a||<.001|
|HU||206.21 ± 50.56||206.47 ± 50.08||206.43 ± 50.07||208.01 ± 50.11||208.11 ± 50.07||206.47 ± 50.08||1.000|
|SD||27.01 ± 2.51||20.72 ± 2.10||16.69 ± 1.91||19.41 ± 1.97||15.13 ± 1.52||10.50 ± 1.30||<.001|
|HU||107.59 ± 17.71||107.51 ± 17.73||107.49 ± 17.71||106.06 ± 19.58||106.79 ± 17.55||106.58 ± 17.52||1.000|
|SD||22.56 ± 2.10||17.88 ± 1.77||14.88 ± 1.64||14.82 ± 1.54||11.31 ± 1.32||7.56 ± 1.18||<.001|
Five reconstruction protocols showed significant differences (p < 0.001). The overall image quality was the best for the DLIR-M (p < 0.001) (Table 5). DLIR-H had the best-ranking score for noise; it provided worse image sharpness compared to DLIR-M and DLIR-L (p < 0.001). AV30 and AV50 had relatively lower ranking scores for all aspects compared to the DLIR (p < 0.001). Inter-reader agreement was moderate in overall image quality, very good in noise (K = 0.48, 0.92, p < 0.001) and fair in image sharpness (K = 0.24, p < 0.001).
|Overall image quality||1.93 ± 1.1||1.63 ± 0.78||4.04 ± 0.76||4.51 ± 0.75||2.89 ± 0.84|
|Noise||1.18 ± 0.39||1.83 ± 0.40||2.99 ± 0.09||4.00 ± 0.00||5.00 ± 0.00|
|Spatial resolution||2.18 ± 0.67||1.27 ± 0.72||4.67 ± 0.57||4.19 ± 0.60||2.69 ± 0.63±|
Our study demonstrated that CT reconstructed with DLIR showed lower noise magnitude and noise texture and image sharpness similar to those with FBP using a phantom and abdominal CT comparing those with AV30 or AV50.
The DLIR was designed to differentiate the signal from noise without changing its texture . In the phantom study, DLIR images with any level showed decreased noise magnitude compared with images with AV30 or 50, which are commonly used in clinical settings for abdominal CT. According to NPS spatial frequency, images with all DLIR levels showed better texture, similar to those with FBP, compared with those of AV50 or AV100. Moreover, images DLIR-L or M showed better texture with those of AV30 and DLIR-H results comparable to those of AV30.
For lower-contrast objects, images with DLIR showed better image sharpness than those with AV. For higher-contrast objects, there were no significant differences between the AV and DLIR images. Previous studies reported that the image sharpness between DLIR and AV50, AV100 was greater for low-contrast objects; however, it also showed differences for high-contrast objects . As our study did not include extremely low doses, different results were obtained.
In the patient study, the measurement of noise with DLIR-M or DLIR-H had lower noise than that with AV30, AV50. CT with DLIR-L did not show significantly different noise compared to AV50. These results were different from those of our phantom study, which showed significantly lower noise in the DLIR-L images.
In the qualitative analysis, DLIR effectively eliminated noise. Jenson et al. showed that readers evaluated images with DLIR-H as the best overall image quality . The authors performed CT with a noise index of 10 . In this study, we performed CT with the noise index of 17. CT with DLIR-M showed the best overall image quality, although DLIR-H showed lower noise. This could be due to image sharpness and texture characteristics. In the phantom study, compared with AV30, NPS spatial frequency were higher with DLIR-L and DLIR-M. It did not show statistically significant differences with DLIR-H. In patient studies, the evaluation of spatial resolution showed a fair inter-reader agreement. Further research is needed on this. The time required for reconstruction is similar between DLIR and AV. Our study showed that DLIR is sufficient for reconstruction as the first option in daily practice.
The present study had several limitations. First, the phantom we used is not in conditions that are very close to the human body. Acrylic insert is a material with a lower HU than bone, and we thought that it could replace the material between water and bone. Further studies are needed for low-contrast materials. Second, this study did not compare the diagnostic capabilities.
In conclusion, phantom data suggests that DLIR showed improved spatial resolution, FBP-like image texture, and effective noise reduction under a decreased radiation dose. Patient data suggests that DLIR showed effective noise reduction while preserving image quality. DLIR-M showed better rankings in both image quality and image sharpness comparing AV-30 or AV-50 in abdominal CT.
This research was supported by Korea University Medical School (K1925111) and Korea University Ansan Hospital (K2111051).
We thank Min Baeggi for their technical support.
The authors have no competing interests to declare.
Pontana F, Pagniez J, Duhamel A, et al. Reduced-dose low-voltage chest CT angiography with Sinogram-affirmed iterative reconstruction versus standard-dose filtered back projection. Radiology. 2013; 267: 609–618. DOI: https://doi.org/10.1148/radiol.12120414
Pontana F, Pagniez J, Flohr T, et al. Chest computed tomography using iterative reconstruction vs. filtered back projection (Part 1): Evaluation of image noise reduction in 32 patients. Eur Radiol. 2011; 21: 627–635. DOI: https://doi.org/10.1007/s00330-010-1990-5
Singh S, Kalra MK, Hsieh J, et al. Abdominal CT: Comparison of adaptive statistical iterative and filtered back projection reconstruction techniques. Radiology. 2010; 257: 373–383. DOI: https://doi.org/10.1148/radiol.10092212
Samei E, Richard S. Assessment of the dose reduction potential of a model-based iterative reconstruction algorithm using a task-based performance metrology. Med Phys. 2015; 42:314–323. DOI: https://doi.org/10.1118/1.4903899
Ott JG, Becce F, Monnin P, Schmidt S, Bochud FO, Verdun FR. Update on the non-prewhitening model observer in computed tomography for the assessment of the adaptive statistical and model-based iterative reconstruction algorithms. Phys Med Biol. 2014; 59: 4047–4064. DOI: https://doi.org/10.1088/0031-9155/59/4/4047
Greffier J, Frandon J, Larbi A, Beregi JP, Pereira F. CT iterative reconstruction algorithms: A task-based image quality assessment. Eur Radiol. 2020; 30: 487–500. DOI: https://doi.org/10.1007/s00330-019-06359-6
Lee NK, Kim S, Hong SB, et al. Low-dose CT with the adaptive statistical iterative reconstruction v technique in abdominal organ injury: Comparison with routine-dose CT with filtered back projection. Am J Roentgenol. 2019; 213: 659–666. DOI: https://doi.org/10.2214/AJR.18.20827
Park C, Choo KS, Kim JH, Nam KJ, Lee JW, Kim JY. Image quality and radiation dose in CT venography using model-based iterative reconstruction at 80 kVp versus adaptive statistical iterative reconstruction-v at 70 kVp. Korean J Radiol. 2019; 20: 1167–1175. DOI: https://doi.org/10.3348/kjr.2018.0897
Ren Q, Dewan SK, Li M, et al. Comparison of adaptive statistical iterative and filtered back projection reconstruction techniques in brain CT. Eur J Radiol. 2012; 81: 2597–2601. DOI: https://doi.org/10.1016/j.ejrad.2011.12.041
Padole A, Ali Khawaja RD, Kalra MK, Singh S. CT radiation dose and iterative reconstruction techniques. AJR Am J Roentgenol. 2015; 204: W384–392. DOI: https://doi.org/10.2214/AJR.14.13241
Kang E, Min J, Ye JC. A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction. Med Phys. 2017; 44: e360–e375. DOI: https://doi.org/10.1002/mp.12344
Chen H, Zhang Y, Zhang W, et al. Low-dose CT via convolutional neural network. Biomed Opt Express. 2017; 8: 679–694. DOI: https://doi.org/10.1364/BOE.8.000679
Shin YJ, Chang W, Ye JC, et al. Low-dose abdominal using a deep learning-based denoising algorithm: A comparison with CT reconstructed with filtered back projection or iterative reconstruction algorithm. Korean J Radiol. 2020; 21: 356–364. DOI: https://doi.org/10.3348/kjr.2019.0413
Jensen CT, Liu X, Tamm EP, et al. Image quality assessment of abdominal CT by use of new deep learning image reconstruction: Initial experience. Am J Roentgenol. 2020; 215: 50–57. DOI: https://doi.org/10.2214/AJR.19.22332
Ehman EC, Yu L, Manduca A, et al. Methods for clinical evaluation of noise reduction techniques in abdominopelvic CT. Radiographics. 2014; 34: 849–862. DOI: https://doi.org/10.1148/rg.344135128
Kijewski MF, Judy PF. The noise power spectrum of CT images. Physics in Medicine and Biology. 1987; 32: 565–575. DOI: https://doi.org/10.1088/0031-9155/32/5/003
Friedman SN, Fung GS, Siewerdsen JH, Tsui BM. A simple approach to measure computed tomography (CT) modulation transfer function (MTF) and noise-power spectrum (NPS) using the American College of Radiology (ACR) accreditation phantom. Med Physics. 2013; 40: 051907. DOI: https://doi.org/10.1118/1.4800795
Park C, Choo KS, Jung Y, Jeong HS, Hwang JY, Yun MS. CT iterative vs. deep learning reconstruction: Comparison of noise and sharpness. Eur Radiol. 2021; 31: 3156–3164. DOI: https://doi.org/10.1007/s00330-020-07358-8
Greffier J, Hamard A, Pereira F, et al. Image quality and dose reduction opportunity of deep learning image reconstruction algorithm for CT: A phantom study. Eur Radiol. 2020; 30: 3951–3959. DOI: https://doi.org/10.1007/s00330-020-06724-w