Multi-Projection Segmentation on Dental Cone Beam Computed Tomography Images Using Level Set Method

Segmentation of dental Cone-beam computed tomography (CBCT) images based on Boundary Tracking has been widely used in recent decades. Generally, the process only uses axial projection data of CBCT where the slices image that representing the tip of the tooth object have decreased in contrast which impact too difficult to distinguish with background or other elements. In this paper we propose the multi-projection segmentation method by combining the level set segmentation result on three projections to detect the tooth object more optimally. Multi-projection is performed by decomposing CBCT data which produces three projections called axial, sagittal and coronal projections. Then, the segmentation based on the set level method is implemented on the slices image in the three projections. The results of the three projections are combined to get the final result of this method. This proposed method obtains evaluation results of accuracy, sensitivity, specificity with values of 97.18%, 88.62%, and 97.61%, respectively.


I. INTRODUCTION
Contribution of dental cone-beam computed tomography (CBCT) images in dentistry as support tools for analysis and diagnosis triggered research related to computer science [1]. The research aim to modeling dental objects in CBCT images used in dental treatment planning and simulation [2]. The segmentation of CBCT is research related to computer science which it is fundamental step towards achieving the objective.
In the last few decades, several segmentation methodologies have been used to generate teeth model from CBCT images. Generally, the CBCT image segmentation method can be categorized into 3 types including manually, interactively, and automatically, respectively [3]. Computer science research related to biomedical technology focuses on interactive and automated segmentation. Interactive segmentation involves human interaction to mark "objectivity" as reference data in the segmentation process [4]. Meanwhile, automatic segmentation Doesn't require the human interaction in its process [5].
Interactive and automatic segmentation methods can be classified into two approaches, namely Label Propagation and Boundary Tracking [4]. The label propagation approach tends to analyze and process the value of a single or area pixel (region pixel) [6]. One of these approaches is the region splitting / merging method [7]. The boundary tracking approach is the process of tracking the contours or edges of objects in the image [8]. The method that is commonly used and developed is called active contour [9].
Region segmentation is a label propagation approach where the process is to determine the similarity of subregions based on several properties, namely intensity, color, and texture, respectively [10]. Indraswari et al. (2018) developed a segmentation method utilizing 3D CBCT information by using a region merging algorithm to identify tooth elements that have similar intensity to other elements [11]. On the other hand, a method is also developed that is separate from the two approaches mentioned above. Deep learning is implemented into a multi-projection network that is used in the learning process for dental objects to be recognized in CBCT images [12].
Noise of Dental CBCT images is a common problem in several segmentation methods. Region-based segmentation (region) is a fast but less optimal method for the problem [13]. The reason is that the division of the region depends on the segmentation parameters that are affected by noise, for example intensity, color, and texture. Related research also blames this for further research [11]. The implementation of deep learning methods is a possible solution to the case. However, additional training be require on dental objects by append more training process to recognize noise content in dental CBCT images.
Segmentation using the boundary tracking approach is possible on the dental CBCT images that has noise [1]. This approach focuses on finding the contours of an object by tracing its edges. The boundary tracking method that is widely used and developed is called Level Set method [14]. The disadvantage of this level set method is the re-initialization step to avoid deviating the extracted object boundaries. The Distance Regularized Level set Evolution (DRLSE) method was developed intrinsically to maintain the regularity of the level set function [15].
In the last decade, segmentation of dental CBCT images using level sets has been widely researched and developed. Gao and Chae [16] developed the dental CBCT segmentation method based on the level set method with shape and intensity prior to the previous slice to segment the tooth and achieved promising results. Ji et al. [17] developed a level set framework for anterior teeth segmentation. The hybrid level set model was developed which integrated few method to generate optimal results [18]. Yau et al. [19] applied level set method which focuses on reconstruction based on data fusion. Xia et al [20] applied a level set method that focuses on the maxillary and mandible of teeth structure.
Generally, 3D CBCT images can be modeled into three types of scan projection, namely Axial, Sagittal, and Coronal, respectively. The researches in the last decade implemented and developed the level set method with the axial projection scan image as the executed data. The axial projection can be said to be the best scanning model of the others because it represents the direction of the tooth object shape in the axial slice image arrangement. However, the slice image arrangements representing the tip of the tooth object had decreased contrast. The object threshold has almost a similarity to the threshold of the background or other elements so that the object's contours are difficult to distinguish during the segmentation process [21]. Therefore, integration or a combination of sagittal and coronal projection is possible to implement because several slices image in the projection can represent fully tooth object.
In this paper we propose the multi-projection segmentation method by combining the level set segmentation result on three projections to detect the tooth object more optimally. Multi-projection is performed by decomposing CBCT data which produces three projections called axial, sagittal and coronal projections. Then, the segmentation based on the set level method is implemented on the slices image in the three projections.

A. Dental Cone Beam Computed Tomography
The dataset that has been used in this research is Dental CBCT scan images of human's jaw. Cone beam computed tomography is a radiographic imaging method that allows accurate three-dimensional (3D) imaging of hard tissue structures. CBCT is the most significant of the emerging medical diagnostic imaging modalities recently [14]. The 3D image is obtained from radiographic rays along 180-360 degrees of rotation which will be translated to a receptor. The translation results are rendered into a threedimensional volumetric image of the tooth structure that show as Figure 1a [22].
Generally, Dental CBCT image represented as a collection of 2-dimensional images. The two-dimension images are obtained from slicing 3D CBCT images of teeth based on three-dimensional coordinates. There are three models of slicing 3D CBCT images of teeth into 2dimensional images including from top to bottom, left to right, and front to back, respectively. The axial projection slice image as shown in Figure 1b is a collection of 2dimensional CBCT images of the top-to-bottom slicing process of the CBCT image. The Sagittal projection image in Figure 1c is a 2-dimensional CBCT image of left-toright slicing process. and the Coronal projection image in Figure 1d is a 2-dimensional CBCT image of the front-toback slicing process 3D Dental CBCT.

B. Level Set Method
The level set method has been widely used to segment medical images such as CBCT in recent decades [23]. The level set method was first introduced by Osher and Sethian in 1988 [14]. Level set methods utilize dynamic variational boundaries for segment the object by characterization of active contour. Segmentation process conduct into a timedependent Partial differential Equation (PDE) by function φ(t; x; y) [24]. This function specifies a level set function value that represents the contour of the segmented object. t represents the time-evolution of the active contour which implicitly tracks the zero-level set Γ(t) which is the real contour representation of the object. The value of F (t; x; y) determines the position of the proximity of the active contour Γ(t).
In the level set method, the shape curve of object C is represented implicitly by the distance function based on, where, all of the coordinate points on the boundary of the object φ(x; y) are equal to zero. This means that curves that pass through the outline of the object are all assumed to be zero. This condition is defined as zero level set and the method is called Level Set Function (LSF).
where, F is the speed function in the image segmentation depending on the image data and the LSF) . Equation 3 is the basic or traditional equation of the level set method. The development of this method is to avoid the process of re-initialization scheme on [16]. The formulation of energy minimization as the evolution of the LSF at the set level can be written, where, energy minimization is defined in Ω domain or slice of 2D CBCT images. ℛ ! is a relationship term for regulation of the level set with (n > 0) with a constant value. ℛ ! is also referred to as the internal energy of the LSF. The set level regulation relationship is defined as follows, where, p is a potential function (or density energy) ∶ [0, ∞] → ℜ. In eq.5, ℒ " ( ) and " ( ) are energy functions that evolve the curve of level set function. These two relations are also called the external energy function of the level set function. Each of these relationships is defined as follows, and where dan are Dirac delta and Heaveside function. is an edge detector image which can be defined as a positive and descending function in the image gradient for energy minimization. domain can be defined as follows, ∇ $ * is a matrix convolution process including image refinement. is a gaussian kernel with standard deviation. The convolution process can reduce the noise contained in the test image. The range of ( , ) is between 0 and 1. This edge detector implies the object's boundary value will be close to 0. whereas, the high value approaches 1 which is a homogeneous background.

C. Evaluation Method
Evaluation is carried out to determine the effectiveness of a method that has been developed. The comparison of the "results developed method" to "ground truth" will be done with the commonly used evaluation parameters. The Confusion Matrix is a tool for evaluating a method used or developed as shown in Table I Evaluation is carried out using the testing image results of the development method (Predicted Condition) with the ground truth (true condition). Each pixel from the cbct image resulting from the segmentation method that is developed will be compared with the pixel of the ground truth image. The four conditions are calculated, namely; 1) True Positive: a condition that the positive pixel in the resulting image is the same as the ground truth. 2) True Negative is a condition where the negative pixel of the resulting image is the same as the ground truth. 3) False Positive is a type of error where the predicted image pixel is positive, but the ground truth is negative. 4) False Negative is a type of error condition where the predicted pixel image is positive, while the ground truth is negative. From all these conditions, then the evaluation parameters are calculated, namely accuracy, sensitivity, and specificity using Eq.9-eq.11,

III. PPROPOSED METHOD
The procedure of the proposed method for Multi-Projection Segmentation on Dental Cone Beam Computed Tomography Images is shown in Figure 2. The segmentation procedure consists of three steps: 1) Voxel intensity clustering to determine the tooth object threshold; 2) Segmentation of initialization slices using Level Set based on Dental ROI Area. 3) Slice-by-slice segmentation of CBCT data on three projections (axial, sagittal, coronal) based on initialization slices. 4) Combined slice-by-slice segmentation results from the three projections (axial, sagittal, coronal) Step 1: Voxel intensity clustering to determine the tooth object threshold Voxel intensity clustering is used to determine the threshold of slices of dental CBCT image. Voxelization is carried out because of the arrangement of the CBCT slices to form a three-dimensional plane. It's the process of describing a 3D field in which in this paper the voxel size is 2x2x2. The voxel value is determined by the average value of the voxel points using Equation (10). Voxel clustering uses the Hierarchical Cluster Analysis (HCA) method where three clusters are determined with two thresholds. The two thresholds will divide the intensity of the object that represents the bone equal to the tooth. In CBCT images, bones and teeth are the part that is studied than other parts. Bones and teeth have the highest intensity compared to other parts. Step 2: Segmentation of initialization slices using Level Set based on Dental ROI Area Segmentation of initialization slices using Level Set based on Dental ROI Area is performed on initialization slices. The selection of initialization slices is done manually with human assistance. As shown in Figure 3, the initialization slices were taken in one slice in two parts, namely maxillary and mandibular. The slice selected should visualize the entire tooth object. Fig. 3. Example of CBCT slice initialization retrieval As in Figure 5. segmentation of an initialization slice begin with binarization process to convert it to a binary image [25]. Enhancement of salt and pepper noise with a median filter to enhance the results of binarization process. Dilation and erosion morphology were then performed on the binary image. dilation morphology operation produces ROI of tooth structure. On the other hand, the erosion morphology operation generates the point of the tooth object.
Furthermore, the ROI of the tooth structure is specified to be the ROI of the tooth object. The polynomial fitting method generates a curve line based on the point of the tooth object. The ROI of the tooth object is obtained by determining the dividing line between the object's point and its neighbors. the lowest grayscale value becomes the dividing line point between the two tooth objects. Figure 4 shows how the ROI of the tooth object is obtained from the curve line and the dividing line of the tooth object. Finally, tooth object tracking is implemented by evolving the ROI of each tooth object. We use the Distance Regularized Level Set Evolution (DRLSE) method to do this so that the segmentation results are obtained from the initialization slices.

Step 3: Segmentation of all slices of each projection (Axial, Sagittal, and Coronal)
Segmentation to all slices of each projection (axial, sagittal, and coronal) is based on the segmentation results from the initialization slices as the determination of the starting slice and initial contour ( 0) level set.
The arrangement of the Axial slices from the CBCT image CBCT image makes it a three-dimensional plane with coordinates X, Y, and Z. As shown in Figure 6, the sagittal and coronal slices are obtained by slicing one pixel from their respective coordinates, namely x and y. Sagittal and coronal projection CBCT slices together with axial  Segmentation is then performed on all slices in each projection. The result of segmentation of the initialization slice will determine the zero level set 0 or initial contour for the level set in the next slice. Axial projection segmentation was carried out by directly utilizing the segmentation results from the maxillary and mandibular initialization slices. the neighboring slices of the initialization slices are segmented by the same method. the result of segmentation of initialization slices into initial contours or zero level sets for tracking dental objects. After that, the result will be used as the initial contour for the next neighboring slice. The process will be repeated until all slices are processed as shown in Figure 7. The maxillary and mandibular initialization slice numbers will be the starting slices for this axial projection segmentation.  The segmentation process in the sagittal and coronal projections is shown in Figure 9. The segmentation process utilizes the segmentation results of the initialization slices of the maxillary or mandibular teeth. then the segmentation process occurs based on the number of objects detected in the initialization slice results. When processing an object, the centroid is determined to get the x and y coordinate values. The x value will be used to determine the starting slice for the slice in the coronal projection. Meanwhile, the value of y is used to determine the starting slice of the sagittal projection. Then an initial contour such as a line is determined based on the centroid point with the horizontal and vertical boundaries of the object. After that, the initial contour development was developed to track the sagittal and coronal starting slice tooth objects using the level set method. Segmentation of neighboring slices is then carried out by utilizing the results of the previous slices until all parts of the tooth object have been successfully segmented.

Step 4: Combination of segmentation result for each projection (Axial, Sagittal, Coronal)
Finally, the process of combining the results of segmentation of all slices of each axial, sagittal, and coronal projections. Figure 10 illustrates how a slice of axial, sagittal, and coronal projections is combined. The

IV. EXPERIMETN RESULT
The implementation of this research method has used grayscale image data sourced from the Dental and Oral Hospital, Universitas Airlangga (RSGM UNAIR) (Indraswari et al., 2018(Indraswari et al., , 2019. The data is an axial CBCT image slice of 7 patients, each of which consists of 200 slices. Each slice image is equipped with a ground truth image for the evaluation process of the proposed method. Ground truth images were created with the help of dentists to determine the object in each slice manually.
The first experiment in this study was the determination of the maxillary and mandibular initialization slices. Initialization slices on the maxilla and mandible were determined manually. We determined the initialization slices in the maxillary and mandibular sections for seven subjects as shown in Table II. The initialization slices should obtain the total number of tooth objects if selected.
In the proposed method, the first step is Voxel intensity clustering to determine the tooth object threshold. Threshold is used to obtain the intensity of the tooth object used in the initial binarization process. In Table II it is shown threshold of CBCT data for each research subject. ROI The gear object is formed using the result of the binarization. Furthermore, the ROI is then implemented as a zero-level set to be evolved to form a gear object. Table  II shows the results of this process from the proposed method in which dental objects appear from the object tracking results.
Visually, the results show that the segmentation process is successful in getting the tooth object on the initialization slice. However, to prove the results quantitatively, an evaluation method is used. Table III shows the results of the evaluation of the accuracy, sensitivity, and specificity of the designed method. In general, the level of value obtained is above ninety percent. This can be a reference that the method we have designed has been running optimally so that the results can be used for the next process according to the proposed method.   Next, segmentation of all axial, sagittal, and coronal projection images of the seven subjects was then performed. The sagittal and coronal sections were obtained by the decomposition process of CBCT data for the axial projection teeth which we described in the previous section of this paper. Segmentation is performed on each projection by utilizing the segmentation results of the initialization slices to determine the initial contour or zero level set required by the DRLSE method. In addition, the results of the initialization slices are used to determine the starting slice for segmentation of the sagittal and coronal projections.
Previously, the initial data used in this study was a collection of 200 slices of axial slices with the dimensions of each slice being 266x266 pixels. Slicing is then carried out based on the coordinates that have been described previously. The decomposition will produce 266 slices for sagittal and coronal projections with the dimensions of each slice being 266x200 pixels.
The segmentation process is carried out on all of these slices with the results of the initial slice segmentation being used as the data needed during the process. All of these are listed in step three of our proposed series of methods. Table  IV shows a sample slice of the segmentation results of the three projections. To display all the results of the slices in this paper, it is a less than optimal thing to do because the results of the segmentation slices from the three axial, sagittal, and coronal projections are quite a lot, namely 200, 266, and 266 respectively. However, from the sample it can be shown that the proposed method is able to obtain dental objects in each projection. Quantitatively, the average evaluation results of the three projections are shown in Table V  Finally, the last stage of the method we propose is the Combination of segmentation result for each projection (Axial, Sagittal, Coronal). All slices in the axial, sagittal, and coronal projections are combined into one so as to form a complete projection in the axial projection.
Axial projection is the main projection used in the proposed method. The disadvantage of this projection is that it does not represent the whole object part at each slice. The sagittal and coronal projections display the entire tooth from the root to the crown, which can help the axial projection result for optimal results. Table VI shows the comparison of the segmentation results between the axial projections and the combined results of the three projections.
In general, there is a reduction in the sensitivity parameter between the axial segmentation and the proposed method. However, visually the results are different. In the results of axial segmentation, there are cases where tooth objects are not detected in the axial projection. On the other hand, the proposed method segmentation can be refined to match the actual number of objects. There is a little noise caused by the combination. Furthermore, a comparison of the proposed method, segmentation only axial projection, and 3D Region Merging and Multiprojection Deep Learning methods were then carried out. Table IX shows the results of the evaluation parameters of each comparison method. The accuracy and specificity of the proposed method are not too different from the 3D Region Merging method. However, the proposed method is slightly superior in terms of specificity parameters or in terms of comparing dental objects with other elements in dental CBCT images. Then the proposed method produces slightly higher accuracy than the multi projection deep learning method even though this method has an advantage in terms of specificity.

V. CONCLUSION AND FUTURE WORK
This paper discusses the proposed method and the results of the experiment. The multi-projection segmentation method of dental CBCT images using a level set. Based on the evaluation results, the success rate of the proposed method obtained is 97.18% accuracy, 88.62% sensitivity, and 97.61% specificity. The Clustering of voxel intensity values in dental CBCT images were successfully to determine the threshold used as the formation of the ROI area of the dental object. The use of the ROI area of the tooth object as the value of the initial contour or the variable phi in the level set method is able to track the tooth object on the initialized slice image with the best evaluation results of accuracy, sensitivity, and specificity, which are 98.99%, 91.06%, 99.52%, respectively. Slice image segmentation on the Axial, Sagittal, and Coronal projections from the decomposition of dental CBCT data was successfully carried out with the best accuracy values of 99.02%, 98.36%, and 98.78%, respectively. Finally, the combination of three projections (multiprojection segmentation) namely Axial projection with Sagittal and Coronal can improve the results of the axial projection segmentation which is less in producing the entire tooth with an average increase in yield from 96.87% to 97.18%.
In this proposed method, there are still some images that are a factor in the less than optimal segmentation results. This is due to the characteristics of the patient's image using braces at the time of data collection using a scanning device. Noise which can be called diffusion or light reflection appears in the test image. The intensity value of the noise tends to have similarities and interfere with the shape of the tooth object. Further research can be done to improve the results on CBCT images that have these problems.