Ashen God

Chapter 895: Update Coming Soon


Today I still have a bit of diarrhea, so I'll update later, probably around one in the morning. Just refresh this chapter then.

...

Abstract: To ensure network security, a method for mining and estimating network security risks based on big data analysis is proposed. The Map and Reduce functions of the Hadoop platform are utilized to mine association rules of network security events. These mined association rules are used as features of network security events, and these features are input into the Support Vector Machine with Radial Basis Kernel Function. By training, a network security risk estimation model is established. The optimal parameter of the Support Vector Machine is searched using the optimization performance of the QPSO method. Experimental results indicate that this method improves the precision of network security risk estimation and provides significant reference value for defending against network security risks.

Keywords: Big data analysis; Network security risk; Association rules; Support Vector Machine

1 Introduction

The development of internet technology is extremely rapid, and the internet environment possesses high openness. Some attackers exploit the uncertainty and diversity of the network to attack it, seriously threatening the secure operation of networks [1-2]. Previous network defense methods only utilized information contained within data packets to obtain risk estimation results, resulting in a lower accuracy of the obtained results. To ensure secure network operations, it is essential for network administrators to clearly understand in real-time the operational status of a network, identify network security risks in advance, and apply appropriate defense measures to resist risks. This is the fundamental basis for maintaining secure network operations [3-5]. Currently, numerous research scholars conduct extensive studies on network security risks. Han Xiaolu, He Chunrong, and others respectively employ intuitionistic fuzzy sets and attention mechanisms to assess network security statuses [6-7]. However, there still exist deficiencies in network security risk, such as excessive alert volume and high false alarm rates due to large data volumes. Extracting useful network security risk data from massive network big data is crucial for precise evaluation of network security risks. When there are attacks on a network, a large amount of various types of alert information will be generated, increasing the difficulty of data mining [8]. An efficient big data mining method is extremely important for improving the accuracy of network security risk assessment. For this reason, this paper proposes a method for mining and estimating network security risks based on big data analysis and tests and analyzes its performance.

2 Network Security Risk Mining and Estimation Method Based on Big Data Analysis

2.1 Extraction of Association Rules in Data Mining

Collecting security events from massive network data requires normalizing the security events due to significant differences in their formats. This facilitates the mining of association rules contained within them. Using the mined association rules, analyze network security risks like similar viruses [9], similar vulnerabilities, and other attack behaviors to enhance the precision of Network Security Risk Assessment. The data mining method of big data analysis technology is used to extract association rules of network security events. W = {w1, w2, ..., wn} represents the set of security event elements, R = {r1, r2, ..., rn} represents the dataset, and elements ri within the dataset R are sets established by W, i.e., riW exists. Definition 1: The set R elements are used to establish Set C, and when the number of elements satisfying Cri in the dataset is l, the support measure of Set C in the dataset R is given as follows: (1). Definition 2: Exists Sets C and D that satisfy AW∩IDW, using this to express the confidence of C→D. The mined datasets that meet the minimum confidence and minimum support of C→D are the association rules required by the big data mining method. Association rules are obtained by mining frequent item sets within the transaction sets and extracting association rules present among different transactions. Network security events are characterized by their huge scale [10], and the Hadoop cloud computing platform is used for mining association rules from massive network security events. Big data analysis technology mining of association rules is divided into two parts: (1) Mining frequent item sets that should meet the minimum support; (2) Using the frequent item sets obtained from data mining to extract association rules that meet the conditions of minimum confidence. The Hadoop platform uses Map and Reduce functions to obtain project subsets and integrate the support of obtained subsets, analyzing the support of all subsets to acquire the support of frequent items mined from network security events, and mining frequent item sets contained in network security event datasets. The process of mining association rules on the Hadoop platform is as follows: use minimum support β and the original network security event dataset R as input for Hadoop platform operation; use frequent items that can meet the minimum support as output for Hadoop platform operation. Map task: (1) Divide the original network security data set into data subsets of size n based on the entered file path using frequent item sets with minimum support, format each divided subset to obtain key-value pairs, where value and key represent data information and character offsets respectively. (2) Use the Map function to read the key-value pairs from different subsets obtained, parsing the data information value with the split function, and transferring the parsed results into the set; (3) Use the output key to represent all subsets, and set the subset value to 1; (4) Invoke all optional Combin functions, generate key-value pairs with the same key value on all Map sides within the network security data, and use the Combin function to merge all identical key-value pairs to improve the low computational efficiency caused by sending the obtained key-value pairs to the Reduce side through the network; Reduce task: (1) Sort the key-value pairs sent by the Combin function, merge key-value pairs with the same key value, and obtain, the obtained key-value pairs using the Reduce function to read and accumulate the values in L() within the key-value pairs. The support number of the key set in the network security dataset R, the obtained result is the global support of frequent candidate itemsets on the Reduce end; (2) Candidate item sets higher than the minimum support are sent to the external table of stored data based on the minimum support, using the obtained external table to query and mine the obtained frequent item sets, setting these frequent items as inputs and related files for the MapReduce program. Use the minimum confidence δ and association rules that meet the minimum confidence δ as the input and output for mining network security event association rules, respectively, and the calculation process is as follows: (1) Use the Map function to start the setup method to connect the database; (2) Divide the frequent item set within the external table of stored data, obtain data subsets of size n after division, and format all data into key-value pairs; (3) Parse elements within value in the frequent item set, after parsing, obtain corresponding value used to represent it as (C, D, SValue), storing the obtained (C, D) in the set; (4) Solve subset C within elements in the frequent item set, read subset C's support sup(C), and use the confidence expressed as C→D. (5) When the obtained confidence is higher than the set threshold, all elements outside the subset in the obtained frequent item set have association rules with this subset; use the obtained difference set and subset to establish key values, where the confidence value of this key value is value. The association rules of network security events are mined through the above process, and the Support Vector Machine method is used to achieve network security risk estimation based on the mined association rules.

2.2 Network Security Risk Estimation Method

Use the mined association rules as network security event features to estimate network security risks. Use the sample input xi and sample output yi composed of (xi, yi) to represent the network security event training sample set, which satisfies xiRn, yiRn. Network security event samples within the sample set (xi, yi) are mapped to high-dimensional feature spaces using the nonlinear mapping function φ(), obtaining the optimal linear regression function expression for network security event assessment as follows: (2) where b and w represent bias and weight, respectively. Using the principle of minimizing structural risk, the solution for the LSSVM regression model is obtained, which is given by the formulas: (3)(4) where ei and C represent the error between the regression function and the actual result, and the penalty function, respectively. Introducing the constraint optimization problem from formula (4) into the Lagrange multipliers, the formula obtained is: (5) where ai represents Lagrange multipliers. According to the Mercer condition, the kernel function formula is defined as follows: (6) Select the Radial Basis Kernel Function as the kernel function for network security risk estimation, obtaining its expression as: (7). The final Support Vector Machine regression model obtained is as follows: (8) where σ is the width of the Radial Basis Kernel Function. The parameters of the Support Vector Machine determine its estimation accuracy, and selecting appropriate parameters helps to enhance the estimation precision of network security risks. The QPSO algorithm is selected for optimizing the parameters of the Support Vector Machine. The QPSO algorithm sets m particles existing in D-dimensional search space, with the initial position of the particles expressed as xi(xi1, xi2, ..., xid), PB(pb1, pb2, ..., pbd) represents the current optimal position, and GB(bg1, bg2, ..., bgd) represents the global optimal position. The expression for particle evolution is as follows: (8) where mbest and β represent the best particle value within the swarm and the algorithm convergence speed, respectively. When the iteration count is t, the formula for the algorithm convergence speed is as follows: (9) The process of network security risk assessment is as follows: (1) Determine the particle number within the particle swarm according to the scale of network security risk assessment, with particle dimensions within the swarm representing the parameters C and σ used for estimating network security risk by the Support Vector Machine; (2) Set the parameters of the particle swarm algorithm optimizing the Support Vector Machine parameters and the maximum iterations; (3) Obtain the fitness function of the particles; (4) Calculate the optimal individual position and global optimal position of the particles, and establish the network security information database; (5) Update the position of each particle within the swarm; (6) Repeat iterative calculations following the above process to check whether the termination conditions are met. If met, proceed to step (7); otherwise, return to step (3); (7) Use the optimum particles obtained from the above process as Support Vector Machine parameters to complete the network security risk estimation model, and obtain network security risk estimation results using the established assessment model.

3. Case Analysis

Choose a communication network running for 60 minutes of communication data as the test object, collecting a total of 5,846,544 sample data, using the method in this paper to assess network security risk. Select intuitionistic fuzzy set method (reference [6]) and attention mechanism method (reference [7]) as comparison methods. The method in this paper uses big data analysis technology to mine the association rules present among massive network communication data, counting the number of association rules mined under different minimum confidence and minimum support conditions, as shown in Figure 1. Experiment results in Figure 1 show that when the minimum confidence and minimum support are 0.7 and 0.3 respectively, a greater number of association rules can be mined. When mining massive network data using the method in this paper, set β and б to 0.7 and 0.3 respectively. The method in this paper exhibits high performance in mining association rules, maintaining high mining efficiency when applied to massive network communication data. After completing the association rule mining, use the optimization performance of the QPSO algorithm to obtain optimal parameters for the Support Vector Machine, with the convergence of the QPSO algorithm at different iteration counts shown in Figure 2. Experiment results in Figure 2 reveal that the method in this paper leverages the QPSO algorithm to find the optimal parameters for evaluating network security risks using the Support Vector Machine, requiring roughly 40 iterations to quickly obtain the optimal Support Vector Machine parameters. The method's selected QPSO algorithm exhibits high efficiency in optimization, rapidly obtaining the optimal parameters for the Support Vector Machine in a short time, improving Network Security Risk Estimation Performance. Through the QPSO algorithm, optimal parameters of the Support Vector Machine algorithm were obtained as C=130 and σ=135. Using the optimal parameters obtained by the QPSO algorithm, establish the Network Security Risk Assessment model, using the established model to assess the number of security risk events during 5h of network operation, comparing the method in this paper with the other two methods, as seen in Figure 3. Experimental results in Figure 3 demonstrate that using the method in this paper, the result of assessing network security risks is very close to the actual network security risks, with a high consistency of fluctuation trends. The comparison results indicate that this method can effectively predict network security risks, with highly reliable prediction results, serving as effective evidence for Network Administrator to manage network security. After multiple tests, compare the Network Security Risk Assessment Performance of the three methods; comparison results can be seen in Figure 4. The experiment results in Figure 4 demonstrate that using this method to assess network security risks can effectively improve the shortcomings, such as the need for a large amount of historical data and sensitivity to missing data, possessing high reliability in application to network security risk assessment. The security risk situations of the test network evaluated from January 3, 2020, from 7:00 to 24:00 for a total of 17 hours using the method in this paper are shown in Table 1. Regarding the experimental network security event situation table given in Table 1, evaluate its risk event attack type using the method in this paper, with the results shown in Table 2. Analysis from Table 2 shows that this method can assess security risk events, effectively determining specific attack behaviors of network security risk events, verifying the high validity of risk event assessment by this method.

4. Conclusion

Network security risk estimation is an important part of the current network defense system. As data volume within networks increases, higher requirements for network security risk estimation are demanded. By adequately considering attack scenarios during network operation and applying big data analysis technology to network security risk estimation, it uses the advantage of big data analysis technology for processing massive data to fully mine association rules existing in network security events, assessing network security risks. Experiments confirm that the researched method can effectively estimate network security risks, ensuring effective protection of network security in environments with massive data operation.

If you find any errors ( broken links, non-standard content, etc.. ), Please let us know < report chapter > so we can fix it as soon as possible.


Use arrow keys (or A / D) to PREV/NEXT chapter