Peer-to-Peer IP Traffic Classification Using Decision Tree and IP Layer Attributes
We present a new approach using data-mining technique and, in particular, decision tree to classify peer-to-peer (P2P) traffic in IP networks. We captured the Internet traffic at a main gateway router, performed preprocessing on the data, selected the most significant attributes, and prepared a training-data set to which the decision-tree algorithm was applied. We built several models using a combination of various attribute sets for different ratios of P2P to non-P2P traffic in the training data. We observed that the accuracy of the model increases significantly when we include the attributes “Src IP addr” and “Dst IP addr” in building the model. By detecting communities of peers, we achieved classification accuracy of higher than 98%. Consequently, we recommend that: (a) the classification must be done within the authority of the Internet service providers (ISP) in order to detect communities of peers, and (b) the decision tree needs to be frequently trained to ensure the fairness and correctness of the classification algorithm. Our approach is based only on information in the IP layer, eliminating the privacy issues associated with deep-packet inspection.
|Keywords||data mining, decision tree, IP traffic classification, peer-to-peer traffic|
|Journal||International Journal of Business Data Communications and Networking (IJBDCN)|
Raahemi, B. (Bijan), Hayajneh, A. (Ahmad), & Rabinovitch, P. (Peter). (2007). Peer-to-Peer IP Traffic Classification Using Decision Tree and IP Layer Attributes. International Journal of Business Data Communications and Networking (IJBDCN), 3(4), 60–74. doi:10.4018/jbdcn.2007100104