Intrusion Detection Systemsdocx
=== Intrusion detection systems ===
Chapter-1
Introduction
Intrusion identification is a system security instrument to shield the PC network framework from intrusion or assaults. Headway in system advancements has given an opening to programmers and gatecrashers to discover unapproved approaches to go into another framework. Subsequently, as advances advance, there is additionally a danger of new dangers existing with them. Subsequently, when another sort of intrusion develops, an intrusion discovery framework (IDS) should have the capacity to act adequately and in an auspicious way to stay away from dangerous impacts. In today's setting, the greatest challenge could be managing 'huge information', i.e., a gigantic volume of system movement information that gets gathered powerfully in the system correspondences [7]. Along these lines, interruption location has been one of the center territories of PC security whose objective is to recognize these vindictive exercises in system activity and, vitally to shield the assets from the danger. Most IDS attempt to perform their undertaking progressively yet they need because of different reasons. The circumstances like level of difficulties are and calculation it needs to experience, the constant execution is not generally possible.
In our approach, we have tried to inspect the TCP headers information from the Transmission Control Protocol/Internet Protocol (TCP/IP) packets and detect attacks as an outlier based on the analysis of this header information. We can find the majority of TCP/IP suite of protocols being used for internet data communications applications which include the World Wide Web hyper-media system that uses HTTP (Hyper Text Transfer Protocol). Other examples are common network protocols such as FTP (File Transfer Protocol), SMTP (Simple Mail Transfer Protocol) and Telnet. The common use of TCP means that it is likely to be exploited for misuse and various forms of attacks. Thus, there is a possibility that malicious behavior can be executed through the TCP/IP protocols without being blocked or even noticing by firewalls or detection by IDS because most commonly used IDS do not recognize new variations of attacks.
The measure of detection is the deviations of anomaly TCP header data from the normal TCP header data which allows for high speed network detection since extracting TCP/IP header information can be performed in minimal time. The useful information that can be used from TCP headers are IP Length, TCP flags, TCP window size, checksum and time-to-live (TTL). Many researchers have highlighted the fact that anomaly connections have different patterns of TCP flags from normal ones.
Machine learning and data mining is two relative terms which often use the same methods and overlap significantly, but machine learning is about prediction based on known properties learnt from the training data whereas data mining is about the discovery of unknown properties on the data. The main essence of using a statistical approach as outlier detection in detecting anomalies lies in analyzing and mining information from raw data. It improves learning capability to model normal behavior of the system. In order to ensure a high detection rate, we need to model our normal data properly. These statistically based approaches compared to rule-based, can lead to a faster execution but often result in high false positive rates. Since the intrusion patterns and normal patterns do not always comply with certain distributions nor are they linearly separable, this causes problems in applying statistical learning methods, such as SVM, to intrusion detection.
Problem Definition
One of the main reasons behind the under-deployment of anomaly-based IDSs is that they cannot be easily deployed. A majority of anomaly-based IDSs use data mining and machine learning algorithms, which take feature vectors containing some complicated attributes as their inputs. However, extracting such complicated features from real time TCP network traffic data is not a straight-forward task. Unfortunately, to our best knowledge, there is no publicly available software/script to automatically extract those features from raw TCP data. Therefore, it is virtually impossible for an average network administrator to deploy anomaly-based IDS which relies on such complicated feature vectors.
The problems lying with the current intrusion detection techniques is that it requires continuous human interaction for the task of labeling attacks and normal traffic behavior to support machine learning techniques [7]. Statistical machine learning has been widely used and has become an important tool in various applications for detecting and preventing malicious activity. Moreover, big data in the presence of redundant records makes intrusion detection a complicated task. Most IDS try to perform their task in real time but their performance suffers. The circumstances like level of analysis or their reaction from some intrusion to limit the damage by terminating the network connection, a real time performance is not always achieved. Thus, the main purpose of this research is to propose an efficient technique for intrusion detection that takes two things into consideration. First, efficient use of data mining techniques to extract and filter out a suitable subset of data, for which we opt to use real time TCP data. Secondly, applying efficient machine learning techniques for extracting patterns for intrusion detection in real time, as machine learning approaches provide an opportunity to improve the quality and to facilitate easy administration of IDS.
Motivation
We cannot imagine our life without security, as it is one of the basic needs. Similarly, computer security is also the heart of today’s technological world. Moreover, intrusion detection is one such core areas of network security that needs to be highly effective. Therefore, we need to be concerned about having safe and secure networking channel. Any unauthorized access to these networks means lot of problems and basically, which means losses either financially or resource-wise. Therefore, before we think about the future, we need to think about securing our present network that begins with detecting intrusion in the existing network. Many IDS based on machine learning approaches have been proposed, each has their advantages and disadvantages. Most of the IDS cannot perform well with the big data and in real time whereas the other IDS cannot track down the evolving malicious attacks, thus putting a huge void in the IDS.
This research work got started initially with the working on the simulation of intruded data in a simulated environment. As we progressed through the simulation process, we gained more intuition about going into the basic root point for intrusion detection using the packet files i.e. TCP dump files. The main motivation to carry out this research is that we have found out that most of the research work carried out on this subject is tested real time data, which is a modified version of MIT DARPA dataset that varies significantly from the real network TCP-dump dataset. Very few efforts have been tested using raw network data. Thus, our approach to network security is an attempt to explore the possibility of developing IDS for network raw data using ‘TCP-dump’ files that actually allows implementing the system in a practical environment. It may not be feasible to render the network system immune to intrusions because of many reasons. The system has become more complex, and its security design faces difficulty in anticipating all conditions that might occur during data transfer or while understanding precise implications for even small deviations in several conditions. Thus, these problems give us the required motivation to conduct research on IDS based on techniques of machine learning.
Chapter-2
Background
2.1 Transmission Control Protocol –
TCP and IP represent the most widely used protocols to transfer data through a network system as they work together at different levels of the system (Network and Transport Layers) [24]. The TCP protocol is a connection-oriented protocol that enables the reliable and ordered transmission of traffic on internet application. Applications that utilize TCP/IP include HTTP, FTP, Telnet and SMTP. In the figure2.1, we can see the layers of the TCP/IP protocol suite [35] through which transmitted data passes in the case of an Ethernet network. When an application sends data, it starts from the application layer, then goes to the transport layer, and then to the network layer, and finally to the network interface layer. At each layer, the data gets encapsulated with a header containing information about that layer.
Figure 2.1: TCP data transfer encapsulation process
When a packet is received by the host, the data gets encapsulated stripping off its corresponding header information as it makes its way from the network interface layer to the application layer. This process is defined in RFC 894 [39]. The TCP and IP protocol headers [35] [31] can be seen in figure2.2.
Figure 2.2: Layout of TCP and IP header
The TCP header contains information that is important to the establishment of a trusted connection. The TCP flags are used to control the state of a TCP connection. The TCP sequence and acknowledgement numbers provide unique identifiers of the connection to the client and server, and also provide confirmation that data was received. Table2.1 shows the description all eight TCP flags, two of which, ECE and CWR were originally used as reserved, but are now being used to communicate congestion control capabilities as defined in RFC 3168 [28]. A typical TCP session for the transfer of data is shown in figure2.3. Thus, the client and server must complete a three-way handshake before any data exchange takes place.
Table 2.1: Description of TCP session flags
We can summarize a normal TCP connection in following step:
Client sends a SYN packet with sequence number J.
Server receives the SYN packet and sends its own SYN packet with a sequence number K. At the same time it also acknowledges the clients SYN packet by sending an acknowledgement (ACK) packet with an acknowledgement number J + 1.
The client acknowledges the servers SYN packet by sending an ACK packet with an acknowledgement number K + 1. This establishes a full connection. The three-way handshake is complete and the client can then transmit data to the server.
The client and server exchange data, each sends an acknowledgement when data is received.
The server closes the connection by transmitting a FIN packet with sequence number M.
The client acknowledges the FIN packet with an ACK whose acknowledgement number is M + 1(if we assume that no data has been transmitted). If B bytes of data are transmitted, then the acknowledgement number is M + 1 +
B. The server may still send more data. The client closes the connection by transmitting its own FIN packet with sequence number N.
Finally the server transmits an ACK packet along with a sequence number N
+1 which terminates the connection.
Figure 2.3: TCP data exchange procedure in data transfer
Here, either party can initiate termination of the connection or the termination steps may differ from the one explained above. The connection can get terminated at any time by resetting with a TCP RST packets. Thus, there is a possibility of intruders exploiting this situation and can come up with attacks.
Figure 2.4: Model proposed by Denning
Intrusion Detection Systems
Intrusion Detection Model
In 1986, the first intrusion detection model was proposed by Denning which can be seen in figure2.4. The model was not specific to any system and inputs, but it referred to a reference value for inputs using system and machines. The model generates a number of contours and monitors the contours change based on audit log data of the host system and hence, finds intrusions in the system [38].
Intrusion Detection System can be categorized into two groups based on its deployment.
Host Based IDS
Network Based IDS
Host based systems mostly utilizes the system logs in order to monitor the attacks in the individual hosts. While network based systems is about building enterprise level security systems focusing on huge network traffics and analyzing the attacks from the outside. Host based system are small and easy to deploy and thus, preferable for personal use while network based systems may require higher end of advanced hardware and software in order to protect the whole subnet. Network based IDS can be further divided into network monitoring systems and composite systems that monitors both hosts and the surrounding network.
Network Security Monitor (NSM) [36], Bro, Network Flight Recorder (NFR) and Network Statistics (NetSTAT) are the example of available strict network based systems. NSM was a system designed to monitor the traffic between hosts. Bro acts as a high-speed passive network monitor that filters traffic for certain applications. NFR was a tool for filtering the network data while Net- STAT offered customization and filtering of network events. Emerald, Grids and Distributed Intrusion Detection System (DIDS) are such examples of system that monitors both hosts and network. Emerald was able to detect intrusion in largely distributed networks that would respond based on local targets and manage its monitors to form an analysis hierarchy of network wide threats. Grids allow easy viewing of attacks via graph by collecting results from both host and network based components. DIDS is like an extension of NSM that utilizes data from both host auditing systems and LAN traffic to detect intrusions.
Types of Intrusion Detection Methods
The intrusion detection method can be categorized into two types based on its analysis methods:
Anomaly intrusion detection method
Misuse or signature or rule based intrusion detection method.
In order to detect intrusions, the anomaly detection analyzes the deviation from the normal behaviors at user or system level whereas the misuse detection matches sample data to known intrusive rules or patterns. Machine Learning frequently forms the basis for anomaly method and it can detect novel attacks by comparing suspicious ones with normal traffics but has high false alarm rate due to difficulty of modeling normal behaviors for protected system [22].
With misuse method, pattern matching on known signatures leads to high accuracy for detecting threats but it cannot detect novel attacks as novel attack signatures are not available for pattern matching. The most of current IDS follows a signature-based or misuse approach which is similar to virus scanners, where events are detected after matching with specific predefined patterns known as signatures. The limitation identified with signature-based IDS is their downfall to detect new attacks and also neglect minor variations of known patterns. Besides that it is found to have significant administrative overhead cost attached with it in order to maintain signature databases.
Anomaly based detection was started using statistics. Haystack [37] used statistics to analyze changes in user activities. NSM also used statistics along with rules to analyze and monitor LAN traffic. The NIDES [34] statistical component set the standard for statistical based intrusion detection as it computes a historical distribution of continuous and categorical attributes that gets updated over time and deviation are found using Chi-Square tests. Emerald statistical component was found to be inherited from NIDES system.
Most of the network IDSs like Bro, NSM, NFR, NetStat use rule based methods for detection. Snort [5] is a rule based network IDS whose features include simple rule format based on packet payload inspection, content pattern matching and streamlined architecture based upon source and destination. Network Analysis of Anomalous Traffic Events (NATE) [27] and Light-weight Intrusion Detection System (LISYS) [30] are two light weight anomaly based IDS, both shares similar attributes. In early 1990s, Time-based Inductive Machine (TIM) was an- other anomaly based IDS, that used inductive learning of sequential user patterns in Common Lisp on a VAX 3500 computer.
Copyright Notice
© Licențiada.org respectă drepturile de proprietate intelectuală și așteaptă ca toți utilizatorii să facă același lucru. Dacă consideri că un conținut de pe site încalcă drepturile tale de autor, te rugăm să trimiți o notificare DMCA.
Acest articol: Intrusion Detection Systemsdocx (ID: 117047)
Dacă considerați că acest conținut vă încalcă drepturile de autor, vă rugăm să depuneți o cerere pe pagina noastră Copyright Takedown.
