A Fault Tolerance Mechanism for Network ... - Semantic Scholar

4 downloads 6855 Views 191KB Size Report
We propose the creation of agents, using the concept of sentinels [9], that monitor ..... Authentication of the digital signature and reliability of cryptography: All the ...
A Fault Tolerance Mechanism for Network Intrusion Detection System based on Intelligent Agents (NIDIA) Lindonete Siqueira and Zair Abdelouahab Universidade Federal do Maranh˜ao Campus do Bacanga 65080-040 S˜ao Luis - MA - Brasil [email protected], [email protected]

Abstract The Intrusion Detection System (IDS) has as objective to identify individuals that try to use a system in way not authorized or those that have authorization to use but they abuse of their privileges. The IDS to accomplish its function must, in some way, to guarantee reliability and availability to its own application, so that it gets to give continuity to the services even in case of faults, mainly faults caused by malicious agents. This paper proposes an adaptive fault tolerance mechanism for Network Intrusion Detection System based on Intelligent Agents. We propose the creation of a society of agents that monitors a system to collect information related to agents and hosts. Using the information which is collected, it is possible: to detect which agents are still active; which agents should be replicated and which strategy should be used. The process of replication depends on each type of agent, and its importance to the system at different moments of processing. We use some agents as sentinels for monitoring and thus allowing us to accomplish some important tasks such load balancing, migration, and detection of malicious agents, to guarantee the security of the proper IDS.

1 Introduction The security is one of the largest concerns currently faced by network administrators, to maintain organizations far from attacks. However, it is a challenge every time to prevent from stealing and modification of information or interruption of systems. Information security considers the importance of information by protecting it of several types of threats to guarantee the continuity of the operation. To get an idea of a growth of security incidents registered between 1999 and 2004, the CERT.br (Computer Emergency Response Team Brazil [2]) has been reported

an increase of 2,000% in notifications of incidents. In 1999, it has reported 3,107 incidents and in 2004 they reached a total of 75,722 incidents. Statistics show that until September 2005, network administrators have reported a total of 46,205 security incidents. Intrusion detection system aids computer network administrator to prevent attacks and to act when an attack begins or is detected. A system, which has the objective of providing security, must not have its operations interrupted accidentally or maliciously. However, according to [1] ”one of the responsible of a security organization, intrusion detection system is becoming the target of most experienced attackers...” Intrusion detection systems have been built on a multiagent architecture. In the Federal University of the Maranh˜ao (UFMA), the Network Intrusion Detection System based on Intelligent Agents (NIDIA) [5] is being developed as a cooperative of intelligent agents to enable a detection of new attacks using techniques of neural networks and has also a capacity of giving a response to attacks. Our objective is to provide fault tolerance to the NIDIA. The fault tolerance aims to guarantee the reliability and availability of the system, in other words, the continuity of the service should be possible even in the presence of faults. We propose the creation of agents, using the concept of sentinels [9], that monitor the system to collect information related to agents and hosts (where the agents of the NIDIA are located). Based on the information, which is collected, it is possible: to detect which agents are still active; which agents should be replicated and which strategy should be used. The process of replication depends on each type of agent, and its importance to the system at different moments of processing. The usage of sentinels for monitoring can allow us to accomplish some important tasks such load balancing, migration of agents, and detection of malicious agents, thus guaranteeing the security of the proper IDS (self protection).

Proceedings of the Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems and Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA’06) 0-7695-2560-1/06 $20.00 © 2006

IEEE

This paper is organized as follows. Section 2 presents multi-agent systems, intrusion detection systems, fault tolerance, as well as a quick insight on replication in distributed systems. Section 3 describes the NIDIA and Section 4 the adaptive fault tolerance mechanism for NIDIA. Section 5 presents the future works. Finally, conclusions are drawn in section 6.

2 Background 2.1

Multi-agent systems (MAS)

An agent is a program module that functions continuously in a particular environment. It is able to carry out activities in a flexible and intelligent manner that is responsive to changes in the environment. An agent is able to learn from its experiences. It is autonomous and takes actions based on its built-in knowledge and its past experiences [10]. A multi-agent system consists of multiple agents that can interact together to learn or exchange experiences. Building multi-agent applications is desirable because of the following characteristics [10]: Autonomy: Agents can be removed or added (scalability), reconfigured or upgraded, as long as their external interface remains the same, from the system without altering other system components therefore without having to restart the system and independent of the human intervention or other agent; Reactivity: An agent has a capacity to perceive its environment and to respond the changes that occur to satisfy its design objectives; Sociability: An agent interacts with other agents through a language. An agent may be a member of a group of agents that perform different simple functions; Mobility: It has a capacity to move in a computer network. These characteristics are the main cause of popularity of the MAS. Effectiveness of multi agent systems can be measured by number of applications in which these can be deployed. Typical agent applications are web services, ecommerce, network management, digital tourism, supply chain management, support systems, medical and grid, information retrieval and processing, etc [11].

2.2

Intrusion detection systems

Intrusion detection system (IDS), in agreement with [10], can be defined as a system which can identify individuals who are using a computer system without authorization or those who have legitimate access to the system but are abusing their privileges. It should be fast enough to catch different types of intruders before harm is done. It can react to attacks when they are detected. IDS can be a local host based IDS, which looks and analyzes data that is internal to the system or can be network based (NIDS) in

which network traffic is analyzed. Network-based detection involves gathering information of various sites and subsequently analyzes data. It can therefore help in identifying attacks or potential attacks that cannot be identified if only a single sites log is being analyzed. Several Intrusion detection systems are constructed using multi-agent technology. Below are described briefly some IDSs based on agents. AAFID [15] uses autonomous agents at lower levels to collect, analyze and filter data. These agents are run and control every host by local transceivers. Transceivers in turn report to monitors. A single monitor may control several transceivers located at different hosts and may itself report to other monitors above it. Monitors get a global view of the network that facilitates in the decision process. Mobile Distributed Agent IDS (MADIDS) [7] is a system designed to process the great flow of data transfer in high-speed network. Agents are set on each node and process data transfer using distributed computation. Meanwhile using a reconfiguration of the mobile agents, load balance can be dynamically implemented to gain performance in data flow transfer. In [14], is a distributed mobile agent based intrusion detection system. The system has no central coordinator. It can be updated easily, and can scale well to large area networks. The absence of a central coordinator implies that all sites will be peers. Each site will belong to a virtual neighborhood. Information related to safety of a site will be distributed among neighbors and this will check on each other periodically to make sure no intrusion has occurred.

2.3

Fault Tolerance

Fault tolerance is a means of achieving dependability 1, working under the assumption that a system contains faults, and aiming at providing the specified services in spite of their presence. Fault tolerance together with fault avoidance, removal and forecasting are known as dependability means [4]. Distributed applications with fault tolerance requirements are difficult to implement and maintain, mainly if we consider the complexity of environments characteristics of wide scale. Replication mechanisms have been successfully applied for various MAS [8]. Replicating specific agents, which are identified as crucial to the application, may allow bypassing easily this problem. The intrinsic properties of multi-agent systems, in particular the fact that they are very flexible and dynamic, point them out as a strong basis for elaborating a fault tolerance mechanism likely to allow on-the-fly changes. 1 Dependability is a vital property of any system justifying the reliance that can be placed on the service it delivers

Proceedings of the Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems and Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA’06) 0-7695-2560-1/06 $20.00 © 2006

IEEE

Administration Layer

Updating Layer

Database Layer

[6], proposed a proxy based approach, where a proxy transparently handles a replication group based on predefined policies. A Proxy is nothing more than a computational entity, which provides interface to a set of agent replicas. But this technique suffers the centralized bottleneck of proxy by itself and it only concentrates on the replication of agents of a multi agent system. It is more costly in terms of forming replication groups of all the agents in a multi agent system. One framework, called Darx [8, 12], is constructed to project distributed applications reliable that include a set of distributed agents if communicating. Each agent can be talked back innumerable times with different strategies of replication. Darx includes a management of group for maintenance of the replicas.

Reaction Layer

Analysis Layer

Monitoring Layer

Figure 1. Architecture in layers of the NIDIA

3 Network Intrusion Detection System based on Intelligent Agents (NIDIA) NIDIA is an intrusion detection system, based on the notion of society of intelligent agents capable to detect new attacks in real time. For monitoring, the system adopts a combination of sensor agents that are installed in strategic points of the net and in critical hosts with the objective to capture suspicious packages and malicious activities. It provides a detection methodology by abuse and anomaly to guarantee a larger robustness to the system. It possesses the capacity to interact with firewall, in the sense to reduce problems, allowing a level of security to be reached, once the two systems possess complementary characteristics [5]. The architecture presented for NIDIA is inspired from the logical model of CIDF [3, ?]. However, with the evolution necessities of the NIDIA, the initial project is restructured and reorganized. A new architecture based on layers of agents is developed and adopted. The new architecture is shown in the figure 1 and its layers are described as follows: Monitoring Layer: It is responsible for capturing the occurrence of events and supplying information for the rest of the system. In this layer the sensors agents are located. Analysis Layer: It is responsible for the events analysis of the underlying layer. It formats the events collected for the monitoring layer so that it can later see whether a true attack occurred identifies which type of attack. Reaction Layer: It is responsible for taking reaction in case of an attack is detected. Updating Layer: It is responsible for the update of the databases. Agents of any layer can accomplish consulting the databases, however inserts can only be made by agents of that particular layer. It is also responsible for maintaining the integrity and consistency of the stored information. Administration Layer: It is responsible for the administration and integrity of all agents of the system.

Database Layer: It is responsible for the storage of the information.

4 Fault Tolerance in NIDIA The IDS has the objective to identify intrusions 2 in computational systems to guarantee reliability and availability to its proper application, in such form that it can gives continuity to its services in case of faults, mainly from faults caused by malicious agents. In [15], the following desirable characteristics, among others, are identified for an IDS: i)It must be fault tolerance in the sense that it must be able to recover from system crashes, either accidental or caused by malicious activity; ii) It must resist subversion. The IDS must be able to monitor itself and detect if an attacker has modified it. To reach the characteristics above, IDSs based on agents has three great challenges: i)Protect the hosts; ii) Protect the communication network; iii)Protect its agents (to avoid subversion). In IDS, it can occur accidental and malicious faults. Accidental faults can lead crash of hosts and agents, and still communication loss. One attack can tamper with the code or internal state of the agent and thus modify the agent behavior. Consequently, the agent may behave differently than originally defined. The IDS must protect the host and agents (the code and data integrity) from malicious agents. A malicious agent is the one that executes or requests the execution of non-authorized actions. It can corrupt either host or agent. Our proposed work seeks to provide fault tolerance to NIDIA, with the objective of guaranteeing availability and 2 An intrusion is defined as ”any set of actions that attempt to compromise the integrity, confidentiality, or availability of a resource [15].”

Proceedings of the Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems and Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA’06) 0-7695-2560-1/06 $20.00 © 2006

IEEE

reliability so that it can offer its services even in the presence of faults, mainly from malicious faults. Possible faults should be detected as fast as possible. A Failure of agent or host should be identified through an appropriate diagnosis and the recovery should be provided in the smallest possible time and with the minimum of impact for the remaining of the system. Our approach is based on the idea of monitoring, replicating and migration of agents and has the following objectives: a) Detect which agents and hosts are actives, replying the requests; b) Detect the agents importance during the processing and the availability of the resources of hosts where the agents are executing; c) Detect the action of malicious agents. We propose an agent called System Fault Tolerance Agent (SFTA). This agent is introduced within the administration layer and has the responsibility of maintaining the integrity of the system.

4.1

SFTA Overview

The proposed agent is modeled as a society of agents that cooperate between itself and accomplish its tasks independently, with the objective of maintaining an acceptable level of performance and a flexible update. The society possesses three agents: System Sentinel Agent (SSA), System Fault Evaluation Agent (SFEA) and System Replication Agent (SRA). The SSA makes use of Profile Database (PRDB) to aid in the accomplishment of its tasks. The figure ?? shows the architecture of the society of agents and the relationship among them. The SSA is responsible for monitoring all agents of the NIDIA and the hosts where the agents are executing. It collects information related to the agents, such as amount of messages received, and information related to hosts, such as the amount of available memory. This information is dispatched to the SFEA to get a global view of the information of the system, such as the resources of the hosts, the necessities of the agents, the agents and hosts non-actives. Therefore, the SFEA can opt to the best solution of recovery when a fault is detected. The SRA is responsible for the replication management and executing the actions of recovery. System Sentinel Agent (SSA) SSA is the main agent in the process of fault tolerance architecture. It is responsible for monitoring of system. It collects some information within the agents and host. Some of this information is described below: In the Agent i) The number of received requests. This determines the importance of the agent for the system in certain moment of the processing, through the amount of requests, which it receives; ii) Usage of the processor. This information is compared with the information collected in

Figure 2. Architecture of the SFTA the hosts to verify if there is necessity of migration of an agent; iii) Use of memory. This information is compared with the available memory in the host to evaluate the need of migration; iv) Action that the agent is executing. This information is compared with the information stored in the database PRDB to detect when a agent is malicious. In the Host i) Use of the processor; ii) Amount of free and total memory in the system; iii) Amount of space: free and used disk space. The information collected in the host are used to compare the available resources and the needs of the agents. Based on this information it is possible to determine whether a replication and migration of agents are necessary. SSA is responsible by monitoring of the hosts and agents to detect if they are inactive. It still makes the verification of non-authorized actions, what characteristics of the malicious agents. System Fault Evaluation Agent (SFEA) SFEA is responsible for the analysis of the information collected by SSAs. It has a global view of the system and has knowledge of: i) Availability of resources in the hosts (memory, disk space, etc.). It has a knowledge of who is overloaded and which can receive agents; ii) Necessity of agents of the NIDIA, for example, whether they need more memory. It identifies when an agent is highly/lowly necessary, which implicates that its replication strategy should be observed and possibly altered; iii) Which action of recovery must be executed for each type of detected fault. It can request alteration of the strategy of replication of a group of agents, migration or creation of new agents. System Replication Agent (SRA) SRA is responsible for the replication and recovery management, in other words: for the organization of the group of replicated agents, adding or removing an agent from a group; for the strategies of replication of each group and by which alteration is necessary; for the consistency of the

Proceedings of the Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems and Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA’06) 0-7695-2560-1/06 $20.00 © 2006

IEEE

replicas. The replication is transparent to the agents. Only SRA knows the groups, the strategies and the number of replicas in each group. SRA has knowledge of the localization of the active agents and its replicas. Each layer of the system will have an SRA responsible for managing the groups of replicas of the agents within that layer. Profile Database (PRDB) PRDB is the database of profiles. It stores the information related to standard behavior of each agent of the NIDIA. All the authorized actions for each agent of the system should be registered. Besides, the agents that can send solicitations to agents that are registered. SSA uses this information in the process of detection of malicious agents.

4.2

Application of SFTA to NIDIA

LSIA is the first agent initiated in the system. The LSIA is responsible for the management of the agents that compose the detection system and for the interface between the system and the security administrator. Through, it can make the management of the status and configuration of the agents, the maintenance of the database and the registry of the detected occurrences. It is the only agent that communicates directly with the administrator of the system. The LSIA activates agents of the NIDIA based on a configuration archive filled out by security administrator. This archive has the name of the agent, the IP address of the host where the agent will be initiated, once the agents of the NIDIA are distributed on the net. LSIA starts an SRA agent for each defined layer of figure 1. Each SRA is to monitor agents with similar characteristics and with a specific function. The started agents are registered with the corresponding SRA layer. The SRA creates a group of agents containing one active replica, and stores this information together with the localization of the created agent. The LSIA activates one SSA in each host where NIDIA agents are executing. If there is already one, it only informs that a new agent must be monitored. The SSA possesses a list containing all agents that should be monitored. All groups of agents are initiated with a strategy of passive replication, where we have a active replica and a passive replica. However, in agreement with the information that is dispatched by SSAs to SFEA, a strategy can be altered. The number of replicas in each group can also vary. Then the agents are processing the system will have its strategy and number of responses differentiated from other agents. Finally, the LSIA activates SFEA. It has a global view of the network that facilitates in the decision process of recovery.

4.3

Detection of faults

Detection of Malicious Agent The detention of malicious agents is accomplished in two phases: a) Security Agents The Agents Security phase consists of two steps that are executed by the agent when it is initiated. The first step is proposed by [13]. The description of this phase is given below: 1. Authentication of the digital signature and reliability of cryptography: All the agents of the NIDIA generate a pair of keys, a public key and a private key, when they are initiated. A message that is exchanged among agents is in XML format. The agents maintain with themselves the private key and send the public key for a XKMS server. The exchange of messages is accomplished through the use of the public key for sending a message and a private key to receive the message. The certification is used to guarantee the authenticity of agents that sends the messages. 2. Authorization: An agent has a list of capacities where it is described which agents can send solicitations to him. This information is stored in the database PRDB and when the agent is started it consults the database to obtain it. When an agent receives a solicitation, it checks whether the agent is registered in its list. If the name of the agents is not part of its list of capacities, it immediately informs the SSA and does not execute the solicitation.

b) Agents Monitoring Using as an example, an agent that belongs the analysis layer, we will demonstrate how the SFTA works to detect a malicious agent. The agent chosen is the SMA, that it is responsible for to organize and to accomplish a pre formatting in the events collected by the sensor agents. The detection process steps are described as follows: [i.] The SSA periodically consults the SMA to verify which actions are being executed. To each action executed the SSA makes a consultation to the PRDB. [ii.] It checks if the action is authorized for the agent who executes it. [iii.] If action is authorized, continues the monitoring process. [iv.] If action is not authorized a message is sent to the SFEA. We have the following information in this message: Type of fault, in this case is malicious agent; Agent Name; Host IP; Executed action. [v.] SFEA verifies in its base of rules, which the action of recovery must be executed in this type of fault.

Proceedings of the Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems and Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA’06) 0-7695-2560-1/06 $20.00 © 2006

IEEE

[vi.] The SFEA after analysis informs the SRA that a malicious agent must be disabled and one another replica must be activated. An important point is to know if the agent who executes the action is malicious or if it received the request from the malicious action of another agent. An agent only receives request from authorized agents who are in its list of capacities.

5 Future Work In the present work, we have shown a fault tolerance mechanism for NIDIA, an intrusion detection system based in agents. The mechanism considered aims to understand all the possible faults in the NIDIA, either accidental or malicious faults. The architecture of a society of fault tolerance agents is presented. The demonstration of the detection of malicious agent is made. We pretend extending this demonstration to all the detection methods: Detect which agents and hosts are actives; Detect the agents importance during the processing and the availability of the resources of hosts where they are executing. We plan to extend our work to use this fault tolerance mechanism to other intrusion detection systems based on agents and later to other multiagent systems. Also the study of an algorithm of appropriate load balancing should be investigated for the SFEA.

6 Conclusion IDS is expected to grow in different organizations. But the lack of fault tolerance can lead to severe problems and can even take IDS paradigm far away from exploitation in emerging applications. An IDS must be fault tolerance to resist subversion. It must be able to monitor itself and judge if it has been tampered by attacks. If it finds that it is tampered is must be able to recover and reconfigure itself. Continuity of service is one of the useful effective measures in IDSs. Fault tolerance is introduced in NIDIA by introducing society of agents SFTA. High assurance using monitoring, replication and migration of the agents is used to provide fault tolerance can lead the system to a safer and resistant state. The proposed mechanism seeks to include all the possible faults in a multiagent system. With this mechanism, it is possible: to detect which agents and hosts are actives, replying the requests; to detect the agents importance during the processing and the availability of the resources of hosts where the agents are executing; to detect the action of malicious agents. The detection of malicious agents is made using a list of capacities for each agent and monitoring the actions that are accomplished by each agent of the system.

References [1] R. S. Campello, R. F. Weber, V. da Silveira Serafim, and V. G. Ribeiro. O sistema de deteco de intruso asgaard. Workshop de Segurana - WSeg 2001, 2001. [2] R. e. T. d. I. d. S. n. B. Centro de Estudos. Avaliable at http://www.cert.br. Acess date: 24/10/2005. [3] C. I. D. F. (CIDF). http://www.isi.edu/gost/cidf/. Acess date: 10/09/2005. [4] R. de Lemos and J. L. Fiadeiro. An architectural support for self-adaptive software for treating faults. In A. W. D. Garlan, J. Kramer, editor, Proceedings of the 1st ACM SIGSOFT Workshop on Self-Healing Systems (WOSS’02), pages 39– 42, Charleston, SC, USA, November 2002. [5] C. F. L. et al. The nidia project network intrusion detection system based on intelligent agents. Proceedings of Tenth Latin-Ibero-American Congress on Operations Research and Systems, pages 212–217, 2000. [6] A. Fedoruk and R. Deters. Improving fault-tolerance by replicating agents. In AAMAS ’02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems, pages 737–744, New York, NY, USA, 2002. ACM Press. [7] L. Guangchun, L. Xianliang, L. Jiong, and Z. Jun. Madids: a novel distributed ids based on mobile agent. SIGOPS Oper. Syst. Rev., 37(1):46–53, 2003. [8] Z. Guessoum, J.-P. Briot, S. Charpentier, O. Marin, and P. Sens. A fault-tolerant multi-agent framework. In AAMAS, pages 672–673. ACM, 2002. [9] S. Haegg. A sentinel approach to fault handling in multi-agent systems. In Revised Papers from the Second Australian Workshop on Distributed Artificial Intelligence, pages 181–195, London, UK, 1997. Springer-Verlag. [10] I. M. Hegazy, T. Al-Arif, Z. T. Fayed, and H. M. Faheem. A multi-agent based system for intrusion detection. IEEE Potentials, 22:28–31, Outubro/Novembro 2003. [11] F. A. P. Jr., Z. Abdelouahab, and E. Nascimento. Proposal of model and message format for sharing information between csirts and idss. In Proceedings of 14th International Congress on Computing, Mxico Federal District, Mxico, 2005. [12] Z. Khan, S. Shahid, H. Ahmad, A. Ali, and H. Suguri. Decentralized architecture for fault tolerant multi agent system. In Autonomous Decentralized Systems, 2005. ISADS 2005, pages 167– 174. IEEE, 2005. [13] O. Marin, P. Sens, J.-P. Briot, and Z. Guessoum. Towards adaptive fault-tolerance for distributed multi-agent systems. In Proceedings of the 3rd. European Research Seminar on Advanced Distributed Systems (ERSADS’2001), pages 195– 201, 2001. [14] E. J. S. Oliveira and Z. Abdelouahab. Secure agent communication languages with xml security standards. 2005. [15] G. Ramachandran and D. Hart. A p2p intrusion detection system based on mobile agents. In ACM-SE 42: Proceedings of the 42nd annual Southeast regional conference, pages 185–190, New York, NY, USA, 2004. ACM Press. [16] E. H. Spafford and D. Zamboni. Intrusion detection using autonomous agents. Comput. Networks, 34(4):547–570, 2000.

Proceedings of the Fourth IEEE Workshop on Software Technologies for Future Embedded and Ubiquitous Systems and Second International Workshop on Collaborative Computing, Integration, and Assurance (SEUS-WCCIA’06) 0-7695-2560-1/06 $20.00 © 2006

IEEE