With the vast amounts of information that we are presented with, the interpretation of information held into usable knowledge is vital for today’s society. The exchange of processed information should not violate the confidentiality or privacy of either the information owners or those referred to in it. To deliver technologies that can effectively meet these goals is an important c
The SCIIMS project is focused toward the combating of People Trafficking and People Smuggling as part of the fight against organised crime and the improvement of the security of European citizens. Specific scenarios using this exemplar will be used for both experimentation and demonstration in order to validate and test new algorithms and technologies.
Development and application of Information Management Techniques enabling information to be fused and shared nationally and trans-nationally within a secure information infrastructure in accordance with European agencies information needs;
Development and application of tools to assist decision making in order to predict events and analyse likely consequences and effects to the security of EU citizens;
The project will utilise existing state of the art products and develop and incorporate new capabilities (beyond state of the art), in order to speed introduction of new innovative techniques, technologies and systems which is vital to the improvement of information management and exploitation techniques.
The project will:
Conduct capability focus and Technology Research including involvement of User Groups to identify specific capability gaps. This will include an analysis of legislation e.g. Data Protection, as well as security infrastructures;
Production of collaborative requirements sets including User and System Requirements in order to inform design of systems and innovative applications;
Design and development of applications and algorithms to enable Trend Analysis of information, Information Fusion, Data Mining/Integration and Decision Tools;
Define and Research Technology route maps and exploitation plans to enable the Consortium and User Groups to exploit the developed capabilities and technologies as well as to disseminate findings and recommendations to the user community and European Commission;
Conduct of system test including demonstrations to selected User Groups in order to seek input and recommendations for other capabilities;
Plan and conduct experimentation in order to verify and measure the improvements and advantages of the developed capabilities and technologies over and above an agreed baseline. Analysis of completed experimentation will be carried out, in order to inform further iterations of development within the programme.
Some of the key features of the SCIIMS system design include:
This is central to SCIIMS and has a structure matched to the system domain. It defines the semantics and terms used enabling the user and the SCIIMS applications and tools to interact with the underlying data sources.
Service Oriented Architecture
An Enterprise Service Bus (ESB) will be used to support the SCIIMS Service Orientated Architecture. This will be used as an integration platform that enables existing modules and applications to be exposed as services. The ESB is based on open standards.
SCIIMS will have an HCI that takes into account the business processes followed by a user. The HCI design will take into account the need for language independence and ease of understanding. Novel Visualisation techniques for large heterogeneous data sets are key to the success of the research project.
Open Standards and Tools
SCIIMS where possible uses open standards and tools. These will aid the future exploitation of the SCIIMS capabilities and technologies and ensure ease of integration into future systems.
Current Research areas include:
Visualisation of Heterogeneous Data Sets
Heterogeneous data sets are visualised as a network, mapping the content of the data warehouse according to defined business logic.
Named Entity Recognition and Classification
This finds and categorises specific types of elements within written text.
Entity Resolution (ER) techniques
These have the goal of identifying and grouping all of the data elements that are representations of real-world entities in heterogeneous data sources that are same underlying conceptual entity. Maximum entropy learning techniques are used and the entities can be visualised and managed graphically.
This detects patterns in a data set that do not conform to an established pattern, thus allowing abnormal behaviour to be detected.
Web Data Extraction Techniques
Research is being carried out into design techniques and systems able to explore automatically selected arrays of websites and to find, detect, and extract relevant data for future exploitation of derived knowledge. Semi-automatic and automatic crawling techniques are being researched. The semi-automatic approach is supervised and requires human intervention. An ad-hoc wrapper is created for each target data source either programmatically or with the help of a tool to extract the contents from the web pages. Automatic web data extraction uses non-supervised techniques, i.e., they do not require human intervention whatsoever except for an initial description of the intended task.
Alternative Competing Hypothoses
This provides a user with information on the strength and weaknesses of their hypotheses for a particular investigation. Aspects being considered include:
Matching Alternative Competing Hypotheses to the way a user’s business processes.
Providing a visual representation of the logic and probabilities of the information leading to the hypotheses.
Providing information on where further information may be needed and the key information contributions to the overall hypothesis.
Linking Alternative Competing Hypotheses to the Ontology.
Using signal processing to identify patterns in the data.
The SCIIMS project has established an Independent Ethics Advisory Board (EAB). The EAB has a direct reporting line to the European Commission and is responsible for advising the project regarding any ethical issues raised and ensuring compliance with any required legislation and/or guidance. The project is using fabricated datasets with which to stimulate the system however the relevant security measures for the protection of data/information are taken into account to ensure any future fielded system would be compliant.
Collaborative research project funded by the European Commission 7th Framework Programme (2007-2013) under grant
agreement No. 218223 copyright © SCIIMS.