The 6 th IEEE International Conference on E-Health and Bioe ngineering – EHB 2017 [630544]

The 6 th IEEE International Conference on E-Health and Bioe ngineering – EHB 2017
Grigore T. Popa University of Medicine and Pharmac y, Sinaia, Romania, June 22-24, 2017

978-1-5386-0358-1/17/$31.00 ©2017 IEEE Advanced solutions for medical information
storing: Clinical data warehouse

Abstract —The paper presents a number of research
challenges for medical data storing into Health Inf ormation
Systems (HIS), such as complex-data modeling featur es,
advanced classification structures, integration of very complex
data, and demonstrates how this area may benefit fr om the
functionality offered by data warehousing. In addit ion is
presented a case study which configures a data ware house
developed using multi-agent technology that integra tes
information from heterogeneous sources of a healthc are unit.
Keywords—healthcare, clinical data warehouse, electron ic
health records, data models, multi-agent technology
I. INTRODUCTION
Health Information Systems (HIS) is the most promis ing
solution for optimizing the use of existing healthc are
infrastructure. Requires the clinical field of data processing
models but stronger and more comprehensive than tho se found
in conventional multidimensional approaches used so far. In
addition, the data should provide enhanced support for search
operations, recognition, sorting and classifying da ta evaluated.
Two concepts are now associated in medical data sto ring:
Big Data (related to the huge amount of information to be
stored and analyzed) and Data Warehouse (DW), as th e best
logistic support for these operations. The main rep ositories
used by healthcare units for maintaining data proce ssed in
everyday operations are defined as Relational Datab ase
Management Systems (RDBMS). RDBMS have been
optimized to perform efficiently those operations, usually
named online transaction processing (OLTP) applicat ions.
Ast ăzi, clasicele RDBMS sunt completate de sisteme
alternative de gestionare a datelor Data Management Systems
(DMS), special concepute pentru a face fa ță volumului
masivelor uriase de date pe care le intalnim in Bi g Data. [1].
Besides managing and processing of data, increasing ly
more in HIS resort to artificial intelligence algor ithms and
expert systems involved in decision-making process and thus
constitutes Decision Support Systems (DSS). Even de dicated
solutions such as SQL has been extended with new
construction (eg. NewSQL NoSQL or) performing index ing
and query optimization New Techniques to run comple x
queries fast [2]
We shoud not overlook the fact that not have any me dical
educational institutions wich manage increasing vol umes of
large Big Data, every year. DW use in a process of
accumulation of new knowledge facilitates data mini ng
specific operations such as analysis of key perform ance
indicators [3] or integrate information from severa l database
based on semantic criteria [4]. In the last 2-3 years many applications operating o n
massive data support turned to Cloud Computing. Als o DW
benefit from this new paradigm to Provide analytica l seen
online and in real time. DW in the ITS Advantages o f Cloud
benefited politica flexibility, availability, adapt ability,
scalability, virtualization, etc. [5].
It is obvious that, according to DW mentioned trend s play
a decisive role in the procedures for healthcare (H C) and
clinical improve efficiency of logistics support da ta storage,
and makers.Therefore, this paper analyses new solut ions for
better efficiency of linking patient data from many databases
into one data warehouse to perform clinical analyti cs.
II. DATA WAREHOUSES IN MEDICAL
APPLICATIONS
A. Characteristics of Data Warehousing
A data warehouse is a special database, built throu gh
specific methods of operational data held by an org anization.
The purpose of such filing is to provide executive
management architectures and useful tools by: organ ization
systemic understanding and use of data for strategi c decision
making. From a historical perspective, William Inmo n is
considered to be the founder of the concept of "dat a
warehousing" and author of the first definitions ri gorous: "A
data warehouse is a subject oriented, integrated, n on-volatile
and time-variant collection of time in support of m anagement's
Decisions "[3]. to see to what extent the terms lis ted in the
above definition retains actuality, I made a brief comment,
with an emphasis on customization in medical applic ations.
Subject Oriented. A data warehouse is organized within
the meaning of major topics such as customers, supp liers,
libraries etc. These subjects require information f rom various
sources, but DW is focused on modeling and data ana lysis for
decision making, excluding data that are not useful in this
process.
Integrated. This feature refers to data consistency,
following the expression of a unitary form of data from
multiple heterogeneous sources (in HC – files, medi cal images,
electronic files). The term used in the current mom ent is the
essential requirement for interoperability and is i n all
processes of data processing.
Non-volatile . Data is stored to provide information in
historical perspective, even many years ago in case of medical
information so that decision makers can see the suc cessive
values of the same data to determine the time cours e and
calculate trends in certain indicators.

Time-variant . This refers to data persistence, meaning that
an update of the data warehouse as a result of chan ges made to
the source data, or should not alter or delete exis ting data.
Management’s decisions . This phrase indicates that DW
data is traditionally used at the strategic level ( this means
management) and is optimized for data analysis (th is means
decision). In HC eliminating redundancy in data is essential to
simplify the search process based on specific conte nt in
particular medical imaging. Due to the large volume of data in
repositories is necessary to use special tools and technologies
to extract relevant data for decision support. Thes e can be
grouped into two categories essential first related to dynamic
analysis of multidimensional real-time (OLAP – On L ine
Analytical Processing), the other the use of statis tical methods
for extracting knowledge from data (data mining) fo cused on
the discovery templates (patterns ) significant col lections of
data.
B. Clinical Data Warehouse architecture
Even if DWS architecture varies depending on the
specifics of each organization, you can define a ba sic
architecture model with three main components: 1) d ata
acquisition system from OLTP systems and other sour ces; 2)
data warehouse itself and warehouse management syst em
data; 3) system analysis and presentation of data f rom the data
warehouse. A more design, suitable for applications where
data coming from multiple heterogeneous sources, su ch as
applications HC includes an additional layer for da ta
integration. Such a complex architecture is divided into four
distinct levels of achievement data (see Fig. 1) as follows:
• The data sources – heterogeneous data collected fro m
various operational systems of the organization. Th e
rule uses a process of integrating these data via a
separate module called the data warehouse and sourc e
mode.
• Level of data transformation – using a process of
extracting, transforming and loading data (ETL –
Extract, Transform, Load) involves processing of da ta
in terms of the integrity, accuracy, accuracy and
format;
• The data warehouse – contains the processed data
loaded into the multi-dimensional structures and
aggregates at different levels prepared for use in the
assay. This level can design systems like multiple data
mart (the term defines a subset of DW oriented
transactional operations).
• The presentation and reporting data warehouse and
data mining involves the use of Business Intelligen ce
tools for the analysis and interpretation of inform ation
provided. This level uses OLAP analysis functionali ty.

Fig. 1. CDW basic architecture
Regardless of the architecture used in a data wareh ouse
can be found three functional levels, which can be grouped in
modules:
1) Operational module – the institution represented by the
data can come from different applications or distri buted
systems. Integrating these data involves a process of
extraction,conditioning, cleaning (cleaning), fusio n, validation
and loading.
2) The central module of the data warehouse – repre sented
by the general system database and the server where it is
running. For implementation can choose either a dis tributed
system, decentralized, where the data is stored in independent
units (Independent Data Marts or use a data source single
centralized. The first option seems at present to i mpose using
architectures MAS (Multi -Agent Systems).
3) Strategic Module (decision). By using different means
of access to information and processing technologie s
available, users can get information that will help determine
the strategy for action in various processes of dec ision making
and analysis.
C. Clinical Data Warehouse data flow and standards
In the elaboration of a CDW project, este necesar c a toate
documentele din lantul de prelucrare a datelor, org anizate in
formate tipizate denumite Case Report Forms (CRFs), sa fie
compatibile cu standard recunoscute. In Fig. 2 is s hown the
data flow in a typical CDW architecture. The centr al CDW
block has a main entry from the Order Communication System
(OCS) and two outcomes, for the End of Treatment (E oT)
standard reports and for feed of XML formats of a m edical
standard like CDISC.
Clinical Data Interchange Standards Consortium (CDI SC)
is a nonprofit organization that develops healthcar e standards
for both clinical practice and research. CDISC Foun dational
Standards (see https://www.cdisc.org/standards/foundational )
include specifications and models for data represen tation,
among them Protocol Representation Model (PRM), Stu dy
Data Tabulation Model (SDTM), Standard for Exchange of
Nonclinical Data (SEND), Analysis Data Model (ADaM) ,
Define-XML.

Fig. 2 Clinical Data Warehouse (CDW) Data Flows and Standards
The foundation for a CDW is the electronic health r ecord
(EHR) concept, and its associated notions of electr onic
medical record (EMR) and electronic patient record (EPR).
Terms EHR, EMR and EPR were often used interchangea bly,
although meanwhile clearly defined differences betw een them.
EHR is defined as a collection of electronic health information
(electronic health information) of individual patie nts [7]. EMR
is the patient record created in hospitals and can serve as a
source for EHR. EPR is an application for recording electronic
health personal that each patient makes available t o health care
providers. EHR is the central component of the IT
infrastructure of a modern units for medical assist ance, but the
spread of information on all different levels of re gistration and
storage create the need for a common standard for m edical
data of patients. One accepted standard of this kin d is
openEHR [7], an open standard specification in heal th
informatics that describes the management and stora ge,
retrieval and exchange of health data in electronic health
records (EHRs).
OpenEHR approach distinguishes a model structure on two
levels: "Reference Model" (RM) and "archetypes". RM is used
to represent the generic properties of medical reco rds.
"Archetypes" meta-data are used to define data mode ls
Specifications for particular requirements for medi cal
accounting profession or specialty services. RM mod els
include Primary Information (MI)] and AM are includ ed in the
Archetypes definition language (ADL) and the archet ype
object model (AOM). Fig. 3 illustrates the relation ships
between packages in openEHR architecture.

Fig.3. The relationship between packages open EHR III. CLINICAL DATA WAREHOUSE ARCHITECTURE
BASED ON OLAP AND MAS TECHNOLOGIES

In the multidimensional model CDW data is organized into
multiple dimensions, and each dimension represents a
hierarchy of categories. Data warehouse content is analyzed
using OLAP technology in order to discover trends, patterns
of behavior, abnormal or different dependencies bet ween data.
Being a relatively new technology, OLAP architectur e model
(Fig. 4), which was imposed for multidimensional an alysis-
oriented systems is client / server into three laye rs.

Fig. 4. The architecture of a OLAP system
• Data Warehouse form the lowest responsible for
storing and retrieving data. Transactional applicat ions
typically use systems relational data warehouses bu t
for multidimensional systems are used. Given the
large volume of data, it is advisable that RDBMS us ed
to provide support for parallel and distributed
processing, have mechanisms for indexing and
optimization, to provide a high level of safety ;
• OLAP analytic engine ( OLAP engine ) has the task to
retrieve the demands expressed by users and in
consultation metadata to generate applications
necessary for data to be redirected to customers. I n
addition, data obtained will apply at this level of
processing of generating a series of queries, data
manipulation and synthesizing results ;
• • Metadata – are stored dimensions, members and the ir
multiple and hierarchical structures of dimensions,
information appearing on the axes cubes and present ed
to the user as names of rows or columns ( pivot table );
• OLAP applications are the tools wielded by the end
user.
From punch of view of end user application that use s its
must provide two important features: free navigatio n through
the data warehouse searching for relevant informati on and
different ways of presenting data. Requirements man agement
and development for OLAP, although similar to the q uery and
reporting are generally more complex. The commissio ning of
a OLAP system and data access software requires a c lear

understanding of the institution of the data model and
analytical functions required by executive manageme nt and
strategic.
Multi-agent Systems (MAS) offer a new way for analy zing
problems and for designing systems, for dealing wit h
complexity, distribution and interactivity. Multi-a gent
healthcare systems have provided a clear means of m onitoring
the agent’s behavior with significant impact in the ir process of
knowledge acquisition and validation. By including MAS as
core component of CDW architecture, it is possible to
conceive agents that send messages and interact wit h people or
other agents, aiming at distributed interactive sim ulation
environments.
MAS structure facilitate the access to the informat ion
stored in the data bases executed by queries in the associated
operational databases. Users often need to make sev eral
related queries. RDBMS has no way to recognize and exploit
optimization opportunities arising from executing m any
related queries together. Here, agent technology ma y give
special support in order to help users. The data of CDW is
accessed using not only OLAP query engines, but als o
knowledge extraction algorithms, information visual ization
tools, statistical packages and report generators.
IV. CONCLUSIONS
Motivated by the increasing use of multidimensional
databases for data analysis in complex application areas such
as healthcare, this paper has investigated several aspects of
data modeling and query processing for complex
multidimensional data. Some important remarks deriv ed from
the analysis of specific requirements for Clinical Data
Warehouses. Data size matters and data volume are i mportant
parameters to take under consideration when dimensi oning the
CDW and developing agents. The consistent availabil ity and
good performance in the presence of continuing dema nding
and changing use, interfere with the system stabili ty. A
growing set of applications sharing the data must b e available
and data warehouse may be adaptive; i.e., to deal w ith rapid
change in business environments, strategies and act ivities. Consequently, the CDW become larger and more comple x, the
difficulty of managing it rises, and these requirem ents can be
addressed only partly by the creation of external t ools and
management facilities.
As a proof of concept, it was presented an extended
multidimensional data model which could be implemen ted
using standard OLAP technology and techniques such as
RDBMSs and pre-aggregation around a multi-agent sys tem.
The system eased the integration of OLAP data with complex
external data considerably and allowed data to be h andled
using the most appropriate data model and technolog y. The
proposed CDW architecture allows to support at the same time
small transactions, executed by users, and very lar ge
transactions, executed by software agents, during t he loading
of data into the data warehouse.
REFERENCES

[1] A. Moniruzzaman, NewSQL: Towards Next-Generation Sc alable
RDBMS for Online Transaction Processing (OLTP) for Big Data
Management, arXiv:1411.7343, pp.1-14, 2014
[2] M. Ayadi, R. Bouslimi, J. Akaichi, A framework for medical and health
care databases and data warehouses conceptual model ing support,
Network Modeling Analysis in Health Informatics and Bioinformatics,
Volume 5, Issue 1, pp. 1-21, 2016
[3] O. Moscoso-Zea, A. Sampedro, S. Lujan-Mora, Datawa rehouse design
for educational data mining, 15th International Con ference on
Information Technology Based Higher Education and T raining, pp. 1-6,
2016
[4] F. Nammour, K. Danas, N. Mansour, CorporateMeasures : A clinical
analytics framework leading to clinical intelligenc e, IEEE 18th Int.
Conf. on E-health Networking, Applications and Serv ices, pp. 1-6, 2016
[5] A. Ettaoufik, M. Ouzzif, Query's optimization in da ta warehouse on the
cloud using fragmentation, International Conference on Next Generation
Networks and Services, pp. 145-148, 2014 3.
[6] W. H. Inmon, Building the data warehouse, Wilez-Q ED, 1992
[7] P. Kierkegaard, "Electronic health record: Wiring Europe's healthcare".
Computer Law & Security Review, 27 (5), pp. 503–515 , 2011
[8] G. Ellingsen, B. Christensen, L. Silsand, Developin g Large-scale
Electronic Patient Records Conforming to the openEH R Architecture,
Procedia Technology, Vol. 16, pp. 1281-1286, 2014

Similar Posts