Cryptographic accelerator – state of the art [629057]

Cryptographic accelerator – state of the art
Roxana Gabriela Sorbun1
coordinator: s. l. dr. ing. Vlad Ciobanu
Abstract — This paper describes the content of a research
conducted in the cryptographic accelerators domain that will
help me to implement a cryptographic accelerator using FPGA
and to make the encrypted communication between FPGA and
PC.
I. INTRODUCTION
In the following section, a state of the art is presented
for cryptographic accelerators domain. After conducting this
research an encrypted communication between a PC and an
FPGA will be developed.
For developing a cryptographic accelerator two main arti-
cles were studied and an elaborate research in Cryptography
andFPGA domains were done. To realize the communication
between a PC and FPGA an elaborate study in this aria was
ass well done.
II. C RYPTOGRAPHY
Cryptography is the study of mathematical methods related
to information security, able to ensure the confidentiality,
authentication, and non-repudiation of messages, and data
integrity circulated.
Next it will be useful to define the principal notions used
in cryptography:
cryptography (science of secret writing) – is the defen-
sive side of cryptology, having as activity object the
development of cryptographic systems and the rules
used;
cryptanalysis – is the offensive side of cryptology,
having as activity object the study of their own crypto-
graphic systems to provide the necessary characteristics
for them to complete the function that they were de-
signed for;
cryptographic algorithm – is an array of reversible trans-
formations through the array of clear messages from a
language is transformed into an array of cryptograms;
encryption key – is a particular convention, materialized
by a word, phrase, number, numeric string, etc. and
regulating the operation of encryption;
cryptographic protocol – a set of rules, between two
or more partners through which occurs authentication
operation and/or transfer of key or message.
cryptographic system – is composed of three elements:
cryptographic algorithm, key generator system, and
distribution protocol for encryption keys;
1Roxana S. is with Faculty of Automatic Control and Computer Sci-
ence, Master of Science in Advanced Computer Architectures, University
Politehnica of Bucharest, Bucharest, Romania
coordinator: s. l. dr. ing. Vlad Ciobanudeciphering – is the inverse operation of ciphering and
consists of applying the known ciphering system onto
the cryptograms to find out the clear message;
decryption – is the operation whereby, only according to
the cryptograms made with an unknown cipher, the clear
message which was encrypted is highlighted and the
cryptographic system characteristics which were used
are determined.
Cryptography is a component from a bigger field known
as information security. The objectives that are prior to this
field are:
privacy – property to keeping the information secret and
accessible only to authorized persons;
data integrity – property to avoid any unauthorized
modification of information (insertion, deletion, substi-
tution);
authentication – property to identify an entity in accor-
dance with certain standards, it is composed of entity
authentication, information source authentication;
non-repudiation – property to prevent an entity from
denying a previous commitments or actions.
Cryptography’s fundamental goal is to adequately address
these four areas in both theory and practice. There are a
number of basic cryptographic tools (primitives) used to
provide information security. Some example of primitives
includes encryption schemes, hash functions, and digital
signature schemes. In Fig. 1 it can be observed how these
primitives are listed as a schematic and how they relate to
each other. These primitives should be evaluated with respect
to various criteria such as:
level of security – usually defined by the amount of work
necessary to defeat the objective;
functionality – to meet various security objectives the
primitives need to be combined, the basic property of
the primitive defines which primitives are more effective
for a given objective;
methods of operation – when a primitive is applied in
different ways and with different inputs it can inhibit
different characteristics;
performance – efficiency of a primitive in a particular
mode of operation;
ease of implementation – the difficulty of realizing a
primitive and the complexity of implementing it in a
software or hardware environment.
Cryptography is an intersection of mathematics, computer
science, and electrical engineering disciplines. Places, where
cryptography is most common found, are ATM cards, com-

Fig. 1
A taxonomy of cryptographic primitives
puter passwords, and electric commerce.
Cryptography refers to encryption, which is the process
of converting ordinary information (plaintext) into unintelli-
gible information (ciphertext). Decryption is the reverse of
encryption, which is the process of converting unintelligible
information to ordinary information. A cipher (or chyper) is
the equivalent name for an algorithm used for encryption and
decryption.
Symmetric-key cryptography was the only encryption
method known publicly until June 1976 and it refers to
encryption methods in which the sender and receiver share
the same key.
Data Encryption Standard (DES) and Advanced Encryp-
tion Standard (AES) are block cipher designs that have been
designed by the US government. DES is quite popular and
is used across a wide range of applications, from ATM
encryption to e-mail privacy and secure remote access.
Cryptographic hash functions are a type of cryptographic
algorithm. They take a message of any length as an input
and as an output they take a short fixed length hash (any
function that can be used to map data of arbitrary size to
data of fixed size).
There are many cryptographic algorithms[1], but the most
known ones are:
DES (Data Encryption Standard): operates on 64-bit
blocks of data, using a 56-bit key;
RSA (public key algorithm invented by Rivest, Shamir
and Adleman): the key used for encryption is different
from (but related to) the key used for decryption;
HASH (hash algorithm): used for computing a con-
densed representation of a fixed length message/file;
MD5: takes a message of up to 264 bits and reduces it
to a digest of 128 bits (16 bytes);AES (Advanced Encryption Standard);
SHA-1: similar in structure to MD5, but producing a
digest of 160 bits (20 bytes);
HMAC (Hash Message Authentication Code): a hashing
method that uses a key in conjunction with an algo-
rithm such as MD5 (HMAC-MD5) or SHA-1 (HMAC-
SHA1).
III. C RYPTOGRAPHIC ALGORITHMS
In this section, I will describe a few algorithms that are
well known in cryptography. The most known cryptographic
algorithms are DES (Data Encryption Standard), RSA (pub-
lic key algorithm invented by Rivest, Shamir and Adleman),
HASH (hash algorithm), AES (Advanced Encryption Stan-
dard), SHA, HMAC (Hash Message Authentication Code).
Cryptography algorithms are ranked by their speed in
encrypting/decrypting data and their robustness to withstand
attacks. Real-time processing of data encryption/decryption
is essential in network based applications to keep pace with
the input data inhalation rate. The encryption/decryption
steps are computationally intensive and exhibit high degree
of parallelism. Field programmable gate arrays (FPGA) and
graphics processing units (GPU) are being employed as
cryptographic co-processors to target different cryptography
algorithms.
A. Caesar encrypting system
Caesar encrypting algorithm is a monoalphabetic encryp-
tion system for which the clear text is built from the roman
alphabet (A:::Z )and the encryption key is represented by a
numberk2f0;:::;25g.
In the preprocessing phase, the space between words is
ignored or replaced with the less used character/letter from
that language.
Each letter from the source text is associated with the
lexicographical order x. For encryption, it is replaced with
the code character (x+k) mod 26 and for decryption the
inverse rule is used (xk) mod 26 .
B. Substitution method
The encryption operation consists of a bi-univocal corre-
spondence between the clean alphabet and the encrypting
alphabet. Presumably, the clean alphabet consists of the 26
letters of the roman alphabet and the character for the space
between words. The encrypting alphabet consists of the same
characters or just the 26 letters in which case the space
between words will be replaced by the less used letter from
that alphabet or it will be ignored.
The correspondence between the clean alphabet and the
encryption alphabet can be random or pseudo-random (a
password is used to start the encryption alphabet).
For the pseudo-random correspondence the following al-
gorithm is applied:
the letters from password are written only once in the
order of appearance;
the rest of the alphabet that does not appear in the
password are written in order.

The correspondence between the alphabets (encryption al-
phabet and clean alphabet) is realized after the rule alphabet
in alphabet and a fixed permutation . Depending on the
permutation the substitution can be:
direct – the encrypting alphabet has the same lexical
meaning with the clean alphabet;
A B C D E F G H IJK L M
G H I JK LM N O PQ R S
N O P Q R S TU V W X Y Z
T U V W X Y ZA B C D E F
reverse – the encrypting alphabet has the reversed lexical
meaning with the clean alphabet.
A B C D E FG H I JK LM
U T S R Q PO N M LK J I
N O PQ R S T U V W X Y Z
H G FE D C B A Z Y X W V
Below are three of the most famous substitution methods (the
old Hebraic codes) – if XsubstitutesY, thenYsubstitutes
X:
atbash – the first half of the alphabet is mapped in the
next half of the alphabet written in reverse:
A B C D E F G H I JK L M
Z Y X W V U T S R Q P O N
albam – the first half of the alphabet is mapped in the
next half of the alphabet written in lexical order:
A B C D E FG H I J K L M
N O PQ R ST U V W X Y Z
atbah:
A B C D JK LM E S T U V
IH G F R Q P O N ZY X W
C. Playfair encrypting system
The Playfair encrypting system was proposed in 1854 by
Charles Wheatstone and promoted by Lord Playfair. It is one
of the most known digraphic ciphering systems. This system
is relatively simple and safer than the other systems based
on monoalphabetic substitution.
The system for the roman alphabet consists of a box of
size55in which the letters (the letter Iis associated
with the letter J) are placed. The plaintext is preprocessed
to be compatible with the ciphering array: the space between
words is ignored or replaced with the less used letter of the
alphabet, the letter Iis associated with the letter J, and
another letter is added if the text is not an even number of
diagrams (a group of two letters).The encryption rules:
if the diagram does not have letters on the same row or
column, then the rule for ciphering is the rectangle rule
(the first letter of the diagram is the letter that is on the
same line and the last letter of the diagram is the letter
next to it in the same line, the substituted letters need
to be a corner of the rectangle);
if the diagram has letters on the same row the rule is
right ciphering, left deciphering ;
if the diagram has letters on the same column the rule
isdown ciphering, up deciphering .
D. Hill encrypting system
Hill encrypting system is a polygraphic substitution
method based on the calculations in algebra modp.
In the preprocessing phase, the space between words is
ignored or replaced with the less used letter from the alphabet
used in the clean text.
The algorithm processes a data block Mwithncharacters
(letters), the encryption key is represented by a Karray of
sizenn, reversible modp.
Two subclasses exist in the Hill algorithm for which the
ciphering rules are different (multiplication order):
the first subclass has the ciphering rule in which the
multiplication order is C=MK and the deciphering
rule isM=CK1;
the second subclass has the chipering rule in which the
multiplication order is C=KM and the deciphering
rule isM=K1C.
E. Polyalphabetic encrypting system
A polyalphabetic substitution encryption system is a gen-
eralization of the monoalphabetic substitution encryption
system and consists of Nalphabets. Each alphabet repre-
sents a permutation (determined by password) of the input
alphabet. The ciphering algorithm consists of substituting the
ithlettermfrom the clean text with the corresponding letter
from theithmodNalphabet.
This kind of system is easy to identify using the occur-
rence frequency of letters in decimated sequences from the
ciphertext.
An example of such a system is the Vigen `ere algorithm in
which the password k;:::;k nis used periodically to transform
themj2fA;:::;Zgcharacter from the plaintext acording
tocj= (mj+kjmod n) mod 26 formula.
Attacking a polyalphabetic system is like attacking a
Nmonoalphabetic substitution system. Using the divide et
impera method (O(N)) we have the following algorithm:
input – ciphertext of Mlength;
output – plaintext corresponding to the polyalphabetic
encryption system;
first step – determine the Nnumber of alphabets;
second step – for j= 0 to4run: fori= 1 toNj
run: apply the partial reconstruction procedure (based
on(j+ 1)gram frequency) of i;:::;i +jalphabets;
third step – according to the Nalphabets rebuild the
plaintext.

F . Transposition method
Transposition method assures the diffusion achievement:
spreading the static properties of the plaintext in the ci-
phertext. For example, in the column transposition case the
plaintext is written row by row in a tubular form with n
columns, then it is read on columns based on the encryption
key represented by a npermutation.
If the size of the plaintext is not a nmultiple then it can be
filled or not with a specified character. In the preprocessing
phase the space between words is ignored or replaced with
the less used character from the language in which the
plaintext was written.
G. Mixed systems
Mixed systems have as a base the successive ciphering
of the message using the substitution method and then the
transposition method or the other way.
Attacking such a system is realizable by accessing the
last component and going until the first one. The simple
substitution is commutative with the transposition operation,
therefore at any time, the substitution method can be ap-
proached first and then the transposition method.
When using a polyalphabetic system, with an unknown
number of alphabets, it is recommended that after deter-
mining the number of alphabets (using a static method) to
simultaneously approach the identification of the alphabets
and the transposition used.
When using a polygraphic system and a transposition it is
recommended to use a backtracking technique.
H. RIJNDAEL algorithm – AES
Rijndael algorithm is known as AES (Advanced Encryp-
tion Standard) and is an encrypting method established by
the U.S. National Institute of Standards and Technology
(NIST) in 2001. The Rijndael cipher was developed by two
Belgian cryptographers, Joan Daemen and Vincent Rjimen.
The cipher has different keys and blocks sizes.
Rijndael algorithm is designed on the substitution-
permutation network principle, combining the substitution
method and the permutation method, and it is fast in both
hardware and software.
The Rijndael algorithm is an encryption algorithm in
which the length of blocks and keys is independent (128
bits, 192 bits, 256 bits). AES standardized all the keys sizes
but restricted the blocks size to just one, 128 bits.
In FIPS 192, the AES operations are defined as array
operations, where both key and block are written in an array
form. The block is stored in an array (table) called state , the
first four bytes are stored in the first column, the second four
bytes are stored in the next array column and so on until the
last four bytes.
2
664b0b4b8b12
b1b5b9b13
b2b6b10b14
b3b7b11b153
775AES cipher uses key size which specifies the number of
repetitions of transformation rounds that convert the input
(plaintext) into output (ciphertext). The number of cycles of
repetition depends on the key size:
10 cycles of repetition for 128-bit keys;
12 cycles of repetition for 192-bit keys;
14 cycles of repetition for 256-bit keys.
AES algorithm consists of:
KeyExpansions – round keys are derived from the cipher
key using Rijndael’s key chedule;
InitialRound:
–AddRoundKey
Rounds:
–SubBytes (fig. 2) – is a substitution cipher named
Rijndael S-box, the transformation is non-linear
which results in high level of security;
–ShiftRows (fig. 3) – operates on rows level, cyclical
movement of bytes on the rows;
–MixColumns (fig. 4) – operates on the columns of
thestate , combines four bytes in each column;
–AddRoundKey (fig. 5) – each byte of the state is
combined with bitwise XOR ().
Fig. 2
SubBytes
Fig. 3
ShiftRows
FinalRound:
–SubBytes
–ShiftRows
–AddRoundKey
I. Merkle-Hellman encryption system
Merkle-Hellman encryption algorithm consists of cipher-
ing the message as a solution to a problem of knapsack
type for which the fM1;:::;M ngweight factor represents

Fig. 4
MixColumns
Fig. 5
AddRoundKey
the ciphering key, and the plaintext fb1;:::;b ngcorresponds
to the ciphertext:
n
i=1biMi
Definition: A string of weight factors fM1;:::;M ngis called
superincresing if
Mk>k1
i=1Mi
for anyk.
The superincreasing knapsack problem is easy to solve
using the following statement: for k=n;:::; 1:
ifMk<S, thenbk= 1 andS=SMk;
otherwisebk= 0.
The knapsack algorithms which are not superincreasing are
not very easy to solve and a rapid algorithm to solve the
problem does not exist. The only known method to determine
ifbi= 1 consists of testing all the solutions. The fastest
testing algorithms have an exponential complexity.
Merkle-Hellman algorithm is based on the following
property: the private key is a string of weight factors for
superincreasing knapsack and the public key is the string of
weight factors for a knapsack which has the same solution,
but is not superincreasing. Merkle and Hellman found a
method to transform a superincreasing knapsack problem
into a normal knapsack problem. The converting techniqueuses the modular arithmetic.
If a superincreasing knapsack problem (private key) is
known with weight factors fM1;:::;M ngthen it can be trans-
formed into a normal knapsack problem (public key) with
the string of weight factors fmM 1modp;:::;mM nmodpg,
wheremandpare prime natural numbers (they are part of
the private key) and p>n
i=1Mi.
To cipher a binary message, the binary message needs to
be divided into blocks equal in length with the cardinal of
weight factors array. The ciphering of a b1;:::;b nblock will
be the natural number:
n
i=1bi(mM imodp)
To decipher the message the receiver knows the private
key: the original weight factors, and mandpvalues. The
receiver will calculate first m1modp. The ciphertext will
be multiplied with m1modpand then the superincreasing
knapsack problem will be solved to recover the plaintext.
J. RSA encryption system
The RSA algorithm was invented by Ron Rivest, Adi
Shamir, and Leorad Adleman. The RSAs security is based
on the big numbers factorization difficulty. The public and
private keys are functions of big prime numbers pairs (200
digits or greater). The factorization of two prime numbers
product involves the recovery of the plaintext from the
ciphertext, knowing the public key.
To generate two keys (public and private), two big prime
numberspandqare randomly chosen. By security reasoning,
pandqmust have the same size order. The n=pgproduct
will be calculated. Then randomly the public exponent e(for
ciphering) is chosen such as eand(p1)(q1)to be
relatively prime. Using the extended Euclid algorithm the
private exponent d(for deciphering) is calculated such as:
ed1 mod (p1)(q1)
With other words:
de1mod (p1)(q1)
Note thatdandnare relatively prime. The pair (e;n)
represents the public key and the pair (d;p;q )represents the
private key. The pandqnumbers are no longer necessary
for ciphering/deciphering, but they will never be made public
(knowing the numbers and ciphering exponent leads to the
determination of the deciphering coefficient, therefore the
system becomes useless).
To cipher aMmessage it will be divided into blocks with
a smaller length n. Therefore pandqare prime numbers
with 100 digits when nwill have less than 200 digits and
every block message Miwill have less than 200 digits. If
fixed length blocks need to be ciphered then the 0 padding
operation will be used. The ciphered Cmessage will be
obtained by concatenating Cimessages which have almost
the same length. The ciphering formula is:
CiMe
imodn
To decipher a message the following calculus is needed:

MiCd
imodn,
because
Cd
i(Me
i)dMed
iMk(p1)(q1)+1
i
MiMk(p1)(q1)
iMimodn
IV. FPGA
Field programmable gate arrays (FPGAs) are integrated
circuits that enable designers to program customized digital
logic in the field.
Below are enumerated a few inventions[2] that led to
FPGA development:
1960: MOSFET (MetalOxideSemiconductor Field-
Effect Transistor) – one of the most basic elements in
an FPGA;
1961: communication IC (Integrated Circuit);
1962: TTL (Transistor-Transistor Logic) – a family of
integrated circuits;
1963: CMOS (Complementary Metal-Oxide Semicon-
ductor);
1965: Moore’s Law – based on the prediction that the
number of transistors on circuit doubles every year;
1970: PROM (Programmable Read-Only Memory) –
one of the first types of programmable memory;
1971: EPROM (Erasable Programmable Read-Only
Memory) – second step in programmable memory, you
could erase it;
1972: DST(Depleted-Substrate Transistor);
1975: PLA (Programmable Logic Array);
1978: PAL (Programmable Array Logic);
1983: EEPROM (Electrically Erasable Programmable
Read-Only Memory);
1983: GAL (Generic Array Logic);
1984: FLASH memory – a type of EEPROM, the most
common used non-volatile memory;
1985: FPGA (Field Programmable Gate Array).
In the early days, to use an FPGA in your designed a lot of
programming was needed for it to perform simple functions
and because of this fact a lot of designers avoided them.
An FPGA is a semiconductor device on which the function
can be defined after the manufacturing. FPGA enables the
designer or user to program product features and functions,
to adapt to new standards and to reconfigure hardware for
specific applications even after the product has been installed.
The FPGA has evolved from an useful but humble inter-
face device into a system-level integrated circuit (IC) with
its own microprocessors, memory blocks, and interfaces.
FPGAs use congurations in general specied with the help
of a Hardware Description Language (HDL). In order to
dene the behavior of an FPGA, the user or designer needs
to provide a hardware description language (HDL) or a
schematic design. The most common HDLs are VHDL
(VHSIC Hardware Description Language) and Verilog.
The principal FPGA manufacturers are Xilinx and Altera,
and they provide design software, for Linux and Windows,
which enable engineers to design, analyze, simulate and
synthesize (compile) their designs.Because of their programmable nature, FPGAs can be used
in different markets and applications such as:
aerospace and defense;
ASIC prototyping;
audio;
automotive;
broadcast;
consumer electronics;
data center;
high performance computing and data storage;
medical;
industrial;
security;
video and image processing;
wired communications;
wireless communications.
V. PC AND FPGA
A personal computer (PC) is a multi-purpose pro-
grammable electronic device. Computers were invented to
compute (to solve complex mathematical problems) but
nowadays they are more used for entertainment than to solve
complex mathematical problems. Almost every person has
access to a computer. A range of software applications is
available for personal computers including word processing,
spreadsheets, databases, web browsers, email, digital media
playback, video games, and many personal productivities and
special-purpose software applications. In the 2010s, personal
computers are connected to the Internet giving the user
access to the World Wide Web. PCs are connected to the
Internet by a local area network (LAN), either by a cable or
a wireless connection.
The earliest electronic general-purpose computer was
ENIAC (Electronic Numerical Integrator And Computer)
which became operational 1946, but it was not affordable
or easy to use. One of the early single-user computers was
LPG-30 (standing for Librascope General Purpose and then
Librascope General Precision) manufactured by Librascope
and sold and serviced by the Royal Precision Electronic
Computer Company in 1956 with a retail price of $47,000 –
equivalent to about $414,000 today.
Between 1965 and 1966 was developed a series of soviet
computer named MIR (Machine for Engineering Calcula-
tions) which could be considered the first personal computer.
It was considered innovative at that time because it had user
interface combining a keyboard with a monitor and light pen
for correcting text and drawing on the screen.
By the early 1970s, people in academic or research
institutions had access to single-person use of a computer
system in interactive mode. Early personal computers, also
called microcomputers, were sold in a kit form and in lim-
ited volumes. Minimal programming was done with toggle
switches to enter instructions, and output was provided by
front panel lamps. In 1973 Micral N was a commercial
non-kit microcomputer based on a microprocessor, the Intel
8008. A big step in the development of personal computers
was 1973 with Xerox Alto which had a graphical user

TABLE I
FPGA Algorithm Acceleration [3]
Application Processor only FPGA Co-processing Acceleration Factor
Hough and Inverse Hough
Processing (1)12 minutes processing time
Pentium 4-3 GHz2 seconds of processing time at
20 MHz370X
Spatial Statistics (Two-Point
Angular Correlation Cosmology)
(2)3,397 CPU hours with 2.8-GHz
Pentium (approximate solution)36 Hours (exact solution) 96X
Black-Scholes (Financial
Application (single precision FP
2M points)) (2)2.3M experiments/sec with a
2.8-GHz Processor299M experiments/sec 130X
Smith Waterman SS search 34
from FASTA (1)6461 sec processing time
(Opteron)100-sec FPGA processing 64X
Prewitt Edge Detection (compute
intensive video and image
processing) (3)327M clocks (1-GHz processing
power)131K clocks at 0.33 MHz 83X
Monte Carlo Radiative Heat
Transfer (1)60-ns processing time (3-GHz
processor)6.12 ns of processing time 10X
BJM Financial Analysis (5M
paths) (1)6300 sec processing time
(Pentium 4-1.5-GHz)242 sec of processing at 61 MHz 26X
(1) Source: Celoxica
(2) Source: SRC Computers
(3) Source: XtremeData
interface (GUI) that inspired today’s graphical user interfaces
(Windows, MacOS). Xerox Alto was not commercialized
because it was too expensive to build and it remained just
a demonstration project. Also in 1973 was introduced fully
BASIC programmable microcomputers that fit entirely on
top of a desk, including a keyboard, a small one-line display
and printer. In 1974 was introduced Altair 8800 which was
considered by many the first personal computer. It was
based on the 8-bit Intel 8800 microprocessor and the first
programming language for this machine was Microsoft’s
founding product, Altair BASIC. In 1976 was introduced the
Apple I computer circuit board developed by Steve Jobs and
Steve Wozniak. In 1977 the Commodore PET, Apple II and
TRS-80 Micro Computer System were introduced.
During the early 1980s, the personal computer area de-
veloped drastically for programming and games and they
can be used with the television already installed at home.
In this period the following appeared: ZX Series – ZX80
(1980), ZX81 (1981), ZX Spectrum (1982), Commodore 64
(1982), PC-9800 series (1982 – 2000), Commodore Amiga
1000 (1985).
Over the years the personal computers developed in a rapid
manner and today there is almost no difference between per-
sonal computers and work or engineering computers. They
use basically the same components and operating systems.
There are two types of computers stationary(workstation,
desktop computer, gaming computer, single unit, nettop,
home theater PC) and portable(laptop/notebook, desktop re-
placement computer, netbook, tablet, ultra-mobile PC, pocket
PC).
Personal computers have operating systems such as Mi-
crosoft Windows, OS X, Linux, Solaris and FreeBSD. PCs
hardware usually consists of: computer case, power supply
unit, processor (CPU – Central Processing Unit), mother-
board (system board or main board), main memory (RAM,DRAM, SDRAM or SRAM), hard disk, visual display unit
(computer monitor or just display), video card (graphics card,
graphics adapter or video adapter), keyboard, mouse and
other components.
Because of the necessity of a better processing power
the FPGAs started to be implemented in PCs. Because
of affordable FPGAs, students and engineers can recreate
vintage computers or implement more novel architectures.
A fully based computer is COPACOBANA which is a code
breaker.
FPGAs can be found in computers and in embedded
systems:
PCIe (Peripheral Component Interconnect Express) –
often the FPGA connected to the main CPU as a PCIe
device;
electronic board – all recent FPGA can interface with
high-speed I/O interfaces, no special FPGA is required
to connect the electronic board with DDR3, PCI, or
PCIe;
between sensor and CPU – often the FPGA is connected
between an external interface and the main processor ;
embedded systems – can use FPGA as their main
processor;
soft-core – for some embedded systems a small micro-
controller is required but instead of having it on the
electronic board alongside the FPGA, it can be imple-
mented inside the FPGA as a soft-core;
FPGA SoC – the processor and the FPGA are on the
same chip.
In Table 1 is shown a practical experience with FPGA-based
co-processors that shows at least a ten-fold improvement in
the execution of the algorithms as compared to processors
alone. The speed is more than 100 times faster.
The communications between FPGAs and PCs can be
made using:

USB ports;
parallel ports;
serial ports;
ethernet
VI. C RYPTOGRAPHIC ACCELERATORS
A cryptographic accelerator is a co-processor designed to
perform computationally intensive cryptographic operations,
the speed of processing is greater than the CPUs speed. These
co-processors can increase the speed of servers that process
cryptographic operations.
A hardware cryptographic accelerator can be:
integrated into te SoC as a separate processor;
integrated into a co-processor in a circuit board;
on a chip which is on an extended circuit board;
an ISA extension.
Data encryption and authentication utilize complex mathe-
matical processes that require significant system resources.
Organizations try to stay up to date with the technologies so
that the customer can have fast and secure encryption/decryp-
tion systems. The CPU performance is constantly increasing
but the cryptographic processes often lags ordinary process-
ing performance and a cryptographic accelerator is used.
On the market a lot of hardware cryptographic accelerators
such as:
nShield family of multi-purpose HSMs;
payShield family of payment HSMs;
keyAuthority;
Datacryptor Link and Layer 2 Encryption;
Datacryptor 5000 Series;
SafeNet PCIe HSM formerly Luna PCI-E;
SafeNet USB HSM formerly Luna G5;
SafeNet Network HSM formerly Luna SA;
Soekris VPN1411;
NITROX V Security Processor Family;
NITROX III Security Processor Family.
The nShield family of multi-purpose HSMs supports the
fastest cryptographic algorithms and safeguarding high-
volume sensitive data transactions. It supports a wide array
of APIs and operating systems.
The payShield family of payment HSMs is dedicated to
the payment industry for transaction processing and key
management. It is used in almost 80% of payment card
transactions.
SafeNet PCIe HSM can be embedded directly in an
appliance or application server for an easy-to-integrate and
cost-efficient solution for cryptographic acceleration and
security. The high-security hardware design of SafeNet PCIe
HSM ensures the integrity and protection of encryption keys
throughout their life cycle.
SafeNet USB HSM delivers industry leading key manage-
ment in a portable appliance with a USB interface. All key
materials are maintained exclusively within the confines of
the hardware.
SafeNet Network HSM is the most trusted general purpose
HSM on the market in part because of the unique approach toprotecting cryptographic keys. Unlike other methods of key
storage which move keys outside of the HSM into a trusted
layer, the keys-in-hardware approach protects the entire key
lifecycle within the FIPS 140-2 validated confines of the
SafeNet Network HSM appliance.
The Soekris VPN1411 hardware security accelerator de-
livers excellent performance off-loading the CPU from the
computing intensive tasks of encryption and compression.
Encrypts, 128 AES at up to 19 Mbps, 3-DES at up to 40
Mbps.
The NITROX V security processor family integrates up
to 288 purpose-built security cores with high-performance
compression engine and virtualization hardware with PCIe
Gen 3 and Interlaken interface. It provides an excellent
solution to data center security and compression challenges.
It can perform from 20,000 up to 120,000 RSA 2018
operations/second, can process full IPsec processes or full
SSL processes from 15 Gbps up to 100 Gbps, and the same
for compression speed, depending on the processor from
NITROX V security processor family.
NITROX III processors are highly scalable and deliver
performance ranging from 5 Gbps to 32 Gbps for full IPsec
protocol or SSL offload, 1024bit exponent RSA from 35K
to 230K RSA operations per second, and 2048bit exponent
RSA up to 41K RSA operations per second enabling an
optimal solution for applications.
For this paper, the following two articles were reviewed
A Low-power Pairing-Based Cryptographic Accelerator for
Embedded Security Applications [14] and Cryptographic Ac-
celerator for 802.15.4 Transceivers with Key Agreement
Engine based on Montgomery Arithmetic [16].
A. A Low-power Pairing-Based Cryptographic Accelerator
for Embedded Security Applications [14]
In [14] the authors implemented an IP core for Pairing-
based cryptography that performs an elliptic curve cryp-
tographic operation called the Tate Pairing over the field
GF(2251)and it was implemented in TSMC 65nm GP
CMOS standard cells and optimized for low-power opera-
tions. The resulting core form this implementation computes
the pairing in 1.5ms and consumes less than 4mW.
Tate Pairing is the most widely used mathematical oper-
ation known as bilinear pairing. In [14] the authors report
on a low-power ASIC implementation of a dual-mode Tate
Pairing/point-scalar multiplication processor.
The core architecture is made in such a way so that it
can compute the Tate Pairing over GF(2m)using Algorithm
1(a modified version of BKLS/GHS algorithm). Algorithm
1involves eight distinct operations (elliptic curve point
addition, subtraction and doubling over GF(2m)and line
function, multiplication, squaring, inversion and exponentia-
tion overGF(24m)). Fig. 6 are represented the code for a
modified version of the BKLS/GHS algorithm and the core
architecture illustration. Using multiplier digit size of d= 1
Tate Pairing runtime is approximately 75,000 clock cycles,
or 1.5ms at Fclk= 50Mhz .

Fig. 6
A modified version of the BKLS/GHS algorithm and the core
architecture
In the below table are shown the Scenario 1 activities with
the following notations: Mult – multiplication, Sqr – squaring,
ECP – elliptic curve point addition, LFP – line function, Inv
– inversion, and Exp – exponentiation over GF(24m).
Scenario Mult Sqr ECP LFP Inv Exp
1 Busy Busy Busy Busy Idle Idle
In Fig. 7 the results of Scenario 1 process are shown.
B. Cryptographic Accelerator for 802.15.4 Transceivers with
Key Agreement Engine based on Montgomery Arithmetic
[16]
In [16] authors present the design of a cryptographic core
compliant with the IEEE 802.15.4 standard and based on
FPGA. The IEEE 801.15.4 standard utilizes cryptographic
Fig. 7
Results for peak power waveforms for Scenario 1
techniques based on symmetric-key to protect information
and leaves key management out of its scope.
Authors presented the design of a cryptographic core
implemented in VHDL and based on FPGA trying to assure
some problems about 802.15.4 security suite that appear
when implemented in hardware.
In Fig. 8 are enumerated all eight IEEE 802.15.4 standard
security suites supported.
Fig. 8
Security suites supported by IEEE 802.15.4 standard
The cryptographic core was designed using:
AES engine performing AES counter mode encryption
and CBC-MAC authentication;
ACL of 255 entries implemented in a CAM (Content
Addressable Memory);
Montgomery modular exponentiation unit performing
2018-bit operations;
two plaintext and ciphertext FIFOs of 1024 bytes;
a slave bus interface.
The cryptographic core has been synthesized using Xilinx
XST 12 on the target FPGA, the Spartan 6 xc6slx75-3csg484
based on a 45 nm process. A 30% area is allocated for
physical implementation and medium access control layer
of the IEEE 802.15.4 protocol by the cryptographic core
implemented in FPGA.
VII. F UTURE WORK
Because this research paper will turn into my dissertation
I divided the main project into four papers that will combine

in the end as a dissertation.
The first part consists of this paper in which I studied
cryptography, cryptographic algorithms, FPGAs, the PC and
FPGA combination and cryptographic accelerators which are
the main subject of this paper and the future papers.
For the second semester, I will implement an algorithm
that I studied in this paper and create a cryptographic
accelerator using FPGA. I will highlight the difference of
mathematic operations executed by the computer alone (PC
using the CPU) and the computer using the cryptographic ac-
celerator with FPGA, implemented by me, as a co-processor.
In the actual stage that I am now, I think the AES algorithm
will be suitable.
As for third and fourth semesters, I will try to implement
other algorithms and use two FPGAs that will send encrypted
date to each other and so to create an encrypted communi-
cation between these two.
VIII. CONCLUSIONS
This research paper helped me to understand what cryp-
tography is, a few simple cryptographic algorithms and some
complicated ones, an FPGA is and how it works, how the
PC communicates with the FPGA, and what a cryptographic
accelerator is and how it can be used.
I consider that I managed to accomplish what I had in
mind when I started the research for this paper.
APPENDIX
AES – Advanced Encryption Standard
ATM – Automated Teller Machine
ASIC – Application Specific Integrated Circuits
BKLS/GHS – algorithms invented by Barreto, Kim, Lynn,
Scott and Galbraith, Harris, Soldera
CMOS – Complementary Metal-Oxide-Semiconductor
CPU – Central Processing Unit
COPACOBANA – Cost-Optimized PArallel COde Breaker
DST – Depleted-Substrate Transistor
DDR3 – Double Data Rate type 3
DRAM – Dynamic Random Access Memory
DES – Data Encryption Standard
EPROM – Erasable Programmable Read-Only Memory
EEPROM – Electrically Erasable Programmable Read-
Only Memory
ENIAC – Electronic Numerical Integrator And Computer
GAL – Generic Array Logic
GPU – Graphics Processing Unit
GP – General Purpose
FPGA – Field Programmable Gate Array
FIPS – Federal Information Processing Standard
HSM – Hardware Security Module
HMAC – Keyed-hash Message Authentication Code
HDL – Hardware Description Language
I/O – Input/Output
IC – Integrated Circuit
IEEE – Institute of Electrical and Electronics Engineers
ISA – Instruction Set Architecture
LAN – Local Area NetworkLGP – Librascope General Precision
MOSFET – Metal Oxide Semiconductor Field Effect Tran-
sistor
NIST – National Institute of Standards and Technology
PC – Personal Computer
PET – Personal Electronic Transactor
PCI – Peripheral Component Interconnect
PCIe – Peripheral Component Interconnect Express
PLA – Programmable Logic Array
PAL – Programmable Array Logic
PROM – Programmable Read-Only Memory
RSA – an algorithm designed by Rivest, Shamir and
Adleman
RAM – Random Access Memory
SDRAM – Synchronous Dynamic Random Access Mem-
ory
SRAM – Static Random Access Memory
SoC – System on a Chip
TTL – Transistor-Transistor Logic
TRS 80 – Trandy/Radio Shack with Z80 microprocessor
TSMC – Taiwan Semiconductor Manufacturing Company
USB – Universal Serial Bus
VHDL – VHSIC Hardware Description Language
VHSIC – Very High Speed Integrated Circuit
REFERENCES
[1] Cryptographic Algorithms (http://www.cryptographyworld.com)
[2] History of FPGA (https://blog.digilentinc.com/history-of-the-fpga)
[3] Accelerating High-Performance Computing With FPGAs
(www.altera.com)
[4] A. Atanasiu, Securitatea Informatiei, vol. 1, Criptografie, ed. InfoData,
Cluj, 2008
[5] A. Atanasiu, Securitatea Informatiei, vol. 2, Protocoale de securitate,
ed. InfoData, Cluj, 2009
[6] A.J. Menenzes, Handbook of Applied Cryptography, CRC Press, 1997
[7] E. Simion, V . Preda, A. Popescu, Criptanaliza. Rezultate si Tehnici
Matematice,Ed. Univ. Buc., ISBN 973575975-6, 2004
[8] B. Schneier, Applied Cryptography, Adison-Wesley, 1998
[9] Adrian Atanasiu, Lecture Notes: Cryptography I, Faculty of Mathe-
matics, Bucharest University
[10] Adrian Atanasiu, Lecture Notes: Cryptography II, Faculty of Mathe-
matics, Bucharest University
[11] Chey Chob, Cryptography for dummies, John Wiley & Sons Inc.,
Hoboken, New Jersey, 2004
[12] Andrew Moore, FPGAs for dummies, John Wiley & Sons Inc.,
Hoboken, New Jersey, 2014
[13] Vlad Ciobanu, Lecture Notes, Faculty of Automation and Computer
Science, Politehnica University of Bucharest, 2016
[14] Tom English, Maurice Keller, Ka Lok Man, Emanuel Popovici, Michel
Schellekens, William Marnane, A Low-power Pairing-Based Crypto-
graphic Accelerator for Embedded Security Applications, Published
in: SOC Conference, 2009. SOCC 2009. IEEE International, 2009
[15] Domenico Argenziano, A research project on FPGA-accelerated cryp-
tographic computing, Published in: P2P, Parallel, Grid, Cloud and
Internet Computing (3PGCIC), 2015 10th International Conference
[16] Antonio de la Piedra, Abdellah Touhafi, Gianluca Cornetta, Crypto-
graphic Accelerator for 802.15.4 Transceivers with Key Agreement
Engine based on Montgomery Arithmetic, Published in: Communica-
tions and Vehicular Technology in the Benelux (SCVT), 2011 18th
IEEE Symposium
[17] Yi Wang, Yajun Ha, FPGA based Rekeying for cryptographic key man-
agement in Storage Area Network, Published in: Field Programmable
Logic and Applications (FPL), 2013 23rd International Conference
[18] Vivek Venugopal, Devu Manikantan Shila, High Throughput Imple-
mentations of Cryptography algorithms on GPU and FPGA, Pub-
lished in: Instrumentation and Measurement Technology Conference
(I2MTC), 2013 IEEE International

[19] Jonathan Lutz, Anwarul Hasan, High Performance FPGA based El-
liptic Curve Cryptographic Co-Processor, Published in: Information
Technology: Coding and Computing, 2004. Proceedings. ITCC 2004.
International Conference
[20] Nikolaos Alachiotis, Simon A. Berger, Alexandros Stamatakis, Ef-
ficient PC-FPGA communication over gigabit ethernet, Published
in: Computer and Information Technology (CIT), 2010 IEEE 10th
International Conference
[21] Shuai Liu, Lei Ju, Xiaojun Cai, Zhiping Jia, Zhiyong Zhang, High
Performance FPGA Implementation of Elliptic Curve Cryptography
over Binary Fields, Published in: Trust, Security and Privacy in
Computing and Communications (TrustCom), 2014 IEEE 13th Inter-
national Conference
[22] Nan Liao, Xiaoxin Cui, Tian Wang, Kai Liao, Yewen Ni, Dunshan
Yu, Xiaole Cui, A High-efficient and Accurate Fault Model Aiming
at FPGA-based AES Cryptographic Applications, Published in: ASIC
(ASICON), 2015 IEEE 11th International Conference
[23] Sammy H. M. Kwok, Edmund Y . Lam, FPGA-based High-speed True
Random Number Generator for Cryptographic Applications, Published
in: TENCON 2006. 2006 IEEE Region 10 Conference
[24] Michael Kasper, Werner Schindler, Marc Stottinger, A Stochastic
Method for Security Evaluation of Cryptographic FPGA Implemen-
tations, Published in: Field-Programmable Technology (FPT), 2010
International Conference
[25] Junfeng Fan, Lejla Batina, Kazuo Sakiyama, Ingrid Verbauwhede,
FPGA Design for Algebraic Tori-Based Public-Key Cryptography,
Published in: Design, Automation and Test in Europe, 2008. DATE
’08
[26] Wael Adi, Rolf Ernst, Bassel Soudan, Abdulrahman Hanoun, VLSI
Design Exchange with Intellectual Property Protection in FPGA En-
vironment Using both Secret and Public-Key Cryptography, Published
in: Emerging VLSI Technologies and Architectures, 2006. IEEE
Computer Society Annual Symposium

Similar Posts