Development of Secure Mobile Communication using ...

4 downloads 6036 Views 1MB Size Report
Digital signature can be represented as a secure base in ... used to generate a digital signature. ... range of 318 Hz and 2000 Hz, down sampling the input to.
Development of Secure Mobile Communication using voice and fingerprint identify 1

G. Narayana, 2Naveen Kumar R, Prof.M.Padmavathamma3 1,2 Research Scholar, 3Professor 1,2,3 Dept. Of Computer Science,1,23 S.V.University – Tirupati 1 [email protected] [email protected] 3 [email protected]

Abstract: Mobile handheld device is a popular device that provides secure, private, authentic, and accurate communication and exchange of confidential information. In this paper we propose a technique to solve the authenticity problem in mobile communication. This technique is mainly based on the usage of the voice and finger print to identify both the speaker and the sender. This technique is simple, requires less calculation than other public/private key techniques, assures more authenticity than digital signature, and eliminates the need for a third party. Moreover, when applied to mobile phones, this technique resists any forge imposed by another party.

signature scheme, the original message is first mapped to a checksum, which is used to provide data integrity, by a one-way function. Then this checksum (message digest) is used to generate a digital signature. On the other hand, in message recovery based scheme, the receiver can recover the original message from the received signature that is performed by the message redundancy scheme. There also exists some work/theory on digital signatures for mobile devices. Conceptual Overview

Initial Voice

Sample Extraction Conversion

Keywords: Cryptography, Secure Telecommunication, fingerprint, Voice recognition

Storage

1. INTRODUCTION A lot of mobile users use messaging or calling as their main communication tool disregarding the safety level of such a communication system; if phones are lost or shared, anyone can access the data on the phone. This is known as the AUTHENTICITY in cryptography science. That is why; scientists should come up with a concept that minimizes the risk associated with losing or sharing a phone, thus offering a safe environment for communication. This paper presents a solution for the above mentioned problem. “Voice Identification Technique and fingerprint identification” is the most effective technique for solving such a problem. This technique works on the voice recognition basis whereby the phone can be accessed when it identifies the voice of the user(s) and Finger based scanning is one of the oldest methods used for verification.

2. RELATED WORK Digital signature can be represented as a secure base in such applications, because it provides authentication services. Traditional digital signature schemes are based on asymmetric cryptographic techniques, which make the signature computation very expensive. Although handheld devices are of many shapes and can be used for different purposes, they have some limitations; they have a limited computational capability and a short battery life. There are many proposed digital signature schemes in literature. According to their bases, these schemes can be classified into two general categories: message digest based schemes and message recovery based schemes. In message digest based digital

Fingerprinting

Same storage used for recalling hashed items

3. Fingerprint as a hash value Briefly, voice and fingerprint can be seen as a compressed summary of the corresponding voice object. Mathematically speaking, a finger mark function F maps the voice object X consisting from a large number of bits to a fingerprint of only a limited number of bits. So, the entire mechanism of extraction of a fingerprint can be seen as a hash-function H, which maps an object X to a hash (message digest). The main advantage why hash functions are so widely used in the field of computer science is that they allow comparing two large objects X, Y, by just comparing their respective hash values H(X), H(Y). The equality between latter pairs implies the equality between X, Y, with a very low error probability. Next, you can see a conceptual diagram of the hash function's functionality. Hash Functions Key

Hash bin

abc

101

def

313

ghi

876

jkl

89

mno

35

Here in the figure, string values act as keys, mapping into hash bins (integers), which are smaller by design and much easier to compare. Nevertheless, there is always a small probability that completely different keys will occupy the same bin (Steve and Mark collision). So, why do we need to develop an entirely new system based on fingerprints, and not just use cryptographic hash functions? The answer is pretty simple. Although the perpetual difference of an voice object compressed to amr format and the same one compressed to wave format is very small, the binary representation of those are totally different, meaning that H(amr(X)) and H(wave(X)) will result in completely dissimilar message digests (hash bins). Even worst, cryptographic hash functions are usually very sensitive: a single bit of difference in the original object results in a completely different hash value. Of course, that doesn't please us. We would like to have a system that will not take into account any low level details (binary signatures) and will analyze only the voice signal as any human does.

4. General Schema for Fingerprint Creation The framework which is going to be built will consist of several different conceptual parts. In the next figure, you can visualize the activity diagram which abstractly describes the logical flow of the fingerprint creation as following; I will describe in deeper details each activity involved and what component is responsible for it. Broadly speaking, the algorithm can be logically decomposed into two main parts: fingerprint creation (extracting unique perceptual features from the voice) and fingerprint lookup (querying the database). Because lots of the components are involved in both activities, they will be described together.

Input File

Preprocess the signal

4.1 PROCESSING THE INPUT SIGNAL

All voice and fingerprint files that any person has on his computer are encoded in some format. Therefore, the first step (1) in the algorithm is decoding the input file to PCM (Pulse Code Modulation format). The PCM format can be considered the raw digital format of an analog signal, in which the magnitude of the signal is sampled regularly at uniform intervals, with each sample being quantized to the nearest value within a range of digital steps. After decoding goes the Mono conversion and sampling rate reduction. Particularly, the sampling rate, or sampling frequency define the number of samples per second (or per other unit) taken from a continuous signal to make a discrete signal. For time-domain signals, the unit for sampling rate is 1/s. The inverse of the sampling frequency is the sampling period or sampling interval, which is the time between samples. As the analyzed frequency band in the algorithm lies within the range of 318 Hz and 2000 Hz, down sampling the input to 5512 Hz is considered a safe and, at the same time, a required operation. 4.2 SPECTROGRAM CREATION

After preprocessing the signal to 5512 Hz PCM, the next (2) step in the fingerprinting algorithm is building the spectrogram of the voice input. In digital signal processing, a spectrogram is an image that shows how the spectral density of a signal varies in time. Converting the signal to spectrogram involves several steps. First of all, the signal should be sliced into overlapping frames. Each frame should then be passed through a Fast Fourier Transform in order to get the spectral density varying in time domain. The parameters used in these transformation steps will be equal to those that have been found to work well in other voice and fingerprinting studies (specifically in A Highly Robust Voice and Fingerprinting System), voice frames that are 371 ms long (2048 samples), taken every 11.6 ms (64 samples), thus having an overlap of 31/32. The mechanism of slicing and framing the input signal can be visualized as follows:

Compute Spectrogram

Band filtering

Compute wavelets

Min Hash Transform

Group LSH tables

Store in database

4.3 BAND FILTERING

After the FFT transform is accomplished on each slice, the output spectrogram is cut such that 318Hz2000Hz domain is obtained. Generally, this domain can be considered to be one of the most relevant frequency spaces for Human Auditory System. While the range of frequencies that any individual can hear is largely related to environmental factors, the generally accepted standard range of audible frequencies is 20 to 20000Hz. Frequencies below 20Hz can usually be felt rather than

heard, assuming the amplitude of the vibration is high enough. Frequencies above 20000Hz can sometimes be sensed by young people, but high frequencies are the first to be affected by hearing loss due to age and/or prolonged exposure to very loud noises. In the next table, you can visualize the frequency domains and their main descriptions. As it can be seen, the most relevant voice content lies within the 512-8192Hz frequency range, as it defines normal speech. Frequency ranges: Frequency (Hz)

Octave

Description

16 – 32

1

The human threshold of feeling, and the lowest pedal notes of a pipe organ.

32 – 512

2–5

Rhythm frequencies, where the lower and upper bass notes lie.

512 – 2048

6–7

Defines human speech intelligibility, gives a horn-like or tinny quality to voice.

2048 – 8192

8–9

Gives presence to speech, where labial and fricative voices lie.

8192 – 16384

0

Brilliance, the voices of bells and 1 the ringing of cymbals. In speech, the voice of the letter "S" (8000-11000 Hz)

4.4 WAVELET DECOMPOSITION

Once the logarithmic spectrogram is "drawn" (having 32 bins spread over the 318-2000Hz range, for each 11.6 ms), the next step in the algorithm (4) is dividing the entire image into slices (spectral sub-images 128x32) that correspond to a 1.48 sec granularity. In terms of programming, after you build the log-spectrogram, you got its entire image encoded into an [N][32] doubledimensional float array, where N equals the "length of the signal in milliseconds divided by 11.6 ms", and 32 represents the number of frequency bands. Once you get the divided spectral sub-images, they will be further reduced by applying Haar wavelet decomposition on each of them. The use of wavelets in the voice-retrieval task is done due to their successful use in image retrieval (Fast Multiresolution Image Querying). In other words, a wavelet is a wave-like oscillation with amplitude that starts at zero, increases, and then decreases back to zero. It can typically be visualized as a "brief oscillation" like the one seen on a seismograph or heart monitor. Wavelets are crafted purposefully to have specific properties that make them useful for signal processing. They can be combined by using a "shift, multiply and sum" technique called convolution, with portions of an unknown signal to extract information from this signal. The fundamental idea behind wavelets is to analyze according to scale This method returns the actual fingerprints that will be further reduced in dimensionality through the use of the Min Hash + LSH technique. 4.5 MIN-HASHING THE FINGERPRINT

At this stage of processing, we have got fingerprints that are 8192 bits log, which we would like to further reduce in their length, by creating a compact representation of each item. Therefore, in the next step (5) of the algorithm, we explore the use of Min-Hash to compute sub-fingerprints for these sparse bit vectors. Its usage has proven to be effective in the efficient search of similar sets. It works as follows: think of a column as a set of rows in which the column has a 1 (consider C1 and C2 as two fingerprints). Then the similarity of two columns C1 and C2 is Sim(C1, C2) = (C1∩C2)/(C1∪ C2). Type A

C1 1

C2 1

B C D

0 1 0

1 0 0

Formula: sim(C1,c2)=a/(a+b+c) It is a number between 0 and 1; it is 0 when the two sets are disjoint, 1 when they are equal, and strictly between 0 and 1 otherwise. Before exploring how MinHash works, I'll point to a very useful characteristic that it has: the probability that MinHash(C1) = MinHash(C2) equals.

5.

OVERVIEW OF VOICE MATCHING ALGORITHM

IDENTITY

Voice identify verification is a quick, convenient, and effective method of establishing an individual’s identity. Among all the biometric techniques, Voice is recording, set of words or a wave pattern of a voice. It cannot be played back or used for any other purpose than a comparison with subsequent voice prints. Minutia extraction analyzes and identifies the key features of the voice patter such as the location and direction of the ridges. When the voice frequency is analyzed, the minutiae points are extracted and translated into a code that serves as a template. Templates usually have a size of between 40 and 1000 bytes, often around 256 bytes. The algorithms used to resolve voice identity matching. If voice identity matches the existing then voice is recognized otherwise it is rejected.

6. PROPOSED DESIGN As mentioned in the previous sections, the existing solutions to solve the authenticity problem are based on the calculation of the public and private keys. The main drawback of this solution is mainly related to its performance in terms of the execution time in the encryption and decryption and the limitation of the mobile device. To solve this drawback, our design is based on the Fingerprint process. The scenario is as follows: Design of Secure Mobile Communication using fingerprint

1. When a person buys a SIM Card from a seller, his or her voice should be taken and send to the database of the service provider and saved beside its number.

2. New options should be added to the mobile system; for instance talking with voice identity. 3. The first step in using this system is downloading the user’s voice form the service provider database and save it to the SIM Card. It is important to note here that in case of reselling the SIM Card, the voice can be removed. 4. When the caller/sender uses these options to authenticate him/herself for the receiver, the mobile system asks for the voice that matches with the one on the SIM Card. If the matching succeeds, a true symbol is added to the number and appears on the receiver mobile device to prove the authenticity of the caller or the sender (fig .4).

SMS with Voice identification Process List

List

About voice

About voice

Download voice

Download voice

SMS + voice

SMS + voice

Call with Voice identification Process Call

Call

About voice

Call

Download voice

Call + voice

Call + voice

Flowchart of the Proposed Design Bob wants to call alice with voice identity

The phone asks Bob to put his identity

CONCLUSION This paper introduces a new design to solve the authenticity problem in mobile communication. Fingerprinting using wavelets, as there are still things that can be improved. Specifically, I would pay more attention to the following topics: normalization procedures applied on the amplitude level of the voice and wavelets, MinHash random permutations generator, wavelet extraction. So it might take you a while to understand it. Also, if you consider improving the algorithm, you would rather want to take a look on the possibility of using neural networks as forgiving hash functions generator. There are many steps required in building the fingerprints, so it's not uncommon to lose the connections between all of them. In order to simplify the explanation, next you can see a generalized image that will help you in visualizing them all together. It is a full example of processing a 44100Hz, Stereo, .mp3 file (Prodigy - No Good). Specifically, the activity flow is as follows: 1. It can be seen from the image that the input file has two channels (stereo). In the first step, the voice is down sampled to 5512Hz, Mono. Pay attention that the shape of the input signal doesn't change. 2. Once the signal is down sampled, its corresponding spectrogram is built. 3. The spectrogram is logarithmized and divided into spectral images, which will be further representing the fingerprints themselves. 4. Each spectral-frame is transformed using wavelet decomposition, and Top-200 wavelets are kept in the original image. Note a distinctive pattern between consecutive fingerprints. Having similar wavelet signatures across time will help us in identifying the voice even if there is a small time misalignment between creational and query fingerprints. 5. Each fingerprint is encoded as a bit vector (length 8192). 6. Min-Hash algorithm is used in order to generate the corresponding signatures. 7. Finally, using Locality Sensitive Hashing, the Min-Hash signature for each fingerprint is spread in 25 hash tables, which will be later used in lookup. REFERENCES

Matching

The phone system adds true symbol to the number and calls alice

Alice sees Bob’s number with true symbol

[1] A. K. Jain, A. Ross, and U. Uludag, "Biometric template security: challenges and solutions," Proc. of the EuropeanSignal Processing Conference (EUSIPCO '05), Sep.2005 [2] A.K.Jain, U.Uludag, and R.K.Hsu, “Hiding a face in a fingerprint image,” Proc. of Intl Conf. on Pattern Recognition,vol.3, pp. 756-759,Aug. 2002. [3] D. Maltoni, D. Maio, A. K. Jain, and S. Prabhakar, Handbook of Fingerprint Recognition. Springer-Verlag, 2003 [4] U. Uludag and A. K. Jain, “Attacks on biometric systems: a case study in fingerprints,” in Proc. SPIE, Security,

Steganography and Watermarking of Multimedia Contents VI, vol. 5306, pp. 622–633, (San Jose, CA), January 2004. [5] A.K.Jain and U.Uludag, “Hiding biometric data,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no.11, pp. 1494-1498, Nov.2003 [6] N.K. Ratha, J.H. Connell, and R.M. Bolle, “An Analysis of Minutiae Matching Strength,” Proc. Third Int’l. Conf. Audio- and Video-Based Biometric Person Authentication, pp. 223-228, June 2001 [7]Wang Na, Zhang Chiya, Li Xia, Wang Yunjin,”Enhancing Iris-feature Security with steganography”, 2010 [8] J. Daugman. “High Confidence Visual Recognition of Persons by a Test of Statistical Independence”, IEEE Tans. Pattern Analysis and Machine Intelligence, vol.15, pp.11481161, 1993.2000