MORE ADVANCED STEGANOGRAPHY USING BPCS

MORE ADVANCED STEGANOGRAPHY USING BPCS Rituraj Rusia 1, Munendra Kumar Mishra 2, R. K. Tiwari 3 1 Ph.D.(CS) Research Scholar, MGCGVV, Chitrakoot (MP) 2 Vindhya Institute of Technology and Science (VITS), Satna (MP) 3 Head of Dept. of Physics, Govt. M. S. Golvarkar New Science College, Rewa (MP) ABSTRACT: Steganography has advanced tremendously in the last few years and simple concepts have even been presented on mainstream TV. However, more sophisticated techniques are less well-known and may be overlooked by forensic analysts and even Steganalysis software. This presentation will showcase several more advanced steganographic techniques, some with a very high data hiding capacities. One technique successfully hides 15% to 20% of data in a jpeg and YOU can't tell! That means your 8 MB jpeg image may contain 1.6 MB of covert data! An audio CD contains about 700 MB of data even a modest 1% capacity allows for 7 MB of data. This paper presents several steganographic techniques illustrated by actual software so YOU can decide the effectiveness for yourself. Can you see or hear it? Will it be flagged by Steganalysis programs? We shall see or not! Keywords: Steganography, Cryptography, RSA, BPCS, Complexity, Image, Embedding, Extracting, EZW encoder [1] INTRODUCTION Information hiding is the science of concealing the existence of data even when it is being sought. Cryptography may very well conceal the meaning of the data, but in some cases, this is inadequate. Often times breaking unbreakable cryptography is as simple as a gun to the head or a briefcase full of money or both! Steganography is a sub-discipline of the broader science of information hiding and employs numerous technologies to achieve its goals: digital signal processing, cryptography, information theory, data compression, math, and human audio/visual perception, just to name a few. Steganography has two primary goals: 1) Security is the hidden data perceptible by either a person or a computer, and 2) Capacity how much data can be hidden in a given cover file [1]. These two goals are often in competition. The more data you hide, the more likely it is to be found, i.e. it has less security and vice versa. A third goal, robustness, is what separates steganography from watermarking (a 2nd sub-discipline of information hiding). Robustness is the resilience of your hidden data to image/audio manipulation such as contrast, brightness, cropping, stretching, analog-to-digital-to-analog conversion, etc. There is a large commercial interest in watermarking for digital rights management. Since there is also a trade-off between robustness and capacity, steganographic programs often do not attempt to be robust, and the techniques presented here are no exception. 264

More Advanced Steganography Using BPCS There are three levels of failure for steganography: 1) detection, 2) extraction, and 3) destruction [2]. When hidden data is detected, generally, game over. However, if the data cannot be extracted, your objective may still be met. Extraction can be made more difficult by encrypting and/or scrambling the message data. Preventing destruction refers to maintaining the integrity of the hidden data without significant damage to the cover file. Certainly, one could always delete or overwrite the file in question, but preventing an opponent from destroying your data while keeping the value in the digital work is a challenge. For steganography, once the algorithm is known, you can use the same algorithm to insert randomized data into the same bits that carry the message. Message destroyed, image no worse off. Finally, for the purpose of discussion, we can rate the perceptibility in three easy levels: 1) Indistinguishable, 2) can see/hear distortion when looking/listening closely for it, 3) blatantly obvious to a casual observer. A. Advantage of Steganography: The advantage of steganography, over cryptography alone, is that messages do not attract attention to themselves. Plainly visible encrypted messagesno matter how unbreakable-will arouse suspicion, and may in themselves be incriminating in countries where encryption is illegal. Therefore, whereas cryptography protects the contents of a message, steganography can be said to protect both messages and communicating parties. However, it can also pose serious problems because it's difficult to detect. Network surveillance and monitoring systems will not flag messages or files that contain steganographic data. Therefore, if someone attempted to steal confidential data, they could conceal it within another file and send it in an innocent looking email [3]. B. Conversion of Text Information: The meaningful text messages need to be translated into encrypted binary data stream. The data stream is used for the embedding of carrier image, which makes preparations for the steganography [4]. The conversion is shown as Figure-1. Figure-1: Conversion of text information C. RSA Algorithm: The RSA algorithm, named for its creators Ron Rivest, Adi Shamir, and Leonard Adleman, is currently one of the favorite public key encryption methods. RSA algorithm [5] is applied for the encryption of text information which is secure against a manin-the-middle attack. It is very simply to multiply numbers together, especially with computers. But it can be very difficult to factor numbers. Steps involved in RSA algorithm: 265

a). Key generation: 1. Choose two distinct prime numbers p and q. 2. Compute n = pq. 3. Compute φ(pq) = (p 1)(q 1). (φ is Euler's totient function). 4. Choose an integer e such that 1 < e < φ(pq), and e and φ(pq) share no divisors other than 1 (i.e., e and φ(pq) are coprime). e is released as the public key exponent. 5. Determine d (using modular arithmetic) which satisfies the congruence relation d.e = 1 (mod ϕ (pq)) d is kept as the private key exponent. The public key consists of the modulus and the public (or encryption) exponent. The private key consists of the private (or decryption) exponent which must be kept secret. b). Encryption: c = me mod n (Where m is an integer 0<m<n) c). Decryption: m = cd mod n d). Chaos Theory: Chaos is a kind of behavior about nonlinear dynamics law control. This paper adopts Logistic mapping method to generate chaotic sequence: α k+1 =µ.a k. (1-α k ), k=0, 1, 2.. The value traverses in the interval [0, 1], and µ is a control parameter or a bifurcation parameter. When 3.5699456. <µ<=4, the logistic map works in chaotic state. The data stream gene- rated is disordered, and it s similar to random noise. The new binary sequence, which is the binarization of acquired chaotic sequence, has two main functions in this paper. 1. It is used to the encryption of text data information, steganography. 2. It is used to stimulate the binary data stream, which can facilitate the process of various experiments. [2] BIT PLANE COMPLEXITY SEGMENTATION (BPCS) Bit Plane Complexity Segmentation (BPCS) was introduced in 1998 by Eiji Kawaguchi and Richard O. Eason [6] to overcome the shortcomings of the traditional Least Significant Bit (LSB) manipulation techniques [7]. While the LSB manipulation technique works very well for most gray scale and RGB color images, it is severely crippled by its limitation in capacity, which is restricted to about one-eighth the size of the base image. BPCS is based on the simple idea that the higher bit planes could also be used for embedding information provided they are hidden in seemingly "complex" regions. BPCS is another substitution type method, but rather than replacing specific bits, BPCS scans for complex areas of an image, and replaces those with the message data. The idea is that a human cannot distinguish between one complex patch and another complex patch [8]. 266

More Advanced Steganography Using BPCS Figure-2: Randomized data patches Certainly, looking at these images side-by-side and comparing you can see differences. But, if you were to look at one as a small piece of a larger image, and it was later replaced by the other, you would likely not notice a difference. These images are large, 512 x 512, but the BPCS algorithm uses 8 x 8 patches, making perceptible detection even less likely. BPCS segments an image into bit planes, and in each plane, the value is either zero or one. Then BPCS scans an 8x8 patch and determines the complexity. How much change is there? For instance, a pure black or pure white patch has zero complexity, i.e. no change. A checkerboard pattern of alternating black and white, has the maximum complexity there are 112 changes when scanned by row, then by column. A simply complexity measure is to divide the number of changes in the image sample, by the maximum, and get a value from 0 to 1. Experimentally, a good threshold was determined to be 0.3. (It MUST be less than 0.5) So, if there are at least 34 changes (34/112 = 0.305), then the image sample is complex and we can hide our data there. If the threshold is not met, BPCS continues to the next 8x8 matrix and leaves that patch unchanged. Next, the 64 bits are replaced by the message data. Now the problem is this: What if the message data is not complex? During extraction, the program will skip this bit plane. The solution is to conjugate the data by exclusive or ing it with a checkerboard pattern. The conjugate complexity is always one minus the complexity of the non-conjugate data. This is why the threshold MUST be less than 0.5, otherwise the conjugation solution would not work (if the threshold is 0.7, and the message data s complexity is 0.6, you cannot conjugate it to meet the threshold. Now, you must indicate which data is conjugated. The solution in the original paper was to use one bit in the 8x8 matrix to indicate if it is conjugated. Other solutions have been proposed, but this one is simple and effective. Figure-3: Conjugation Example, P is non-complex data, Wc is checkerboard pattern, and P* is the result of conjugation 267

The following images illustrate BPCS in action. The histograms show that this technique can be easily detected statistically. Figure-4: Original image of a Baboon, image with hidden data and a threshold of 0.3 Figure-5: Image with hidden data and a threshold of 0.2 and 0.1 respectively Better complexity measures have been developed since the original inception. These reduce capacity, but prevent highly patterned patches from being considered complex. For instance, a checkerboard pattern is complex, but if modified, humans will perceive the change in pattern. Figure-6: Original image histogram, image with a threshold of 0.3 histogram 268

More Advanced Steganography Using BPCS Figure-7: Histograms of images with a threshold of 0.2 and 0.1 respectively High Capacity Hiding in JPEG Images (JPEG) Jpeg files require a completely different hiding approach than altering bits in the cover file, as these bits will be distorted by the lossy compression process. Before discussing hiding, a brief overview of the compression process is required. Jpeg is designed to work best with 24-bit natural color images, but can also work with grayscale images too. Jpeg examines an image in 8x8 blocks of pixels, does a color plane conversion from RGB (red, green, blue) to YCrCb (luminance and chrominance), applies a discrete cosine transform, quantizes the results (primary source of loss is right here), and entropy encodes the rest. Figure-8: JPEG compression process On computer and television screens, the smallest division of color data is a pixel. In computer memory, each pixel is represented by a binary value. The more bits that are used to represent each value, the wider the range of colors is for each pixel. Typical amounts of bits per pixel (bpp) are 8, 24, and 32. With these binary pixel values, and knowledge of which part of the picture each one represents, we can construct bit planes. A bit plane is a data structure made from all the bits of a certain significant position from each binary digit, with the spacial location preserved. In Figure 901, position (0, 0) from bit plane 2 is bit 2 from pixel (0, 0) in the image. BPCS addresses the embedding limit by working to disguise the visual artifacts that are produced by the steganographic process. Optometric studies have shown that the human visual 269

system is very good at spotting anomalies in areas of homogenous color, but less adept at seeing them in visually complex areas. When an image is deconstructed into bit planes, the complexity of each region can be measured. Areas of low complexity such as homogenous color or simple shapes appear as uniform areas with very few changes between one and zero. Complex areas such as a picture of a forest would appear as noise-like regions with many changes between one and zero. These random-seeming regions in each bit plane can then be replaced with hidden data, which is ideally also noise-like. Because it is difficult for the human eye to distinguish differences between the two noise-like areas, we are able to disguise the changes to the image. Additionally, since complex areas of an image tend to be complex through many of their bit planes, much more data can be embedded with this technique than with those that are limited to only the lowest planes. Figure-9: Image pixel location (0,0) has the binary value 01001110. In these bit planes, black is a 0 and white is a 1. In the first bit plane in the figure, position (0,0), there is a black zero. In the second bit plane, there is a white one, and so on down to the last bit plane. Figure-10: Noise-like patch (a) informative patch (b): (a) complexity 69, (b) complexity 29. In BPCS, the complexity of each subsection of a bit plane is defined as the number of non-edge transitions from 1 to 0 and 0 to 1, both horizontally and vertically. Thus the complexity of each section is not determined only by the number of ones or zeros it contains. Generally, for any square of 2nx2n pixels, the maximum complexity is 2x2nx(2n-1) and the minimum is of course 0. Most versions of image BPCS use an 8 pixel square, where the maximum complexity is 112. In Figure-10, white represents a one and black a zero. 270

More Advanced Steganography Using BPCS Both squares or patches, have the same number of ones and zeros, but very different complexities. This shows that one contains much more visual information than the other. The complex patch (A) has very little visually informative information, therefore it can be replaced with secret date and with a very low effect on the image s quality. However, if the more visually informative patch (B) was replaced, it would cause noise-like distortion of the definite edges and shapes. This technique works very well with natural images, as they tend to have many areas of high complexity. Images with many complex textures and well shaded objects are usually have a high embedded data capacity. BPCS works much less well with computer generated images and line art, as those classes of images tend to have large areas of uniformity and sharply defined border areas. With these types of images, there is very little complexity to exploit and any changes tend to generate very obvious artifacts. This is one flaw BPCS shares with traditional steganography, though for slightly different reasons. Traditional steganography works poorly with computer generated pictures because the static distortion effect produced by embedding is very obvious in areas of homogenous color. Another shared flaw is fragility of the secret data with respect to changes in the post-embedding image. Any lossy compression will corrupt the hidden data, as will most transformations and filters. Since this makes the hidden data very vulnerable to any destructive attack, BPCS is almost useless for watermarking purposes. Despite these drawbacks, BPCS is very effective. With visually complex images, embedding rates of 30% to 50% are possible with low degradation. Even at high embedding rates, the artifacts generated are often overlooked because they are disguised in complex visual areas. This research proposes a way to combine BPCS with wavelet image compression and EZW encoding to create a system ideal for Internet use. The merits of BPCS-Steganography are as follows. [9] 1) The information hiding capacity of a true color image is around 50%. 2) A sharpening operation on the dummy image increases the embedding capacity quite a bit. 3) Randomization of the secret data by a compression operation makes the embedded data more intangible. 4) Customization of a BPCS - Steganography program for each user is easy. It further protects against eavesdropping on the embedded information. 5) It is most secured technique and provides high security. 6) Tography, DeStagnography, Mail, and File Format Conversion on a thin client. 7) Less prone to typical attacks, viruses, worms, unpatched clients, vulnerabilities 8) Sensitive data stored on secure servers rather than scattered across multiple potentially unprotected and vulnerable clients (e.g. smart phones and laptops). 9) Encrypted transmission of all data between server and clients. 10) Software Management features (above) accommodate quick and easy application of security advisories on server side. 271

[3] WAVELET COMPRESSION AND THE EZW ENCODER The original JPEG standard made use of the block Discrete Cosine Transform (DCT) for its compression. This standard has been in wide use for some time, having gained popularity in part because of the demand for a good standard compression scheme to speed the download of images from the newly popular World Wide Web. At the same time, wavelet image compression was in the early stages of research and beginning to gain acceptance in the academic community. After being refined, wavelet techniques achieved even better compression than the DCT, with fewer artifacts and distortions. There have been great advances in the field of wavelet compression within recent years and many of today s best image, audio, and video COmpressor/DECompressors (CODECs) are based on wavelets. The Discrete Wavelet Transform, when used on images, generally creates a lossy representation of that image. The image can then be reconstructed from the transform coefficients by using the inverse DWT. The coefficients produced have some image-like properties, which are exploited in many encoders, and which are used by EZW BPCS as explained in the next section. Figure-11 describes how one property of the DWT coefficients is exploited to improve encoding efficiency. The seminal paper on wavelet image compression is the 1993 paper Embedded Image Coding Using Zerotrees of Wavelet Coefficients [10] by J.M. Shapiro. The EZW encoder is simple, fast and provides very good compression rates. It takes advantage of the correlations between subbands in a wavelet coefficient set to lower the amount of bits needed to represent them. The successive approximation method used by EZW encodes the wavelet coefficients one bit plane at a time, starting with the most significant bits. The encoding is lossless as long as all bit planes are processed. However, significant compression can be achieved by not encoding all the bit planes. In fact, with many images, a very good representation can be achieved by using only half of the available planes. A custom fast recursive indexed EZW encoder was written for this project which trades extra memory usage for an increase in speed. Figure-11: This figure shows the correlation between subbands in a wavelet coefficient set, and how EZW exploits it. A Zerotree the first highlighted pixel in the upper left corner would mean that all the other highlighted pixels could be represented with only one symbol. 272

More Advanced Steganography Using BPCS [3.1] EZW BASED BPCS This research proposes a method of embedding secret data into a DWT transformed image using the previously described BPCS. The coefficients of the DWT have many imagelike properties, and BPCS is ideal for exploiting them. The main properties leveraged for BPCS are: Correspondence: Special areas in each section of the coefficients subbands correspond directly to areas in the original image. Complexity: The bit planes at corresponding significance levels of the wavelet coefficients and the original image are usually proportionally complex. Resilience: Changes in the values of the wavelet coefficients do not create disproportionately large changes in the reconstructed image. The property of correspondence states that in each subband of the wavelet coefficients, any sub section of that subband directly corresponds to a section of the original image. This correspondence is of course proportional, as the subbands decrease in size by a factor of two with each pass of the DWT. For example, an 8x8 patch of pixels in the original image corresponds to a 4x4 patch of pixels in the largest subband. This allows the same complexity metrics to be used on the wavelet coefficients as are used on the original image. In the wavelet coefficients, the complexity of any sub section is related to the complexity of the corresponding sub section of the original image. While the amount of complexity in the wavelet coefficients is very important, the distribution of the complexity is also important. In the wavelet coefficients, the bits are ordered in decreasing significance, just as in the original image. Because of this, bit planes tend to become more complex towards the least significant bits. This is good for BPCS because this is where changes will have the smallest impact. The capacity of a container image is limited not only by its complexity, but by the decoder s resilience to changes made in the coefficients. Resilience indicates the ability of the wavelet coefficients to absorb changes in value without changing the final image. The more resilient they are, the more changes that can be made and thus the more data that can be embedded. The inverse DWT is quite resilient to small changes in the coefficient values, and large changes experience a blending and blurring effect from the smoothing nature of the wavelet transform. This property is extremely useful for BPCS, as many slight changes in the coefficients are blended out and result in very little visual impact on the reconstructed image. However, there are many differences between the wavelet coefficients and the original image that they are generated from. The main difference that has to be accounted for is the subband structure. Normally in BPCS, an 8x8 square block of pixels called a patch is used for complexity measuring and embedding. In wavelet coefficients, the largest subband sections are only 1/4 the size of the original image. Embedding into an 8x8 block of pixels in this subband would be like embedding into a 16x16 block in the original image. A smaller block size of 4x4 can be used to compensate for this. Unfortunately, a 4x4 patch has a much smaller complexity range, 24 compared to the 8x8 maximum of 112. The smaller range results in a much coarser change gradient in the amount of both distortion artifacts and embedding capacity. However, at proportional complexity 273

values, the 4x4 patches seem to provide better overall results for both distortion and capacity than the 8x8 patches. Another difference has to do with non-uniform significance across the sub bands within each bit plane. The smaller subbands respond differently to changes than the larger subbands. A subband based weighting scheme was devised to increase the complexity level required for embedding in the more significant subbands. This decreased the embedding potential for each bit plane, but resulted in vastly reduced visual distortion. An important but easily overlooked permutation is the quality of the secret data to be embedded. If the data is not random-seeming, the distortion of the output image may be greatly increased. Replacing a complex, noise-like area in an image with data that is all zeros or all ones would result in a high amount of distortion. Encryption and compression of the data before embedding solves this problem, and also allows for much greater amounts of data to be embedded. If the data is random-seeming, the post-embedding encoding will not be able to compress it much. Because of this, the final output file size usually increases by about the same amount as the size of the data that is embedded. However, the larger the file, the better the arithmetic encoder performs, so this increase varies depending on how much data is embedded. Below is the embedding algorithm (see Figure-12): 1: The image is converted into raw pixel values. 2: The DWT is applied to the image. 3: The Wavelet coefficients are encoded to the desired resolution by the EZW encoder. 4: The Wavelet coefficient bit planes are decoded and reconstructed to the encoded resolution. 5: BPCS is performed on the Wavelet coefficients. 6: The Wavelet coefficients are re-encoded to the previous resolution by the EZW encoder. 7: The EZW file is arithmetically encoded. Figure-12: The algorithm in graphical form. The arrows represent input/output data streams. EZW is a progressive encoder, so it encodes one bit plane at a time. Steps 3 and 4 are performed so that portions of the final bit plane can be omitted, so as to meet bpp requirements. It is also very easy to construct bit planes from the encoded coefficients during the decoding process. Any transform like the DWT that results in coefficients with progressively significant bits can be used, as long as a suitable complexity metric can be 274

More Advanced Steganography Using BPCS found. The arithmetic encoding in step 7 is useful for further compressing the final symbolic output. Theoretically, the image can be of any color depth or aspect ratio. A trivial way to extend this process to 24bpp RGB color would be to separate the pixel values into three 8bit color spaces. The same algorithm as above could then applied from step 2. An additional step would be needed after step 7, to join the three separate streams into a final output file. Using an RGB image would yield a much higher embedding capacity as well as a much larger post embedding file than 8bpp grayscale. Applying this algorithm to 32bpp RGBA color would be more difficult. The changes in the Alpha values could cause very obvious transparency problems, with point distortions being very easy to see. Also, Alpha values tend not to be very complex. Most times they define large regions of the image to be either completely transparent or completely opaque. The second most common use is to define a smooth gradient of transparency. In either case, the Alpha channel would likely have a low embedding capacity, and be prone to producing large distortions. It would likely be best to embed in only the RGB color space and to ignore the Alpha channel completely. [4] PERFORMANCE MEASUREMENT PARAMETERS The performance of various steganographic methods can be rated by three main parameters, (i) capacity, (ii) security, and (iii) imperceptibility [11] [12]. Recently, two more parameters, (iv) temper resistance, and (v) computational complexity are also introduced in literature [13]. The hiding capacity refers to the maximum amount of information that can be hidden in the image. It is represented in bits per byte, or bits per pixel, or in total as number of bytes, or number of kilo bytes. It should be as high as possible. Security means the ability to survive from transformations like cropping, scaling, filtering, addition of noise, and from different attacks. The different attacks are, (i) steganography-only attack, (ii) known-carrier attack, (iii) chosen steganography attack and (iv) known steganography attack [14]. In steganography-only attack, only the stego-image is available to the intruder for analysis. In known-carrier attack, both the original image and stego-image are available to the intruder for analysis. In chosen steganography attack, the steganographic algorithm is available to the intruder along with the known message. In fourth category i.e. the known steganography attack, the original image, stego-image and the steganography algorithm are available to the intruder for analysis. A good steganographic technique should escape from all these varieties of attacks. Imperceptibility refers to perceptual transparency i.e. no visual artifacts on the stegoimage. It should be as high as possible. Temper resistance means the survival of the embedded data in the stego-image when attempt is done to modify it. Finally, computational complexity refers to the computational cost of embedding and extraction. It should be as low as possible. The distortion in the stego-image can be measured by the parameters like, mean square error (MSE), peak signal-to-noise ratio (PSNR), and correlation (r) [15]. The lesser distortion means, lesser MSE, but higher PSNR. If p is an M N grayscale image and q is its stego-image, then the MSE and PSNR values are computed using (1) and (2). The p ij and q ij are the original image pixel value and the stego-image pixel value at i th row and j th column respectively. For 275

color images a pixel comprises of 3 bytes. Each byte can be treated as a pixel and the same equations can be used to calculate the MSE and PSNR. ( ) ( ) ( ) ( ( ) ) ( ( ) ) The C max represents the actual maximum pixel value in the image. PSNR values falling below 30 db indicate a fairly low quality stego-image i.e. distortion caused by embedding is severe. However a high quality stego-image should possess the PSNR value more than 40 db [16]. The correlation, denoted by the letter r, is a measure of the similarity between the original image and the stego-image. It is measured using (3). The p and q are the average pixel value in original image and stego-image respectively. The MATLAB and SCILAB has built in function corr2(p, q) evaluates the correlation between the cover image, p and the stego-image, q. The maximum value of corr2(p, q) can be 1, if and are the same images. So if distortion is lesser, then the correlation can be higher. It has been experimentally investigated that stego-images bearing secret message, are statistically natural images [17]. By adding the unnatural message inside a natural image, there is a change in statistics, but this change is so small, and thus does not allow for reliable detection. Every year new steganographic techniques are evolved, at the same time the new types of attacks are also introduced. As per information theory the entropy measure can be an attack [18]. The entropy of the stego-image, S; will be equal to the entropy of the cover image, C; plus the entropy of the embedded data, E. [5] EXPERIMENTAL RESULTS The process described in the previous sections was implemented and tested on several standard images including Girl and Dog. The initial results are very good, and further refinements to the technique should be able to boost embedding rates even more. At embedding levels up to 25% of the final compressed image size, there is an absence of large, obvious distortions in the post embedding image. At higher levels, artifacts that do appear tend to show as a blurred mottling of the image in the areas of high complexity [19]. When more significant bit planes are used, there are sometimes point irregularities where a section with a low amplitude trend has a high amplitude spike caused by embedding. This spike shows as a light spot on the darker background, or vice versus. The spot fades into the surroundings, but is sometimes noticeable. As in regular steganography, for this reason as well as others, use of the higher bit planes should be limited. Of the four algorithms used (DWT, EZW, BPCS, and Arithmetic Encoding) BPCS is the most computationally intensive. Even so, on a middle range 333MHZ Pentium running Red Hat Linux 7, the average time for the entire process from step 1 to step 7 in section 5 takes less than 10 seconds. On a 400MHZ G3 Apple Powerbook, the process takes less than 5 seconds [20]. As with many techniques, in BPCS steganography 276

More Advanced Steganography Using BPCS memory usage can be traded for speed. With minor optimizations for memory, peak memory usage was cut in half, while the time taken was less than doubled. The results listed are from an implementation of the EZW BPCS technique. The Moffat [21] adaptive arithmetic encoder was used as the final step in Figure-12. The system was tested on 8bpp greyscale images, which were 512x512 pixels in size [22]. The 5 most significant bit planes were not used, as experiments show that these planes tend to have both low embedding capacity and low resilience to change. Also in these results, the 4x4 patch size was used as it gave better performance. Table-1 is of Girl, encoded with 8 and 9 bit planes, respectively. Figures-13 and 14 show the post embedding file for two of the results listed in the table. Generally, the system was able to achieve embedding rates of 20% to 25% with little or no obvious degradation in image quality. Rates of over 50% were attained but some distortion was observed. Figure-13: Test image Girl, 3 of 8 bit planes used, with a complexity threshold of 6. Image Figure-14: Test image Girl, 3 of 8 bit planes used, with a complexity threshold of 2. #planes for embedding #planes used threshold Complexity Embedded data (bytes) Compressed (bytes) PSNR (db) (a) 8 19333 35.9 (b) 8 3 6 5104 24219 31.6 (c) 8 3 4 7216 26599 30.3 (d) 8 3 2 9934 30482 29.1 (e) 9 37630 39.1 (f) 9 4 6 12442 47791 33.0 (g) 9 4 4 16750 52073 31.5 (h) 9 4 2 22908 60032 30.0 Table-1: Experimental results for Girl 277

[6] DESIGN AND IMPLEMENT The technique of improved steganography text based on Chaos, RSA and BPCS designs [4] as Figure-15. In order to preserve the message data we insert, we must hide after the lossy part of the compression. Since this is after quantization, we choose to hide in the resulting quantized DCT components. This technique boasts a solid capacity of 15 to 20% for a high quality jpeg image. Interestingly, at lower qualities, the alterations are easily noticed as is illustrated by the sample images. This technique is essentially an adaptive LSB method, for the DCT coefficients. The log2 of the value of the magnitude of the DCT coefficient is compared to the log2 of an alpha factor times the corresponding value in the quantization table. The lesser of these two values is the number of bits that can be hidden. That number of bits in the DCT is replaced by an equal number of bits from the message. There are a couple of additional considerations. The DC component of the DCT results is altered less, as it is more significant and therefore we don t want to change it as much. We also employ a block classification routine to increase capacity. By finding blocks that are less uniform (i.e. more complex), we can adapt the number of bits to hide a busy picture is still a better cover file than a uniform one. However, the results are not nearly so pronounced as in the LSB or BPCS techniques. Both of the next two images have roughly 22% of their data replaced with message data. Can YOU tell there is anything amiss? Large images are presented so you can take a really close look. Figure-15: The technique of improved steganography text based on chaos, RSA and BPCS Figure-16: Baboon with 22% hidden data, 95% quality 278

More Advanced Steganography Using BPCS Figure-17: Dog with 22% hidden data, 95% quality The quality of the jpeg image is an important consideration with this technique. IT works well roughly down to 60 65% quality, lower than that, there is noticeable distortion. The next image has only 18% of hidden data at 50% quality. Figure-18: Dog with 18% hidden data, 50% quality In order to detect this type of steganography, you must examine the DCT coefficients themselves, a simply histogram of the file is not effective, as seen next. There is nothing distinct about the histograms with or without the hidden data. [7] APPLICATIONS In discussing applications of BPCS Steganography, it is instructive to note that it differs from digital watermarking in two fundamental ways. The first is that for full color (e.g., 24-bit) images, it has a very large embedding capacity. As described previously, our experiments with BMP images have shown capacities exceeding 50% of the original image size. Although the results presented in this paper are for 24-bit images, we have also been working with other formats, such as 256 color images, which utilize a palette. Although the capacity is lower, the same concepts can be applied. The second difference is that BPCS Steganography is not robust to even small changes in the image. This can be viewed as a good thing in applications where an unknowing user might acquire an embedded image. Any alteration, such as clipping, sharpening or lossy compression, would "destroy the evidence" and make it unusable for later extraction. Extracting the embedded information requires a deliberate attempt by a knowledgeable user on an unaltered image. The lack of robustness also ties in to the fact that a malicious user cannot alter the embedded data without knowledge of the customization parameters. 279

The more obvious applications of BPCS Steganography relate to secret communications. For example a person, group, or company can have a web page containing secret information meant for another. Anyone can download the web page, so when the intended recipient does so, it does not draw any attention. Extracting the embedded information would require software customized with the proper parameters. Encryption of the embedded data would further improve security. This scenario is analogous to putting something in a very secure safe and then hiding the safe in a hard to find place. In some applications, the presence of the embedded data may be known, but without the customization parameters, the data is inseparable from the image. In such cases, the image can be viewable by regular means, but the data is tied to the image and can't readily be replaced with other data. Others may know the data is there, but without the customization parameters, they cannot alter it and still make it readable by the customized software. Applications of BPCS Steganography are not limited to those related to secrecy. For such applications, the presence of the embedded data may be known, and the software for extraction and embedding can be standardized to a common set of customization parameters. An example of this is a digital photo album, where information related to a photo, such as date and time taken, exposure parameters, and scene content, can be embedded in the photo itself. [8] CONCLUSIONS AND FUTURE WORK In 2012 I mined over thousands of steganography papers alone and that number was a small fraction of the number of papers on watermarking. Several steganographic techniques can successfully hide/extract arbitrary data and remain visually undetectable. The recent revelation that Russian spies used steganography to communicate only highlights the need for continues research. These programs are a stepping stone to truly sophisticated and nearly undetectable steganography. The objective of this paper was to demonstrate our BPCS-Steganography, which is based on a property of the human visual system. The most important point for this technique is that humans can t see any information in the bit-planes of a color image if it is very complex. We have discussed the following points and showed our experiments. (1) We can categorize the bit-planes of a natural image as informative areas and noiselike areas by the complexity thresholding. (2) Humans see informative information only in a very simple binary pattern. (3) We can replace complex regions with secret information in the bit-planes of a natural image without changing the image quality. This leads to our BPCS-Steganography. (4) Gray coding provides a better means of identifying which regions of the higher bit planes can be embedded. (5) A BPCS-Steganography program can be customized for each user. Thus it guarantees secret Internet communication. We are very convinced that this steganography is a very strong information security technique, especially when combined with encrypted embedded data. Furthermore, it can be applied to areas other than secret communication. Future research will include the application to vessels other than 24-bit images, identifying and formalizing the customization parameters, and developing new applications. 280

More Advanced Steganography Using BPCS REFERENCES [1] William Stallings; Cryptography and Network Security: Principals and Practice, Prentice Hall international, Inc.; 2002. [2] Oded Goldreich; Foundations of Cryptography, China Machine Press, 2003. [3] Jae K. Shim, Anique A. Qureshi and Joel G. Siegel, The International Handbook of Computer Security, Glenlake Publishing Company, Ltd., Glenlake Publishing Company, Ltd., 2000. [4] Pradnya R. Rudramath, M. R. Madki, International Journal of Scientific and Research Publications (IJSRP), Volume 2, Issue 7, July 2012. [5] http://www.datahide.com/bpcse/qtechhv-program-e.html [6] Eiji Kawaguchi, Richard O. Eason, "Principle and Applications of BPCS- Steganography." SPIE's International Symposium on Voice, Video and Data Communications, (1998-11). [7] Eiji Kawaguchi and Richard O. Eason, Principle and Applications of BPCS- Steganography, Kyushu Institute of Technology, Kitakyushu, Japan University of Maine, Orono, Maine. [8] Neil F. Johnson, Zoran Duric, Sushil Jajodia, Information hiding: Steganography and Watermarking- Attacks and Countermeasures, Kluwer Academic Publishers, 2001. [9] Sheetal Mehta, Kaveri Dighe, Meera Jagtap, Anju Ekre, IEEE paper on Web Based BPCS Steganography- IJCTEE, Volume-2, Issue-2, April 2012. [10] J. M. Shapiro. Embedded Image Coding Using Zerotrees of Wavelet Coefficients, IEEE Transactions on Signal Processing, pp. 3445-62, 1993. [11] B. Li, J. He, J. Huang, and Y.Q. Shi, A survey on image steganography and steganalysis, Journal of Information Hiding and Multimedia Signal processing, Vol.2, no.2, pp.142-172, 2011. [12] N. Hamid, A. Yahya, R.B. Ahmad, D. Nejim, and L. Kannon, Steganography in image files: a survey, Australian Journal of Basic and Applied Sciences, Vol.7, no.1, pp.35-55, 2013. [13] M. Hussain A survey of image steganography techniques, International Journal of Advanced Science and Technology, Vol. 54, pp.113-123, 2013. [14] A. Bhatacharya, I. Banerjee, and G. Sanyal, A survey of steganography and steganalysis techniques in image, text, audio and video cover carrier, Journal of Global Research in Computer Science, Vol.2, no.4, pp.1-16, 2011. [15] A. P. S. Pharwaha, Secure data communication using moderate bit substitution for data hiding with three layer security, IE(I) Journal-ET, Vol.91, pp.45-50, 2010. [16] A. Cheddad, J. Condell, K. Curran, and P.M. Kevitt, Digital image steganography: survey and analysis of current methods, Signal Processing, Vol. 90, pp.727-752, 2010. [17] A. Martin, G. Sapiro, and G. Seroussi, Is image steganography natural, IEEE Transactions on Image Processing, Vol.14, no.12, pp.2040-2050, 2005. [18] R. J. Anderson, and F. A. P. Petitcolas, On the limits of steganography, IEEE Journal of Selected Areas in Communications, Vol.16, no.4, pp.474-481, 1998. [19] Kawaguchi, E. and Taniguchi, R., Complexity of binary pictures and image thresholding - An application of DF- Expression to the thresholding problem, Proceedings of 8th ICPR, Vol.2, pp.1221-1225, 1986. 281

[20] Kawaguchi, E. and Niimi M, Modeling Digital Image into Informative and Noise-Like Regions by Complexity Measure, Preprint of the 7th European-Japanese Conference on Information Modeling and Knowledge Bases, pp.268-278, May, Toulouse, 1997. [21] A. Moffat, R. Neal, I. H. Witten. Arithmetic Coding Revisited, ACM Transactions on Information Systems, 16(3):256-294, 1998. [22] Ian H. Witten, Radford M. Neal, John G. Cleary, "Arithmetic Coding for data Compression," Communications of the ACM, Volume 30, No. 6, June 1987. 282