Steganography is the art and science of hiding information by embedding a message within another. Only the sender and the intended recipient of the message know that there is hidden information within.

The word steganography originates from the Greek words steganos (cover or protect) and graphein (write).

Steganography tries to hide the presence of message; cryptography tries to hide the contents of a message.

The ancient Greeks used the wooden backing of a wax tablet before applying the beeswax or the shaved head of a slave in order to hide their secret messages.

There are several possibilities how information can be hidden in digital media:

  • By using the noise in an image or a sound file
  • By spreading the information (over several pixels,…)
  • By adoption of a statistical model (e.g. q very often is followed by u in languages like english or german)
  • By miming real content (e.g. of a text)
  • By replacing randomness (with hidden information)
  • By changing order
  • By splitting the information into pieces (which can/should take different routes to it's destination)
  • By hiding the origin

There is no doubt that criminals are highly interested in using steganography in order to hide their communication – but that group is of course not the only one who might be using it.

This technology also can support the weak – e.g.:

  • Anonymously seeking advice
  • Anonymously informing authorities
  • Anonymously leaking information to the media
  • Anonymously joining a discussion without losing face/…

Further uses are:

  • Digital watermarks
  • Document tracking tools
  • Digital signatures embedded in documents

In the following text I will only concentrate on digital steganography and use the word steganography synonymously.

According to the theory of information hiding, the legitimate or official communication within sender and recipient is called covertext, the hidden information embedded is called stegotext.

A stegosystem is an algorithm to create a stegotext with an embedded message. The goal of this algorithm is to hide a stegotext in a way that it just looks like the covertext.

The more formal way of a definition of a stegosystem is:

Let $C$ be a distribution on a set C of covertexts. A stegosystem is a triple of probabilistic polynomial-time algorithms (SK, SE, SD) with the following properties.

  • The key generation algorithm SK takes as input the security parameter n and outputs a bit string sk, called the stego key.
  • The steganographic encoding algorithm SE takes as inputs the security parameter n, the stego key sk and a message m to be embedded and outputs an element c of the covertext space C, which is called stegotext. The algorithm may access the covertext distribution $C$.
  • The steganographic decoding algorithm SD takes as inputs the security parameter n, the stego key sk, and an element c of the covertext space C and outputs either a message m or a special symbol S. An output value of S indicates a decoding error, for example, when SD has determined that no message is embedded in c.

The probability that the decoding algorithm outputs the correct embedded message is called the reliability of a stegosystem.

References:

  1. Steganography (Wikipedia, the free encyclopedia)
  2. Pfitzmann, B. (1996). Information hiding terminology. Information Hiding, First International Workshop, Lecture Notes in Computer Science, vol. 1174, ed. R. Anderson. Springer, Berlin, pages 347-350
  3. Encyclopedia of Cryptography and Security, edited by Henk C. A. van Tilborg, ISBN-13: 978-0387-23473-1, 2005, pages 159-164
  4. Wayner, Peter (2009). Disappearing cryptography: Information hiding: Steganography & watermarking, 3rd ed., Elsevier Inc, Burlington, MA 01803, USA