Every so often a University lecture or somebody selling the relevant technology will make the argument that those in the electronic discovery sector are missing critical evidence from their cases because they are not looking for information that has been hidden and encrypted using technology known as Steganography: Steganography involves the hiding of a document inside another document, often in a picture or music file.
Is this really a problem? Are major cases being lost on a daily basis due to stenography? Are lawyers culling and reading millions of documents and saying “Wait! I think somebody has hidden an email in picture, we need to find it and decode“?
Before “No” is screamed out by all of those in the industry, we should first look at what electronic discovery is.
Electronic Discovery is the collection and review of large amounts of documents, could be tens of thousands or tens of millions. The collection is normally conducted by specialist companies who then cull the data, through searches and filters, and then load the data into a review platform for the legal team to read. The legal team then review the files, on mass identifying the ones of relevance to the case. The scale of these reviews is massive, and dwarfs anything in a computer forensics cases.
The aim of these investigations is not to find a single child abuse image, or deleted evidence of an individual being involved in terrorism, rather it is the review of a companies documents to find out how the company was behaving, what decisions were being made. Rarely is there a “smoking gun”, in fact when law firms have finished their review thousands of documents of relevance are often found, possibly tens of thousands.
The company being investigated may have disk encryption and/or email encryption in use, but the ediscovery company will know this because they will be working with the IT dept for the company, so can get around both encryption problems.
The idea that board of directors, whose job it is to run multinational companies are routinely sending company documents hidden in a picture are quite ridiculous. Especially when you consider these documents which are also saved, unencrypted on the company backup server and staff are expected to work at home, on a train, and most commonly on their blackberry – therefore using steganography on a day to day basis would not make it an easy working environment.
To add further weight to the argument that encryption of this nature is not relevant is the very the nature and time scale of these investigations. Many e-discovery projects will relate to a deal that has gone sour, a breach of contract, or a complaint about actions that were 3 to 5 years ago. Those involved in the investigation almost certainly did not expect their company to be sued 5 years later (even if they are still with the company); so why would they use such super secret encryption on their day to day work? In fact in many cases those involved in the investigation are the ones providing documents, and the legal team are dependent on them to hand over information and help the review.
Even if a company did decide to use steganography internally how would they manage the documents? Imagine the scene in the office on a daily basis:
John: Hey Dave, I am looking for the spreadsheet of the accounts for Q1 of 2006, do you know where it is?Dave: Is it in the picture called “Beach 2007”?
John: Nope, that sales forecast for 2008
Dave: What about the MP3 “OOps I did it again”, by Ms Spears?
John: Nope, that’s our Sarbanes Oxley policy
John: Our wait, I have found it, its in the picture “Fishing Trip.gif” NOT “fishtrip.jpg”, any idea what the password was?
Dave: Nope, sorry, no idea.
Even if the issues of the time scale of the investigation, the nature of the investigation and internal document management are not enough to persuade the determined investigator not to look for stegongraphy there are three more issues to consider. Keywords, cost and proportionality.
Keywords: Huge culls of this data are made right at the beginning of a case using keyword filtering, which may be court approved or sanctioned, and will remove huge volumes of data straight away. As a law firm could well have a data set of 100 millions of documents, it is is simply not possible to review all of the data, so keywords have to be used, and while they have their limitations they have been used on cases around the world for many years. So, if a person wants to hide their document from a legal review team they could just miss-spell all the keywords.
Costs: Somebody has to pay for the investigation and, unlike criminal investigations, the review is being conducted for profit. There is rarely a greater good, its about money. Is the client willing to pay to try and find data hidden with steganography, which is not backed up or saved elsewhere in an unencrypted version, knowing that once the data is found its possible that the encryption cannot be defeated, and knowing that the probability of even finding anything is slim? That’s a tough sell to the client.
Proportionality. Electronic Discovery is not about doing anything to win, its about proportionality. If there are 5 years of backup tapes, with a non de-duplicated data set of 200 million documents, relating to a breach of contract in 2002, which the company is defending and assisting with the location of documents, is it proportional to search every file for one extra document, that common sense shows is not likely to exist? Especially when there is enough evidence anyway.
Steganography is not an issue for electronic discovery, on a regular basis. It may be relevant to computer forensics investigations, the very detailed analysis of an individuals computer, but even that would be rare.
Some of the the reasons that steganography is not relevant to E-Discovery are highlighted below:
- Documents being reviewed are often provided voluntarily by the company being investigated. Why provide a hidden document?
- Often documents being reviewed are old, and there was no intent to commit a a malicious act at the time. If there was, why write it down?
- Even if there was intent to commit a crime 5 years ago, document management would prohibit using steganography (it would be easier just to delete it).
- The review teams are rarely looking for a single document, rather a review of how a company behaved during a given time period
- Tens or hundreds of millions of documents are collected, therefore looking for a single document is unlikely to be relevant or effective
- E-discovery is expensive enough, who wants to pay extra to try and find a hidden document, that somebody has voluntarily handed over?
- Huge culls of data are often conducted early on in a review process, commonly using keywords. It would be easier to miss spell all the keywords, than bother with steganography (not that this is a realistic problem either).
Put it another way, when a teenager is arrested for stealing from a shop, is the entire shop closed off and fingerprinted, the teenager checked for gun poweder residue, door to door enquires made over a 2 mile radius, and a finger tip search conducted by specialist search teams in a 0.5 mile radius? No, the store owner gives a statement saying the teenage did it and CCTV may be collected, and that’s it.
If there is a terrorist bomb, its the other extreme, every investigation is about finding the right balance, and often the best value for money.