Friday, August 19, 2011

Fuzzy Hashing and E-Discovery

Recent work has made me consider an interesting role fuzzy hashes could play in E-Discovery.

In the last year I've worked a few intellectual property theft cases where Company A has sued Company B claiming Company B stole IP from Company A in the form of documents, design drawings, spreadsheets, contracts, etc.

In these cases Company A has requested that Company B turn over all documents that may pertain to Company A or Company A's work product, etc. with specific search terms provided and so on.

Company B argues they can't comply with Company A's request because they have documents relating to Company A and Company A's work product as a result of market research for the purposes of strategic planning and that turning over all of those documents would damage Company B.

In such cases, if Company A is concerned that Company B has stolen specific documents, maybe a better approach would be to request that Company B run ssdeep or another fuzzy hashing tool against all of their documents and turn over the fuzzy hashes.

Company A can then review the fuzzy hash results from Company B without knowing anything about the documents those hashes came from. They can compare the set of hashes provided by Company B against the set of fuzzy hashes generated from their own documents and make an argument to the judge to compel Company B to turn over those documents that match beyond a certain threshold.

24:DZL3MxMsqTzquAxQ+BP/te7hMHg9iGCTMyzGVmZWImQjXIvTvT/X7FJf8XLVw:J3oy+x/te7qmNmlYvX/xp8W

No comments:

Post a Comment

Paperclip Maximizers, Artificial Intelligence and Natural Stupidity

Existential risk from AI Some believe an existential risk accompanies the development or emergence of artificial general intelligence (AGI)...