Saturday, April 23, 2011

Forensic string searching

Can the principle of "least frequent occurrence" be applied to digital forensic string searches?

Late last night (or painfully early this morning) I published a new post over at the SANS Digital Forensics Blog. The post is called "Least frequently occurring strings?" and attempts to shed some light on that question.

I've used this approach on a couple of recent cases, one real and one from The Honeynet Project's forensic challenge image found here, this is the image the post contains data from.

I really knew nothing about the Honeynet challenge case, but in less than half an hour, I'd located an IRC bot using the LFO approach to analyzing strings. Of course the Honeynet case is quite small, so the technique worked well, on larger cases from the real world, I expect it's going to take longer or maybe not work at all. Nevertheless, LFO is a concept that other practitioners have been applying for some time now.

There's lots of other goodies in the post, like moving beyond just using strings to extract ASCII and Unicode text from disk images. If you have a decent system and a good dictionary file, you can reduce this set of data even further to lines that actually contain English words.

Check it out, I hope the world finds it useful.

No comments:

Post a Comment

Paperclip Maximizers, Artificial Intelligence and Natural Stupidity

Existential risk from AI Some believe an existential risk accompanies the development or emergence of artificial general intelligence (AGI)...