Richard Bucker

Programmer DNA in your code - In Plain Sight

Posted at — Mar 21, 2012

In a recent “In Plain Sight” episode the writers tried to suggest that the bad guys, given enough CPU cycles, could identify and geographically locate an individual programmer based on his code’s DNA.Seriously?I implemented some algorithms as an undergrad that could compare two documents and determine the likelihood of plagiarism. But in this case (a) the english language (b) comparing two known documents.I would like to think that I’m not the only one that realizes that the potential for matching two snippets or even complete source trees is as unlikely as proving program “correctness”. (when I was an undergrad that proof could not be completed)First of all the number of syntax permutations is infinite. The problems they solve are equally as large. Variable name substitution does not count, and with applications like gofmt and IDEs that reformat and inject comments… one is more likely to identify the IDE and not the programmer.