Plagiarizing quicksort?
Posted by David Zaslavsky on — CommentsI found an interesting tidbit in one of Princeton’s old academic integrity handbooks that I wanted to share before throwing it out.
They’re trying to demonstrate plagiarism in something other than the usual humanities-paper context. Which is a good goal, I suppose, but the execution leaves something to be desired. Here’s the “original” implementation of quicksort, from Bob Sedgewick’s Algorithms in C, as reproduced in the handbook:
quicksort (int a[], int l, int r)
{
int v, i, j, t;
if (r > l)
{
v = a[r]; i = l-1; j = r;
for (;;)
{
while (a[++i] < v) ;
while (a[–j] > v);
if (i >= j) break;
t = a[i]; a[i] = a[r]; a[r] = t;
}
t = a[i]; a[i] = a[r]; a[r] = t
quicksort (a, l, i-1);
quicksort (a, i+1, r);
}
}
And here’s their plagiarized example:
#define Swap(A,B) {temp=(A); (A)=(B); (B)=(A); }
void mysort (const int * data, int x, int y) {
int temp;
while (y > x) {
int pivot = data[y];
int i = x-1;
int j = r;
while (1) {
while (data [++i] < pivot) { /*do nothing*/ }
while (data [–j] > pivot) { /*do nothing*/ }
if (i >= j) break;
swap (data [i], data [y];
}
swap (data [i], data [j];
mysort (data, x, i-1);
x = i+1;
}
}
Yeah, neither of them can actually be compiled. I left all the errors from the handbook in there for your amusement ;-)
But the point I wanted to make is that, well, it’s quicksort. There’s only so much creativity that can be involved in this algorithm — when you’re programming with basic algorithms, what you’re trying to write mostly determines how you’re going to write it. Anyone who’s independently come up with their own implementation of quicksort can probably attest that it looks pretty similar to the one in any standard reference book. So who’s to say that something like the “plagiarized” code sample above was, in fact, plagiarized?
This doesn’t mean that it’s impossible to commit plagiarism when writing computer code. For instance, every once in a while there’s a news story about some company that gets caught for using GPL-licensed code without complying with the license. But accusations of copyright violations like that are based on large chunks of code using fairly distinctive algorithms, or even entire programs in many cases. And it’s often the case that the people who actually copy the code make barely any effort to disguise it (maybe hoping nobody will bother to look). So in these high-profile cases, you can be pretty certain that plagiarism is taking place. But I certainly wouldn’t make a judgment like that based on one little 20-line section that implements what may be one of the most common algorithms in the world of programming.