Researching Searching Algorithms? 62
NiN3x asks: "I have recently written a sorting algorithm that can be close to four times as fast as quicksort and never much slower. I was wondering if there are lots of sorting algorithms like this out there. I don't think I could be the only one that has thought of something like this. I am only in my second year of computer science, so I don't know a lot about these things. I have tried searching the net and can't find anything. My algorithm is NOT an inplace sorting algorithm. Can anyone point me to some sources for this type of thing?"
Publish it? (Score:5, Interesting)
Is there some reason why you cannot publish it here for a review? A basic C reference cannot be very many lines? This way people could give real input for you. I don't believe you have much to lose. If you do, attach your favorite license to it :)
Check out Perl's source code for more info. (Score:3, Interesting)
If your algorithm is general-purpose and happens to be faster, why not GPL/LGPL/BSD it and contribute to Perl's, Python's, and PHP's sorting and searching code?
Re:Funny that this topic came up... (Score:5, Interesting)
My guess, is that it really isn't. There are many algorithms that are faster than quicksort for smaller inputs. For less than 100 elements, insertion sort wins (hands down). For less than "really more than you usually need to sort, but not really extreme values", shell-sort usually wins.
Also, there are lots of algorithms that are faster than O(nlogn) when you have more information than that given to you by <:. One example would be bucket-sort.
Personally, I couldn't care less. The chances of a second-year student inventing a better sorting algorithm that isn't already published somewhere is nada. That doesn't mean that he is not smart, and that his algorithm is stupid. It *is* probably better than quicksort for the test-cases he has tested it on (and maybe for many more). What it means is that the field of sorting is so well-studied that it is very unlikely to get real contributions from somebody without at least two PhD's in mathematics these days.
Re:Apples and Oranges? (Score:4, Interesting)
Yeah, the sorting is O(N) but reading the result is O(N^2) and without knowing the result the sorting itself doesn't do much good. That's because you need N^2 of memory and in worst case you have to do linear scan through the whole memory. If you think about sorting it should be pretty clear that O(n log n) is best you can get if you have no information about the data and it's uniformly distributed. IIRC worst case for heap sort is O(n log n) and worst case for quicksort is O(n^2) so if you don't need in-place sorting you should use for example heap sorting. If you don't have many items to sort just do insertion sort because it has least overhead.
Well, post it. (Score:4, Interesting)
Post your algorithm, then!
There are many implementations of quicksort-like sorts, but when done well, quicksort is pretty damn fast. For instance, if you're measuring against a version that doesn't special case arrays of size 3, 4, and maybe even bigger, then it will run many times slower than a tuned version. You won't be able to beat quicksort (with proper pivot-picking) asymptotically, so there won't be any ground-breaking result here, but it's probably possible to beat its constant factors (which may be practically useful)... and there are probably many people here who'd be willing to look at the algorithm if you posted it.
Re:Funny that this topic came up... (Score:3, Interesting)
The interesting thing about O(nlogn) versus O(n) or O(whatever-you-want) is that an "operation" is not often the same between algorithms.
So an O(nlogn) implementation is still faster than an O(n) implementation when the input set is less than 1,000,000 items and the latter implementation requires 6 times the time per operation compares to the former.
So while it is unlikely that someone without a vast theoretical background will discover a better algorithm for all cases (or even the extreme cases), as you point out; there is a significantly greater liklihood that he could have discovered an algorithm which provides improvement for data sets which are commonly used in certain - even many - fields. Without having multiple doctorates.
Sorting in general is a well-studied field; but as the application of computers grows, the need for less general sorting grows as well. Many data structures and algorithms are not considered "sorted" because they are partitioned, or implicitly ordered, yet they make use of sorting theory.