Comment made by: markengelberg
I believe that one important measure of a language is the degree to which its internal abstractions, used for its own data structures, are open for extension. For that reason, I have long believed this is an issue worth addressing. The assumption baked into subseq and rsubseq that Sorted collections can only have unique sorted values was surely an oversight in the initial code, one that has not only prevented priority-map from hooking into subseq and rsubseq, but also several clever data structures by Marczyk. I don't think there's something interesting beyond what is in Sorted that we need to try to understand -- the problem is that implementing Sorted's seqFrom interface is not currently sufficient for subseq and rsubseq, as they are currently defined, to provide their specified behavior. You can view this as either a deficiency in Sorted or a deficiency in subseq/rsubseq.
Marco, the latest version of clojure.data.priority-map contains a patched version of subseq and rsubseq that doesn't rely on changes to the underlying Sorted interface to fix its incorrect behavior on sorted collections with duplicate sorted values. Feel free to use that implementation with priority-map or any other similar data structures you might be building.
I can't speak for the Clojure team, but I would think that if someone were to take the initiative to do extensive benchmarking on the subseq and rsubseq in clojure.data.priority-map and establish that there is no performance hit on existing sorted Clojure data structures, then they might be more inclined to move those patched versions into Clojure's core, since there'd be a clear benefit with zero downside. (Or alternatively, if there is a performance hit with respect to Clojure's existing sorted data structures, figure out how to tweak it so there is no performance penalty, which should be easy to do). The needed changes to subseq and rsubseq were simple: where they currently change a <= subsequence into a < subsequence by calling rest to drop the first item, I merely changed them to call drop-while to drop all equal items off the front of the sequence, since there might be more than one. When this issue was first raised, core team members indicated they would rather see this fixed at the protocol level, but if that is no longer feasible, I would like to see this change to subseq and rsubseq considered. It seems like an easy way to fix this longstanding issue that prevents subseq and rsubseq from working with all data structures that fulfill Sorted's seqFrom implementation.