Python group numbers into ranges
I referred Unsupervised clustering with unknown number of clusters as suggested by @schwobaseggl and changed the code a bit as per my need. Here is the new code:
i have chosen threshold value (thres) as 11% of the total range of data so the output for this dataset is: Your use of the dictionary seems to be a way to allow the numbers to arrive out-of-order. Rather than sort them, you seem to be trying to use a dictionary (and associated hashing) to maximize efficiency. This doesn't really work out perfectly, since you wind up doing sequential searches for a given value. Hashing a low:high range (a dictionary key:value pair) to avoid a search doesn't help much. Only they key gets hashed. It does help in the case where you're extending a range at the low end. But for extending a range at the high end, you have to resort to searches of the dictionary values. What you're really creating is a collection of "partitions". Each partition is bounded by a low and high value. Rather than a dictionary, you can also use a trivial tuple of (low, high). To be most Pythonic, the (low,high) pair includes the low, but does not include the high. It's a "half-open interval". Here's a version using a simple A binary search (using the You start with an empty set of partitions, the first number creates a trivial partition of just that number. Each next number can lead to one of three things.
It's something like this.
This is a pretty radical rethinking of your algorithm, so it's not a proper code review. Assembling a string with separators is simpler in Python. For your example of ","-separated strings, always think of doing it this way.
Once you have that, you're simply making a sequence of details. In this case, each detail is either "x-y" or "x" as the two forms that a range can take. So it has to be something like
In this example, I've shown the format as an in-line Given your ranges, the overall function is this. |