Author Topic: Why not to use arange - and the alternative! (Read 62473 times)

Anders Blom · « **on:** February 24, 2009, 11:44 »

As useful as it may seem, numpy.arange() is not a reliable function. I recommend everyone to consider using numpy.linspace() instead, when possible. From the documentation, it is pretty clear what arange() should deliver:

Code


arange(start,stop,step)

should give an array with values [start,start+step,start+2*step,...] until start+N*step > stop. "stop" is not part of the interval, the manual says. So, let's try:

Code


arange(0.4, 1.1, 0.1)

Ok, this is what you expect: you get [0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], and as the documentation says the endpoint, 1.1, is not included. But, now try

Code


arange(0.4, 1.1, 0.1)

This time you get [0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1] - the end-point is suddenly included!!!

The problem lies deeply buried in the bit representation of floating numbers. Due to the rule used to determine the number of points in the array, the last element in the output array may be larger than "stop". Actually, the manual is honest enough to mention this, but it makes it a bit hard to trust the output from this function! It should be said, that the implementation of arange() has been under a lot of debate, regarding this point; see e.g. http://osdir.com/ml/python.numeric.general/2006-02/threads.html, and search for "arange" on that page. However, no result has come of this, and as a result I recommend that arange() not be used if it can be avoided. Fortunately, there is an alternative, which actually is even better (ok, it depends a bit on what you are trying to achieve). The function numpy.linspace() is a bit unknown but very useful, and above all more predictable! The syntax is very simple (I just paste the result of "help linspace" here, for convenience):

Quote

linspace(start, stop, num=50, endpoint=True, retstep=False) Return evenly spaced numbers over a specified interval. Returns `num` evenly spaced samples, calculated over the interval [`start`, `stop` ]. The endpoint of the interval can optionally be excluded. Parameters ---------- start : {float, int} The starting value of the sequence. stop : {float, int} The end value of the sequence, unless `endpoint` is set to False. In that case, the sequence consists of all but the last of ``num + 1`` evenly spaced samples, so that `stop` is excluded. Note that the step size changes when `endpoint` is False. num : int, optional Number of samples to generate. Default is 50. endpoint : bool, optional If True, `stop` is the last sample. Otherwise, it is not included. Default is True. retstep : bool, optional If True, return (`samples`, `step`), where `step` is the spacing between samples. Returns ------- samples : ndarray There are `num` equally spaced samples in the closed interval ``[start, stop]`` or the half-open interval ``[start, stop)`` (depending on whether `endpoint` is True or False). step : float (only if `retstep` is True) Size of spacing between samples.

Use this function whenever you want to create a sequence of real numbers between a start and an end point, with a specified number of points. This is often more handy than specifying the point spacing, as you must do for arange() anyway.

QuantumATK Forum

News:

Author Topic: Why not to use arange - and the alternative! (Read 62473 times)

Anders Blom

Why not to use arange - and the alternative!