Increasing Performance when Assigning Subsets of Arrays
When working with multi-dimensional data, as is the case with images, it is sometimes necessary to replace a chunk of data with another chuck of the same size. With images, this could be a spatial area that needs to be replaced because of clouds, or a spectral band that needs to be replaced.
In the case of a spatial replacement, say we want to replace a section of data with an array of the same size. For this example, the subset is 50 rows and 50 columns, and the larger data is 250 rows, and 250 columns.
IDL> array = bytarr(250, 250) + 127B
IDL> sub = indgen(50, 50)
To replace a section of data in IDL, one might go about itthis way:
IDL> array[50:99, 100:149] = sub
This is an acceptable method, but it is not the faster way of doing this. With the method of replacement done above, IDL will replace each individual element one at a time. To replace the entire subset at once, IDL only requires the first index where the subset needs to be inserted:
IDL> array[50, 100] = sub
IDL will fill in data until the entirety of the subset is in the larger data. Do be cautious though, if you run out of room IDL will either give unexpected results, or throw and out-of-bounds error. Here is the image of the array we just manipluated:
IDL> i = image(array)
An Array Within an Array
The advantage though, is that this is much faster than explicitly calling out the array elements to replace, or using a wildcard (*) character. Let's take a look at using wildcard characters for the next example - replacing a band.
If we wanted to take an image, and decrease the amount of red in that image, we could do that by multiplying the red band by a number smaller than one. First, let's get some image data:
IDL> file = file_which('rose.jpg')
IDL> read_jpeg, file, data
IDL> help, data
DATA BYTE = Array[3, 227, 149]
In order to get the red band we will have to use wildcards. However, this will not be a performance issue, as we are assigning a chuck of data to a variable. Nothing is being replaced. As a good rule of thumb, it is best to avoid using wildcards on the left-hand side of an assignment when possible, but wildcards on the right-hand side of an assignment are okay.
IDL> red = data[0,*,*]
Now let's assign the old red band to the original multiplied by 0.6. We could use wildcards again to do this:
IDL> data[0,*,*]= byte(red * 0.6)
But we now know that this is not the most efficient way to assign this band in IDL. It only needs the first index of the data that we want to replace:
IDL> data[0,0,0] = byte(red* 0.6)
Because the red band is still the same size, the entire band will be replaced all at once instead of one element at a time. For these small arrays, the speed increase is not huge, but with larger datasets, it becomes important to use this practice to achieve high levels of performance.
IDL> i = image(data)