Item 15: Avoid Striding and Slicing in a Single Expression

Notes

Python slices can be extended with a stride
- i.e. a_sequence[start:stop:stride]
- stride lets you specify \(n\) such that every \(n\)-th item is taken
- For example slicing even and odd indices in a list

x = ["red", "orange", "yellow", "green", "blue", "purple"]
odds = x[::2] # First, third, fifth
evens = x[1::2] # Second, fourth, sixth

print(odds)
print(evens)

['red', 'yellow', 'blue']
['orange', 'green', 'purple']

The stride syntax can cause unexpected behaviour
e.g. to reverse a string one normally slices with a stride of -1

x = b"Mongoose"
y = x[::-1]
print(y)

b'esoognoM'

Also works for unicode strings (See Item 10)
- But not when encoded as a UTF-8 byte string
- The individual bytes are reversed, no longer valid utf-8, so the decoding fails

print("Reversing a unicode string")
x = "☘️🪉"
y = x[::-1]
print(y)

print("Reversing the encoded UTF-8 representation")
x = "☘️🪉"
w = x.encode("utf-8")
y = w[::-1]
z = y.decode("utf-8")
print(z)

Reversing a unicode string
🪉️☘
Reversing the encoded UTF-8 representation

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
Cell In[3], line 10
      8 w = x.encode("utf-8")
      9 y = w[::-1]
---> 10 z = y.decode("utf-8")
     11 print(z)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Consider the following other strides
1. Here [::2] means select every 2nd element starting at the beginning
2. What does [::-2] mean?
  - Take every second element, starting at the end, moving backwards
3. What does [2::2] mean?
  - Every second item, starting at the third
4. What does [-2::-2] mean?
  - Select every second item, starting two from the end and moving backwards
5. What does [-2:2:-2] mean?
  - Select every second item, starting two from the end, moving backwards to the third index
6. What does [2:2:-2] mean?
  - Select every second element from the third element to the third element, moving backwards
  - So select nothing because [2:2] is empty

x = ["a", "b", "c", "d", "e", "f", "g", "h"]


print("x[::2]:", x[::2])
print("x[::-2]:", x[::-2])
print("x[2::2]:", x[2::2])
print("x[-2::-2]:", x[-2::-2])
print("x[-2:2:-2]:", x[-2:2:-2])
print("x[2:2:-2]:", x[2:2:-2])

x[::2]: ['a', 'c', 'e', 'g']
x[::-2]: ['h', 'f', 'd', 'b']
x[2::2]: ['c', 'e', 'g']
x[-2::-2]: ['g', 'e', 'c', 'a']
x[-2:2:-2]: ['g', 'e']
x[2:2:-2]: []

Combining strides and slices thus creates very dense and hard to parse expressions
- Especially when the stride is negative
Consider splitting into two steps
1. Stride first
2. Then slice

x = ["a", "b", "c", "d", "e", "f", "g", "h"]
y = x[::2] # take every second item
y = y[1:-1] # take every element from the first, to one from the end
print(y)

['c', 'e']

Striding + slicing results in an additional shallow copy
- Hence why we stride first -> reduces the memory footprint of the intermediate copy
- If this is still too memory intensive consider the itertools module
  - Provides islice which is a cleaner interface

Things to Remember

Specifying start, end and stride in one expression can result in overly dense expressions
If striding try to only use positive strides and mixed start or end indices
- Negative strides should be avoided due to unclear behaviour
If you need to start, end and stride consider splitting it into two operations
- Alternatively use islice from itertools