Item 14: Know how to Slice Sequences

Notes

  • Python has syntax for slicing sequences

  • This is a method for accessing a subset of sequence elements

  • Slice is implemented by default for sequence types (list, tuple, str, bytes)

  • Slicing can be extended to any class implementing the __getitem__ and __setitem__ dunder methods

  • The basic syntax is a_sequence[start:end]

    • start is inclusive
    • end is exclusive
a = ["a", "b", "c", "d", "e", "f", "g", "h"]

print("Middle two:  ", a[3:5])
print("All but ends: ", a[1:7])
Middle two:   ['d', 'e']
All but ends:  ['b', 'c', 'd', 'e', 'f', 'g']
  • When slicing from the start you can exempt the 0
  • When slicing to the end you should omit the last index (last is assumed by default)
a = ["a", "b", "c", "d", "e", "f", "g", "h"]
assert a[:5] == a[0:5]
assert a[5:] == a[5:len(a)]
  • Negative indices let you slice offset relative to the end of the sequence
    • A python reader should be able to understand all of the following slices
a = ["a", "b", "c", "d", "e", "f", "g", "h"]

print(a[:])
print(a[:5])
print(a[:-1])
print(a[4:])
print(a[-3:])
print(a[2:5])
print(a[2:-1])
print(a[-3:-1])
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
['a', 'b', 'c', 'd', 'e']
['a', 'b', 'c', 'd', 'e', 'f', 'g']
['e', 'f', 'g', 'h']
['f', 'g', 'h']
['c', 'd', 'e']
['c', 'd', 'e', 'f', 'g']
['f', 'g']
  • You can pass indices to slice that are out of the length of a sequence
    • Missing values are omitted
  • Let’s you enforce a maximum sequence length
    • Below restricts a sequence to five elements
a = ["a", "b", "c", "d", "e", "f", "g", "h"]
b = ["1", "2", "3"]

# slicing forward
print("Slicing forward")
print(a[:5])  # takes indices 0, 1, 3, 4
print(b[:5])  # takes indices 0, 1, 2 (3, 4 don't exist)

print("Slicing backward")
print(a[-5:])
print(b[-5:])
Slicing forward
['a', 'b', 'c', 'd', 'e']
['1', '2', '3']
Slicing backward
['d', 'e', 'f', 'g', 'h']
['1', '2', '3']
  • Accessing an invalid index directly will throw an error
a = ["a", "b", "c", "d", "e", "f", "g", "h"]
print(a[20])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[5], line 2
      1 a = ["a", "b", "c", "d", "e", "f", "g", "h"]
----> 2 print(a[20])

IndexError: list index out of range
Note

Indexing by a negated variable can lead to surprising results. E.g. the expression a[-n:] works for n>0, say n=3 a[-3:0]. When n is zero, a[-0:] evaluates to a[:] which slices the entire list

  • Slicing a list creates a new list
    • Modifying the new list doesn’t change the old list
    • The new list refers to the same list items though
      • It’s a shallow copy, not a deep copy
a = ["a", "b", "c", "d", "e", "f", "g", "h"]

b = a[3:]

print("Before:  ", b)
b[1] = 99
print("After:   ", b)
print("No change:   ", a)
Before:   ['d', 'e', 'f', 'g', 'h']
After:    ['d', 99, 'f', 'g', 'h']
No change:    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
  • Slices can be used in assignments
    • Slice then replaces the specified range
    • The lengths don’t need to match
    • Values before and after the slice are preserved
    • This can result in the list size changing
print("Shrinking a list with a slice")

a = ["a", "b", "c", "d", "e", "f", "g", "h"]
print("Before   ", a)
a[2:7] = [99, 22, 14]
print("After    ", a)

print("Growing a list with a slice")

a = ["a", "b", "c", "d", "e", "f", "g", "h"]
print("Before   ", a)
a[2:3] = [47, 11]
print("After    ", a)
Shrinking a list with a slice
Before    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After     ['a', 'b', 99, 22, 14, 'h']
Growing a list with a slice
Before    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After     ['a', 'b', 47, 11, 'd', 'e', 'f', 'g', 'h']
  • You can copy a whole list using [:]
a = ["a", "b", "c", "d", "e", "f", "g", "h"]

b = a[:]

assert a == b and b is not a
  • Assigning with [:] can be used to replace an entire list
a = ["a", "b", "c", "d", "e", "f", "g", "h"]

print("Before:  ", a)
a[:] = [101, 102, 103]
print("After:   ", a)
Before:   ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
After:    [101, 102, 103]

Things to Remember

  • Avoid being verbose when slicing
    • 0 is the implicit start index
    • len(sequence) is the implicit stop index
  • Slicing is forgiving of start and end indices
    • Out of bound or non-existent indices are ignored
    • Can be used to enforce maximum lengths for sequences
  • Assigning to a sequence slice replaces that range in the sequence
    • Even when the lengths differ