Efficient Python

Although Python is a not a compiled/low-level language, there are certain optimisations that will make it reduce memory usage, become more efficient, and approximate zero-copying found in lower level languages. The best way to do this is to dive right in and see what changes we can make to 'naive' programs. TLDR;

Implementing tail in Python

The Unix tail tool is a program that, given some input, extracts only the last N lines from it, and prints it to the user. A trivial solution would be something like the following:

import sys

def main():
    n = int(sys.argv[1])
    lines = sys.stdin.readlines()
    for line in lines[-n:]:
        print(line, end='')

if __name__ == '__main__':
    main()

The way we'll run our program on Unix systems is:

$ cat head.py | python head.py 10

I think you can find a windows equivalent using pipes and what not. Skipping the deails: the code is quite straightforward, really -

  1. it expects an integer as the second argument
  2. reads all the lines from stdin and stores it in a list
  3. gets the last N lines[1] from the list

Notice that since we only have to display the last N elements, we don't really have to hold all the input in memory at once, we can really just hold the last N elements in memory:

import sys

def main():
    n = int(sys.argv[1])
    q = []
    for line in sys.stdin:3
        q.append(line)
        if len(q) > n:1
            del q[0]2

    for line in q:
        print(line, end='')

if __name__ == '__main__':
    main()

Although a little longer and uglier, the code still does what we set out to do quite elegantly. Let's walk through the important points:

Perhaps the most important change is 2. It reduces our storage requirements from O(size of input) to O(N+1). However this will not be possible without 3. When we use the generator version, what the runtime does for us is this:

  1. Create a new generator object. On a high level this is just an object that knows how to return the next element, and can signal for end by raising exceptions.[2]
  2. Convert our for-loop into something roughly like the following, but in lower level code:
    while True:
      try:
          next_item = get_next_value()
      except StopIteration:
          break
      # do things here
  3. This is more efficient than our previous version of reading all the values at once, primarily because this version uses less resources- we (ideally) only need as much resources to generate the current value.[3]

There's an article that explains generators much better than I can, and goes into more detail as well.

A little bit faster now

Aren't pops/deletes from the 'head' of a list way more expensive than deletes from the end? Yes[4]. Is there another datastructure that can support fast pops and pushes from both ends? Yes.

Solution: collections.deque. An often overlooked datastructure found in Python's standard library. Also it supports more arguments than meets the average, non stdlib-stalking eye: maxlen.

from collections import deque
import sys

def main():
    n = int(sys.argv[1])
    for line in deque(sys.stdin, maxlen=n):1
        print(line, end='')

if __name__ == '__main__':
    main()

What 1 maxlen does is that it restricts the size of the deque to at most N elements. It does it by removing the front/rear elements to make way for elements added to the rear/front.

Tips