Python methods are not functions

Top

When I first looked more closely at Python, I was pleasantly surprised that its semantics seemed significantly simpler than it appeared to be at first: everything's a dictionary, and symbol resolution is a matter of walking up through scopes; functions are first-class objects. It's practically a Scheme!

Not quite. Now I keep finding layers of special cases, and the sheen is wearing off somewhat. Today's surprise is that methods are significantly different from functions.

Python methods aren't functions

Should that be a surprise? Perhaps not. In languages like Java and C++, a method is fairly obviously related to, but distinct from, a function. It's very easy to get the impression that in Python, they're just the same thing, but with a calling convention.

Picking examples from the top of the Google-found list of tutorials, we can learn that ‘a function inside a class is known as a method’, or that ‘method is just a special term for functions that are part of a class’ (I'm not knocking these tutorials, by the way, but pointing out that it's very easy to pick up this wrong impression).

Even the Python.org tutorial says ‘By definition, all attributes of a class that are function objects define corresponding methods of its instances’, and the Python language specification, in its glossary, describes ‘method’ as ‘A function which is defined inside a class body. If called as an attribute of an instance of that class, the method will get the instance object as its first argument (which is usually called self).’

It appears from that, that a method is a function is a method is a function, and that what make the function a method is the way it's called:

class C:
    def f(self, msg):
        print('msg={}'.format(msg))

c = C()
c.f('hello')

The explanations above seem to suggest (at least to me) that it's the calling convention that's important. Thus, in c.f('hello'), the c provides the scope which identifies which f to call, and additionally – as a piece of syntactic sugar – indicates that the function f should be called with arguments (c, 'hello').

Well... no

Well, yes, sort of, but this comes unstuck when you (OK, when I) try to use a method ‘function’ as just a function.

Say we define a class and function as follows:

class C:
    def f(msg):
        print('msg={}'.format(msg))

c = C()
print_msg = c.f  # look up the function

print_msg('hello')

That doesn't work: it fails with TypeError: f() takes 1 positional argument but 2 were given. That is, although in the call of print_msg there was no instance object – this is not ‘called as an attribute of an instance of that class’ – there was still a self object passed to the function f, preceding the msg argument 'hello'.

Can we generate a single-argument function in any way?

What got me started on this huge parenthesis was the desire to pass a handler function generated within one instance, to be called by another, in such a way that the handler function can still access the original instance's internal state (ie, this is the same as being a callback, in another context). Can we do that?

One way is to prefix the function definition with @staticmethod:

class C:
    @staticmethod
    def fs(msg):
        print(msg)

c = C()
print_msg = c.fs

print_msg('hello')

That makes the function fs a class method, with only a single argument, as expected. But that means that the function fs has no access to internal attributes of a particular instance.

class C:
    extra = None

    def __init__(self, extra):
        self.extra = extra

    def get_handler(self):
        def h(msg):
            print('h: msg={}  extra={}'.format(msg, self.extra))
        return h

c = C('wibble')
print_msg = c.get_handler()  # look up the function

print_msg('hello')

Here, the get_handler function turns into an instance method object, but when called it returns an ordinary function which isn't mutated into anything, but which does have access to the instance's internal data (ie, it's a closure).

So is a method a function or not?

This had me tearing my hair out. The explanation is in the Callable types section of the section on the Standard type hierarchy in the Python language specification. The resolution of the puzzle is that the value of the f attribute of the instance c is ‘just’ a function, but what you get back when you resolve c.f is not that function:

User-defined method objects may be created when getting an attribute of a class (perhaps via an instance of that class), if that attribute is a user-defined function object or a class method object.

When an instance method object is created by retrieving a user-defined function object from a class via one of its instances, ...

Instance method objects

So c.f is a function, but what's assigned to print_msg, in the example above, is a new instance method object which wraps the function, and which is treated as a distinct case by the definition of the semantics of callable objects (and see the Callable types section of the language definition, as mentioned above). That is, it is specifically the call semantics that requires the magic of prepending the self object to the front of the function's list of parameters, and the description of methods as just functions is a little white lie.

It's not about the def

The magic isn't happening in the def, by the way, since if you define a class function through a lambda:

class C:
    fl = lambda msg: print('fl: msg={}'.format(msg))

c = C()
print_msg = c.fl
print_msg('hello')

you get exactly the same error.

Does this matter?

Not really; not deeply. The semantics of method calls is pretty similar to functions, and certainly close enough for any introductory text. Also, when you re-read the python.org tutorial, where it says ‘[b]y definition, all attributes of a class that are function objects define corresponding methods of its instances’, you realise that it does actually give a precisely correct description, but carefully phrased so that the method/function distinction doesn't distract a novice reader.

But the ugly thing, for me, is that in the expression print_msg = c.f, there's an awful lot happening in that equals sign, it is emphatically not just a lookup and assignment, since the thing on the left has a different type from the thing on the right. Also, the conversion is triggered by a magic special case (retrieval of a class attribute when that atttribute is a user-defined function), the explanation for which is pretty far down in the detailed description of callable objects; it's easy to miss, and I suspect you have to be a pretty seasoned Python user to actually be aware of the difference.

The Aha! that comes out of this

Perhaps I've been spoiled by Scheme and friends. In R5RS (number 5 in the sequence of de facto Scheme standards), the semantics of procedures and procedure calls are dealt with fully in just a few lines of sections 4.1.3 and 4.1.4. In Scheme, there's no magic, almost no special-casing, and no white lies.

Now, R5RS is a pretty damn minimal Scheme (and that's how folk like it – there was almost a riot when the Scheme Steering Committee produced a successor, R6RS, which had a few batteries included). Batteries-included members of the Scheme family, such as cousin Racket, add lots more structures, including modules and classes, but they do so without violating or even adjusting the underlying evaluation and call semantics.

Racket (and other Scheme/Lisp family members (IsSchemeLisp? – garhh!) do this by adding extra syntax to the language through the more or less sophisticated macro systems are famous for (see a discussion of Racket macros by Greg Hendershott, and Paul Graham's famous Beating the Averages for some Lisp lovin'). And I realise (this is the aha!) that all the macro magic is really only possible because of the simplicity and absolute consistency of the evaluation and procedure call semantics.

The thing that would frustrate a Python macro system (apart from the indentation, which is probably less of a big deal), and which hamstrings C's cruddy little macro system, and C++’s to the extent that you can call ‘templates’ a macro system, is that the language syntax and semantics are so closely bound up with one another, that generating code rapidly becomes a bit of a nightmare; the assignment print_msg = c.f means different things depending on what the right-hand side is. In Scheme and other Lisps on the other hand, a macro turns one bracketed sequence of terms into another bracketed sequence of terms, and the evaluation semantics are exactly the same in the result as they are in the starting expression.

Lisps do start off looking a little bit weird – all those parentheses! – but they're as analysable as you could possibly desire, leaving you to get on with turning thought into code, without worrying there's a half-forgotten sub-paragraph special case that's going to show up like a bad smell, and bite you on the bum.

Norman, 2015 March 28