2021 fall cs61A 学习笔记

Building Abstractions with Functions

用函数构建抽象

Higher-Order Functions

evaluate expression ——是计算这个表达式的意思

Functions as Arguments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
>>> def sum_naturals(n):
'''自然数求和
'''
total, k = 0, 1
while k <= n:
total, k = total + k, k + 1
return total
>>> sum_naturals(100)
5050
>>> def sum_cubes(n):
'''求立方和
'''
total, k = 0, 1
while k <= n:
total, k = total + k*k*k, k + 1
return total
>>> sum_cubes(100)
25502500
>>> def pi_sum(n):
'''求pi
'''
total, k = 0, 1
while k <= n:
total, k = total + 8 / ((4*k-3) * (4*k-1)), k + 1
return total
>>> pi_sum(100)
3.1365926848388144

三个函数的功能都是一样的,都是进行求和,但是具体的求法有点不同,可以对其抽象,抽取共同部分。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> def summation(n, term):
total, k = 0, 1
while k <= n:
total, k = total + term(k), k + 1
return total
>>> def identity(x):
return x
>>> def sum_naturals(n):
return summation(n, identity)
>>> sum_naturals(10)
55
>>> summation(10, square)
385
>>> def pi_term(x):
return 8 / ((4*x-3) * (4*x-1))
>>> def pi_sum(n):
return summation(n, pi_term)
>>> pi_sum(1e6)
3.141592153589902

Functions as Returned Values

We can achieve even more expressive power in our programs by creating functions whose returned values are themselves functions.

栗子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
def square(x):
return x*x

def successor(x):
return x+1

def compose1(f,g):
'''返回一个函数'''
def h(x):
return f(g(x))
return h

def f(x):
'''Never called.'''
return -x

square_successor = compose1(square,successor)
result = square_successor(12)

Lambda expressions

In Python, we can create function values on the fly using lambda expressions, which evaluate to unnamed functions. A lambda expression evaluates to a function that has a single return expression as its body. Assignment and control statements are not allowed.

We can understand the structure of a lambda expression by constructing a corresponding English sentence:

lambda expression

Function Decorators

Python provides special syntax to apply higher-order functions as part of executing a def statement, called a decorator. Perhaps the most common example is a trace.

1
2
3
4
5
6
7
8
9
10
11
>>> def trace(fn):
def wrapped(x):
print('-> ', fn, '(', x, ')')
return fn(x)
return wrapped
>>> @trace
def triple(x):
return 3 * x
>>> triple(12)
-> <function triple at 0x102a39848> ( 12 )
36

上述的@tracedecorator的作用等同于下面这个代码

1
2
3
>>> def triple(x):
return 3 * x
>>> triple = trace(triple)

The decorator symbol @ may also be followed by a call expression.

Recursive Fcuntions

mutual recursion

As another example of mutual recursion, consider a two-player game in which there are n initial pebbles on a table. The players take turns, removing either one or two pebbles from the table, and the player who removes the final pebble wins. Suppose that Alice and Bob play this game, each using a simple strategy:

  • Alice always removes a single pebble
  • Bob removes two pebbles if an even number of pebbles is on the table, and one otherwise
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def play_alice(n):
if n == 0:
print("Bob wins!")
else:
play_bob(n-1)

def play_bob(n):
if n == 0:
print("Alice wins!")
elif is_even(n):
play_alice(n-2)
else:
play_alice(n-1)

def is_even(n):
if n == 0:
return True
else:
if (n-1) == 0:
return False
else:
return is_even((n-1)-1)

Building Abstractions with Data

用数据构建抽象

Data Abstraction

Data abstraction is similar in character to functional abstraction. When we create a functional abstraction, the details of how a function is implemented can be suppressed, and the particular function itself can be replaced by any other function with the same overall behavior. In other words, we can make an abstraction that separates the way the function is used from the details of how the function is implemented. Analogously, data abstraction isolates how a compound data value is used from the details of how it is constructed.

数据抽象和函数抽象是相似的。咱不需要管这个数据是如何组织的,只需要知道如何去用这个数据。

Sequences

具有这两个行为的数据称为sequences:

  • finite length.
  • 可以通过索引(从0开始的)来选择元素

Extend behaviours

  1. Membership.Python has two operators inand not in that evaluate to True or False depending on whether an element appears in a sequence.
  2. Slicing.(切片)

Lists

A list value is a sequence that can have arbitrary length.

list的操作

  • add:两个list相加,将两个list扩展成一个大的list,就是拼接两个list
  • mul:一个list乘上一个常数,就是复制多少次这个list

List Comprehensions.(列表生成式) Many sequence processing operations can be expressed by evaluating a fixed expression for each element in a sequence and collecting the resulting values in a result sequence. In Python, a list comprehension is an expression that performs such a computation.

The general form of a list comprehension is:

1
[<map expression> for <name> in <sequence expression> if <filter expression>]

Ranges

A range is another built-in type of sequence in Python, which represents a range of integers.

Ranges commonly appear as the expression in a for header to specify the number of times that the suite should be executed: A common convention is to use a single underscore character for the name in the for header if the name is unused in the suite.

1
2
for _ in range(3):
print('Go Bears!')

Strings

Strings satisfy the two basic conditions of a sequence that we introduced at the beginning of this section: they have a length and they support element selection.

Note: Unlike many other programming languages, Python does not have a separate character type; any text is a string, and strings that represent single characters have a length of 1.

Membership. The behavior of strings diverges from other sequence types in Python. It matches substrings rather than elements.

1
2
>>> 'here' in "Where's Waldo?"
True

Multiline Literals. Strings aren't limited to a single line. Triple quotes delimit string literals that span multiple lines.

1
2
3
4
>>> """The Zen of Python
claims, Readability counts.
Read more: import this."""
'The Zen of Python\nclaims, "Readability counts."\nRead more: import this.'

String Coercion. A string can be created from any object in Python by calling the str constructor function with an object value as its argument.

Nesting lists (使用原生的sequeece构造复杂的数据结构)

Tree
1
2
3
>>> t = tree(3, [tree(1), tree(2, [tree(1), tree(1)])])
>>> t
[3, [1], [2, [1], [1]]]

第一个list节点为root节点,然后之后两个list为左右子树

Linked list 链表
1
four = [1, [2, [3, [4, 'empty']]]]
image-20230129114046273

Sequence Iteration

Sequence unpacking

1
2
3
4
5
6
pairs = [[1, 2], [2, 2], [2, 3], [4, 4]]
same_count = 0
for x, y in pairs:
if x == y:
same_count = same_count + 1
print(same_count)

This pattern of binding multiple names to multiple values in a fixed-length sequence is called sequence unpacking; it is the same pattern that we see in assignment statements that bind multiple names to multiple values.

Mutable Data

In fact, all values in Python are objects. That is, all values have behavior and attributes. They act like the values they represent.

List comprehensions. A list comprehension always creates a new list.

Tuples. A tuple, an instance of the built-in tuple type, is an immutable sequence.

Dictionaries do have some restrictions:

  • A key of a dictionary cannot be or contain a mutable value.
  • There can be at most one value for a given key.

nonlocal

The nonlocal statement declares that whenever we change the binding of the name balance, the binding is changed in the first frame in which balance is already bound. (常用于内部函数)

The key to correctly analyzing code with nonlocal assignment is to remember that only function calls can introduce new frames. Assignment statements always change bindings in existing frames.

Implict Sequences

A sequence can be represented without each element being stored explicitly in the memory of the computer.

Computing values on demand, rather than retrieving them from an existing representation, is an example of lazy computation.

range is a example for Implict sequences.

1
2
3
>>> r = range(10000, 1000000000)
>>> r[45006230]
45016230

Iterator

Calling iter on an iterator will return that iterator, not a copy.

Iterable

Any value that can produce iterators is called an iterable value. In Python, an iterable value is anything that can be passed to the built-in iter function.

Iterators are also iterables, because they can be passed to the iter function.

Built-in Iterator

Several built-in functions take as arguments iterable values and return iterators. These functions are used extensively for lazy sequence processing.

Such as map,filter,zip,reversed,etc.

For Statements

Objects are iterable (an interface) if they have an __iter__ method that returns an iterator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> counts = [1, 2, 3]
>>> for item in counts:
print(item)
1
2
3
>>> items = counts.__iter__()
>>> try:
while True:
item = items.__next__()
print(item)
except StopIteration:
pass
1
2
3

Generators and Yield Statements

1
2
3
4
5
6
7
8
9
10
11
>>> def letters_generator():
current = 'a'
while current <= 'd':
yield current
current = chr(ord(current)+1)
>>> for letter in letters_generator():
print(letter)
a
b
c
d

Even though we never explicitly defined __iter__ or __next__ methods, the yield statement indicates that we are defining a generator function. When called, a generator function doesn't return a particular yielded value, but instead a generator (which is a type of iterator) that itself can return the yielded values.

A generator object has __iter__ and __next__ methods

Iterable Interface

An object is iterable if it returns an iterator when its __iter__ method is invoked. Iterable values represent data collections, and they provide a fixed representation that may produce more than one iterator.

Iterator Interface

The Python iterator interface is defined using a method called __next__ that returns the next element of some underlying sequential series that it represents. In response to invoking __next__, an iterator can perform arbitrary computation in order to either retrieve or compute the next element.

built-in iter function is called on the iterable to create a corresponding iterator.

1
2
3
4
5
6
7
iterator = iter(iterable)
try:
while True:
elem = next(iterator)
# do something
except StopIteration:
pass

Object-Oriented Programming

简单的说明 对象是个啥

an object is a data value that has methods and attributes, accessible via dot notation. Every object also has a type, called its class. To create new types of data, we implement new classes.

Objects and Classes

object和class之间的关系

A class serves as a template for all objects whose type is that class. Every object is an instance of some particular class.

notes: 类名也可以作为参数传递给函数或者方法中

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Ant:
def __init__(self,mass):
self.mass = mass
class SmallAnt(Ant):
strength = 1 # 体力
speed = 10 # 移速
class BigAnt(Ant):
strength = 10 # 体力
speed = 1 # 移速
def make_ant(kind,mass):
new_ant = kind(mass)
return new_ant
# test
my_ant = make_ant(BigAnt,20)
type(my_ant)

__call__有点像,但是这个是将class当作函数在使用,而后者是将class的对象or实例当作函数在使用。

Defining Classes

1
2
class <name>:
<suit>

上述statement就会创建一个名为name的class

__init__函数是the constructor for the class

image-20230130112312832

The __init__ method for Account has two formal parameters. The first one, self, is bound to the newly created Account object. The second parameter, account_holder, is bound to the argument passed to the class when it is called to be instantiated.

By convention, we use the parameter name self for the first argument of a constructor, because it is bound to the object being instantiated. This convention is adopted in virtually all Python code.

1
>>> a = Account('Kirk')

调用class,创建了一个新对象

如何判断两个对象是否为同一个对象,可以用isis not操作符。

Object identity is compared using the is and is not operators.

Methods. Each method definition again includes a special first parameter self, which is bound to the object on which the method is invoked.

image-20230130114046713

当通过调用class里的函数时,比如

1
2
>>> myaccount = Account('yj')
>>> myaccount.deposit(100)

myaccount调用了deposit,此时myaccount赋予了deposit意义,deposit将myaccount这个对象里的钱存了xx。myaccount又作为参数,传递给deposit中的self

Message Passing and Dot Expressions

Dot expressions. The code fragment myaccount.deposit is called a dot expression. A dot expression consists of an expression, a dot, and a name:

1
<expression> . <name>

The built-in function getattr also returns an attribute for an object by name. It is the function equivalent of dot notation.

1
2
>>> getattr(myaccount, 'balance')
10

We can also test whether an object has a named attribute with hasattr.

1
2
>>> hasattr(myaccount, 'deposit')
True

看来对象中的方法也是一个attribute

Methods and functions. When a method is invoked on an object, that object is implicitly passed as the first argument to the method.

object里定义的函数与普通的函数不一样。用type看看这两种函数

1
2
3
4
>>> type(Account.deposit)
<class 'function'>
>>> type(myaccount.deposit)
<class 'method'>

As an attribute of a class, a method is just a function, but as an attribute of an instance, it is a bound method.

bound method的一些解释

bound methods, which couple together a function and the object on which that method will be invoked. A bound method value is already associated with its first argument, the instance on which it was invoked, which will be named self when the method is called.

再来看下上面给的例子,type(Account.deposit)这个函数,需要你提供两个参数——selfamount,它就是一个普通的函数。而下面myaccount.deposit这个方法,只需要你提供amount这个参数,其中self这个参数,自动绑定myaccount对象。

1
2
3
4
>>> Account.deposit(myaccount, 1001)
1011
>>> myaccount.deposit(1000)
2011

The function getattr behaves exactly like dot notation: if its first argument is an object but the name is a method defined in the class, then getattr returns a bound method value. On the other hand, if the first argument is a class, then getattr returns the attribute value directly, which is a plain function.

Naming Conventions. Class names are conventionally written using the CapWords convention (also called CamelCase because the capital letters in the middle of a name look like humps). Method names follow the standard convention of naming functions using lowercased words separated by underscores. (类名用骆驼峰式命名,方法用小写字母和下划线命名)

其他规定:如果一个attribute的名字以下划线开头,则这个attribute只能在类里面访问,不能在外部访问使用。

Class Attributes

Some attribute values are shared across all objects of a given class. Such attributes are associated with the class itself, rather than any individual instance of the class.

In the broader developer community, class attributes may also be called class variables or static variables. (类变量,静态变量)

1
2
3
4
5
6
>>> class Account:
interest = 0.02 # A class attribute
def __init__(self, account_holder):
self.balance = 0
self.holder = account_holder
# Additional methods would be defined here

This attribute can still be accessed from any instance of the class.

However, a single assignment statement to a class attribute changes the value of the attribute for all instances of the class.

1
2
3
4
5
>>> Account.interest = 0.04
>>> spock_account.interest
0.04
>>> kirk_account.interest
0.04

Attribute names. we could easily have a class attribute and an instance attribute with the same name. 类中静态变量的名字可以和实例中的变量的名字相同,但访问的优先级不同,实例中变量优先级更高,即先从局部变量找,再到全局变量找。

As we have seen, a dot expression consists of an expression, a dot, and a name:

1
<expression> . <name>

To evaluate a dot expression:

  1. Evaluate the <expression> to the left of the dot, which yields the object of the dot expression.
  2. <name> is matched against the instance attributes of that object; if an attribute with that name exists, its value is returned.
  3. If <name> does not appear among instance attributes, then <name> is looked up in the class, which yields a class attribute value.
  4. That value is returned unless it is a function, in which case a bound method is returned instead.

Attribute assignment. If the object is an instance, then assignment sets an instance attribute. If the object is a class, then assignment sets a class attribute. As a consequence of this rule, assignment to an attribute of an object cannot affect the attributes of its class. 当类的实例为类属性赋值时,实例中会创建一个新的属性(和类属性同名)

If we assign to the named attribute interest of an account instance, we create a new instance attribute that has the same name as the existing class attribute.

1
>>> kirk_account.interest = 0.08

and that attribute value will be returned from a dot expression.

1
2
>>> kirk_account.interest
0.08

However, the class attribute interest still retains its original value, which is returned for all other accounts.

1
2
>>> spock_account.interest
0.04

Changes to the class attribute interest will affect spock_account, but the instance attribute for kirk_account will be unaffected.

1
2
3
4
5
>>> Account.interest = 0.05  # changing the class attribute
>>> spock_account.interest # changes instances without like-named instance attributes
0.05
>>> kirk_account.interest # but the existing instance attribute is unaffected
0.08

Inheritance

简单的说明继承是啥

1
2
3
4
5
6
7
>>> ch = CheckingAccount('Spock')
>>> ch.interest # Lower interest rate for checking accounts
0.01
>>> ch.deposit(20) # Deposits are the same
20
>>> ch.withdraw(5) # withdrawals decrease balance by an extra charge
14

A CheckingAccount is a specialization of an Account. In OOP terminology, the generic account will serve as the base class of CheckingAccount, while CheckingAccount will be a subclass of Account.

A subclass inherits the attributes of its base class, but may override certain attributes, including certain methods. With inheritance, we only specify what is different between the subclass and the base class. Anything that we leave unspecified in the subclass is automatically assumed to behave just as it would for the base class.

给个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> class Account:
"""A bank account that has a non-negative balance."""
interest = 0.02
def __init__(self, account_holder):
self.balance = 0
self.holder = account_holder
def deposit(self, amount):
"""Increase the account balance by amount and return the new balance."""
self.balance = self.balance + amount
return self.balance
def withdraw(self, amount):
"""Decrease the account balance by amount and return the new balance."""
if amount > self.balance:
return 'Insufficient funds'
self.balance = self.balance - amount
return self.balance
1
2
3
4
5
6
>>> class CheckingAccount(Account):
"""A bank account that charges for withdrawals."""
withdraw_charge = 1
interest = 0.01
def withdraw(self, amount):
return Account.withdraw(self, amount + self.withdraw_charge)
1
2
3
4
5
6
7
>>> checking = CheckingAccount('Sam')
>>> checking.deposit(10)
10
>>> checking.withdraw(5)
4
>>> checking.interest
0.01

We can define this procedure recursively. To look up a name in a class.

  1. If it names an attribute in the class, return the attribute value.
  2. Otherwise, look up the name in the base class, if there is one.

Note: The class of an object stays constant throughout. Even though the deposit method was found in the Account class, deposit is called with self bound to an instance of CheckingAccount, not of Account.

Calling ancestors. Attributes that have been overridden are still accessible via class objects. For instance, we implemented the withdraw method of CheckingAccount by calling the withdraw method of Account with an argument that included the withdraw_charge.

Notice that we called self.withdraw_charge rather than the equivalentCheckingAccount.withdraw_charge. The benefit of the former over the latter is that a class that inherits from CheckingAccount might override the withdrawal charge. If that is the case, we would like our implementation of withdraw to find that new value instead of the old one.(就是用self可以保持更新,如果修改了类静态变量)

Interfaces. It is extremely common in object-oriented programs that different types of objects will share the same attribute names. An object interface is a collection of attributes and conditions on those attributes.

In some programming languages such as Java, interface implementations must be explicitly declared. In others such as Python, Ruby, and Go, any object with the appropriate names implements an interface.

例子:

1
2
3
>>> def deposit_all(winners, amount=5):
for account in winners:
account.deposit(amount)

例子中的account必须是要有Account.deposit的实现的任何类

Multiple Inheritance

Python supports the concept of a subclass inheriting attributes from multiple base classes, a language feature called multiple inheritance.

例子:

1
2
3
4
>>> class SavingsAccount(Account):
deposit_charge = 2
def deposit(self, amount):
return Account.deposit(self, amount - self.deposit_charge)
1
2
3
4
>>> class AsSeenOnTVAccount(CheckingAccount, SavingsAccount):
def __init__(self, account_holder):
self.holder = account_holder
self.balance = 1 # A free dollar!

AsSeenOnTVAccount同时继承了两个类。

img

AsSeenOnTVAccount的实例调用一个属性的时,若所有的base class都有该属性并重写了,该如何选择.

python 遵循:如上图,先从左到右,在从下往上。

所以顺序是:AsSeenOnTVAccount, CheckingAccount, SavingsAccount, Account, object

Further reading. Python resolves this name using a recursive algorithm called the C3 Method Resolution Ordering. The method resolution order of any class can be queried using the mro method on all classes.

1
2
>>> [c.__name__ for c in AsSeenOnTVAccount.mro()]
['AsSeenOnTVAccount', 'CheckingAccount', 'SavingsAccount', 'Account', 'object']

Object Abstraction

A central concept in object abstraction is a generic function(泛型函数), which is a function that can accept values of multiple different types.We will consider three different techniques for implementing generic functions: shared interfaces, type dispatching, and type coercion.

String Conversion

String values provide a fundamental medium for communicating information among humans.

python规定所有的objects都要有两个string representations:

  1. str,得到一个human-readable string;当print(any objects)时调用str
  2. repr,得到一个 Python expression,并且eval(repr(object)) == object;当在python 终端中直接输入objects时,调用repr

这两个虽然都返回一个字符串,但是还有有区别的。

举个例子:

1
2
3
4
5
6
>>> from datetime import date
>>> tues = date(2011, 9, 12)
>>> repr(tues)
'datetime.date(2011, 9, 12)'
>>> str(tues)
'2011-09-12'

想要能够对所有的对象都能调用repr或者str就需要将其变为generic function

The object system provides an elegant solution in this case: the repr function always invokes a method called __repr__ on its argument.

1
2
>>> tues.__repr__()
'datetime.date(2011, 9, 12)'
1
2
>>> tues.__str__()
'2011-09-12'

Special Methods

In Python, certain special names are invoked by the Python interpreter in special circumstances.

True and false values. We saw previously that numbers in Python have a truth value; more specifically, 0 is a false value and all other numbers are true values. In fact, all objects in Python have a truth value.但是你自己定义的class可以通过重写__bool__方法来改变。

举个例子:

1
2
3
4
5
6
>>> Account.__bool__ = lambda self: self.balance != 0
>>> bool(Account('Jack'))
False
>>> if not Account('Jack'):
print('Jack has nothing')
Jack has nothing

之前定义的Account类,改写其bool值,当账户余额为0时,为False

Sequence operations. We have seen that we can call the len function to determine the length of a sequence.

The len function invokes the __len__ method of its argument to determine its length. All built-in sequence types implement this method.

1
2
>>> 'Go Bears!'.__len__()
9

Python uses a sequence's length to determine its truth value, if it does not provide a __bool__method. Empty sequences are false, while non-empty sequences are true.

1
2
3
4
5
6
>>> bool('')
False
>>> bool([])
False
>>> bool('Go Bears!')
True

The __getitem__ method is invoked by the element selection operator, but it can also be invoked directly.

1
2
3
4
>>> 'Go Bears!'[3]
'B'
>>> 'Go Bears!'.__getitem__(3)
'B'

Callable objects. In Python, functions are first-class objects, so they can be passed around as data and have attributes like any other object. Python also allows us to define objects that can be "called" like functions by including a __call__ method. With this method, we can define a class that behaves like a higher-order function.

举个例子:

1
2
3
4
5
6
7
>>> def make_adder(n):
def adder(k):
return n + k
return adder
>>> add_three = make_adder(3)
>>> add_three(4)
7

上面是一个高阶函数

1
2
3
4
5
6
7
8
>>> class Adder(object):
def __init__(self, n):
self.n = n
def __call__(self, k):
return self.n + k
>>> add_three_obj = Adder(3)
>>> add_three_obj(4)
7

这个Adder类的功能和上述的高阶函数的功能一致

Here, the Adder class behaves like the make_adder higher-order function, and the add_three_objobject behaves like the add_three function.

Multiple Representations

Abstraction barriers allow us to separate the use and representation of data. However, in large programs, it may not always make sense to speak of "the underlying representation" for a data type in a program. For one thing, there might be more than one useful representation for a data object, and we might like to design systems that can deal with multiple representations.

举个例子:

复数,可以用直角坐标系进行表示,分为实部和虚部;也可以用极坐标进行表示,分为长度和角度;详细情况请参考这篇文章

从0开始实现一个复数:

1
2
3
4
5
6
7
8
9
>>> class Number:
def __add__(self, other):
'''special method 加法
'''
return self.add(other)
def __mul__(self, other):
'''special method 乘法
'''
return self.mul(other)

这是一个superclass,所有的数都继承与它。(类似于Java中的interface)

1
2
3
4
5
6
7
8
9
10
>>> class Complex(Number):
def add(self, other):
'''用直角坐标系的表示,进行加法
'''
return ComplexRI(self.real + other.real, self.imag + other.imag)
def mul(self, other):
'''用极坐标系的表示,进行乘法
'''
magnitude = self.magnitude * other.magnitude
return ComplexMA(magnitude, self.angle + other.angle)

下面分别是ComplexRIComplexMA的实现

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> from math import atan2
>>> class ComplexRI(Complex):
def __init__(self, real, imag):
self.real = real
self.imag = imag
@property
def magnitude(self):
return (self.real ** 2 + self.imag ** 2) ** 0.5
@property
def angle(self):
return atan2(self.imag, self.real)
def __repr__(self):
return 'ComplexRI({0:g}, {1:g})'.format(self.real, self.imag)
1
2
3
4
5
6
7
8
9
10
11
12
13
>>> from math import sin, cos, pi
>>> class ComplexMA(Complex):
def __init__(self, magnitude, angle):
self.magnitude = magnitude
self.angle = angle
@property
def real(self):
return self.magnitude * cos(self.angle)
@property
def imag(self):
return self.magnitude * sin(self.angle)
def __repr__(self):
return 'ComplexMA({0:g}, {1:g} * pi)'.format(self.magnitude, self.angle/pi)

其中,The @property decorator allows functions to be called without call expression syntax (parentheses following an expression).

1
2
3
4
5
6
7
8
9
10
>>> ri = ComplexRI(5, 12)
>>> ri.real
5
>>> ri.magnitude
13.0
>>> ri.real = 9
>>> ri.real
9
>>> ri.magnitude
15.0

最后进行一个测试:

1
2
3
4
5
>>> from math import pi
>>> ComplexRI(1, 2) + ComplexMA(2, pi/2)
ComplexRI(1, 4)
>>> ComplexRI(0, 1) * ComplexRI(0, 1)
ComplexMA(1, 1 * pi)

Interfaces. Object attributes, which are a form of message passing, allows different data types to respond to the same message in different ways. A shared set of messages that elicit similar behavior from different classes is a powerful method of abstraction. An interface is a set of shared attribute names, along with a specification of their behavior. In the case of complex numbers, the interface needed to implement arithmetic consists of four attributes: real, imag, magnitude, and angle.

The interface approach to encoding multiple representations has appealing properties.

Multiple representations of data are closely related to the idea of data abstraction with which we began this chapter. Using data abstraction, we were able to change the implementation of a data type without changing the meaning of the program. With interfaces and message passing, we can have multiple different representations within the same program. In both cases, a set of names and corresponding behavior conditions define the abstraction that enables this flexibility.

Generic Functions

Generic functions are methods or functions that apply to arguments of different types.

但是两个不同类型的对象(虽然他们继承与同一个父类)如何进行交互,比如复数和分数直接如何加和乘。

本文中给出了两种方式:

  1. type dispatching 类型调度
  2. coercion 类型强制转换

先给出实现分数的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> from fractions import gcd
>>> class Rational(Number):
def __init__(self, numer, denom):
g = gcd(numer, denom)
self.numer = numer // g
self.denom = denom // g
def __repr__(self):
return 'Rational({0}, {1})'.format(self.numer, self.denom)
def add(self, other):
nx, dx = self.numer, self.denom
ny, dy = other.numer, other.denom
return Rational(nx * dy + ny * dx, dx * dy)
def mul(self, other):
numer = self.numer * other.numer
denom = self.denom * other.denom
return Rational(numer, denom)

Type dispatching. One way to implement cross-type operations is to select behavior based on the types of the arguments to a function or method. The idea of type dispatching is to write functions that inspect the type of arguments they receive, then execute code that is appropriate for those types.

The built-in function isinstance takes an object and a class. It returns true if the object has a class that either is or inherits from the given class.

1
2
3
4
5
6
7
>>> c = ComplexRI(1, 1)
>>> isinstance(c, ComplexRI)
True
>>> isinstance(c, Complex)
True
>>> isinstance(c, ComplexMA)
False

isinstance是一种判别方法,也可以使用type_tag变量来标记类型

For arithmetic, we will give a type_tagattribute to Rational and Complex instances that has a string value. When two values x and y have the same type_tag, then we can combine them directly with x.add(y). If not, we need a cross-type operation.

1
2
3
4
5
6
7
8
>>> Rational.type_tag = 'rat'
>>> Complex.type_tag = 'com'
>>> Rational(2, 5).type_tag == Rational(1, 2).type_tag
True
>>> ComplexRI(1, 1).type_tag == ComplexMA(2, pi/2).type_tag
True
>>> Rational(2, 5).type_tag == ComplexRI(1, 1).type_tag
False
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
>>> def add_complex_and_rational(c, r):
return ComplexRI(c.real + r.numer/r.denom, c.imag)
>>> def mul_complex_and_rational(c, r):
r_magnitude, r_angle = r.numer/r.denom, 0
if r_magnitude < 0:
r_magnitude, r_angle = -r_magnitude, pi
return ComplexMA(c.magnitude * r_magnitude, c.angle + r_angle)
>>> def add_rational_and_complex(r, c):
return add_complex_and_rational(c, r)
>>> def mul_rational_and_complex(r, c):
return mul_complex_and_rational(c, r)
>>> class Number:
def __add__(self, other):
if self.type_tag == other.type_tag:
return self.add(other)
elif (self.type_tag, other.type_tag) in self.adders:
return self.cross_apply(other, self.adders)
def __mul__(self, other):
if self.type_tag == other.type_tag:
return self.mul(other)
elif (self.type_tag, other.type_tag) in self.multipliers:
return self.cross_apply(other, self.multipliers)
def cross_apply(self, other, cross_fns):
cross_fn = cross_fns[(self.type_tag, other.type_tag)]
return cross_fn(self, other)
adders = {("com", "rat"): add_complex_and_rational,
("rat", "com"): add_rational_and_complex}
multipliers = {("com", "rat"): mul_complex_and_rational,
("rat", "com"): mul_rational_and_complex}

The role of type dispatching is to ensure that these cross-type operations are used at appropriate times.

In this new definition of the Number class, all cross-type implementations are indexed by pairs of type tags in the adders and multipliers dictionaries.

This dictionary-based approach to type dispatching is extensible. New subclasses of Number could install themselves into the system by declaring a type tag and adding cross-type operations to Number.adders and Number.multipliers. They could also define their own adders and multipliers in a subclass.

Coercion.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
>>> def rational_to_complex(r):
'''rational number to complex number
'''
return ComplexRI(r.numer/r.denom, 0)
>>> class Number:
def __add__(self, other):
x, y = self.coerce(other)
return x.add(y)
def __mul__(self, other):
x, y = self.coerce(other)
return x.mul(y)
def coerce(self, other):
if self.type_tag == other.type_tag:
return self, other
elif (self.type_tag, other.type_tag) in self.coercions:
return (self.coerce_to(other.type_tag), other)
elif (other.type_tag, self.type_tag) in self.coercions:
return (self, other.coerce_to(self.type_tag))
def coerce_to(self, other_tag):
coercion_fn = self.coercions[(self.type_tag, other_tag)]
return coercion_fn(self)
coercions = {('rat', 'com'): rational_to_complex}

This coercion scheme has some advantages over the method of defining explicit cross-type operations. Although we still need to write coercion functions to relate the types, we need to write only one function for each pair of types rather than a different function for each set of types and each generic operation.(比上面的方法,代码量少了很多)

Further advantages come from extending coercion. Some more sophisticated coercion schemes do not just try to coerce one type into another, but instead may try to coerce two different types each into a third common type. Consider a rhombus and a rectangle: neither is a special case of the other, but both can be viewed as quadrilaterals. Another extension to coercion is iterative coercion, in which one data type is coerced into another via intermediate types. Consider that an integer can be converted into a real number by first converting it into a rational number, then converting that rational number into a real number. Chaining coercion in this way can reduce the total number of coercion functions that are required by a program.

Despite its advantages, coercion does have potential drawbacks. For one, coercion functions can lose information when they are applied.

Recursive Objects

解释下:

Objects can have other objects as attribute values. When an object of some(某个) class has an attribute value of that same class, it is a recursive object.

Linked List Class

A linked list, introduced earlier in this chapter, is composed of a first element and the rest of the list. The rest of a linked list is itself a linked list — a recursive definition.

例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
>>> class Link:
"""A linked list with a first element and the rest."""
empty = ()
def __init__(self, first, rest=empty):
assert rest is Link.empty or isinstance(rest, Link)
self.first = first
self.rest = rest
def __getitem__(self, i):
if i == 0:
return self.first
else:
return self.rest[i-1]
def __len__(self):
return 1 + len(self.rest)
>>> s = Link(3, Link(4, Link(5)))
>>> len(s)
3
>>> s[1]
4
>>> def link_expression(s):
"""Return a string that would evaluate to s.
用来帮助debug
"""
if s.rest is Link.empty:
rest = ''
else:
rest = ', ' + link_expression(s.rest)
return 'Link({0}{1})'.format(s.first, rest)
>>> link_expression(s)
'Link(3, Link(4, Link(5)))'
1
2
3
>>> Link.__repr__ = link_expression
>>> s # 跟repr(s)的结果一致
Link(3, Link(4, Link(5)))

Python displays instances of user-defined classes by invoking their __repr__ method.

The Link class has the closure property.(闭包属性) Just as an element of a list can itself be a list, a Link can contain a Link as its first element.

1
2
3
>>> s_first = Link(s, Link(6))
>>> s_first
Link(Link(3, Link(4, Link(5))), Link(6))

The s_first linked list has only two elements, but its first element is a linked list with three elements.

1
2
3
4
5
6
>>> len(s_first)
2
>>> len(s_first[0])
3
>>> s_first[0][2]
5

递归函数天然的适合这种递归对象

1
2
3
4
5
6
7
8
9
10
>>> def extend_link(s, t):
if s is Link.empty:
return t
else:
return Link(s.first, extend_link(s.rest, t))
>>> extend_link(s, s)
Link(3, Link(4, Link(5, Link(3, Link(4, Link(5))))))
>>> Link.__add__ = extend_link
>>> s + s
Link(3, Link(4, Link(5, Link(3, Link(4, Link(5))))))

Tree Class

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> class Tree:
def __init__(self, label, branches=()):
self.label = label
for branch in branches:
assert isinstance(branch, Tree)
self.branches = branches
def __repr__(self):
if self.branches:
return 'Tree({0}, {1})'.format(self.label, repr(self.branches))
else:
return 'Tree({0})'.format(repr(self.label))
def is_leaf(self):
return not self.branches

生成一个 Fibonacci Tree(斐波那契数列树)

1
2
3
4
5
6
7
8
9
10
11
>>> def fib_tree(n):
if n == 1:
return Tree(0)
elif n == 2:
return Tree(1)
else:
left = fib_tree(n-2)
right = fib_tree(n-1)
return Tree(left.label + right.label, (left, right))
>>> fib_tree(5)
Tree(3, (Tree(1, (Tree(0), Tree(1))), Tree(2, (Tree(1), Tree(1, (Tree(0), Tree(1)))))))

Sets

特点:无序,不可重复,set里的元素不可修改(immutable)

用上述两种数据结构构建一个 set

List 构建

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
>>> def empty(s):
return s is Link.empty
>>> def set_contains(s, v):
"""Return True if and only if set s contains v."""
if empty(s):
return False
elif s.first == v:
return True
else:
return set_contains(s.rest, v)
>>> s = Link(4, Link(1, Link(5)))
>>> set_contains(s, 2)
False
>>> set_contains(s, 5)
True
>>> def adjoin_set(s, v):
"""Return a set containing all elements of s and element v."""
if set_contains(s, v):
return s
else:
return Link(v, s)
>>> t = adjoin_set(s, 2)
>>> t
Link(2, Link(4, Link(1, Link(5))))

用list构建,时间复杂度上会比较多高;插入一个元素,需要O(n),如果元素有序后,可以用二分查找,降到O(logn);

Tree构建

用树构建,可以用二叉搜索树构建,利用其特点,将时间复杂度降下来,比如插入元素就是O(logn)了。

1
2
3
4
5
6
7
8
9
>>> def set_contains(s, v):
if s is None:
return False
elif s.entry == v:
return True
elif s.entry < v:
return set_contains(s.right, v)
elif s.entry > v:
return set_contains(s.left, v)

Python set implementation. The set type that is built into Python does not use any of these representations internally. Instead, Python uses a representation that gives constant-time membership tests and adjoin operations based on a technique called hashing.

Efficiency

Measuring Efficiency

定义一个高阶函数来计算某个函数调用的次数

1
2
3
4
5
6
>>> def count(f):
def counted(*args):
counted.call_count += 1
return f(*args)
counted.call_count = 0
return counted
1
2
3
4
5
6
7
8
9
10
11
12
13
>>> def fib(n):
if n == 0:
return 0
if n == 1:
return 1
return fib(n-2) + fib(n-1)
>>> fib(5)
5
>>> fib = count(fib)
>>> fib(19)
4181
>>> fib.call_count
13529

Space. 当一个frame处于active状态,这个frame就在占用内存,处于active状态就是有函数调用或者数据处理等。当frame处于inactive状态时,python会自动收回它使用的内存。An environment is active if it provides the evaluation context for some expression being evaluated. An environment becomes inactive whenever the function call for which its first frame was created finally returns.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
>>> def count_frames(f):
def counted(*args):
counted.open_count += 1
counted.max_count = max(counted.max_count, counted.open_count)
result = f(*args)
counted.open_count -= 1
return result
counted.open_count = 0
counted.max_count = 0
return counted
>>> fib = count_frames(fib)
>>> fib(19)
4181
>>> fib.open_count
0
>>> fib.max_count
19
>>> fib(24)
46368
>>> fib.max_count
24

上述高阶函数统计最大同时使用的frame数。

To summarize, the space requirement of the fib function, measured in active frames, is one less than the input, which tends to be small. The time requirement measured in total recursive calls is larger than the output, which tends to be huge.

Memoization

由于fib函数在计算的过程中,有许多需要重复计算的部分,所以可以使用一个字典来保存已经计算过的部分。

Memoization can be expressed naturally as a higher-order function, which can also be used as a decorator. The definition below creates a cache of previously computed results, indexed by the arguments from which they were computed. The use of a dictionary requires that the argument to the memoized function be immutable.

1
2
3
4
5
6
7
8
9
>>> def memo(f):
'''缓存
'''
cache = {}
def memoized(n):
if n not in cache:
cache[n] = f(n)
return cache[n]
return memoized

Decomposition

Modules

A Python module is a file typically containing function or class definitions.

例子:

link.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Link:
empty = ()

def __init__(self, first, rest=empty):
assert rest is Link.empty or isinstance(rest, Link)
self.first = first
self.rest = rest

def __repr__(self):
if self.rest:
rest_repr = ', ' + repr(self.rest)
else:
rest_repr = ''
return 'Link(' + repr(self.first) + rest_repr + ')'

def __str__(self):
string = '<'
while self.rest is not Link.empty:
string += str(self.first) + ' '
self = self.rest
return string + str(self.first) + '>'

Importing

Importing a whole module:

1
2
3
import link

ll = link.Link(3, link.Link(4, link.Link(5)))

Importing specific names:

1
2
3
from link import Link

ll = Link(3, Link(4, Link(5)))

Importing all names:

1
2
3
from link import *

ll = Link(3, Link(4, Link(5)))

Importing with alias

I don't recommend aliasing a class or function name:

1
2
3
from link import Link as LL

ll = LL(3, LL(4, LL(5)))

But aliasing a whole module is sometimes okay (and is common in data science):

1
2
3
import numpy as np

b = np.array([(1.5, 2, 3), (4, 5, 6)])

Running a module

This command runs a module:

python module.py

When run like that, Python sets a global variable __name__ to "main". That means you often see code at the bottom of modules like this:

1
2
if __name__ == "__main__":
# use the code in the module somehow

The code inside that condition will be executed as well, but only when the module is run directly.(只有直接运行该module时才会执行__main__这个里的内容,如果在其他文件中import里这个module,则该module里的__main__里的内容不会运行)

Packages

Python packages

A Python package is a way of bundling multiple related modules together. Popular packages are NumPy and Pillow.

Example package structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
sound/                        top-level package
__init__.py 初始化这个sound package
formats/ 子包:文件格式转换
__init__.py
wavread.py
wavwrite.py
aiffread.py
aiffwrite.py
auread.py
auwrite.py
...
effects/ Subpackage for sound effects
__init__.py
echo.py
surround.py
reverse.py
...
filters/ Subpackage for filters
__init__.py
equalizer.py
vocoder.py
karaoke.py
...

其中__init__.py可以是个空文件

Importing from a package

Importing a whole path:

1
2
3
import sound.effects.echo

sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)

Importing a module from the path:

1
2
3
from sound.effects import echo

echo.echofilter(input, output, delay=0.7, atten=4)

Installing packages

The Python Package Index is a repository of packages for the Python language.

Once you find a package you like, pip is the standard way to install:

1
pip install nltk

You may need to use pip3 if your system defaults to Python 2.

Modularity

Modular design

A design principle: Isolate different parts of a program that address different concerns.

A modular component can be developed and tested independently.

Ways to isolate in Python:

  • Functions
  • Classes
  • Modules
  • Packages

Interpreting Computer Programs

上面两个部分讲的都是去用python这个语言,下面这部分讲如何设计一个编程语言,此时,咱们的角度要从使用者变为设计者。

这部分主要为Scheme(是一种计算机语言,比较古老,也叫Lisp)写一个编译器。

Interpreters for Languages with Combination

先实现部分scheme编译器的功能,然后在实现一个通用的编译器。

A Scheme-Syntax Calculator