起手式 — 代数

两个标量相加

To get us started with Theano and get a feel of what we’re working with, let’s make a simple function: add two numbers together. Here is how you do it:

>>> import numpy
>>> import theano.tensor as T
>>> from theano import function
>>> x = T.dscalar('x')
>>> y = T.dscalar('y')
>>> z = x + y
>>> f = function([x, y], z)

And now that we’ve created our function we can use it:

>>> f(2, 3)
array(5.0)
>>> numpy.allclose(f(16.3, 12.1), 28.4)
True

Let’s break this down into several steps. 第一步是定义两个符号(变量),表示要相加的数量。注意,从现在起,我们将使用术语变量来表示“符号”(换句话说,xyz都是变量对象)。The output of the function f is a numpy.ndarray with zero dimensions.

如果你正在跟着输入解释器,你可能已经注意到执行function指令有一点点延迟。Behind the scene, f was being compiled into C code.

Step 1

>>> x = T.dscalar('x')
>>> y = T.dscalar('y')

在Theano中,所有的符号必须具有类型。特别地,T.dscalar是我们分配给“0维数组(双精度浮点数(d)的标量)”的类型。它是Theano的Type类型。

dscalar is not a class. Therefore, neither x nor y are actually instances of dscalar. They are instances of TensorVariable. 然而,xytype字段赋值为theano的dscalar类型,正如你在下面看到的:

>>> type(x)
<class 'theano.tensor.var.TensorVariable'>
>>> x.type
TensorType(float64, scalar)
>>> T.dscalar
TensorType(float64, scalar)
>>> x.type is T.dscalar
True

通过使用字符串参数调用T.dscalar,你将创建一个给定名称的变量,表示一个浮点数标量。If you provide no argument, the symbol will be unnamed. Names are not required, but they can help debugging.

一会儿会更多地说到Theano的内部结构。You could also learn more by looking into Graph Structures.

Step 2

第二步是将xy组合到它们的和z中:

>>> z = x + y

z是另一个变量,表示xy相加。你可以使用pp函数精确打印与z相关的计算。

>>> from theano import pp
>>> print(pp(z))
(x + y)

Step 3

The last step is to create a function taking x and y as inputs and giving z as output:

>>> f = function([x, y], z)

function的第一个参数是一个变量列表,它们将作为函数的输入。第二个参数是单个变量一个变量的列表。不管哪一种情况,第二个参数是当我们应用函数时我们想要看到它的输出。f may then be used like a normal Python function.

Note

As a shortcut, you can skip step 3, and just use a variable’s eval method. The eval() method is not as flexible as function() but it can do everything we’ve covered in the tutorial so far. It has the added benefit of not requiring you to import function() . Here is how eval() works:

>>> import numpy
>>> import theano.tensor as T
>>> x = T.dscalar('x')
>>> y = T.dscalar('y')
>>> z = x + y
>>> numpy.allclose(z.eval({x : 16.3, y : 12.1}), 28.4)
True

我们传递给eval()一个字典,将theano的符号变量映射到值来替换它们,然后它返回表达式的数值。

eval()在第一次调用变量时会变慢 —— 需要调用function()来在后台编译表达式。Subsequent calls to eval() on that same variable will be fast, because the variable caches the compiled function.

两个矩阵相加

You might already have guessed how to do this. 实际上,与上一个示例的唯一变化是,你需要使用矩阵类型实例化xy

>>> x = T.dmatrix('x')
>>> y = T.dmatrix('y')
>>> z = x + y
>>> f = function([x, y], z)

dmatrix is the Type for matrices of doubles. 然后我们可以在二维数组上使用我们的新函数:

>>> f([[1, 2], [3, 4]], [[10, 20], [30, 40]])
array([[ 11.,  22.],
       [ 33.,  44.]])

The variable is a NumPy array. We can also use NumPy arrays directly as inputs:

>>> import numpy
>>> f(numpy.array([[1, 2], [3, 4]]), numpy.array([[10, 20], [30, 40]]))
array([[ 11.,  22.],
       [ 33.,  44.]])

可以将标量与矩阵相加,向量与矩阵相加,标量与向量相加等。The behavior of these operations is defined by broadcasting.

以下类型可以使用:

  • byte: bscalar, bvector, bmatrix, brow, bcol, btensor3, btensor4, btensor5
  • 16-bit integers: wscalar, wvector, wmatrix, wrow, wcol, wtensor3, wtensor4, wtensor5
  • 32-bit integers: iscalar, ivector, imatrix, irow, icol, itensor3, itensor4, itensor5
  • 64-bit integers: lscalar, lvector, lmatrix, lrow, lcol, ltensor3, ltensor4, ltensor5
  • float: fscalar, fvector, fmatrix, frow, fcol, ftensor3, ftensor4, ftensor5
  • double: dscalar, dvector, dmatrix, drow, dcol, dtensor3, dtensor4, dtensor5
  • complex: cscalar, cvector, cmatrix, crow, ccol, ctensor3, ctensor4, ctensor5

前面的列表并不详尽,可以在这里找到与NumPy数组兼容的所有类型的指南:张量的创建

Note

你作为用户必须选择你的程序将使用32位还是64位整数(i前缀还是l前缀)和浮点数(f前缀还是d前缀)(不是系统架构来选择 )。

练习

import theano
a = theano.tensor.vector() # declare variable
out = a + a ** 10               # build symbolic expression
f = theano.function([a], out)   # compile function
print(f([0, 1, 2]))
[    0.     2.  1026.]

Modify and execute this code to compute this expression: a ** 2 + b ** 2 + 2 * a * b.

答案