Django 1.8.2中文文档

编写自定义 model 字段

介绍

model 参考文档已经介绍了如何使用 Django 的标准字段类;例如 CharFieldDateField,等等。对于很多应用来说,这些类足够用了。 但是在某些情况下, 你所用的Django 不具备你需要的某些精巧功能,或是你想使用的字段与 Django 自带字段完全不同。

Django 内置的字段类型并不能覆盖所有可能遇到的数据库的列类型,仅仅是些普通的字段类型,例如VARCHARINTEGER对于更多不常用的列类型,比如地理定位数据和诸如PostgreSQL自定义类型的自定义字段,你可以定义你自己的Django Field  子类。

有两种实现方式:你可以编写一个复杂的 Python 对象,让它以某种方式将数据序列化,以适应某个数据库的列类型; 或是你创建一个Field的子类,从而让你可以使用 model 中的对象。

示例对象

创建自定义字段需要注意很多细节。 为了使这一章内容容易理解,自始至终我们都只使用这一个例子:包装一个 Python 对象来表示手中桥牌的详细信息。不用担心,这个例子并不要求你会玩桥牌。 你只要知道 52 张牌被平均分配给四个玩家,按惯例,他们被称之为, , 西我们的类看起来就象这个样子:

class Hand(object):
    """A hand of cards (bridge style)"""

    def __init__(self, north, east, south, west):
        # Input parameters are lists of cards ('Ah', '9s', etc)
        self.north = north
        self.east = east
        self.south = south
        self.west = west

    # ... (other possibly useful methods omitted) ...

这只是一个普通的 Python 类,并没有对 Django 做特别的设定。 在 model 中我们可以象下面这样使用 Hand (我们假设 model 中的 hand 属性是 Hand 类的一个实例):

example = MyModel.objects.get(pk=1)
print(example.hand.north)

new_hand = Hand(north, east, south, west)
example.hand = new_hand
example.save()

我们就象使用任何 Python 类一样,对 model 中的 hand 属性进行赋值和取值。利用这一点让 Django 知道如何处理保存和载入一个对象。

为了在 model 中使用 Hand 类,我们不必为这个类做任何的改动。这是非常有用的,它表示着你可以很容易地为已存在的类编写模型支持,而不必改动类的原代码。

注意

你可能只想利用自定义数据库列类型,将数据处理成模型中的标准 Python 类型, 比如:字符串,浮点数等等。 这种情况与我们的 Hand 例子非常相似,我们随着文档的展开对两者的差异进行比较。

后台原理

数据库存储

可以简单的认为 model 字段提供了一种方法来接受普通的 Python 对象,比如布尔值,datetime,或是象Hand这样更复杂的对象,然后在操作数据库时,对对象进行格式转换以适应数据库。(还有序列化也是同样处理,但接下来我们会看到,一旦我们掌握了数据库这方面的转换,再对序列化做处理就游刃有余了)

模型中的字段必须以某种方式转化为现有的数据库字段类型。不同的数据库提供了不同的有效的列类型的集合,但是规则仍然是相同的:这些是唯一工作类型。任何你想存储在数据库中必须适应这些类型之一。

通常情况下,你可以写一个Django字段来匹配特定的数据库行的类型,或有一个相当直接的的方式将你的数据转化为一个字符串。

以我们的 Hand 为例,我们可以将卡片的数据以预先决定好的顺序,连接转化为一个104个字符的字符串– 也就是说, north 卡片排在第一, 然后是 east, southwest所以Hand 对象可以在数据库中以 text或者character 的columns的形式储存

字段类是什么?

所有Django的字段(当我们在本文档中提到字段时,我们总是指模型字段,而不是指表单字段)是django.db.models.Field的子类Most of the information that Django records about a field is common to all fields – name, help text, uniqueness and so forth. Field处理所有存储的信息。We’ll get into the precise details of what Field can do later on; for now, suffice it to say that everything descends from Field and then customizes key pieces of the class behavior.

重要的是要意识到Django字段类不是你的model的属性。The model attributes contain normal Python objects. The field classes you define in a model are actually stored in the Meta class when the model class is created (the precise details of how this is done are unimportant here). This is because the field classes aren’t necessary when you’re just creating and modifying attributes. Instead, they provide the machinery for converting between the attribute value and what is stored in the database or sent to the serializer.

Keep this in mind when creating your own custom fields. The Django Field subclass you write provides the machinery for converting between your Python instances and the database/serializer values in various ways (there are differences between storing a value and using a value for lookups, for example). If this sounds a bit tricky, don’t worry – it will become clearer in the examples below. Just remember that you will often end up creating two classes when you want a custom field:

  • The first class is the Python object that your users will manipulate. They will assign it to the model attribute, they will read from it for displaying purposes, things like that. This is the Hand class in our example.
  • The second class is the Field subclass. This is the class that knows how to convert your first class back and forth between its permanent storage form and the Python form.

Writing a field subclass

When planning your Field subclass, first give some thought to which existing Field class your new field is most similar to. Can you subclass an existing Django field and save yourself some work? 如果没有,你应该继承Field类,所有类都从中继承。

Initializing your new field is a matter of separating out any arguments that are specific to your case from the common arguments and passing the latter to the __init__() method of Field (or your parent class).

In our example, we’ll call our field HandField. (It’s a good idea to call your Field subclass <Something>Field, so it’s easily identifiable as a Field subclass.) It doesn’t behave like any existing field, so we’ll subclass directly from Field:

from django.db import models

class HandField(models.Field):

    description = "A hand of cards (bridge style)"

    def __init__(self, *args, **kwargs):
        kwargs['max_length'] = 104
        super(HandField, self).__init__(*args, **kwargs)

我们的HandField接受大多数标准字段选项(见下面的列表),但我们确保它具有固定长度,因为它只需要持有52个卡片值加上他们的花色;104 characters in total.

Note

Many of Django’s model fields accept options that they don’t do anything with. For example, you can pass both editable and auto_now to a django.db.models.DateField and it will simply ignore the editable parameter (auto_now being set implies editable=False). No error is raised in this case.

This behavior simplifies the field classes, because they don’t need to check for options that aren’t necessary. They just pass all the options to the parent class and then don’t use them later on. It’s up to you whether you want your fields to be more strict about the options they select, or to use the simpler, more permissive behavior of the current fields.

The Field.__init__() method takes the following parameters:

All of the options without an explanation in the above list have the same meaning they do for normal Django fields. See the field documentation for examples and details.

Field deconstruction(model域 解析)

New in Django 1.7:

deconstruct() is part of the migrations framework in Django 1.7 and above. If you have custom fields from previous versions they will need this method added before you can use them with migrations.

The counterpoint to writing your __init__() method is writing the deconstruct() method. This method tells Django how to take an instance of your new field and reduce it to a serialized form - in particular, what arguments to pass to __init__() to re-create it.

如果您没有在继承域的顶部添加任何额外的选项,则无需编写新的deconstruct()方法。If, however, you’re, changing the arguments passed in __init__() (like we are in HandField), you’ll need to supplement the values being passed.

deconstruct()的约定很简单;it returns a tuple of four items: the field’s attribute name, the full import path of the field class, the positional arguments (as a list), and the keyword arguments (as a dict). Note this is different from the deconstruct() method for custom classes which returns a tuple of three things.

As a custom field author, you don’t need to care about the first two values; the base Field class has all the code to work out the field’s attribute name and import path. You do, however, have to care about the positional and keyword arguments, as these are likely the things you are changing.

For example, in our HandField class we’re always forcibly setting max_length in __init__(). The deconstruct() method on the base Field class will see this and try to return it in the keyword arguments; thus, we can drop it from the keyword arguments for readability:

from django.db import models

class HandField(models.Field):

    def __init__(self, *args, **kwargs):
        kwargs['max_length'] = 104
        super(HandField, self).__init__(*args, **kwargs)

    def deconstruct(self):
        name, path, args, kwargs = super(HandField, self).deconstruct()
        del kwargs["max_length"]
        return name, path, args, kwargs

If you add a new keyword argument, you need to write code to put its value into kwargs yourself:

from django.db import models

class CommaSepField(models.Field):
    "Implements comma-separated storage of lists"

    def __init__(self, separator=",", *args, **kwargs):
        self.separator = separator
        super(CommaSepField, self).__init__(*args, **kwargs)

    def deconstruct(self):
        name, path, args, kwargs = super(CommaSepField, self).deconstruct()
        # Only include kwarg if it's not the default
        if self.separator != ",":
            kwargs['separator'] = self.separator
        return name, path, args, kwargs

More complex examples are beyond the scope of this document, but remember - for any configuration of your Field instance, deconstruct() must return arguments that you can pass to __init__ to reconstruct that state.

Pay extra attention if you set new default values for arguments in the Field superclass; you want to make sure they’re always included, rather than disappearing if they take on the old default value.

In addition, try to avoid returning values as positional arguments; where possible, return values as keyword arguments for maximum future compatibility. Of course, if you change the names of things more often than their position in the constructor’s argument list, you might prefer positional, but bear in mind that people will be reconstructing your field from the serialized version for quite a while (possibly years), depending how long your migrations live for.

You can see the results of deconstruction by looking in migrations that include the field, and you can test deconstruction in unit tests by just deconstructing and reconstructing the field:

name, path, args, kwargs = my_field_instance.deconstruct()
new_instance = MyField(*args, **kwargs)
self.assertEqual(my_field_instance.some_attribute, new_instance.some_attribute)

Documenting your custom field

As always, you should document your field type, so users will know what it is. In addition to providing a docstring for it, which is useful for developers, you can also allow users of the admin app to see a short description of the field type via the django.contrib.admindocs application. To do this simply provide descriptive text in a description class attribute of your custom field. In the above example, the description displayed by the admindocs application for a HandField will be ‘A hand of cards (bridge style)’.

In the django.contrib.admindocs display, the field description is interpolated with field.__dict__ which allows the description to incorporate arguments of the field. For example, the description for CharField is:

description = _("String (up to %(max_length)s)")

Useful methods

Once you’ve created your Field subclass, you might consider overriding a few standard methods, depending on your field’s behavior. The list of methods below is in approximately decreasing order of importance, so start from the top.

Custom database types

Say you’ve created a PostgreSQL custom type called mytype. You can subclass Field and implement the db_type() method, like so:

from django.db import models

class MytypeField(models.Field):
    def db_type(self, connection):
        return 'mytype'

Once you have MytypeField, you can use it in any model, just like any other Field type:

class Person(models.Model):
    name = models.CharField(max_length=80)
    something_else = MytypeField()

If you aim to build a database-agnostic application, you should account for differences in database column types. For example, the date/time column type in PostgreSQL is called timestamp, while the same column in MySQL is called datetime. The simplest way to handle this in a db_type() method is to check the connection.settings_dict['ENGINE'] attribute.

For example:

class MyDateField(models.Field):
    def db_type(self, connection):
        if connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
            return 'datetime'
        else:
            return 'timestamp'

The db_type() method is called by Django when the framework constructs the CREATE TABLE statements for your application – that is, when you first create your tables. It is also called when constructing a WHERE clause that includes the model field – that is, when you retrieve data using QuerySet methods like get(), filter(), and exclude() and have the model field as an argument. It’s not called at any other time, so it can afford to execute slightly complex code, such as the connection.settings_dict check in the above example.

Some database column types accept parameters, such as CHAR(25), where the parameter 25 represents the maximum column length. In cases like these, it’s more flexible if the parameter is specified in the model rather than being hard-coded in the db_type() method. For example, it wouldn’t make much sense to have a CharMaxlength25Field, shown here:

# This is a silly example of hard-coded parameters.
class CharMaxlength25Field(models.Field):
    def db_type(self, connection):
        return 'char(25)'

# In the model:
class MyModel(models.Model):
    # ...
    my_field = CharMaxlength25Field()

The better way of doing this would be to make the parameter specifiable at run time – i.e., when the class is instantiated. To do that, just implement Field.__init__(), like so:

# This is a much more flexible example.
class BetterCharField(models.Field):
    def __init__(self, max_length, *args, **kwargs):
        self.max_length = max_length
        super(BetterCharField, self).__init__(*args, **kwargs)

    def db_type(self, connection):
        return 'char(%s)' % self.max_length

# In the model:
class MyModel(models.Model):
    # ...
    my_field = BetterCharField(25)

Finally, if your column requires truly complex SQL setup, return None from db_type(). This will cause Django’s SQL creation code to skip over this field. You are then responsible for creating the column in the right table in some other way, of course, but this gives you a way to tell Django to get out of the way.

Converting values to Python objects

Changed in Django 1.8:

Historically, Django provided a metaclass called SubfieldBase which always called to_python() on assignment. This did not play nicely with custom database transformations, aggregation, or values queries, so it has been replaced with from_db_value().

If your custom Field class deals with data structures that are more complex than strings, dates, integers, or floats, then you may need to override from_db_value() and to_python().

If present for the field subclass, from_db_value() will be called in all circumstances when the data is loaded from the database, including in aggregates and values() calls.

to_python() is called by deserialization and during the clean() method used from forms.

As a general rule, to_python() should deal gracefully with any of the following arguments:

  • An instance of the correct type (e.g., Hand in our ongoing example).
  • A string
  • None (if the field allows null=True)

在我们 HandField 类种,我们使用 VARCHAR 域在数据库中存储我们的数据, 因此我们需要在from_db_value()函数中有能力处理字符串跟 None值。In to_python(), we need to also handle Hand instances:

import re

from django.core.exceptions import ValidationError
from django.db import models

def parse_hand(hand_string):
    """Takes a string of cards and splits into a full hand."""
    p1 = re.compile('.{26}')
    p2 = re.compile('..')
    args = [p2.findall(x) for x in p1.findall(hand_string)]
    if len(args) != 4:
        raise ValidationError("Invalid input for a Hand instance")
    return Hand(*args)

class HandField(models.Field):
    # ...

    def from_db_value(self, value, expression, connection, context):
        if value is None:
            return value
        return parse_hand(value)

    def to_python(self, value):
        if isinstance(value, Hand):
            return value

        if value is None:
            return value

        return parse_hand(value)

Notice that we always return a Hand instance from these methods. That’s the Python object type we want to store in the model’s attribute.

For to_python(), if anything goes wrong during value conversion, you should raise a ValidationError exception.

Converting Python objects to query values

Since using a database requires conversion in both ways, if you override to_python() you also have to override get_prep_value() to convert Python objects back to query values.

For example:

class HandField(models.Field):
    # ...

    def get_prep_value(self, value):
        return ''.join([''.join(l) for l in (value.north,
                value.east, value.south, value.west)])

Warning

If your custom field uses the CHAR, VARCHAR or TEXT types for MySQL, you must make sure that get_prep_value() always returns a string type. MySQL performs flexible and unexpected matching when a query is performed on these types and the provided value is an integer, which can cause queries to include unexpected objects in their results. This problem cannot occur if you always return a string type from get_prep_value().

Converting query values to database values

Some data types (for example, dates) need to be in a specific format before they can be used by a database backend. get_db_prep_value() is the method where those conversions should be made. The specific connection that will be used for the query is passed as the connection parameter. This allows you to use backend-specific conversion logic if it is required.

For example, Django uses the following method for its BinaryField:

def get_db_prep_value(self, value, connection, prepared=False):
    value = super(BinaryField, self).get_db_prep_value(value, connection, prepared)
    if value is not None:
        return connection.Database.Binary(value)
    return value

In case your custom field needs a special conversion when being saved that is not the same as the conversion used for normal query parameters, you can override get_db_prep_save().

Preprocessing values before saving

If you want to preprocess the value just before saving, you can use pre_save(). For example, Django’s DateTimeField uses this method to set the attribute correctly in the case of auto_now or auto_now_add.

If you do override this method, you must return the value of the attribute at the end. You should also update the model’s attribute if you make any changes to the value so that code holding references to the model will always see the correct value.

Preparing values for use in database lookups

As with value conversions, preparing a value for database lookups is a two phase process.

get_prep_lookup() performs the first phase of lookup preparation: type conversion and data validation.

Prepares the value for passing to the database when used in a lookup (a WHERE constraint in SQL). 这个 lookup_type 参数将是一个合法的Django filter查询表达式: exact, iexact, contains, icontains, gt, gte, lt, lte, in, startswith, istartswith, endswith, iendswith, range, year, month, day, isnull, search, regex, and iregex.

New in Django 1.7:

If you are using Custom lookups the lookup_type can be any lookup_name used by the project’s custom lookups.

Your method must be prepared to handle all of these lookup_type values and should raise either a ValueError if the value is of the wrong sort (a list when you were expecting an object, for example) or a TypeError if your field does not support that type of lookup. For many fields, you can get by with handling the lookup types that need special handling for your field and pass the rest to the get_db_prep_lookup() method of the parent class.

If you needed to implement get_db_prep_save(), you will usually need to implement get_prep_lookup(). 如果你不这么做, get_prep_value() 将调用默认执行, 用来管理 exact, gt, gte, lt, lte, in ,range 等查询.

You may also want to implement this method to limit the lookup types that could be used with your custom field type.

Note that, for "range" and "in" lookups, get_prep_lookup will receive a list of objects (presumably of the right type) and will need to convert them to a list of things of the right type for passing to the database. Most of the time, you can reuse get_prep_value(), or at least factor out some common pieces.

For example, the following code implements get_prep_lookup to limit the accepted lookup types to exact and in:

class HandField(models.Field):
    # ...

    def get_prep_lookup(self, lookup_type, value):
        # We only handle 'exact' and 'in'. All others are errors.
        if lookup_type == 'exact':
            return self.get_prep_value(value)
        elif lookup_type == 'in':
            return [self.get_prep_value(v) for v in value]
        else:
            raise TypeError('Lookup type %r not supported.' % lookup_type)

For performing database-specific data conversions required by a lookup, you can override get_db_prep_lookup().

Specifying the form field for a model field

To customize the form field used by ModelForm, you can override formfield().

The form field class can be specified via the form_class and choices_form_class arguments; the latter is used if the field has choices specified, the former otherwise. If these arguments are not provided, CharField or TypedChoiceField will be used.

All of the kwargs dictionary is passed directly to the form field’s __init__() method. Normally, all you need to do is set up a good default for the form_class (and maybe choices_form_class) argument and then delegate further handling to the parent class. This might require you to write a custom form field (and even a form widget). See the forms documentation for information about this.

Continuing our ongoing example, we can write the formfield() method as:

class HandField(models.Field):
    # ...

    def formfield(self, **kwargs):
        # This is a fairly standard way to set up some defaults
        # while letting the caller override them.
        defaults = {'form_class': MyFormField}
        defaults.update(kwargs)
        return super(HandField, self).formfield(**defaults)

This assumes we’ve imported a MyFormField field class (which has its own default widget). This document doesn’t cover the details of writing custom form fields.

Emulating built-in field types

If you have created a db_type() method, you don’t need to worry about get_internal_type() – it won’t be used much. Sometimes, though, your database storage is similar in type to some other field, so you can use that other field’s logic to create the right column.

For example:

class HandField(models.Field):
    # ...

    def get_internal_type(self):
        return 'CharField'

No matter which database backend we are using, this will mean that migrate and other SQL commands create the right column type for storing a string.

If get_internal_type() returns a string that is not known to Django for the database backend you are using – that is, it doesn’t appear in django.db.backends.<db_name>.base.DatabaseWrapper.data_types – the string will still be used by the serializer, but the default db_type() method will return None. See the documentation of db_type() for reasons why this might be useful. Putting a descriptive string in as the type of the field for the serializer is a useful idea if you’re ever going to be using the serializer output in some other place, outside of Django.

Converting field data for serialization

To customize how the values are serialized by a serializer, you can override value_to_string(). Calling Field._get_val_from_obj(obj) is the best way to get the value serialized. For example, since our HandField uses strings for its data storage anyway, we can reuse some existing conversion code:

class HandField(models.Field):
    # ...

    def value_to_string(self, obj):
        value = self._get_val_from_obj(obj)
        return self.get_prep_value(value)

Some general advice

Writing a custom field can be a tricky process, particularly if you’re doing complex conversions between your Python types and your database and serialization formats. Here are a couple of tips to make things go more smoothly:

  1. Look at the existing Django fields (in django/db/models/fields/__init__.py) for inspiration. Try to find a field that’s similar to what you want and extend it a little bit, instead of creating an entirely new field from scratch.
  2. Put a __str__() (__unicode__() on Python 2) method on the class you’re wrapping up as a field. There are a lot of places where the default behavior of the field code is to call force_text() on the value. (In our examples in this document, value would be a Hand instance, not a HandField). 如果你的 __str__() 方法 ( Python 2使用__unicode__()) 会自动把你的Python对象转换成字符串, 你可以节约很多时间。

Writing a FileField

In addition to the above methods, fields that deal with files have a few other special requirements which must be taken into account. The majority of the mechanics provided by FileField, such as controlling database storage and retrieval, can remain unchanged, leaving subclasses to deal with the challenge of supporting a particular type of file.

Django provides a File class, which is used as a proxy to the file’s contents and operations. This can be subclassed to customize how the file is accessed, and what methods are available. It lives at django.db.models.fields.files, and its default behavior is explained in the file documentation.

Once a subclass of File is created, the new FileField subclass must be told to use it. To do so, simply assign the new File subclass to the special attr_class attribute of the FileField subclass.

A few suggestions

In addition to the above details, there are a few guidelines which can greatly improve the efficiency and readability of the field’s code.

  1. The source for Django’s own ImageField (in django/db/models/fields/files.py) is a great example of how to subclass FileField to support a particular type of file, as it incorporates all of the techniques described above.
  2. Cache file attributes wherever possible. Since files may be stored in remote storage systems, retrieving them may cost extra time, or even money, that isn’t always necessary. Once a file is retrieved to obtain some data about its content, cache as much of that data as possible to reduce the number of times the file must be retrieved on subsequent calls for that information.