pandas.DataFrame¶
-
class
pandas.
DataFrame
(data=None, index=None, columns=None, dtype=None, copy=False)[source]¶ 具有标记轴(行和列)的二维大小可变,可能异构的表格数据结构。 算术运算在行标签和列标签上对齐。 可以被认为是Series对象的类似dict的容器。 主要的pandas数据结构
Parameters: data : numpy ndarray (structured or homogeneous), dict, or DataFrame
Dict可以包含Series,数组,常量或类似列表的对象
index : Index or array-like
用于结果框架的索引。 Will default to np.arange(n) if no indexing information part of input data and no index provided
columns : Index or array-like
Column labels to use for resulting frame. Will default to np.arange(n) if no column labels are provided
dtype : dtype, default None
要强制的数据类型。 只允许一个dtype。 If None, infer
copy : boolean, default False
从输入中复制数据。 仅影响DataFrame / 2d ndarray输入
See also
DataFrame.from_records
- constructor from tuples, also record arrays
DataFrame.from_dict
- from dicts of Series, arrays, or dicts
DataFrame.from_items
- from sequence of (key, value) pairs
Examples
Constructing DataFrame from a dictionary.
>>> d = {'col1': [1, 2], 'col2': [3, 4]} >>> df = pd.DataFrame(data=d) >>> df col1 col2 0 1 3 1 2 4
Notice that the inferred dtype is int64.
>>> df.dtypes col1 int64 col2 int64 dtype: object
To enforce a single dtype:
>>> df = pd.DataFrame(data=d, dtype=np.int8) >>> df.dtypes col1 int8 col2 int8 dtype: object
Constructing DataFrame from numpy ndarray:
>>> df2 = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)), ... columns=['a', 'b', 'c', 'd', 'e']) >>> df2 a b c d e 0 2 8 8 3 4 1 4 2 9 0 9 2 1 0 7 8 0 3 5 1 7 1 3 4 6 0 2 4 2
Attributes
T
转置索引和列 at
基于标签的快速标量存取器 axes
返回一个列表,其中行轴标签和列轴标签为唯一成员。 blocks
as_blocks()的内部属性,属性同义词 dtypes
返回此对象中的dtypes。 empty
如果NDFrame完全为空[无项目],则为True,表示任何轴的长度为0。 ftypes
返回此对象中的ftypes(稀疏/密集和dtype的指示)。 iat
快速整数位置标量访问器。 iloc
纯粹基于整数位置的索引,用于按位置选择。 is_copy
ix
主要基于标签位置的索引器,具有整数位置回退。 loc
纯粹基于标签位置的索引器,用于按标签选择。 ndim
轴数/数组尺寸 shape
返回表示DataFrame维度的元组。 size
NDFrame中的元素数量 style
返回Styler对象的属性,该对象包含用于为DataFrame构建样式化HTML表示的方法。 values
NDFrame的Numpy表示 Methods
abs
()返回一个具有绝对值的对象 - 仅适用于全数字的对象。 add
(other[, axis, level, fill_value])添加数据帧和其他元素(二元运算符add)。 add_prefix
(prefix)将前缀字符串与面板项名称连接在一起。 add_suffix
(suffix)连接带有面板项名称的后缀字符串。 agg
(func[, axis])使用callable,string,dict或string / callables列表进行聚合 aggregate
(func[, axis])使用callable,string,dict或string / callables列表进行聚合 align
(other[, join, axis, level, copy, ...])将两个轴上的物体对准 all
([axis, bool_only, skipna, level])返回所请求轴上的所有元素是否为True any
([axis, bool_only, skipna, level])返回任何元素在请求的轴上是否为True append
(other[, ignore_index, verify_integrity])Append rows of other to the end of this frame, returning a new object. apply
(func[, axis, broadcast, raw, reduce, args])Applies function along input axis of DataFrame. applymap
(func)Apply a function to a DataFrame that is intended to operate elementwise, i.e. as_blocks
([copy])Convert the frame to a dict of dtype -> Constructor Types that each has a homogeneous dtype. as_matrix
([columns])Convert the frame to its Numpy-array representation. asfreq
(freq[, method, how, normalize, ...])Convert TimeSeries to specified frequency. asof
(where[, subset])The last row without any NaN is taken (or the last row without assign
(**kwargs)Assign new columns to a DataFrame, returning a new object (a copy) with all the original columns in addition to the new ones. astype
(dtype[, copy, errors])Cast a pandas object to a specified dtype dtype
.at_time
(time[, asof])Select values at particular time of day (e.g. between_time
(start_time, end_time[, ...])Select values between particular times of the day (e.g., 9:00-9:30 AM). bfill
([axis, inplace, limit, downcast])Synonym for DataFrame.fillna(method='bfill')
bool
()Return the bool of a single element PandasObject. boxplot
([column, by, ax, fontsize, rot, ...])Make a box plot from DataFrame column optionally grouped by some columns or clip
([lower, upper, axis, inplace])Trim values at input threshold(s). clip_lower
(threshold[, axis, inplace])Return copy of the input with values below given value(s) truncated. clip_upper
(threshold[, axis, inplace])Return copy of input with values above given value(s) truncated. combine
(other, func[, fill_value, overwrite])Add two DataFrame objects and do not propagate NaN values, so if for a combine_first
(other)Combine two DataFrame objects and default to non-null values in frame calling the method. compound
([axis, skipna, level])Return the compound percentage of the values for the requested axis consolidate
([inplace])DEPRECATED: consolidate will be an internal implementation only. convert_objects
([convert_dates, ...])Deprecated. copy
([deep])Make a copy of this objects data. corr
([method, min_periods])Compute pairwise correlation of columns, excluding NA/null values corrwith
(other[, axis, drop])Compute pairwise correlation between rows or columns of two DataFrame objects. count
([axis, level, numeric_only])Return Series with number of non-NA/null observations over requested axis. cov
([min_periods])Compute pairwise covariance of columns, excluding NA/null values cummax
([axis, skipna])Return cumulative max over requested axis. cummin
([axis, skipna])Return cumulative minimum over requested axis. cumprod
([axis, skipna])Return cumulative product over requested axis. cumsum
([axis, skipna])Return cumulative sum over requested axis. describe
([percentiles, include, exclude])Generates descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN
values.diff
([periods, axis])1st discrete difference of object div
(other[, axis, level, fill_value])Floating division of dataframe and other, element-wise (binary operator truediv). divide
(other[, axis, level, fill_value])Floating division of dataframe and other, element-wise (binary operator truediv). dot
(other)Matrix multiplication with DataFrame or Series objects drop
([labels, axis, index, columns, level, ...])Return new object with labels in requested axis removed. drop_duplicates
([subset, keep, inplace])Return DataFrame with duplicate rows removed, optionally only dropna
([axis, how, thresh, subset, inplace])Return object with labels on given axis omitted where alternately any duplicated
([subset, keep])Return boolean Series denoting duplicate rows, optionally only eq
(other[, axis, level])Wrapper for flexible comparison methods eq equals
(other)Determines if two NDFrame objects contain the same elements. eval
(expr[, inplace])Evaluate an expression in the context of the calling DataFrame instance. ewm
([com, span, halflife, alpha, ...])Provides exponential weighted functions expanding
([min_periods, freq, center, axis])Provides expanding transformations. ffill
([axis, inplace, limit, downcast])Synonym for DataFrame.fillna(method='ffill')
fillna
([value, method, axis, inplace, ...])Fill NA/NaN values using the specified method filter
([items, like, regex, axis])Subset rows or columns of dataframe according to labels in the specified index. first
(offset)Convenience method for subsetting initial periods of time series data based on a date offset. first_valid_index
()Return index for first non-NA/null value. floordiv
(other[, axis, level, fill_value])Integer division of dataframe and other, element-wise (binary operator floordiv). from_csv
(path[, header, sep, index_col, ...])Read CSV file (DEPRECATED, please use pandas.read_csv()
instead).from_dict
(data[, orient, dtype])Construct DataFrame from dict of array-like or dicts from_items
(items[, columns, orient])Convert (key, value) pairs to DataFrame. from_records
(data[, index, exclude, ...])Convert structured or record ndarray to DataFrame ge
(other[, axis, level])Wrapper for flexible comparison methods ge get
(key[, default])Get item from object for given key (DataFrame column, Panel slice, etc.). get_dtype_counts
()Return the counts of dtypes in this object. get_ftype_counts
()Return the counts of ftypes in this object. get_value
(index, col[, takeable])Quickly retrieve single value at passed column and index get_values
()same as values (but handles sparseness conversions) groupby
([by, axis, level, as_index, sort, ...])Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns. gt
(other[, axis, level])Wrapper for flexible comparison methods gt head
([n])Return the first n rows. hist
(data[, column, by, grid, xlabelsize, ...])Draw histogram of the DataFrame’s series using matplotlib / pylab. idxmax
([axis, skipna])Return index of first occurrence of maximum over requested axis. idxmin
([axis, skipna])Return index of first occurrence of minimum over requested axis. infer_objects
()Attempt to infer better dtypes for object columns. info
([verbose, buf, max_cols, memory_usage, ...])Concise summary of a DataFrame. insert
(loc, column, value[, allow_duplicates])Insert column into DataFrame at specified location. interpolate
([method, axis, limit, inplace, ...])Interpolate values according to different methods. isin
(values)Return boolean DataFrame showing whether each element in the DataFrame is contained in values. isna
()Return a boolean same-sized object indicating if the values are NA. isnull
()Return a boolean same-sized object indicating if the values are NA. items
()Iterator over (column name, Series) pairs. iteritems
()Iterator over (column name, Series) pairs. iterrows
()Iterate over DataFrame rows as (index, Series) pairs. itertuples
([index, name])Iterate over DataFrame rows as namedtuples, with index value as first element of the tuple. join
(other[, on, how, lsuffix, rsuffix, sort])Join columns with other DataFrame either on index or on a key column. keys
()Get the ‘info axis’ (see Indexing for more) kurt
([axis, skipna, level, numeric_only])Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). kurtosis
([axis, skipna, level, numeric_only])Return unbiased kurtosis over requested axis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0). last
(offset)Convenience method for subsetting final periods of time series data based on a date offset. last_valid_index
()Return index for last non-NA/null value. le
(other[, axis, level])Wrapper for flexible comparison methods le lookup
(row_labels, col_labels)Label-based “fancy indexing” function for DataFrame. lt
(other[, axis, level])Wrapper for flexible comparison methods lt mad
([axis, skipna, level])Return the mean absolute deviation of the values for the requested axis mask
(cond[, other, inplace, axis, level, ...])Return an object of same shape as self and whose corresponding entries are from self where cond is False and otherwise are from other. max
([axis, skipna, level, numeric_only])This method returns the maximum of the values in the object. mean
([axis, skipna, level, numeric_only])Return the mean of the values for the requested axis median
([axis, skipna, level, numeric_only])Return the median of the values for the requested axis melt
([id_vars, value_vars, var_name, ...])“Unpivots” a DataFrame from wide format to long format, optionally memory_usage
([index, deep])Memory usage of DataFrame columns. merge
(right[, how, on, left_on, right_on, ...])Merge DataFrame objects by performing a database-style join operation by columns or indexes. min
([axis, skipna, level, numeric_only])This method returns the minimum of the values in the object. mod
(other[, axis, level, fill_value])Modulo of dataframe and other, element-wise (binary operator mod). mode
([axis, numeric_only])Gets the mode(s) of each element along the axis selected. mul
(other[, axis, level, fill_value])Multiplication of dataframe and other, element-wise (binary operator mul). multiply
(other[, axis, level, fill_value])Multiplication of dataframe and other, element-wise (binary operator mul). ne
(other[, axis, level])Wrapper for flexible comparison methods ne nlargest
(n, columns[, keep])Get the rows of a DataFrame sorted by the n largest values of columns. notna
()Return a boolean same-sized object indicating if the values are not NA. notnull
()Return a boolean same-sized object indicating if the values are not NA. nsmallest
(n, columns[, keep])Get the rows of a DataFrame sorted by the n smallest values of columns. nunique
([axis, dropna])Return Series with number of distinct observations over requested axis. pct_change
([periods, fill_method, limit, freq])Percent change over given number of periods. pipe
(func, *args, **kwargs)Apply func(self, *args, **kwargs) pivot
([index, columns, values])Reshape data (produce a “pivot” table) based on column values. pivot_table
([values, index, columns, ...])Create a spreadsheet-style pivot table as a DataFrame. plot
alias of FramePlotMethods
pop
(item)Return item and drop from frame. pow
(other[, axis, level, fill_value])Exponential power of dataframe and other, element-wise (binary operator pow). prod
([axis, skipna, level, numeric_only, ...])Return the product of the values for the requested axis product
([axis, skipna, level, numeric_only, ...])Return the product of the values for the requested axis quantile
([q, axis, numeric_only, interpolation])Return values at the given quantile over requested axis, a la numpy.percentile. query
(expr[, inplace])Query the columns of a frame with a boolean expression. radd
(other[, axis, level, fill_value])Addition of dataframe and other, element-wise (binary operator radd). rank
([axis, method, numeric_only, ...])Compute numerical data ranks (1 through n) along axis. rdiv
(other[, axis, level, fill_value])Floating division of dataframe and other, element-wise (binary operator rtruediv). reindex
([labels, index, columns, axis, ...])Conform DataFrame to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. reindex_axis
(labels[, axis, method, level, ...])Conform input object to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index. reindex_like
(other[, method, copy, limit, ...])Return an object with matching indices to myself. rename
([mapper, index, columns, axis, copy, ...])Alter axes labels. rename_axis
(mapper[, axis, copy, inplace])Alter the name of the index or columns. reorder_levels
(order[, axis])Rearrange index levels using input order. replace
([to_replace, value, inplace, limit, ...])Replace values given in ‘to_replace’ with ‘value’. resample
(rule[, how, axis, fill_method, ...])Convenience method for frequency conversion and resampling of time series. reset_index
([level, drop, inplace, ...])For DataFrame with multi-level index, return new DataFrame with labeling information in the columns under the index names, defaulting to ‘level_0’, ‘level_1’, etc. rfloordiv
(other[, axis, level, fill_value])Integer division of dataframe and other, element-wise (binary operator rfloordiv). rmod
(other[, axis, level, fill_value])Modulo of dataframe and other, element-wise (binary operator rmod). rmul
(other[, axis, level, fill_value])Multiplication of dataframe and other, element-wise (binary operator rmul). rolling
(window[, min_periods, freq, center, ...])Provides rolling window calculations. round
([decimals])Round a DataFrame to a variable number of decimal places. rpow
(other[, axis, level, fill_value])Exponential power of dataframe and other, element-wise (binary operator rpow). rsub
(other[, axis, level, fill_value])Subtraction of dataframe and other, element-wise (binary operator rsub). rtruediv
(other[, axis, level, fill_value])Floating division of dataframe and other, element-wise (binary operator rtruediv). sample
([n, frac, replace, weights, ...])Returns a random sample of items from an axis of object. select
(crit[, axis])Return data corresponding to axis labels matching criteria select_dtypes
([include, exclude])Return a subset of a DataFrame including/excluding columns based on their dtype
.sem
([axis, skipna, level, ddof, numeric_only])Return unbiased standard error of the mean over requested axis. set_axis
(labels[, axis, inplace])Assign desired index to given axis set_index
(keys[, drop, append, inplace, ...])Set the DataFrame index (row labels) using one or more existing columns. set_value
(index, col, value[, takeable])Put single value at passed column and index shift
([periods, freq, axis])Shift index by desired number of periods with an optional time freq skew
([axis, skipna, level, numeric_only])Return unbiased skew over requested axis slice_shift
([periods, axis])Equivalent to shift without copying data. sort_index
([axis, level, ascending, ...])Sort object by labels (along an axis) sort_values
(by[, axis, ascending, inplace, ...])Sort by the values along either axis sortlevel
([level, axis, ascending, inplace, ...])DEPRECATED: use DataFrame.sort_index()
squeeze
([axis])Squeeze length 1 dimensions. stack
([level, dropna])Pivot a level of the (possibly hierarchical) column labels, returning a DataFrame (or Series in the case of an object with a single level of column labels) having a hierarchical index with a new inner-most level of row labels. std
([axis, skipna, level, ddof, numeric_only])Return sample standard deviation over requested axis. sub
(other[, axis, level, fill_value])Subtraction of dataframe and other, element-wise (binary operator sub). subtract
(other[, axis, level, fill_value])Subtraction of dataframe and other, element-wise (binary operator sub). sum
([axis, skipna, level, numeric_only, ...])Return the sum of the values for the requested axis swapaxes
(axis1, axis2[, copy])Interchange axes and swap values axes appropriately swaplevel
([i, j, axis])Swap levels i and j in a MultiIndex on a particular axis tail
([n])Return the last n rows. take
(indices[, axis, convert, is_copy])Return the elements in the given positional indices along an axis. to_clipboard
([excel, sep])Attempt to write text representation of object to the system clipboard This can be pasted into Excel, for example. to_csv
([path_or_buf, sep, na_rep, ...])Write DataFrame to a comma-separated values (csv) file to_dense
()Return dense representation of NDFrame (as opposed to sparse) to_dict
([orient, into])Convert DataFrame to dictionary. to_excel
(excel_writer[, sheet_name, na_rep, ...])Write DataFrame to an excel sheet to_feather
(fname)write out the binary feather-format for DataFrames to_gbq
(destination_table, project_id[, ...])Write a DataFrame to a Google BigQuery table. to_hdf
(path_or_buf, key, **kwargs)Write the contained data to an HDF5 file using HDFStore. to_html
([buf, columns, col_space, header, ...])Render a DataFrame as an HTML table. to_json
([path_or_buf, orient, date_format, ...])Convert the object to a JSON string. to_latex
([buf, columns, col_space, header, ...])Render an object to a tabular environment table. to_msgpack
([path_or_buf, encoding])msgpack (serialize) object to input file path to_panel
()Transform long (stacked) format (DataFrame) into wide (3D, Panel) format. to_parquet
(fname[, engine, compression])Write a DataFrame to the binary parquet format. to_period
([freq, axis, copy])Convert DataFrame from DatetimeIndex to PeriodIndex with desired to_pickle
(path[, compression, protocol])Pickle (serialize) object to input file path. to_records
([index, convert_datetime64])Convert DataFrame to record array. to_sparse
([fill_value, kind])Convert to SparseDataFrame to_sql
(name, con[, flavor, schema, ...])Write records stored in a DataFrame to a SQL database. to_stata
(fname[, convert_dates, ...])A class for writing Stata binary dta files from array-like objects to_string
([buf, columns, col_space, header, ...])Render a DataFrame to a console-friendly tabular output. to_timestamp
([freq, how, axis, copy])Cast to DatetimeIndex of timestamps, at beginning of period to_xarray
()Return an xarray object from the pandas object. transform
(func, *args, **kwargs)Call function producing a like-indexed NDFrame transpose
(*args, **kwargs)Transpose index and columns truediv
(other[, axis, level, fill_value])Floating division of dataframe and other, element-wise (binary operator truediv). truncate
([before, after, axis, copy])Truncates a sorted DataFrame/Series before and/or after some particular index value. tshift
([periods, freq, axis])Shift the time index, using the index’s frequency if available. tz_convert
(tz[, axis, level, copy])Convert tz-aware axis to target time zone. tz_localize
(tz[, axis, level, copy, ambiguous])Localize tz-naive TimeSeries to target time zone. unstack
([level, fill_value])透视(必要的分层)索引标签的级别,返回具有新级别列标签的DataFrame,其最内层级别由透视索引标签组成。 update
(other[, join, overwrite, ...])使用传递的DataFrame中的非NA值修改DataFrame。 var
([axis, skipna, level, ddof, numeric_only])在请求的轴上返回无偏差异。 where
(cond[, other, inplace, axis, level, ...])返回与self相同形状的对象,其对应的条目来自self,其中cond为True,否则来自other。 xs
(key[, axis, level, drop_level])返回Series / DataFrame的横截面(行或列)。