Series([data, index, dtype, name, copy, …])
Series
pandas-on-Spark Series that corresponds to pandas Series logically.
Series.index
The index (axis labels) Column of the Series.
Series.dtype
Return the dtype object of the underlying data.
Series.dtypes
Series.ndim
Return an int representing the number of array dimensions.
Series.name
Return name of the Series.
Series.shape
Return a tuple of the shape of the underlying data.
Series.axes
Return a list of the row axis labels.
Series.size
Return an int representing the number of elements in this object.
Series.empty
Returns true if the current object is empty.
Series.T
Return the transpose, which is by definition self.
Series.hasnans
Return True if it has any missing values.
Series.values
Return a Numpy representation of the DataFrame or the Series.
Series.astype(dtype)
Series.astype
Cast a pandas-on-Spark object to a specified dtype dtype.
dtype
Series.copy([deep])
Series.copy
Make a copy of this object’s indices and data.
Series.bool()
Series.bool
Return the bool of a single element in the current object.
Series.at
Access a single value for a row/column label pair.
Series.iat
Access a single value for a row/column pair by integer position.
Series.loc
Access a group of rows and columns by label(s) or a boolean Series.
Series.iloc
Purely integer-location based indexing for selection by position.
Series.keys()
Series.keys
Return alias for index.
Series.pop(item)
Series.pop
Return item and drop from series.
Series.items()
Series.items
This is an alias of iteritems.
iteritems
Series.iteritems()
Series.iteritems
Lazily iterate over (index, value) tuples.
Series.item()
Series.item
Return the first element of the underlying data as a Python scalar.
Series.xs(key[, level])
Series.xs
Return cross-section from the Series.
Series.get(key[, default])
Series.get
Get item from object for given key (DataFrame column, Panel slice, etc.).
Series.add(other)
Series.add
Return Addition of series and other, element-wise (binary operator +).
Series.div(other)
Series.div
Return Floating division of series and other, element-wise (binary operator /).
Series.mul(other)
Series.mul
Return Multiplication of series and other, element-wise (binary operator *).
Series.radd(other)
Series.radd
Return Reverse Addition of series and other, element-wise (binary operator +).
Series.rdiv(other)
Series.rdiv
Return Reverse Floating division of series and other, element-wise (binary operator /).
Series.rmul(other)
Series.rmul
Return Reverse Multiplication of series and other, element-wise (binary operator *).
Series.rsub(other)
Series.rsub
Return Reverse Subtraction of series and other, element-wise (binary operator -).
Series.rtruediv(other)
Series.rtruediv
Series.sub(other)
Series.sub
Return Subtraction of series and other, element-wise (binary operator -).
Series.truediv(other)
Series.truediv
Series.pow(other)
Series.pow
Return Exponential power of series of series and other, element-wise (binary operator **).
Series.rpow(other)
Series.rpow
Return Reverse Exponential power of series and other, element-wise (binary operator **).
Series.mod(other)
Series.mod
Return Modulo of series and other, element-wise (binary operator %).
Series.rmod(other)
Series.rmod
Return Reverse Modulo of series and other, element-wise (binary operator %).
Series.floordiv(other)
Series.floordiv
Return Integer division of series and other, element-wise (binary operator //).
Series.rfloordiv(other)
Series.rfloordiv
Return Reverse Integer division of series and other, element-wise (binary operator //).
Series.divmod(other)
Series.divmod
Return Integer division and modulo of series and other, element-wise (binary operator divmod).
Series.rdivmod(other)
Series.rdivmod
Return Integer division and modulo of series and other, element-wise (binary operator rdivmod).
Series.combine_first(other)
Series.combine_first
Combine Series values, choosing the calling Series’s values first.
Series.lt(other)
Series.lt
Compare if the current value is less than the other.
Series.gt(other)
Series.gt
Compare if the current value is greater than the other.
Series.le(other)
Series.le
Compare if the current value is less than or equal to the other.
Series.ge(other)
Series.ge
Compare if the current value is greater than or equal to the other.
Series.ne(other)
Series.ne
Compare if the current value is not equal to the other.
Series.eq(other)
Series.eq
Compare if the current value is equal to the other.
Series.product([axis, numeric_only, min_count])
Series.product
Return the product of the values.
Series.dot(other)
Series.dot
Compute the dot product between the Series and the columns of other.
Series.apply(func[, args])
Series.apply
Invoke function on values of Series.
Series.agg(func)
Series.agg
Aggregate using one or more operations over the specified axis.
Series.aggregate(func)
Series.aggregate
Series.transform(func[, axis])
Series.transform
Call func producing the same type as self with transformed values and that has the same axis length as input.
func
Series.map(arg)
Series.map
Map values of Series according to input correspondence.
Series.groupby(by[, axis, as_index, dropna])
Series.groupby
Group DataFrame or Series using a Series of columns.
Series.rolling(window[, min_periods])
Series.rolling
Provide rolling transformations.
Series.expanding([min_periods])
Series.expanding
Provide expanding transformations.
Series.pipe(func, *args, **kwargs)
Series.pipe
Apply func(self, *args, **kwargs).
Series.abs()
Series.abs
Return a Series/DataFrame with absolute numeric value of each element.
Series.all([axis])
Series.all
Return whether all elements are True.
Series.any([axis])
Series.any
Return whether any element is True.
Series.between(left, right[, inclusive])
Series.between
Return boolean Series equivalent to left <= series <= right.
Series.clip([lower, upper])
Series.clip
Trim values at input threshold(s).
Series.corr(other[, method])
Series.corr
Compute correlation with other Series, excluding missing values.
Series.count([axis, numeric_only])
Series.count
Count non-NA cells for each column.
Series.cummax([skipna])
Series.cummax
Return cumulative maximum over a DataFrame or Series axis.
Series.cummin([skipna])
Series.cummin
Return cumulative minimum over a DataFrame or Series axis.
Series.cumsum([skipna])
Series.cumsum
Return cumulative sum over a DataFrame or Series axis.
Series.cumprod([skipna])
Series.cumprod
Return cumulative product over a DataFrame or Series axis.
Series.describe([percentiles])
Series.describe
Generate descriptive statistics that summarize the central tendency, dispersion and shape of a dataset’s distribution, excluding NaN values.
NaN
Series.filter([items, like, regex, axis])
Series.filter
Subset rows or columns of dataframe according to labels in the specified index.
Series.kurt([axis, numeric_only])
Series.kurt
Return unbiased kurtosis using Fisher’s definition of kurtosis (kurtosis of normal == 0.0).
Series.mad()
Series.mad
Return the mean absolute deviation of values.
Series.max([axis, numeric_only])
Series.max
Return the maximum of the values.
Series.mean([axis, numeric_only])
Series.mean
Return the mean of the values.
Series.min([axis, numeric_only])
Series.min
Return the minimum of the values.
Series.mode([dropna])
Series.mode
Return the mode(s) of the dataset.
Series.nlargest([n])
Series.nlargest
Return the largest n elements.
Series.nsmallest([n])
Series.nsmallest
Return the smallest n elements.
Series.pct_change([periods])
Series.pct_change
Percentage change between the current and a prior element.
Series.prod([axis, numeric_only, min_count])
Series.prod
Series.nunique([dropna, approx, rsd])
Series.nunique
Return number of unique elements in the object.
Series.is_unique
Return boolean if values in the object are unique
Series.quantile([q, accuracy])
Series.quantile
Return value at the given quantile.
Series.rank([method, ascending])
Series.rank
Compute numerical data ranks (1 through n) along axis.
Series.sem([axis, ddof, numeric_only])
Series.sem
Return unbiased standard error of the mean over requested axis.
Series.skew([axis, numeric_only])
Series.skew
Return unbiased skew normalized by N-1.
Series.std([axis, ddof, numeric_only])
Series.std
Return sample standard deviation.
Series.sum([axis, numeric_only, min_count])
Series.sum
Return the sum of the values.
Series.median([axis, numeric_only, accuracy])
Series.median
Return the median of the values for the requested axis.
Series.var([axis, ddof, numeric_only])
Series.var
Return unbiased variance.
Series.kurtosis([axis, numeric_only])
Series.kurtosis
Series.unique()
Series.unique
Return unique values of Series object.
Series.value_counts([normalize, sort, …])
Series.value_counts
Return a Series containing counts of unique values.
Series.round([decimals])
Series.round
Round each value in a Series to the given number of decimals.
Series.diff([periods])
Series.diff
First discrete difference of element.
Series.is_monotonic
Return boolean if values in the object are monotonically increasing.
Series.is_monotonic_increasing
Series.is_monotonic_decreasing
Return boolean if values in the object are monotonically decreasing.
Series.align(other[, join, axis, copy])
Series.align
Align two objects on their axes with the specified join method.
Series.drop([labels, index, level])
Series.drop
Return Series with specified index labels removed.
Series.droplevel(level)
Series.droplevel
Return Series with requested index level(s) removed.
Series.drop_duplicates([keep, inplace])
Series.drop_duplicates
Return Series with duplicate values removed.
Series.equals(other)
Series.equals
Series.add_prefix(prefix)
Series.add_prefix
Prefix labels with string prefix.
Series.add_suffix(suffix)
Series.add_suffix
Suffix labels with string suffix.
Series.first(offset)
Series.first
Select first periods of time series data based on a date offset.
Series.head([n])
Series.head
Return the first n rows.
Series.idxmax([skipna])
Series.idxmax
Return the row label of the maximum value.
Series.idxmin([skipna])
Series.idxmin
Return the row label of the minimum value.
Series.isin(values)
Series.isin
Check whether values are contained in Series or Index.
Series.last(offset)
Series.last
Select final periods of time series data based on a date offset.
Series.rename([index])
Series.rename
Alter Series name.
Series.rename_axis([mapper, index, inplace])
Series.rename_axis
Set the name of the axis for the index or columns.
Series.reindex([index, fill_value])
Series.reindex
Conform Series to new index with optional filling logic, placing NA/NaN in locations having no value in the previous index.
Series.reindex_like(other)
Series.reindex_like
Return a Series with matching indices as other object.
Series.reset_index([level, drop, name, inplace])
Series.reset_index
Generate a new DataFrame or Series with the index reset.
Series.sample([n, frac, replace, random_state])
Series.sample
Return a random sample of items from an axis of object.
Series.swaplevel([i, j, copy])
Series.swaplevel
Swap levels i and j in a MultiIndex.
Series.swapaxes(i, j[, copy])
Series.swapaxes
Interchange axes and swap values axes appropriately.
Series.take(indices)
Series.take
Return the elements in the given positional indices along an axis.
Series.tail([n])
Series.tail
Return the last n rows.
Series.where(cond[, other])
Series.where
Replace values where the condition is False.
Series.mask(cond[, other])
Series.mask
Replace values where the condition is True.
Series.truncate([before, after, axis, copy])
Series.truncate
Truncate a Series or DataFrame before and after some index value.
Series.backfill([axis, inplace, limit])
Series.backfill
Synonym for DataFrame.fillna() or Series.fillna() with method=`bfill`.
method=`bfill`
Series.bfill([axis, inplace, limit])
Series.bfill
Series.isna()
Series.isna
Detect existing (non-missing) values.
Series.isnull()
Series.isnull
Series.notna()
Series.notna
Series.notnull()
Series.notnull
Series.pad([axis, inplace, limit])
Series.pad
Synonym for DataFrame.fillna() or Series.fillna() with method=`ffill`.
method=`ffill`
Series.dropna([axis, inplace])
Series.dropna
Return a new Series with missing values removed.
Series.fillna([value, method, axis, …])
Series.fillna
Fill NA/NaN values.
Series.argsort()
Series.argsort
Return the integer indices that would sort the Series values.
Series.argmin()
Series.argmin
Return int position of the smallest value in the Series.
Series.argmax()
Series.argmax
Return int position of the largest value in the Series.
Series.sort_index([axis, level, ascending, …])
Series.sort_index
Sort object by labels (along an axis)
Series.sort_values([ascending, inplace, …])
Series.sort_values
Sort by the values.
Series.unstack([level])
Series.unstack
Unstack, a.k.a.
Series.explode()
Series.explode
Transform each element of a list-like to a row.
Series.repeat(repeats)
Series.repeat
Repeat elements of a Series.
Series.squeeze([axis])
Series.squeeze
Squeeze 1 dimensional axis objects into scalars.
Series.factorize([sort, na_sentinel])
Series.factorize
Encode the object as an enumerated type or categorical variable.
Series.append(to_append[, ignore_index, …])
Series.append
Concatenate two or more Series.
Series.compare(other[, keep_shape, keep_equal])
Series.compare
Compare to another Series and show the differences.
Series.replace([to_replace, value, regex])
Series.replace
Replace values given in to_replace with value.
Series.update(other)
Series.update
Modify Series in place using non-NA values from passed Series.
Series.asof(where)
Series.asof
Return the last row(s) without any NaNs before where.
Series.shift([periods, fill_value])
Series.shift
Shift Series/Index by desired number of periods.
Series.first_valid_index()
Series.first_valid_index
Retrieves the index of the first valid value.
Series.last_valid_index()
Series.last_valid_index
Return index for last non-NA/null value.
Series.at_time(time[, asof, axis])
Series.at_time
Select values at particular time of day (example: 9:30AM).
Series.between_time(start_time, end_time[, …])
Series.between_time
Select values between particular times of the day (example: 9:00-9:30 AM).
Series.spark provides features that does not exist in pandas but in Spark. These can be accessed by Series.spark.<function/property>.
Series.spark
Series.spark.<function/property>
Series.spark.column
Spark Column object representing the Series/Index.
Series.spark.transform(func)
Series.spark.transform
Applies a function that takes and returns a Spark column.
Series.spark.apply(func)
Series.spark.apply
Pandas API on Spark provides dtype-specific methods under various accessors. These are separate namespaces within Series that only apply to specific data types.
Data Type
Accessor
Datetime
dt
String
str
Categorical
cat
Series.dt can be used to access the values of the series as datetimelike and return several properties. These can be accessed like Series.dt.<property>.
Series.dt
Series.dt.<property>
Series.dt.date
Returns a Series of python datetime.date objects (namely, the date part of Timestamps without timezone information).
Series.dt.year
The year of the datetime.
Series.dt.month
The month of the timestamp as January = 1 December = 12.
Series.dt.day
The days of the datetime.
Series.dt.hour
The hours of the datetime.
Series.dt.minute
The minutes of the datetime.
Series.dt.second
The seconds of the datetime.
Series.dt.microsecond
The microseconds of the datetime.
Series.dt.week
The week ordinal of the year.
Series.dt.weekofyear
Series.dt.dayofweek
The day of the week with Monday=0, Sunday=6.
Series.dt.weekday
Series.dt.dayofyear
The ordinal day of the year.
Series.dt.quarter
The quarter of the date.
Series.dt.is_month_start
Indicates whether the date is the first day of the month.
Series.dt.is_month_end
Indicates whether the date is the last day of the month.
Series.dt.is_quarter_start
Indicator for whether the date is the first day of a quarter.
Series.dt.is_quarter_end
Indicator for whether the date is the last day of a quarter.
Series.dt.is_year_start
Indicate whether the date is the first day of a year.
Series.dt.is_year_end
Indicate whether the date is the last day of the year.
Series.dt.is_leap_year
Boolean indicator if the date belongs to a leap year.
Series.dt.daysinmonth
The number of days in the month.
Series.dt.days_in_month
Series.dt.normalize()
Series.dt.normalize
Convert times to midnight.
Series.dt.strftime(date_format)
Series.dt.strftime
Convert to a string Series using specified date_format.
Series.dt.round(freq, *args, **kwargs)
Series.dt.round
Perform round operation on the data to the specified freq.
Series.dt.floor(freq, *args, **kwargs)
Series.dt.floor
Perform floor operation on the data to the specified freq.
Series.dt.ceil(freq, *args, **kwargs)
Series.dt.ceil
Perform ceil operation on the data to the specified freq.
Series.dt.month_name([locale])
Series.dt.month_name
Return the month names of the series with specified locale.
Series.dt.day_name([locale])
Series.dt.day_name
Return the day names of the series with specified locale.
Series.str can be used to access the values of the series as strings and apply several methods to it. These can be accessed like Series.str.<function/property>.
Series.str
Series.str.<function/property>
Series.str.capitalize()
Series.str.capitalize
Convert Strings in the series to be capitalized.
Series.str.cat([others, sep, na_rep, join])
Series.str.cat
Not supported.
Series.str.center(width[, fillchar])
Series.str.center
Filling left and right side of strings in the Series/Index with an additional character.
Series.str.contains(pat[, case, flags, na, …])
Series.str.contains
Test if pattern or regex is contained within a string of a Series.
Series.str.count(pat[, flags])
Series.str.count
Count occurrences of pattern in each string of the Series.
Series.str.decode(encoding[, errors])
Series.str.decode
Series.str.encode(encoding[, errors])
Series.str.encode
Series.str.endswith(pattern[, na])
Series.str.endswith
Test if the end of each string element matches a pattern.
Series.str.extract(pat[, flags, expand])
Series.str.extract
Series.str.extractall(pat[, flags])
Series.str.extractall
Series.str.find(sub[, start, end])
Series.str.find
Return lowest indexes in each strings in the Series where the substring is fully contained between [start:end].
Series.str.findall(pat[, flags])
Series.str.findall
Find all occurrences of pattern or regular expression in the Series.
Series.str.get(i)
Series.str.get
Extract element from each string or string list/tuple in the Series at the specified position.
Series.str.get_dummies([sep])
Series.str.get_dummies
Series.str.index(sub[, start, end])
Series.str.index
Return lowest indexes in each strings where the substring is fully contained between [start:end].
Series.str.isalnum()
Series.str.isalnum
Check whether all characters in each string are alphanumeric.
Series.str.isalpha()
Series.str.isalpha
Check whether all characters in each string are alphabetic.
Series.str.isdigit()
Series.str.isdigit
Check whether all characters in each string are digits.
Series.str.isspace()
Series.str.isspace
Check whether all characters in each string are whitespaces.
Series.str.islower()
Series.str.islower
Check whether all characters in each string are lowercase.
Series.str.isupper()
Series.str.isupper
Check whether all characters in each string are uppercase.
Series.str.istitle()
Series.str.istitle
Check whether all characters in each string are titlecase.
Series.str.isnumeric()
Series.str.isnumeric
Check whether all characters in each string are numeric.
Series.str.isdecimal()
Series.str.isdecimal
Check whether all characters in each string are decimals.
Series.str.join(sep)
Series.str.join
Join lists contained as elements in the Series with passed delimiter.
Series.str.len()
Series.str.len
Computes the length of each element in the Series.
Series.str.ljust(width[, fillchar])
Series.str.ljust
Filling right side of strings in the Series with an additional character.
Series.str.lower()
Series.str.lower
Convert strings in the Series/Index to all lowercase.
Series.str.lstrip([to_strip])
Series.str.lstrip
Remove leading characters.
Series.str.match(pat[, case, flags, na])
Series.str.match
Determine if each string matches a regular expression.
Series.str.normalize(form)
Series.str.normalize
Return the Unicode normal form for the strings in the Series.
Series.str.pad(width[, side, fillchar])
Series.str.pad
Pad strings in the Series up to width.
Series.str.partition([sep, expand])
Series.str.partition
Series.str.repeat(repeats)
Series.str.repeat
Duplicate each string in the Series.
Series.str.replace(pat, repl[, n, case, …])
Series.str.replace
Replace occurrences of pattern/regex in the Series with some other string.
Series.str.rfind(sub[, start, end])
Series.str.rfind
Return highest indexes in each strings in the Series where the substring is fully contained between [start:end].
Series.str.rindex(sub[, start, end])
Series.str.rindex
Return highest indexes in each strings where the substring is fully contained between [start:end].
Series.str.rjust(width[, fillchar])
Series.str.rjust
Filling left side of strings in the Series with an additional character.
Series.str.rpartition([sep, expand])
Series.str.rpartition
Series.str.rsplit([pat, n, expand])
Series.str.rsplit
Split strings around given separator/delimiter.
Series.str.rstrip([to_strip])
Series.str.rstrip
Remove trailing characters.
Series.str.slice([start, stop, step])
Series.str.slice
Slice substrings from each element in the Series.
Series.str.slice_replace([start, stop, repl])
Series.str.slice_replace
Series.str.split([pat, n, expand])
Series.str.split
Series.str.startswith(pattern[, na])
Series.str.startswith
Test if the start of each string element matches a pattern.
Series.str.strip([to_strip])
Series.str.strip
Remove leading and trailing characters.
Series.str.swapcase()
Series.str.swapcase
Convert strings in the Series/Index to be swapcased.
Series.str.title()
Series.str.title
Convert Strings in the series to be titlecase.
Series.str.translate(table)
Series.str.translate
Map all characters in the string through the given mapping table.
Series.str.upper()
Series.str.upper
Convert strings in the Series/Index to all uppercase.
Series.str.wrap(width, **kwargs)
Series.str.wrap
Wrap long strings in the Series to be formatted in paragraphs with length less than a given width.
Series.str.zfill(width)
Series.str.zfill
Pad strings in the Series by prepending ‘0’ characters.
Categorical-dtype specific methods and attributes are available under the Series.cat accessor.
Series.cat
Series.cat.categories
The categories of this categorical.
Series.cat.ordered
Whether the categories have an ordered relationship.
Series.cat.codes
Return Series of codes as well as the index.
Series.cat.rename_categories(new_categories)
Series.cat.rename_categories
Rename categories.
Series.cat.reorder_categories(new_categories)
Series.cat.reorder_categories
Reorder categories as specified in new_categories.
Series.cat.add_categories(new_categories[, …])
Series.cat.add_categories
Add new categories.
Series.cat.remove_categories(removals[, inplace])
Series.cat.remove_categories
Remove the specified categories.
Series.cat.remove_unused_categories([inplace])
Series.cat.remove_unused_categories
Remove categories which are not used.
Series.cat.set_categories(new_categories[, …])
Series.cat.set_categories
Set the categories to the specified new_categories.
Series.cat.as_ordered([inplace])
Series.cat.as_ordered
Set the Categorical to be ordered.
Series.cat.as_unordered([inplace])
Series.cat.as_unordered
Set the Categorical to be unordered.
Series.plot is both a callable method and a namespace attribute for specific plotting methods of the form Series.plot.<kind>.
Series.plot
Series.plot.<kind>
alias of pyspark.pandas.plot.core.PandasOnSparkPlotAccessor
pyspark.pandas.plot.core.PandasOnSparkPlotAccessor
Series.plot.area([x, y])
Series.plot.area
Draw a stacked area plot.
Series.plot.bar([x, y])
Series.plot.bar
Vertical bar plot.
Series.plot.barh([x, y])
Series.plot.barh
Make a horizontal bar plot.
Series.plot.box(**kwds)
Series.plot.box
Make a box plot of the Series columns.
Series.plot.density([bw_method, ind])
Series.plot.density
Generate Kernel Density Estimate plot using Gaussian kernels.
Series.plot.hist([bins])
Series.plot.hist
Draw one histogram of the DataFrame’s columns.
Series.plot.line([x, y])
Series.plot.line
Plot DataFrame/Series as lines.
Series.plot.pie(**kwds)
Series.plot.pie
Generate a pie plot.
Series.plot.kde([bw_method, ind])
Series.plot.kde
Series.hist([bins])
Series.hist
Series.to_pandas()
Series.to_pandas
Return a pandas Series.
Series.to_numpy()
Series.to_numpy
A NumPy ndarray representing the values in this DataFrame or Series.
Series.to_list()
Series.to_list
Return a list of the values.
Series.to_string([buf, na_rep, …])
Series.to_string
Render a string representation of the Series.
Series.to_dict([into])
Series.to_dict
Convert Series to {label -> value} dict or dict-like object.
Series.to_clipboard([excel, sep])
Series.to_clipboard
Copy object to the system clipboard.
Series.to_latex([buf, columns, col_space, …])
Series.to_latex
Render an object to a LaTeX tabular environment table.
Series.to_markdown([buf, mode])
Series.to_markdown
Print Series or DataFrame in Markdown-friendly format.
Series.to_json([path, compression, …])
Series.to_json
Convert the object to a JSON string.
Series.to_csv([path, sep, na_rep, columns, …])
Series.to_csv
Write object to a comma-separated values (csv) file.
Series.to_excel(excel_writer[, sheet_name, …])
Series.to_excel
Write object to an Excel sheet.
Series.to_frame([name])
Series.to_frame
Convert Series to DataFrame.
Series.pandas_on_spark provides pandas-on-Spark specific features that exists only in pandas API on Spark. These can be accessed by Series.pandas_on_spark.<function/property>.
Series.pandas_on_spark
Series.pandas_on_spark.<function/property>
Series.pandas_on_spark.transform_batch(func, …)
Series.pandas_on_spark.transform_batch
Transform the data with the function that takes pandas Series and outputs pandas Series.