GroupBy.
nth
Take the nth row from each group.
New in version 3.4.0.
A single nth value for the row
See also
pyspark.pandas.Series.groupby
pyspark.pandas.DataFrame.groupby
Notes
There is a behavior difference between pandas-on-Spark and pandas:
the returned empty dataframe may have an index with different lenght __len__.
Examples
>>> df = ps.DataFrame({'A': [1, 1, 2, 1, 2], ... 'B': [np.nan, 2, 3, 4, 5]}, columns=['A', 'B']) >>> g = df.groupby('A') >>> g.nth(0) B A 1 NaN 2 3.0 >>> g.nth(1) B A 1 2.0 2 5.0 >>> g.nth(-1) B A 1 4.0 2 5.0