DataFrame.
mask
Replace values where the condition is True.
Where cond is False, keep the original value. Where True, replace with corresponding value from other.
Entries where cond is True are replaced with corresponding value from other.
Examples
>>> from pyspark.pandas.config import set_option, reset_option >>> set_option("compute.ops_on_diff_frames", True) >>> df1 = ps.DataFrame({'A': [0, 1, 2, 3, 4], 'B':[100, 200, 300, 400, 500]}) >>> df2 = ps.DataFrame({'A': [0, -1, -2, -3, -4], 'B':[-100, -200, -300, -400, -500]}) >>> df1 A B 0 0 100 1 1 200 2 2 300 3 3 400 4 4 500 >>> df2 A B 0 0 -100 1 -1 -200 2 -2 -300 3 -3 -400 4 -4 -500
>>> df1.mask(df1 > 0).sort_index() A B 0 0.0 NaN 1 NaN NaN 2 NaN NaN 3 NaN NaN 4 NaN NaN
>>> df1.mask(df1 > 1, 10).sort_index() A B 0 0 10 1 1 10 2 10 10 3 10 10 4 10 10
>>> df1.mask(df1 > 1, df1 + 100).sort_index() A B 0 0 200 1 1 300 2 102 400 3 103 500 4 104 600
>>> df1.mask(df1 > 1, df2).sort_index() A B 0 0 -100 1 1 -200 2 -2 -300 3 -3 -400 4 -4 -500
>>> reset_option("compute.ops_on_diff_frames")