Input/Output

DataStreamReader.csv(path[, schema, sep, …])

Loads a CSV file stream and returns the result as a DataFrame.

DataStreamReader.format(source)

Specifies the input data source format.

DataStreamReader.json(path[, schema, …])

Loads a JSON file stream and returns the results as a DataFrame.

DataStreamReader.load([path, format, schema])

Loads a data stream from a data source and returns it as a DataFrame.

DataStreamReader.option(key, value)

Adds an input option for the underlying data source.

DataStreamReader.options(**options)

Adds input options for the underlying data source.

DataStreamReader.orc(path[, mergeSchema, …])

Loads a ORC file stream, returning the result as a DataFrame.

DataStreamReader.parquet(path[, …])

Loads a Parquet file stream, returning the result as a DataFrame.

DataStreamReader.schema(schema)

Specifies the input schema.

DataStreamReader.table(tableName)

Define a Streaming DataFrame on a Table.

DataStreamReader.text(path[, wholetext, …])

Loads a text file stream and returns a DataFrame whose schema starts with a string column named “value”, and followed by partitioned columns if there are any.

DataStreamWriter.foreach(f)

Sets the output of the streaming query to be processed using the provided writer f.

DataStreamWriter.foreachBatch(func)

Sets the output of the streaming query to be processed using the provided function.

DataStreamWriter.format(source)

Specifies the underlying output data source.

DataStreamWriter.option(key, value)

Adds an output option for the underlying data source.

DataStreamWriter.options(**options)

Adds output options for the underlying data source.

DataStreamWriter.outputMode(outputMode)

Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.

DataStreamWriter.partitionBy(*cols)

Partitions the output by the given columns on the file system.

DataStreamWriter.queryName(queryName)

Specifies the name of the StreamingQuery that can be started with start().

DataStreamWriter.start([path, format, …])

Streams the contents of the DataFrame to a data source.

DataStreamWriter.toTable(tableName[, …])

Starts the execution of the streaming query, which will continually output results to the given table as new data arrives.

DataStreamWriter.trigger(*[, …])

Set the trigger for the stream query.