UDFRegistration.
registerJavaFunction
Register a Java user-defined function as a SQL function.
In addition to a name and the function itself, the return type can be optionally specified. When the return type is not specified we would infer it via reflection.
New in version 2.3.0.
Changed in version 3.4.0: Supports Spark Connect.
name of the user-defined function
fully qualified name of java class
pyspark.sql.types.DataType
the return type of the registered Java function. The value can be either a pyspark.sql.types.DataType object or a DDL-formatted type string.
Examples
>>> from pyspark.sql.types import IntegerType >>> spark.udf.registerJavaFunction( ... "javaStringLength", "test.org.apache.spark.sql.JavaStringLength", IntegerType()) ... >>> spark.sql("SELECT javaStringLength('test')").collect() [Row(javaStringLength(test)=4)]
>>> spark.udf.registerJavaFunction( ... "javaStringLength2", "test.org.apache.spark.sql.JavaStringLength") ... >>> spark.sql("SELECT javaStringLength2('test')").collect() [Row(javaStringLength2(test)=4)]
>>> spark.udf.registerJavaFunction( ... "javaStringLength3", "test.org.apache.spark.sql.JavaStringLength", "integer") ... >>> spark.sql("SELECT javaStringLength3('test')").collect() [Row(javaStringLength3(test)=4)]