Spark 3.0.0 ScalaDoc - org.apache.spark.sql.hive.execution

case class CreateHiveTableAsSelectCommand(tableDesc: CatalogTable, query: LogicalPlan, outputColumnNames: Seq[String], mode: SaveMode) extends LogicalPlan with DataWritingCommand with Product with Serializable

Create table and insert the query result into it.

tableDesc: the Table Describe, which may contain serde, storage handler etc.
query: the query whose result will be insert into the new relation
mode: SaveMode

class HiveFileFormat extends FileFormat with DataSourceRegister with Logging

FileFormat for writing Hive tables.

TODO: implement the read logic.

class HiveOptions extends Serializable

Options for the Hive data source.

Options for the Hive data source. Note that rule DetermineHiveSerde will extract Hive serde/format information from these options.

class HiveOutputWriter extends OutputWriter with HiveInspectors

case class HiveScriptIOSchema(inputRowFormat: Seq[(String, String)], outputRowFormat: Seq[(String, String)], inputSerdeClass: Option[String], outputSerdeClass: Option[String], inputSerdeProps: Seq[(String, String)], outputSerdeProps: Seq[(String, String)], recordReaderClass: Option[String], recordWriterClass: Option[String], schemaLess: Boolean) extends HiveInspectors with Product with Serializable

The wrapper class of Hive input and output schema properties

case class InsertIntoHiveDirCommand(isLocal: Boolean, storage: CatalogStorageFormat, query: LogicalPlan, overwrite: Boolean, outputColumnNames: Seq[String]) extends LogicalPlan with SaveAsHiveFile with Product with Serializable

Command for writing the results of query to file system.

The syntax of using this command in SQL is:

INSERT OVERWRITE [LOCAL] DIRECTORY
path
[ROW FORMAT row_format]
[STORED AS file_format]
SELECT ...

isLocal: whether the path specified in storage is a local directory
storage: storage format used to describe how the query result is stored.
query: the logical plan representing data to write to
overwrite: whether overwrites existing directory

case class InsertIntoHiveTable(table: CatalogTable, partition: Map[String, Option[String]], query: LogicalPlan, overwrite: Boolean, ifPartitionNotExists: Boolean, outputColumnNames: Seq[String]) extends LogicalPlan with SaveAsHiveFile with Product with Serializable

Command for writing data out to a Hive table.

This class is mostly a mess, for legacy reasons (since it evolved in organic ways and had to follow Hive's internal implementations closely, which itself was a mess too). Please don't blame Reynold for this! He was just moving code around!

In the future we should converge the write path for Hive with the normal data source write path, as defined in org.apache.spark.sql.execution.datasources.FileFormatWriter.

table

the metadata of the table.

partition

a map from the partition key to the partition value (optional). If the partition value is optional, dynamic partition insert will be performed. As an example, INSERT INTO tbl PARTITION (a=1, b=2) AS ... would have

Map('a' -> Some('1'), 'b' -> Some('2'))

and INSERT INTO tbl PARTITION (a=1, b) AS ... would have

Map('a' -> Some('1'), 'b' -> None)

.

query

the logical plan representing data to write to.

overwrite

overwrite existing table or partitions.

ifPartitionNotExists

If true, only write if the partition does not exist. Only valid for static partitions.

case class ScriptTransformationExec(input: Seq[Expression], script: String, output: Seq[Attribute], child: SparkPlan, ioschema: HiveScriptIOSchema) extends SparkPlan with UnaryExecNode with Product with Serializable

Transforms the input by forking and running the specified script.

input: the set of expression that should be passed to the script.
script: the command that should be executed.
output: the attributes that are produced by the script.

Packages

execution

package execution

Type Members

Value Members

Ungrouped