package streaming
Spark Streaming functionality. org.apache.spark.streaming.StreamingContext serves as the main entry point to Spark Streaming, while org.apache.spark.streaming.dstream.DStream is the data type representing a continuous sequence of RDDs, representing a continuous stream of data.
In addition, org.apache.spark.streaming.dstream.PairDStreamFunctions contains operations
available only on DStreams
of key-value pairs, such as groupByKey
and reduceByKey
. These operations are automatically
available on any DStream of the right type (e.g. DStream[(Int, Int)] through implicit
conversions.
For the Java API of Spark Streaming, take a look at the org.apache.spark.streaming.api.java.JavaStreamingContext which serves as the entry point, and the org.apache.spark.streaming.api.java.JavaDStream and the org.apache.spark.streaming.api.java.JavaPairDStream which have the DStream functionality.
- Alphabetic
- By Inheritance
- streaming
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
- case class Duration(millis: Long) extends Product with Serializable
-
sealed abstract
class
State[S] extends AnyRef
:: Experimental :: Abstract class for getting and updating the state in mapping function used in the
mapWithState
operation of a pair DStream (Scala) or a JavaPairDStream (Java).:: Experimental :: Abstract class for getting and updating the state in mapping function used in the
mapWithState
operation of a pair DStream (Scala) or a JavaPairDStream (Java).Scala example of using
State
:// A mapping function that maintains an integer state and returns a String def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = { // Check if state exists if (state.exists) { val existingState = state.get // Get the existing state val shouldRemove = ... // Decide whether to remove the state if (shouldRemove) { state.remove() // Remove the state } else { val newState = ... state.update(newState) // Set the new state } } else { val initialState = ... state.update(initialState) // Set the initial state } ... // return something }
Java example of using
State
:// A mapping function that maintains an integer state and returns a String Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction = new Function3<String, Optional<Integer>, State<Integer>, String>() { @Override public String call(String key, Optional<Integer> value, State<Integer> state) { if (state.exists()) { int existingState = state.get(); // Get the existing state boolean shouldRemove = ...; // Decide whether to remove the state if (shouldRemove) { state.remove(); // Remove the state } else { int newState = ...; state.update(newState); // Set the new state } } else { int initialState = ...; // Set the initial state state.update(initialState); } // return something } };
- S
Class of the state
- Annotations
- @Experimental()
-
sealed abstract
class
StateSpec[KeyType, ValueType, StateType, MappedType] extends Serializable
:: Experimental :: Abstract class representing all the specifications of the DStream transformation
mapWithState
operation of a pair DStream (Scala) or a JavaPairDStream (Java).:: Experimental :: Abstract class representing all the specifications of the DStream transformation
mapWithState
operation of a pair DStream (Scala) or a JavaPairDStream (Java). Useorg.apache.spark.streaming.StateSpec.function()
factory methods to create instances of this class.Example in Scala:
// A mapping function that maintains an integer state and return a String def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } val spec = StateSpec.function(mappingFunction).numPartitions(10) val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)
Example in Java:
// A mapping function that maintains an integer state and return a string Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction = new Function3<String, Optional<Integer>, State<Integer>, String>() { @Override public Optional<String> call(Optional<Integer> value, State<Integer> state) { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } }; JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream = keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
- KeyType
Class of the state key
- ValueType
Class of the state value
- StateType
Class of the state data
- MappedType
Class of the mapped elements
- Annotations
- @Experimental()
-
class
StreamingContext extends Logging
Main entry point for Spark Streaming functionality.
Main entry point for Spark Streaming functionality. It provides methods used to create org.apache.spark.streaming.dstream.DStreams from various input sources. It can be either created by providing a Spark master URL and an appName, or from a org.apache.spark.SparkConf configuration (see core Spark documentation), or from an existing org.apache.spark.SparkContext. The associated SparkContext can be accessed using
context.sparkContext
. After creating and transforming DStreams, the streaming computation can be started and stopped usingcontext.start()
andcontext.stop()
, respectively.context.awaitTermination()
allows the current thread to wait for the termination of the context bystop()
or by an exception. - sealed abstract final class StreamingContextState extends Enum[StreamingContextState]
-
case class
Time(millis: Long) extends Product with Serializable
This is a simple class that represents an absolute instant of time.
This is a simple class that represents an absolute instant of time. Internally, it represents time as the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC. This is the same format as what is returned by System.currentTimeMillis.
Value Members
- object Durations
-
object
Milliseconds
Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of milliseconds.
-
object
Minutes
Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of minutes.
-
object
Seconds
Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of seconds.
-
object
StateSpec extends Serializable
:: Experimental :: Builder object for creating instances of
org.apache.spark.streaming.StateSpec
that is used for specifying the parameters of the DStream transformationmapWithState
that is used for specifying the parameters of the DStream transformationmapWithState
operation of a pair DStream (Scala) or a JavaPairDStream (Java).:: Experimental :: Builder object for creating instances of
org.apache.spark.streaming.StateSpec
that is used for specifying the parameters of the DStream transformationmapWithState
that is used for specifying the parameters of the DStream transformationmapWithState
operation of a pair DStream (Scala) or a JavaPairDStream (Java).Example in Scala:
// A mapping function that maintains an integer state and return a String def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } val spec = StateSpec.function(mappingFunction).numPartitions(10) val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)
Example in Java:
// A mapping function that maintains an integer state and return a string Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction = new Function3<String, Optional<Integer>, State<Integer>, String>() { @Override public Optional<String> call(Optional<Integer> value, State<Integer> state) { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } }; JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream = keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
- Annotations
- @Experimental()
-
object
StreamingContext extends Logging
StreamingContext object contains a number of utility functions related to the StreamingContext class.
- object Time extends Serializable