Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

  1. What is the difference between SparkContext, JavaSparkContext, SQLContext and SparkSession?
  2. Is there any method to convert or create a Context using a SparkSession?
  3. Can I completely replace all the Contexts using one single entry SparkSession?
  4. Are all the functions in SQLContext, SparkContext, and JavaSparkContext also in SparkSession?
  5. Some functions like parallelize have different behaviors in SparkContext and JavaSparkContext. How do they behave in SparkSession?
  6. How can I create the following using a SparkSession?

    • RDD
    • JavaRDD
    • JavaPairRDD
    • Dataset

Is there a method to transform a JavaPairRDD into a Dataset or a Dataset into a JavaPairRDD?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
223 views
Welcome To Ask or Share your Answers For Others

1 Answer

sparkContext is a Scala implementation entry point and JavaSparkContext is a java wrapper of sparkContext.

SQLContext is entry point of SparkSQL which can be received from sparkContext.Prior to 2.x.x, RDD ,DataFrame and Data-set were three different data abstractions.Since Spark 2.x.x, All three data abstractions are unified and SparkSession is the unified entry point of Spark.

An additional note is , RDD meant for unstructured data, strongly typed data and DataFrames are for structured and loosely typed data. You can check

Is there any method to convert or create Context using Sparksession ?

yes. its sparkSession.sparkContext() and for SQL, sparkSession.sqlContext()

Can I completely replace all the Context using one single entry SparkSession ?

yes. you can get respective contexs from sparkSession.

Does all the functions in SQLContext, SparkContext,JavaSparkContext etc are added in SparkSession?

Not directly. you got to get respective context and make use of it.something like backward compatibility

How to use such function in SparkSession?

get respective context and make use of it.

How to create the following using SparkSession?

  1. RDD can be created from sparkSession.sparkContext.parallelize(???)
  2. JavaRDD same applies with this but in java implementation
  3. JavaPairRDD sparkSession.sparkContext.parallelize(???).map(//making your data as key-value pair here is one way)
  4. Dataset what sparkSession returns is Dataset if it is structured data.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...