Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm trying to use the DateTimeFormatter from java.time.format in Spark but it appears to be not serializable. This is the relevant chunk of code:

val pattern = "<some pattern>".r
val dtFormatter = DateTimeFormatter.ofPattern("<some non-ISO pattern>")

val logs = sc.wholeTextFiles(path)

val entries = logs.flatMap(fileContent => {
    val file = fileContent._1
    val content = fileContent._2
    content.split("\r?\n").map(line => line match {
      case pattern(dt, ev, seq) => Some(LogEntry(LocalDateTime.parse(dt, dtFormatter), ev, seq.toInt))
      case _ => logger.error(s"Cannot parse $file: $line"); None
    })
  })

How can I avoid the java.io.NotSerializableException: java.time.format.DateTimeFormatter exception? Is there a better library to parse timestamps? I've read that Joda is also not serializable and has been incorporated in Java 8's time library.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
247 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can avoid serialization in two ways:

  1. Assuming its value can be constant, place the formatter in an object (making it "static"). This would mean that the static value can be accessed within each worker, instead of the driver serializing it and sending to worker:

    object MyUtils {
      val dtFormatter = DateTimeFormatter.ofPattern("<some non-ISO pattern>")
    }
    
    import MyUtils._
    logs.flatMap(fileContent => {
      // can safely use formatter here
    })
    
  2. instantiate it per record inside the anonymous function. This carries some performance penalty (as the instantiation will happen over and over, per record), so only use this option if the first can't be applied:

    logs.flatMap(fileContent => {
      val dtFormatter = DateTimeFormatter.ofPattern("<some non-ISO pattern>")
      // use formatter here
    })
    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...