[EDIT: Thanks to this post, the issue reported here has been resolved since Spark 1.4.1 – see the comments below] While writing the previous post on Spark dataframes, I encountered an unexpected behavior of the respective .filter method; but, on the one hand, I needed some more time to experiment and confirm it and, on the other hand, I knew …
Spark data frames from CSV files: handling headers & column types
If you come from the R (or Python/pandas) universe, like me, you must implicitly think that working with CSV files must be one of the most natural and straightforward things to happen in a data analysis context. Indeed, if you have your data in a CSV file, practically the only thing you have to do from R is to fire …
- Page 2 of 2
- 1
- 2