Blog - Page 3 of 6 - Nodalpoint

Using ROracle with Oracle Instant Client 12c

Christos - Iraklis Tsatsoulis February 18, 2016Oracle R, R Leave a Comment

The other day, while setting up the new Oracle R Enterprise (ORE) 1.5 client packages in a Linux server, we installed the Oracle DB Instant Client v. 12.1, as advised in the relevant documentation. Problem was, ORE failed to load, in fact due to ROracle failure: Truth is, the file libclntsh.so.11.1 did not exist, but this was expected, simply due …

Querying Big Data SQL tables with Oracle R Enterprise

Christos - Iraklis Tsatsoulis February 15, 2016Big Data, Oracle Big Data SQL, Oracle R 1 Comment

I was wondering recently if I could use Oracle R Enterprise (ORE) to query Big Data SQL tables (i.e. Oracle Database external tables based on HDFS or Hive data), since I have never seen such a combination mentioned in the relevant Oracle documentation and white papers. I am happy to announce that the answer is an unconditional yes. In this …

Nonlinear regression using Spark – Part 1: Nonlinear models

Constantinos Voglis February 10, 2016Spark 2 Comments

Regression constitutes a very important topic in supervised learning. Its goal is to predict the value of one or more continuous target variables (responses) given the value of a -dimensional vector of input variables (predictors). More specifically, given a training data set comprising of observations , where , together with corresponding target values , the goal is to predict the …

Caution when installing Oracle R Distribution in Oracle Linux using Yum

Christos - Iraklis Tsatsoulis February 1, 2016Oracle R Leave a Comment

Last week we tried to install Oracle R Distribution (ORD) in Oracle Linux 7.1 using Yum, which is the installation method recommended by Oracle. After following closely the instructions provided in the documentation, instead of the Oracle R Distribution 3.2.0, we found ourselves with the latest (3.2.3) version of GNU R installed. What had happened is that in our /etc/yum.repos.d, …

Limitations of Spark MLlib linear algebra module

Christos - Iraklis Tsatsoulis December 18, 2015Spark 1 Comment

A couple of days ago I stumbled upon some unexpected behavior of Spark MLlib (v. 1.5.2), while trying some ultra-simple operations on vectors. Consider the following Pyspark snippet: Clearly, what happens is that the unary operator – (minus) for vectors fails, giving errors for expressions like -x and -y+x, although x-y behaves as expected. The result of the last operation, …

How NOT to perform feature selection!

Christos - Iraklis Tsatsoulis December 14, 2015Data Science 9 Comments

Cross-validation (CV) is nowadays being widely used for model assessment in predictive analytics tasks; nevertheless, cases where it is incorrectly applied are not uncommon, especially when the predictive model building includes a feature selection stage. I was reminded of such a situation while reading this recent Revolution Analytics blog post, where CV is used to assess both the feature selection …

Oracle R Enterprise 1.4: ore.make.names does not work for Oracle DB connections

Christos - Iraklis Tsatsoulis November 19, 2015Oracle R Leave a Comment

I have reported in the past about some unexpected behavior issues of Oracle R Enterprise 1.4 ore.make.names function; nevertheless, back then I had only tried it with Hive connections. I tried to use it today with an Oracle database connection, and it doesn’t seem to work. Here is a reproducible example in Oracle Big Data Lite VM 4.2.1, using the …

Manipulating Hive tables with Oracle R connectors for Hadoop

Christos - Iraklis Tsatsoulis November 13, 2015Hadoop, Hive, Oracle R 2 Comments

In this post, we’ll have a look at how easy it is to manipulate Hive tables using Oracle R connectors for Hadoop (ORCH, presently known as Oracle R Advanced Analytics for Hadoop – ORAAH). We will use the weblog data from Athens Datathon 2015, which we have already loaded in a Hive table named weblogs, as described in more detail …

Using Ansible to install WebLogic 12c R2 and Fussion Middleware

Chris Vezalis November 9, 2015Ansible, DEVOPS, Fusion Middleware, Linux, Oracle ADF, Oracle Linux, Vagrant, WebLogic

Before a couple of days Oracle release WebLogic 12c R2 (12.2.1). There are a lot of cool features like Java EE 7 support and Multitenancy Support for WebLogic domains. Installation of WebLogic server along with ADF runtime (Fusion Middleware Infrastructure) are not hard but requires a lot of parameters to be configured and a significant time when you need to …

Augmenting PCA functionality in Spark 1.5

Christos - Iraklis Tsatsoulis November 3, 2015Dimensionality Reduction, Spark 7 Comments

Surprisingly enough, although the relatively new Spark ML library (not to be confused with Spark MLlib) includes a method for principal components analysis (PCA), there is no way to extract some very useful information regarding the PCA transformation, namely the resulting eigenvalues (check the Python API documentation); and, without the eigenvalues, one cannot compute the proportion of variance explained (PVE), …

Page 3 of 6
←
1
...
2
3
4
...
6
→

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31