Nonlinear regression using Spark – Part 2: sum-of-squares objective functions

Constantinos VoglisData Science, Spark 4 Comments

This post is the second one in a series that discusses algorithmic and implementation issues about nonlinear regression using Spark. In the previous post we identified a small window for contribution into Spark MLlib by adding methods for nonlinear regression, starting with the definition and implementation of a general nonlinear model. We remind the reader that regression is essentially an …

Classification in Spark 2.0: “Input validation failed” and other wondrous tales

Christos - Iraklis TsatsoulisData Science, Spark 7 Comments

Spark 2.0 has been released since last July but, despite the numerous improvements and new features, several annoyances still remain and can cause headaches, especially in the Spark machine learning APIs. Today we’ll have a look at some of them, inspired by a recent answer of mine in a Stack Overflow question (the question was about Spark 1.6 but, as …

How to evaluate R models in Azure Machine Learning Studio

Constantinos VoglisAzure Machine Learning Studio, Data Science, R 6 Comments

Azure Machine Learning Studio is a GUI-based integrated development environment for constructing and operationalizing machine learning workflows. The basic computational unit of an Azure ML Studio workflow (or Experiment) is a module which implements machine learning algorithms, data conversion and transformation functions etc. Modules can be connected by data flows, thus implementing a machine learning pipeline. A typical pipeline in …

Installing the additional R packages in Oracle Big Data Lite VM 4.5.0

Christos - Iraklis TsatsoulisR 2 Comments

Oracle has just released version 4.5.0 of the Big Data Lite VM which, when it comes to R, still suffers from the issues we had pinpointed for the previous version 4.4.0 (and then some). The first attempt to install the additional packages fails with a ‘cannot open URL’ error: Fortunately, the warning about the proxy helps to locate the issue, …

ViewCriteria issue when using more than once attribute with LOV based on switcher (Oracle JDeveloper 12.2.1.0)

Rigas PapazisisFusion Middleware, Oracle ADF 1 Comment

Almost a year ago I wrote this post about a ViewCriteria issue when using the same attribute twice in Oracle JDeveloper 12.1.3.0. Now I came across another issue with ViewCriteria, in Oracle JDeveloper 12.2.1.0 this time, again related with a multiple insertion of an attribute but with a more complex scenario this time. Consider having an attribute with applied LOV that …

Manifest entry Weblogic-Application-Version causes log messages not to be shown on EM 12.2.1

Rigas PapazisisFusion Middleware, Oracle ADF, WebLogic 3 Comments

Recently we migrated an ADF app from 12.1.3 to 12.2.1 and we faced a problem with the log messages in the application server. Specifically, the deployed application returned zero log messages when navigating to em>application>Logs>view Log messages. The log configuration and the search results are as you can see in the following screenshots:   After many checks and dummy applications …

How to use SparkR in Cloudera Hadoop

Christos - Iraklis TsatsoulisBig Data, R, Spark 20 Comments

Suppose you are an avid R user, and you would like to use SparkR in Cloudera Hadoop; unfortunately, as of the latest CDH version (5.7), SparkR is still not supported (and, according to a recent discussion in the Cloudera forums, we shouldn’t expect this to happen anytime soon). Is there anything  you can do? Well, indeed there is. In this …

Bulk load data to HBase in Oracle Big Data Appliance

Christos - Iraklis TsatsoulisBig Data, HBase 1 Comment

I ran into an issue recently, while trying to bulk load some data to HBase in Oracle Big Data Appliance. Following is a reproducible description and solution using the current version of Oracle Big Data Lite VM (4.4.0). Enabling HBase in Oracle Big Data Lite VM (Feel free to skip this section if you do not use Oracle Big Data …

Installing the additional R packages in Oracle Big Data Lite VM 4.4.0

Christos - Iraklis TsatsoulisR Leave a Comment

In the just-released version 4.4.0 of Oracle Big Data Lite VM, as in the previous one (4.3.0.1), there is a rather large number of additional R packages to be installed by the provided script install_additional_packages.sh, i.e. 28 packages without counting their dependencies (the respective number in version 4.2.1 was only 10). Unfortunately, what has also changed is the form of …

Using ROracle with Oracle Instant Client 12c

Christos - Iraklis TsatsoulisOracle R, R Leave a Comment

The other day, while setting up the new Oracle R Enterprise (ORE) 1.5 client packages in a Linux server, we installed the Oracle DB Instant Client v. 12.1, as advised in the relevant documentation. Problem was, ORE failed to load, in fact due to ROracle failure: Truth is, the file libclntsh.so.11.1 did not exist, but this was expected, simply due …