This is the second part of our three-part blog post series (see the first part here), which deals with incremental data updates. In our scenario we assume that we acquire small batches of data updates using some kind of web scraping mechanism. We will not deal with the details of that mechanism, as it is beyond the scope of this …
Graph analysis of Stack Overflow tags with Oracle PGX – Part 1: Data Engineering
Intoduction Oracle Parallel Graph Analytics (PGX) is a toolkit for graph analysis, both for running algorithms such as PageRank and for performing SQL-like pattern-matching against graphs. Extreme performance is offered through algorithm parallelization, and graphs can be loaded from a variety of sources such as flat files, SQL and NoSQL databases etc. So, in order to get a deeper feeling, …
Streaming data from Raspberry Pi to Oracle NoSQL via Node-RED
Starting from version 4.2, Oracle NoSQL now offers drivers for Node.js and Python, in addition to the existing ones for Java, C, and C++; this is good news for data science people, like myself, since we are normally not accustomed to code in Java or C/C++. So, I thought to build a short demo project, putting into test both the …
Bulk load data to HBase in Oracle Big Data Appliance
I ran into an issue recently, while trying to bulk load some data to HBase in Oracle Big Data Appliance. Following is a reproducible description and solution using the current version of Oracle Big Data Lite VM (4.4.0). Enabling HBase in Oracle Big Data Lite VM (Feel free to skip this section if you do not use Oracle Big Data …