Starting from version 4.2, Oracle NoSQL now offers drivers for Node.js and Python, in addition to the existing ones for Java, C, and C++; this is good news for data science people, like myself, since we are normally not accustomed to code in Java or C/C++. So, I thought to build a short demo project, putting into test both the new drivers as well as some recently acquired skills on IoT development and Node-RED in particular.
In this post we will walk through building a demo application, in which we get processor temperature measurements from a Raspberry Pi and store them to a remotely-hosted Oracle NoSQL database using a Node-RED flow; we will also get a quick glimpse at the Python driver, using it to write rows in Oracle NoSQL outside the main Node-RED flow.
Prerequisites & system configuration
Here is what we will use:
- Raspberry Pi 3
- Oracle NoSQL v4.3.10 (hosted in an Oracle Big Data Lite VM instance)
- Node.js v6.9.5 LTS
- Node-RED v0.16.2 (you may have issues with older versions)
- Python 2.7
- The nosqldb-oraclejs Node.js package v4.3.10 (install with
npm
) - The nosqldb Python package v4.3.10 (install with
pip
)
Despite the claims in the documentation, we need Python 2.7 – Python 2.6 will not work (see the discussion here); Also, the host running Oracle NoSQL must be known by name to the other devices involved (the IP will not suffice – see the discussion here), so we need to add the following entry to Pi’s /etc/hosts file:
192.168.70.74 bigdatalite.localdomain
Overview & preliminaries
Our application will involve two tables in Oracle NoSQL: a parent table named devices
, holding information about the device ID (MAC address), type, and location; and a child table named measurements
, holding the timestamped measurements from a particular device. Here are the necessary DDL statements from the kvlite console:
[oracle@bigdatalite ~]$ java -jar $KVHOME/lib/kvstore.jar runadmin -port 5000 -host localhost kv-> connect store -name kvstore Connected to kvstore at localhost:5000. kv-> execute "CREATE TABLE devices ( -> device_id STRING, -> type STRING, -> location STRING, -> PRIMARY KEY (device_id) -> )"; Statement completed successfully kv-> execute "CREATE TABLE devices.measurements ( -> time STRING, -> temperature DOUBLE, -> PRIMARY KEY (time) -> )"; Statement completed successfully kv-> show tables; Tables: [...] devices devices.measurements [...]
Notice that, although in the second DDL statement above it appears that our measurements
table has only the time
field as its primary key, the fact that it is a child table of devices
means that the parent’s primary key is included , not only as a field, but as a primary key as well:
kv-> execute "DESCRIBE AS JSON TABLE devices.measurements" { "type" : "table", "name" : "measurements", "owner" : null, "parent" : "devices", "shardKey" : [ "device_id" ], "primaryKey" : [ "device_id", "time" ], "fields" : [ { "name" : "device_id", "type" : "STRING", "nullable" : false, "default" : null }, { "name" : "time", "type" : "STRING", "nullable" : false, "default" : null }, { "name" : "temperature", "type" : "DOUBLE", "nullable" : true, "default" : null } ] }
This makes our demo project scalable in a real situation with possibly hundreds of devices, since the combination of MAC address (device_id
) and timestamp will always be unique, hence there is no danger of record loss due to primary key duplication. This is also the reason why we keep our time
field as type STRING instead of TIMESTAMP, since the latter type cannot be used as a table primary key in Oracle NoSQL.
Get the MAC address of the Pi
Getting the MAC address of our Pi is straightforward in Node.js:
pi@raspberrypi:~ $ node > require('os'); [...] > tt = os.networkInterfaces() { lo: [ { address: '127.0.0.1', netmask: '255.0.0.0', family: 'IPv4', mac: '00:00:00:00:00:00', internal: true }, { address: '::1', netmask: 'ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff', family: 'IPv6', mac: '00:00:00:00:00:00', scopeid: 0, internal: true } ], eth0: [ { address: '192.168.70.73', netmask: '255.255.255.0', family: 'IPv4', mac: 'b8:27:eb:0a:0e:e8', internal: false }, { address: 'fe80::7a93:3806:deea:da60', netmask: 'ffff:ffff:ffff:ffff::', family: 'IPv6', mac: 'b8:27:eb:0a:0e:e8', scopeid: 2, internal: false } ] } > tt.eth0[0].mac 'b8:27:eb:0a:0e:e8'
Proxy server and Python driver
Before proceeding to the Node-RED flow for capturing and storing our measurements, notice that we now have enough information to populate our parent table devices
with the first entry. For test & demonstration purposes, we choose to do it using the Python driver.
Both Python & Node.js drivers require a lightweight proxy server in order to communicate with Oracle NoSQL; the necessary jar files are already included in the respective packages nosqldb (Python) and nosqldb-oraclejs (Node.js), so we first need to provide their location in the CLASSPATH environment variable (empty by default in Pi) and then start the proxy:
pi@raspberrypi:~ $ export CLASSPATH="/usr/local/lib/python2.7/dist-packages/nosqldb/kvproxy/lib/*" pi@raspberrypi:~ $ java oracle.kv.proxy.KVProxy -port 7010 -helper-hosts 192.168.70.74:5000 -store kvstore Feb 08, 2017 4:35:55 PM oracle.kv.proxy.KVProxy <init> INFO: PS: Starting KVProxy server Feb 08, 2017 4:35:55 PM oracle.kv.proxy.KVProxy <init> INFO: PS: Connect to Oracle NoSQL Database kvstore nodes : 192.168.70.74:5000 Feb 08, 2017 4:35:56 PM oracle.kv.proxy.KVProxy <init> INFO: PS: ... connected successfully Feb 08, 2017 4:35:58 PM oracle.kv.proxy.KVProxy startServer INFO: PS: Starting listener ( Half-Sync/Half-Async server - 20 no of threads on port 7010, version 4.3.10, protocol version 3)
We have prepared two separate Python scripts – one for connecting to the store (nosqldb_connect.py) and one for inserting a row to the parent table devices
(nosqldb_write_parent.py); here they are:
# nosqldb_connect.py import logging, sys, os # set logging level to debug and log to stdout def setup_logging(): logger = logging.getLogger("nosqldb") logger = logging.StreamHandler(sys.stdout) logger.setLevel(logging.DEBUG) formatter = logging.Formatter('\t%(levelname)s - %(message)s') logger.setFormatter(formatter) rootLogger.addHandler(logger) from nosqldb import ConnectionException from nosqldb import IllegalArgumentException from nosqldb import Factory from nosqldb import StoreConfig from nosqldb import ProxyConfig storehost = "bigdatalite.localdomain:5000" proxy = "localhost:7010" # configure and open the store def open_store(): try: kvstoreconfig = StoreConfig('kvstore', [storehost]) kvproxyconfig = ProxyConfig() print 'Store connection open' return Factory.open(proxy, kvstoreconfig) except ConnectionException, ce: logging.error("Store connection failed.") logging.error(ce.message) sys.exit(-1) # follow by # store = open_store()
# nosqldb_write_parent.py row_parent = { 'device_id' : 'b8:27:eb:0a:0e:e8', 'type' : 'Raspberry Pi 3', 'location' : 'office' } try: store.put("devices", row_parent) print "Store write succeeded." logging.debug("Store write succeeded.") except IllegalArgumentException, iae: logging.error("Could not write to table.") logging.error(iae.message) # follow by # store.close()
Let us now run them from a Python console in our Pi:
>>> execfile('/home/pi/scripts/nosqldb_connect.py') >>> store = open_store() Store connection open >>> execfile('/home/pi/scripts/nosqldb_write_parent.py') Store write succeeded. >>> store.close()
and confirm that we indeed have one row inserted at the parent table devices
in kvstore:
kv-> execute 'SELECT * FROM devices'; +-------------------+----------------+----------+ | device_id | type | location | +-------------------+----------------+----------+ | b8:27:eb:0a:0e:e8 | Raspberry Pi 3 | office | +-------------------+----------------+----------+ 1 row returned
Looks we are OK!
We have kept the scripts for connection and writing to Oracle NoSQL deliberately distinct; that way, the writing script can be easily modified in order, say, to read the MAC addresses and locations of new devices programatically from a text file, and even to call the connection script once in the beginning with execfile()
.
The main Node-RED flow
Having finished with the preliminaries and the short demonstration of the Python driver, we now turn to the main task of our demo project, i.e. constructing a flow in Node-RED which will get processor temperature measurements from our Pi every 5 seconds and store them in the Oracle NoSQL store. Here is the flow schematic in Node-RED:
The unconnected sub-flow in the lower part is executed only once in the beginning, and it is needed in order to automatically get the MAC address of our Pi and use it in the measurements transmitted to Oracle NoSQL.
Recall how we got the Pi MAC address above, using require('os')
? Unfortunately, the require()
command is not available for Node-RED nodes; what we have to do is modify accordingly the functionGlobalContext
in the /home/pi/.node-red/settings.js file, so as to make any such additional requirements available to Node-RED through its global context. And apart from os, we have to include also the nosqldb-oraclejs package. In my case, the said function begins at line 159 of the file, and here is how it must be modified (highlighted lines):
functionGlobalContext: { os:require('os'), nosqldb:require('nosqldb-oraclejs'), // octalbonescript:require('octalbonescript'), // jfive:require("johnny-five"), // j5board:require("johnny-five").Board({repl:false}) crc:require("crc") },
We have to restart Node-RED in order for the changes to be applied. Notice that if you use an old version of Node-RED, you may encounter some issues here with nosqldb-oraclejs; but with Node-RED 0.16.2 (latest version at the time of writing), everything runs smoothly:
pi@raspberrypi:~ $ node-red-start Start Node-RED [...] Started Node-RED graphical event wiring tool.. Welcome to Node-RED =================== 8 Feb 19:42:56 - [info] Node-RED version: v0.16.2 8 Feb 19:42:56 - [info] Node.js version: v6.9.5 8 Feb 19:42:56 - [info] Linux 4.1.18-v7+ arm LE 8 Feb 19:42:56 - [info] Loading palette nodes pi : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/python -u /usr/lib/node_modules/node-red/nodes/core/hardware/nrgpio.py info pam_unix(sudo:session): session opened for user root by (uid=0) pam_unix(sudo:session): session closed for user root 8 Feb 19:42:59 - [info] Settings file : /home/pi/.node-red/settings.js 8 Feb 19:42:59 - [info] User directory : /home/pi/.node-red 8 Feb 19:42:59 - [info] Flows file : /home/pi/.node-red/flows_raspberrypi.json 8 Feb 19:42:59 - [info] Server now running at http://127.0.0.1:1880/ 8 Feb 19:42:59 - [info] Starting flows 8 Feb 19:42:59 - [info] Started flows
Getting the Pi MAC address in Node-RED
Here is the Node.js code inside our get_mac
node:
// We need to include 'os' in functionGlobalContext // located at /home/pi/settings.js var tt = global.get('os').networkInterfaces(); global.set('mac', tt.eth0[0].mac);
Notice the difference in accessing the os module compared with our plain Node.js code above: now os is included in our global context, and we access it using global.get('os')
.
The format node
Here is the code inside our format
node:
var temp = parseFloat(msg.payload.replace( /[^\d\.]*/g, '')); var time = new Date().toISOString().replace(/\..+/, ''); var mac = global.get('mac'); msg.payload = {'device_id' : mac, 'measurements' : [ { 'time' : time }, { 'temperature' : temp } ] }; return msg;
As the name implies, its role is simply to parse the temperature and time measurements, bundle them with the global variable mac
(set in the node get_mac
), and pass the data along to the next node write_db
.
The write_db node
This is where we invoke the Node.js driver for connecting and writing to the Oracle NoSQL store; here is the node code:
var nosqldb = global.get('nosqldb'); // Create a configuration object var configuration = new nosqldb.Configuration(); configuration.proxy.startProxy = true; configuration.proxy.host = 'localhost:7010'; configuration.storeHelperHosts = ['192.168.70.74:5000', 'bigdatalite.localdomain:5000']; configuration.storeName = 'kvstore'; // Create a store with the specified configuration var store = nosqldb.createStore(configuration); // Create the row to be inserted into the child table 'devices.measurements' var childRow = {'device_id' : msg.payload.device_id, 'time' : msg.payload.measurements[0].time, 'temperature' : msg.payload.measurements[1].temperature }; // Define the insert action store.on('open', function () { console.log('Store opened.'); store.put('devices.measurements', childRow, function (err) { if (err) throw err; else console.log("Child row inserted."); }); }).on('error', function(error) { console.log('Error in the store.'); console.log(error); }); // Perform the row insert store.open(); // for diagnostics in Node-RED only: msg.payload = childRow; return msg;
A nice feature of the Node.js driver (currently missing from the Python one) is that we don’t need to manually start the proxy server; it suffices to set configuration.proxy.startProxy = true
in the code, as in line 5 of the above snippet.
We are now ready to start inserting measurements from our Raspberry Pi into Oracle NoSQL; deploying the flow from the Node-RED web UI, we get the following info at the console:
9 Feb 16:49:18 - [info] Starting flows 9 Feb 16:49:18 - [info] Started flows Store opened. Child row inserted. Store opened. Child row inserted. Store opened. Child row inserted. Store opened. Child row inserted.
At which point we can confirm that we have already 4 records in the devices.measurements
table, with 5 seconds separation between them (the trigger time interval in our flow, defined in the inject
node):
kv-> execute 'SELECT * FROM devices.measurements'; +-------------------+---------------------+-------------+ | device_id | time | temperature | +-------------------+---------------------+-------------+ | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:23 | 36.9 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:28 | 40.1 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:33 | 38.5 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:38 | 37.9 | +-------------------+---------------------+-------------+ 4 rows returned
Deploy the flow to more devices
It seems we are set. But there is one final step: we have shown how to connect to and get measurements from just one Raspberry Pi device; what if we had 50 (or 500…) such devices? Would we have to connect manually to each one of them with a browser, load the flow in Node-RED GUI, and press ‘Deploy’?
Fortunately no. Although rather buried and obscured in the documentation, there is a straightforward way for deploying a flow to a remote machine running Node-RED via a REST API.
So, as a last demonstration, here is how we can deploy the flow remotely to a second Raspberry Pi located, say, in the roof; first, we get its MAC address, and we use the Python driver as above to register it to our parent table devices
, so now it looks like that:
kv-> execute 'SELECT * FROM devices'; +-------------------+----------------+----------+ | device_id | type | location | +-------------------+----------------+----------+ | b8:27:eb:00:98:1b | Raspberry Pi 3 | roof | | b8:27:eb:0a:0e:e8 | Raspberry Pi 3 | office | +-------------------+----------------+----------+ 2 rows returned
Then, after we have saved our flow in a file nosqldb_demo.json in a remote machine (here we use the Oracle Big Data Lite VM), we send a POST request to the Node-RED service of our second Pi like this:
[oracle@bigdatalite scripts]$ curl -v -X POST http://192.168.70.77:1880/flows -H "Content-Type: application/json" --data @nosqldb_demo.json * About to connect() to 192.168.70.77 port 1880 (#0) * Trying 192.168.70.77... connected * Connected to 192.168.70.77 (192.168.70.77) port 1880 (#0) > POST /flows HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Host: 192.168.70.77:1880 > Accept: */* > Content-Type: application/json > Content-Length: 3517 > Expect: 100-continue > < HTTP/1.1 100 Continue < HTTP/1.1 204 No Content < X-Powered-By: Express < Date: Mon, 13 Feb 2017 10:49:10 GMT < Connection: keep-alive < * Connection #0 to host 192.168.70.77 left intact * Closing connection #0
We see that we get a 204 response (highlighted) meaning success. Here is the Node-RED console output in this remote Pi:
13 Feb 12:49:10 - [info] Starting flows 13 Feb 12:49:10 - [info] Started flows Store opened. Child row inserted. Store opened. Child row inserted. Store opened. Child row inserted.
i.e. 3 new records added to our devices.measurements
table, which we can easily verify in the kvlite console:
kv-> execute 'SELECT * FROM devices.measurements'; +-------------------+---------------------+-------------+ | device_id | time | temperature | +-------------------+---------------------+-------------+ | b8:27:eb:00:98:1b | 2017-02-13T10:49:15 | 39.7 | | b8:27:eb:00:98:1b | 2017-02-13T10:49:20 | 42.9 | | b8:27:eb:00:98:1b | 2017-02-13T10:49:25 | 40.2 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:23 | 36.9 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:28 | 40.1 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:33 | 38.5 | | b8:27:eb:0a:0e:e8 | 2017-02-09T14:49:38 | 37.9 | +-------------------+---------------------+-------------+ 7 rows returned
Summary
We have demonstrated a complete and scalable end-to-end application for storing data from multiple Raspberry Pi devices into Oracle NoSQL via Node-RED; to the best of our knowledge, this is the first demonstration of a connection to Oracle NoSQL using Node-RED. Despite the fact that the Node.js driver for Oracle NoSQL was probably not built with Node-RED in mind, we can happily report that it indeed works smoothly with Node-RED.
As always, comments and remarks most welcome.-
- Streaming data from Raspberry Pi to Oracle NoSQL via Node-RED - February 13, 2017
- Dynamically switch Keras backend in Jupyter notebooks - January 10, 2017
- sparklyr: a test drive on YARN - November 7, 2016