OS X El Capitan Upgrade Hangs

Just like it did when I upgraded from OS X Mavericks to OS X Yosemite a year ago, attempting an upgrade from OSX Yosemite to OSX El Capitan hung again in the same nondescript manner – a blank/empty screen with the pointer, for seemingly infinite period.

I tried to wait it out for a few hours (surely a 30% filled SSD shouldn’t make it need longer?), but that made no difference. The good bit is that this state is easy to recover from – just hold your power button down until the machine powers off, and then restart it (it will go back into OSX Yosemite or whatever old OS state you had running, without issues)

I ended up recalling the fix I did the last time, i.e. moving /usr/local out of the way, given I’m a heavy Homebrew user, and gave it a try again and the upgrade succeeded in under 20 minutes in the subsequent attempt.

Before the upgrade, I ran in the Terminal.app:

sudo mv /usr/local ~/local

And after the upgrade (with a brew update command just to be sure it is not broken after the move-back):

sudo rmdir /usr/local && sudo mv ~/local /usr/
brew update

Writing a simple Kudu Java API Program

This post is about using Cloudera’s new Kudu service, via its Java API interface, in a Maven-based Java program.

Kudu ships with a Java client library that is ready to use, and also publishes the Java library jars on Cloudera’s Maven repositories. To use within your Maven project pom.xml, add the below to the appropriate areas:

The Repository:

        <name>Cloudera Repositories</name>

The Library:


To start off, you will need to create a KuduClient object, like such:

KuduClient kuduClient =
   new KuduClientBuilder("kudu-master-hostname").build();

To create a table, a schema needs to be built first, and then created via the client. If say we have a simple schema for table “users“, with columns “username(string, key) and “age(8-bit signed int, value), then we can create it as shown below, utilising ColumnSchema and Schema objects over the previously created kuduClient object:

ColumnSchema usernameCol =
    new ColumnSchemaBuilder("username", Type.STRING)
ColumnSchema ageCol =
    new ColumnSchemaBuilder("age", Type.INT8)

List<ColumnSchema> columns = new ArrayList<ColumnSchema>();

Schema schema = new Schema(columns);
String tableName = "users";

if ( ! kuduClient.tableExists(tableName) ) {
    client.createTable(tableName, schema);

Data work (such as inserts, updates or deletes) in Kudu are done within Sessions. The below continuance shows how to insert a row, after creating a Session and a KuduTable for the table “users” and applying the insert object over it:

KuduSession session = kuduClient.newSession();
KuduTable table = kuduClient.openTable(tableName);

Insert insert = table.newInsert();
insert.getRow().addString("username", "harshj");
insert.getRow().addInt("age", 25);


Likewise, you can update existing rows:

Update update = table.newUpdate();
update.getRow().addString("username", "harshj");
// Change from 25 previously written
update.getRow().addInt("age", 26);


Or even delete them by key column (“username”):

Delete delete = table.newDelete();
delete.getRow().addString("username", "harshj");


Reading rows can be done via the KuduScanner class. The below example shows how to fetch only the key column data (all of it):

List<String> columnNames = new ArrayList<String>();

KuduScanner scanner =

while (scanner.hasMoreRows()) {
    for (RowResult row : scanner.nextRows()) {

A fully runnable example can be found also on Kudu’s kudu-examples repository.

Checking out a git branch when multiple remotes (repos) are involved

When you are dealing with a local git repository which has multiple remotes defined (i.e. has multiple remote repositories possibly carrying the same branches, such as multiple forks), then the right way to checkout a working branch from one of them is to use the below syntax:

git checkout -b master origin/master

Where origin is the name of the remote.

You can also delete a local branch before switching to another repository’s copy of it:

git checkout origin/master
git branch -D master
git checkout -b master origin/master

Purging away unnecessary local maven repository jars

As someone who works with a lot of Java (or other JVM related) projects, I often notice my $HOME/.m2 local maven cache grows quite big over time. This is due to a natural result of a lot of mvn install and other equivalent commands, that build and install the project jars locally (some large projects such as Apache Hadoop end up requiring the use of the local repository instead of built targets, to resolve their inter-dependencies).

More often than not, when working on such projects, the version is usually set to a SNAPSHOT styled one, for example spark-streaming-flume_2.10/1.5.0-SNAPSHOT, and these can be deleted away when you are no longer working on the project. Here’s the simple command I use to erase them away:

find $HOME/.m2 -type d | grep SNAPSHOT\$ | xargs rm -rf

Why your mobile data connection may not work on Mi4 (or other MIUI devices)

As a user of Airtel in India, and as a new Xiaomi Mi4 user, I found this issue as I began to use the phone wherein the mobile data connection (Mobile Office APN in Airtel’s case) would connect instantly, but simply not work in applications.

This is evident in the case of Google Maps, for example, which would never fix the location from the internet connection once you switch over from WiFi to mobile data connection when moving about.

The reason for this is perhaps due to a likely optimisation MIUI does, and a small bug in its APN management module. When using a new SIM, the interface tries to auto-configure your APNs for immediate connectivity, but ends up creating duplicated APN entries with the same name.

To fix this, head to your Settings application, and then head into the Mobile Networks option under it.

Mobile Networks under Settings application

Mobile Networks under Settings application

Within this, head into Access Point Names, and under it you can observe there may be duplicates for every entry present in it.

Access Point Names under Mobile Networks Settings

Access Point Names under Mobile Networks Settings

Duplicated entries inside Access Point Names list

Duplicated entries inside Access Point Names list

The solution is to delete the duplicates such as only one unique entry of every thing remains (i.e. only one of Airtel Live!, Mobile Office and even Airtel MMS). To do this, use the arrow icon on the chosen duplicate, and then use the More button to delete the APN via the presented option.

Delete the APN via the More button inside the duplicated entry details view

Delete the APN via the More button inside the duplicated entry details view

Hope this helps you get back your mobile data connectivity!

About a moto-scooter I own

I purchased a Honda Dio moto-scooter in August 2014 last year, mainly for work commute purposes and I’ve been trying to monitor its mileage performance for a while since then. This post is about my observations, and some mileage data that may help others quantify their own purchases. I’ve so far consistently gotten an average of 52+ kilo-meters, per litre of petrol, which is very satisfactory.

Below are some of the mileage charts from data I’ve collected via Fuelio app since March/April 2014 till August 2015.

View post on imgur.com

View post on imgur.com

View post on imgur.com

Some notes:

  • I try to brake as less as I can, preferring to drive slower instead when I can judge a signal or jam coming up, and gradually halting at such points when possible.
  • The lower two points of the Fuel Consumption charts represent a period of driving where I had to brake a lot more (more erratic driving style than usual).
  • I do my best to stay in speed limits, and in the economy range (35-45 km/h).

Trusting self-signed SSL certificates in Cloudera Hue

Using self-signed certificates is an easy process. Its very useful in local test environments, something we often rely on at my work.

However, it has its cons when the width of the ecosystem weighs in (software written in languages other than Java, such as Python, C++, etc.), as the support and configuration for trusting self-signed certificates varies from platform to platform.

For Cloudera Hue, which runs on Python (and uses PyOpenSSL plus the Python Requests libraries), while running it over a self-signed certificate file is as easy as configuring it, making it trust other services that it talks to (such as HDFS or YARN) gets difficult when the mentioned services also utilise self-signed certificates. You may often run into the following error or similar forms, within Hue’s File Browser, and other parts, for example:

Processing exception: [Errno 1] _ssl.c:492: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed: Traceback (most recent call last):
  File "/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/handlers/base.py", line 112, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/db/transaction.py", line 371, in inner
    return func(*args, **kwargs)
  File "/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p0.4/lib/hue/apps/jobbrowser/src/jobbrowser/views.py", line 119, in jobs
    raise ex
RestException: [Errno 1] _ssl.c:492: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed

While not something you’d want to do on a production grade cluster, you can still get Hue to work in such an environment by applying a small change in the Hue process environment. If you use Cloudera Manager, you could append the below into the CM -> Hue -> Configuration -> Hue Server Environment Advanced Configuration Snippet (Safety Valve) field, save and restart the Hue service:


This utilises the Python Requests library feature of using a custom CA Certificate bundle instead of the default one (which PyOpenSSL appears to self-bundle, rather than reuse the system default certs). You only need to make sure that the pointed PEM file carries all the necessary host certificates (and chain certificates) to allow Hue to talk to every applicable service in the cluster.

Hat tip to my most excellent colleague, Chris Conner, for pointing me to this feature.

A fresh start

Welcome, whoever still visits here!

I’ve just changed hosting (thank you Surya for hosting me all these past years), and as you may notice, I cleared away all old content, for a fresh new start.

A lot has changed in my life over the last ten years I’ve run this personal website, so its time I also changed the content to focus more on the current.