Thursday, May 24, 2018

Gradle OutOfMemoryError



·         OutOfMemoryError – It’s possible that gradle is running in 32-bit mode when it should be running in 64 bit mode. To check whether gradle is running in 32-bit mode or 64-bit mode, in the build.gradle file, dump out a few system properties as follows:
println System.properties['os.arch']
println System.properties['sun.arch.data.model']

If sun.arch.data.model has a value of 32, then it’s running in 32-bit mode.

Double check that the JAVA_HOME environment variable is set to a path similar to
C:\Program Files\Java\jdk1.8.0_171
If the path points to C:\Program Files (x86), then it is likely to be running in 32 bit mode. In this case reinstall jdk for java in 64-bit.
Another symptom of running in 32-bit mode is if you try increasing the memory allocation higher than 1 GB you may get the following error: 

C:\Users\Philip\git\lims>gradle clean
Error occurred during initialization of VM
Could not reserve enough space for 2097152KB object heap

Sunday, May 14, 2017

ElasticSearch: Issues Adding a new node

The Issue

The new node was visible on the cluster but existing shards were not relocating to the new node

The steps

I had a pre-existing elasticsearch cluster of 3 nodes, and I went about adding a new node. In a round-robin fashion, I updated the elasticsearch.yml configuration of the pre-existing nodes to include the new node by updating the list of hosts and the minimum number of master nodes:

elasticsearch.yml
 discovery.zen.ping.unicast.hosts: ["10.0.0.1", "10.0.0.2", "10.0.0.3", "10.0.0.4"]  
 discovery.zen.minimum_master_nodes: 3  

Restarting each node, and checking the health status as follows:


 [root@mongo-elastic-node-1 centos]# curl -XGET 10.0.0.1:9200/_cluster/health?pretty  
 {  
  "cluster_name" : "cpi",  
  "status" : "green",  
  "timed_out" : false,  
  "number_of_nodes" : 4,  
  "number_of_data_nodes" : 4,  
  "active_primary_shards" : 40,  
  "active_shards" : 71,  
  "relocating_shards" : 2,  
  "initializing_shards" : 0,  
  "unassigned_shards" : 0,  
  "delayed_unassigned_shards" : 0,  
  "number_of_pending_tasks" : 0,  
  "number_of_in_flight_fetch" : 0,  
  "task_max_waiting_in_queue_millis" : 0,  
  "active_shards_percent_as_number" : 100.0  
 }  

The important item to notice from above, is the bit about "relocating_shards". Here it's saying that the cluster is relocating 2 shards. To find out which shards are going where, you can check with this command:

 [root@mongo-elastic-node-1 centos]# curl -XGET http://10.0.0.9:9200/_cat/shards | grep RELO  
  % Total  % Received % Xferd Average Speed  Time  Time   Time Current  
                  Dload Upload  Total  Spent  Left Speed  
 100 7881 100 7881  0   0  318k   0 --:--:-- --:--:-- --:--:-- 334k  
 cpi12         2 p RELOCATING 6953804  5.8gb 10.0.0.2 cpi2 -> 10.0.0.4 fBmdkD2gT6-jTJ6k_bEF0w cpi4  
 cpi12         0 r RELOCATING 6958611  5.5gb 10.0.0.3 cpi3 -> 10.0.0.4 fBmdkD2gT6-jTJ6k_bEF0w cpi4  

Here's it's saying that cluster is trying to send shards belonging to the index called cpi12 from node cpi3 and node cpi2 to node cpi4. More specifically, it's trying to send shard #2 and shard #0 by RELOCATING them to cpi4. To monitor it's progress, I would login into cpi4 and see if the diskspace usage was going up. And here is where I noticed my first problem:


 [root@elastic-node-4 elasticsearch]# df -h  
 Filesystem   Size Used Avail Use% Mounted on  
 /dev/vdb     69G  52M  66G  1% /mnt  

The mounted folder where I expected to find my elasticsearch data remained unchanged at 52 MB.

Debugging

I remained stumped on this one for a long time and did the following checks:

  • The elasticsearch.yml config file for every node ensuring that discovery.zen.ping.unicast.hosts was correctly.
  • Every node could ping the new node and vice versa.
  • Every node could access ports 9200 and 9300 on the new node and vice-versa using the telnet command.
  • Every node had sufficient diskspace for the shard relocation
  • New node had the right permissions to write to it's elasticsearch folder
  • Check cluster settings: curl 'http://localhost:9200/_cluster/settings?pretty' and look for cluster.routing settings
  • Restarted elasticsearch on each node 3 times over
However, none of the above solved the issue. Even worse, the repeated restarts of each node, managed to get my cluster into an even worse state where now some of shards became UNASSIGNED:

 [root@mongo-elastic-node-1 bin]# curl -XGET http://10.0.0.1:9200/_cat/shards | grep UNASS  
  % Total  % Received % Xferd Average Speed  Time  Time   Time Current  
                  Dload Upload  Total  Spent  Left Speed  
 100 5250 100 5250  0   0  143k   0 --:--:-- --:--:-- --:--:-- 146k  
 .marvel-es-2017.05.13 0 p UNASSIGNED  
 .marvel-es-2017.05.13 0 r UNASSIGNED  
 .marvel-es-2017.05.14 0 p UNASSIGNED  
 .marvel-es-2017.05.14 0 r UNASSIGNED  
 cpi14         1 p UNASSIGNED  
 cpi13         1 p UNASSIGNED  
 cpi13         4 p UNASSIGNED  

After much browsing on the web, there was one forum that mentioned the state of the plugins on all nodes must be exactly the same as referenced from here: http://stackoverflow.com/questions/28473687/elasticsearch-cluster-no-known-master-node-scheduling-a-retry

The solution

The question about the plugins got my memory jogging where I had previously installed the marvel plugin. To see what plugins are installed for each node, run the plugin command from the command-line:

 [root@elastic-node-3 elasticsearch]# cd /usr/share/elasticsearch/bin  
 [root@elastic-node-3 bin]# ./plugin list  
 Installed plugins in /usr/share/elasticsearch/plugins:  
   - license  
   - marvel-agent  

It turned out my pre-existing 3 nodes each had the license and marvel-agent plugins installed. Whereas the fresh install of the 4th node had no plugins at all. Because of this, the nodes were able to acknowledge each other, but refused to talk. To fix this, I manually removed the plugins for each node:

 [root@elastic-node-3 bin]# ./plugin remove license  
 -> Removing license...  
 Removed license  
 [root@elastic-node-3 bin]# ./plugin remove marvel-agent  
 -> Removing marvel-agent...  
 Removed marvel-agent  

Before I could see if shard relocation would work, I first had to assign the UNASSIGNED shards:

 [root@mongo-elastic-node-1 elasticsearch]# curl -XPOST -d '{ "commands" : [{ "allocate" : { "index": "cpi14", "shard":1, "node":"cpi4", "allow_primary":true } }]}' localhost:9200/_cluster/reroute?pretty  

I had repeat this command for every UNASSIGNED shard. Checking the cluster health, I could see that there were no more unassigned shards, and that there were 2 shards currently relocating:

 [root@elastic-node-4 elasticsearch]# curl -XGET localhost:9200/_cluster/health?pretty  
 {  
  "cluster_name" : "cpi",  
  "status" : "green",  
  "timed_out" : false,  
  "number_of_nodes" : 4,  
  "number_of_data_nodes" : 4,  
  "active_primary_shards" : 40,  
  "active_shards" : 71,  
  "relocating_shards" : 2,  
  "initializing_shards" : 0,  
  "unassigned_shards" : 0,  
  "delayed_unassigned_shards" : 0,  
  "number_of_pending_tasks" : 0,  
  "number_of_in_flight_fetch" : 0,  
  "task_max_waiting_in_queue_millis" : 0,  
  "active_shards_percent_as_number" : 100.0  
 }  

Again, checking the diskspace usage on the new node this time showed that shards were indeed relocating this time! Yay!

References

http://stackoverflow.com/questions/23656458/elasticsearch-what-to-do-with-unassigned-shards

http://stackoverflow.com/questions/28473687/elasticsearch-cluster-no-known-master-node-scheduling-a-retry

https://www.elastic.co/guide/en/elasticsearch/plugins/2.2/listing-removing.html

Wednesday, May 3, 2017

MongoDB switching to WireTiger storage engine

We were already running in production with a mongodb cluster of 3 nodes (replicated) which were running out of diskspace, each node having access to a 750 GB drive at 77% usage. The obvious solution was to expand the diskspace, but at the same time I wanted to be more efficient with the disk space usage itself.

Previously we were using the storage engine callled MMAPv1 which had no support for compression and I wanted to switch over to the WireTiger storage engine which does have support for compression options.

Here I describe the strategy I used :


Since my mongoDB cluster was replicated, I was able to take down one node at a time to perform the switch over to WiredTiger. Once I was finished with one node, I could bring it back up, and take down the next node, and so on until all nodes were upgraded. By doing it this way, there was no downtime whatsoever from the perspective of the user.


For each node I did the following:



  • Shutdown the mongod service
  • Moved the mongo data folder, which in my case was /var/lib/mongo, to another volume attached storage for backup purposes in case procedure fails.
  • Recreate the mongo data folder, in my case /var/lib/mongo and assign the appropriate permissions: chown mongod:mongod /var/lib/mongo
  • Modify the /etc/mongod.conf configuration file to include the following: storageEngine=wiredTiger
  • Restart mongod service
  • Check wiredTiger is configured correctly using the mongo command-line:
 db.serverStatus().storageEngine  
 { "name" : "wiredTiger", "supportsCommittedReads" : true }  

Now that the node is back up and running, replication will happen in the background. If you head over to your primary mongo node, and type rs.status() and you should see a status of STARTUP2.
Once the node has replicated successfully, repeat the same procedure for the next node.

Reference:


https://docs.mongodb.com/v3.0/release-notes/3.0-upgrade/?_ga=1.86531032.1131483509.1428671022#change-replica-set-storage-engine-to-wiredtiger

https://askubuntu.com/questions/643252/how-to-migrate-mongodb-2-6-to-3-0-with-wiredtiger

Tuesday, March 14, 2017

mongodb won't start: Data directory /data/db not found

Our VM provider suffered a hardware failure and one of our mongo nodes failed to start up. Using the command:

 sudo mongod 

 I had the following error message:
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] MongoDB starting : pid=1347 port=27017 dbpath=/data/db 64-bit host=mongodb-node-3.novalocal  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] db version v3.2.5  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] git version: 34e65e5383f7ea1726332cb175b73077ec4a1b02  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] allocator: tcmalloc  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] modules: none  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] build environment:  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten]   distmod: rhel70  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten]   distarch: x86_64  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten]   target_arch: x86_64  
 2017-03-15T09:05:27.963+1100 I CONTROL [initandlisten] options: {}  
 2017-03-15T09:05:27.985+1100 I STORAGE [initandlisten] exception in initAndListen: 29 Data directory /data/db not found., terminating  
 2017-03-15T09:05:27.985+1100 I CONTROL [initandlisten] dbexit: rc: 100  
I've highlighted the important error message in red.

It appears mongo was completely ignoring my configured dbpath specified in the /etc/mongod.conf.

So what was going on?

Turns out that mongo's journal folder got corrupted when the server immediately shutdown. So removing /var/lib/mongo/journal folder solved the problem.


Then I restarted mongod and that got everything back up and running again!

Reference: http://stackoverflow.com/questions/20729155/mongod-shell-doesnt-start-data-db-doesnt-exsist

Tuesday, February 21, 2017

Fineuploader with Grails

Fineuploader is an excellent frontend javascript library for supporting full-featured uploading capabilities such as concurrent file chunking and file resume. Our use case involves uploading extremely large files such as genomic DNA sequencing data including FASTQs, BAMs and VCFs. But the javascript library does run on it's own out of the box. You will still need to implement some server-side code to handle the file uploading.

There's already a Github repository with many examples of server-side implementations for the popular programming languages like Java, Python, PHP, node.js etc... which can be found here:

https://github.com/FineUploader/server-examples

However, I could not find any examples for a Grails implementation, so once again, I rolled up my own solution which can be found below. For this implementation I've only focused on file concurrent chunking and file resume. Other features such as file deletion we're omitted on purpose, but that's not to say you couldn't modify it to support the remaining features of fineuploader.



And for the GSP I have something like this:


Monday, February 6, 2017

Gradle intellij


Commands used to get intellij to recognize the Gradle project

gradle idea

From intellij GUI, File -> Invalidate Caches/Restart

Import project from Gradle

Monday, January 23, 2017

Grails 3 assets duplicated

For some reason, in Grails 3, the assets (javascripts, css ) declarations we're being duplicated HTML source as shown below:

Generated HTML:
   <script type="text/javascript" src="/assets/jquery-2.2.0.min.js?compile=false" ></script>  
   <script type="text/javascript" src="/assets/jquery-ui.min.js?compile=false" ></script>  
   <link rel="stylesheet" href="/assets/jquery-ui.min.css?compile=false" />  
   <link rel="stylesheet" href="/assets/jquery-ui.theme.min.css?compile=false" />  
   <link rel="stylesheet" href="/assets/bootstrap.css?compile=false" />  
 <link rel="stylesheet" href="/assets/grails.css?compile=false" />  
 <link rel="stylesheet" href="/assets/main.css?compile=false" />  
 <link rel="stylesheet" href="/assets/mobile.css?compile=false" />  
 <link rel="stylesheet" href="/assets/application.css?compile=false" />  
   <script type="text/javascript" src="/assets/igv-1.0.6.js?compile=false" ></script>  
   <link rel="stylesheet" href="/assets/igv-1.0.6.css?compile=false" />  
   <script type="text/javascript" src="/assets/jquery-2.2.0.min.js?compile=false" ></script>  
 <script type="text/javascript" src="/assets/bootstrap.js?compile=false" ></script>  
 <script type="text/javascript" src="/assets/igv-1.0.6.js?compile=false" ></script>  
 <script type="text/javascript" src="/assets/jquery-ui.min.js?compile=false" ></script>  
 <script type="text/javascript" src="/assets/application.js?compile=false" ></script>  

As you can see, several assets are being defined twice, for example jquery-2.2.0.min.js, jquery-ui.min.js and igv-1.0.6.js!

My following GSP code looked like this:

GSP:
   <asset:javascript src="jquery-2.2.0.min.js"/>  
   <asset:javascript src="jquery-ui.min.js"/>  
   <asset:stylesheet src="jquery-ui.min.css" />  
   <asset:stylesheet src="jquery-ui.theme.min.css" />  
   <asset:stylesheet src="application.css"/>  
   <asset:javascript src="igv-1.0.6.js" />  
   <asset:stylesheet src="igv-1.0.6.css"/>  
   <asset:javascript src="application.js"/>  

Strangely enough, if I remove the following line, then the duplicates are removed:

 <asset:javascript src="application.js"/>  

Another problem I had was that in production( but worked fine development mode), my javascript files were not being minified even after I set the configuration to use ES6 in my gradle.build file:


 assets {  
   minifyJs = false  
   minifyCss = true  
   enableSourceMaps = true  
   minifyOptions = [  
       languageMode: 'ES6',  
       targetLanguage: 'ES6'  
   ]  
 }  

To work around the issue, I set the minifyJs = false as shown above.

At the moment, the asset pipeline just feels buggy and unstable and it's probably better to just disable some of the features in the configuration to get some basic things working.

Not sure if I've misunderstood how assets were meant to be used, but if somebody can explain this, please enlighten me!