Sunday, June 30, 2019

Docker Containers Missing!

For many months we've been trying to track down the cause for the sudden termination of our docker containers.

When running the "docker container ls" command, nothing is shown. It almost looks like they're missing.

  docker container ls  
 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES  

However running the process command "docker ps -a" does indeed indicate that our containers were exited:

 [root@patient-archive-cpi centos]# docker ps -a  
 CONTAINER ID    IMAGE            COMMAND         CREATED       STATUS            PORTS        NAMES  
 4a9b66fdd624    aehrc/ontoserver:ctsa-5.2  "/run.sh run"      7 weeks ago     Exited (143) 2 days ago            ontoserver  
 9e8314f52de2    postgres          "docker-entrypoint.s…"  7 weeks ago     Exited (0) 2 days ago         5432/tcp      docker_db_1  


The command for viewing "dockers logs ", indicated that the container was being shutdown. But there were no errors or reason for the shutdown.


 2019-06-28 03:41:48.302Z INFO 1 --- [ Thread-4] s.b.w.s.c.AnnotationConfigServletWebServerApplicationContext : Closing org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext@63d4e2ba: startup date [Fri May 10 04:11:42 GMT 2019]; root of context hierarchy  
 2019-06-28 03:41:48.313Z INFO 1 --- [ Thread-4] o.s.jmx.export.annotation.AnnotationMBeanExporter : Unregistering JMX-exposed beans on shutdown  
 2019-06-28 03:41:48.314Z INFO 1 --- [ Thread-4] o.s.jmx.export.annotation.AnnotationMBeanExporter : Unregistering JMX-exposed beans  
 2019-06-28 03:41:48.316Z INFO 1 --- [ Thread-4] o.s.scheduling.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'batchRunner'  
 2019-06-28 03:41:48.319Z INFO 1 --- [ Thread-4] o.s.scheduling.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'jobRunner'  
 2019-06-28 03:41:48.319Z INFO 1 --- [ Thread-4] o.s.scheduling.concurrent.ThreadPoolTaskExecutor : Shutting down ExecutorService 'auditReportJobRunner'  
 2019-06-28 03:41:48.329Z INFO 1 --- [ Thread-4] o.s.orm.jpa.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'  
 2019-06-28 03:41:48.331Z INFO 1 --- [ Thread-4] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown initiated...  
 2019-06-28 03:41:48.343Z INFO 1 --- [ Thread-4] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Shutdown completed.  

Running the docker command for viewing events showed nothing of interest

 docker events --filter container=ontoserver --since '2019-06-27'  

After some digging around it turns out that there was an automated yum update to docker in the /var/log/yum.log

 Jun 28 03:41:44 Updated: docker-ce-cli.x86_64 1:18.09.7-3.el7  
 Jun 28 03:41:48 Updated: containerd.io.x86_64 1.2.6-3.3.el7  
 Jun 28 03:41:52 Updated: docker-ce.x86_64 3:18.09.7-3.el7  

Turns out in docker there's a feature for re-enabling the docker containers after a docker update called 'live-restore'
Added a file /etc/docker/daemon.json with the following contents
{
  "live-restore": true
}
Then restarted docker
systemctl reload docker