Software Development Lifecycle Trends in 2020

2020 is here and it’s like we are watching a SciFi movie from the future 😉 But the future is now, no doubt about it. The last 5 years I hear from many companies that the SDLC trends have changed and many more tools are used to find bugs faster, to build faster, to deploy faster, new or legacy products. DevOps mentality plays a vital role to this effort and the reason for this is that the word “faster” in the previous statement can be used only if development, devops and operations teams work very close together.

Of course it’s hard to adopt new trends in legacy products but if you have time to redesign and can-do attitude nothing is impossible. I’m trying to adopt such trends in my teams as well. I’m pretty sure that in 2020 we will see new products and tools that help the software development lifecycle with more automation, AI/ML, testing, increased observability.

Let’s see which are the top 9 trends I picked up during my research and experiments.

1. Software Development Workflow

Everybody uses such tools. Git as a source code management tool is the most popular of all and all developers should be familiar with it. Many startups use cloud based git products like gitlab, bitbucket or github. On the other hand established companies or companies that believe their code is safer on premises, may use same products installed in their data centers. As the years pass more and more companies have distributed teams and engineers working remotely who need a way to collaborate, discuss, share their notes, ideas. Slack or Microsoft Teams are ones of such tools. Zoom or Jitsi Meet are also great collaboration tools.

2. Continuous Delivery / Continuous Integration

You definitely should use CI/CD tools in order to increase automation and improve your software reliability by getting feedback from every step of your development pipeline. Shipping code faster is number one priority of all companies.
We use a lot Jenkins, but there some other great CI/CD platforms like CircleCI, GitlabCI, TravisCI.

3. Testing

CI/CD platforms are also used to execute automated tests that the development or QA teams have implemented, in order to ensure that nothing has broken during the development pipelines. Each time we find a production issue we implement a new test case in our automated integration tests (whenever is possible) so that we make sure that this bug won’t happen again in a new release. We use a lot SonarQube, Selenium, JUnit, Findbugs and Jmeter.

4. DevSecOps

Everybody agrees that security is a very important chapter in the SDLC but devops teams did not put a lot of effort in this one. In 2020 DevSecOps is one of the trends that will play a major role in the development pipeline. But what exactly are the DevSecOps tasks that you should be dealing with? Some examples are:

a. checks for vulnerabilities in opensource libraries that your applications depend on

b. security vulnerabilities that our apps may be exposed to, like

  • SQL Injection
  • Cross Site Scripting
  • Broken Authentication and Session Management
  • Insecure Direct Object References
  • Cross Site Request Forgery
  • Security Misconfiguration
  • Insecure Cryptographic Storage
  • Failure to restrict URL Access

c. train your development teams to write secure code

d. execute threat modelling and risk assessments

Fortunately there are now several tools out there that devops teams may use: Checkmarkx, Snyk, ShiftLeft, continuum security.

5. Orchestration

Simplification of the deployments of complex distributed software is definitely part of the development pipeline nowadays.  Kubernetes, Docker Swarm, Amazon EKS, Azure AKS, Docker, Cloud Foundry are some of the tools/platforms that gives us the power of deploying faster, easier and more secure. Of course you have to master the learning curve of such tools, but when you do it you will be ready for large scale deployments in private or public clouds. An idea of playing around and experimenting with them is to setup on premises such tools and deploy apps of your staging environments. Learn best practices of writing Dockerfiles and make sure you know how to debug such environments when something goes wrong.

6. Log Management

No more grepping and searching in application log files. Log Management tools are kind of an oasis in the desert of huge and infinite log files, log databases, etc. Of course these tools cannot help troubleshooting if there are no log messages with the exact errors, so it depends on the development team on what information will be exposed through the logs. Several tools exist for logs management and analytics: Splunk, Elastic, Sumo logic, loggy.

7. Monitoring

New Relic, DataDog, Dynatrace and other companies provide log aggregation and consolidation. More and more products move to microservices and troubleshooting is harder since you need to read logs from different services or need to know which service was overloaded while asynchronous requests come in and you can’t find out only through the log messages what’s going on. For this reason application performance management features that these monitoring tools have are very useful on debugging.

8. Alerting

Fine-tuned alerts ensure you learn about software issues in real-time and the right people get notified at the right time. Sometimes email alerts are not enough, especially when something happens when your operations team sleeps. Phone or pager calls are the ideal for such situations. Some the tools that already exist: PagerDuty, OpsGenie, Solarwinds pingdom, Victor Ops, (x) matters.

9. Visualization

There are so much data coming from the monitoring tools that you need to have a way to visualize them in order to understand what’s going on with our systems. Thanks to the following tools you are able to understand the meaning of the high volumes of data: Kibana, Grafana, Prometheus, Datadog.

WG Developers Meetup S03E02 – Cloud Talks

Image may contain: text

This meetup was dedicated to cloud technologies with hands on examples deployed to AWS engines.

“Getting started with Serverless” by Vasilis Dourdounis, CTO ExitIntelligence, Patras branch. The presentation is located here.

Vasilis presented examples from serverless.com how we deploy a simple node function in AWS engine. The interaction with attendees had some interesting questions like how long the container is alive for such serverless functions. According to Vasili’s experience it’s not standard. Some times it takes 10 minutes of inactivity to shutdown the container but this is something that AWS is working on.

“Install kubernetes in AWS through Terraform and Ansible” by Stamatis Panorgios, DevOps Engineer, ZuluTrade Technologies, Remote. The presentation is located here.

One more hands-on presentation on how to install a custom Kubernetes cluster on AWS using Terraform and Ansible. Stamatis noted about terraform-inventory tool which finds out the public IPs of the cluster that AWS engine assigned during the creation of the instances.

Since we’re very close to Christmas, I’m wishing you Merry Christmas, and don’t forget we’ll be here next month with new interesting topics!

WG Developers Meetup S03E01

New season, new fresh start for everyone! The first meetup of the new season just completed and I feel all the participants enjoyed it and started working on ideas for the next one 😉 We had two interesting talks by software engineers from companies located in Patras.

“Clean code principles” by Apostolos Stamatis, Intrasoft International, Patras branch. The presentation is located here.

Apostolos explained very well how SOLID principles help software development teams to write clear code and prevent difficult production issues to happen after releases. A question followed by the participants: “What are you doing when you join a project as a new member and you see dirty legacy code?”

An interesting discussion started with this question with a lot of answers from different aspects. The outcome was that we always should file the issues we see, and include them in the roadmap as long as we have done the appropriate risk management, measuring all the relevant factors.

“Augmented reality” by Elefteria Marinou, Citrix Innovation Labs, Patras branch. The presentation is located here.

One more interesting presentation by Eleftheria, which was a part of her thesis. Eleftheria explained very well the differences of VR AR, and AV while she answered questions regarding the experiments she did for her thesis.

Many thanks again to the Stavropoulou Foundation for the venue, and Dimitris Karakasilis, Marios Karagiannopoulos and Christos Aridas for their passion to grow this wonderful community.

Stay tuned for the next one, in one month!

How a Nexus Repository Manager corruption led to a mini Odyssey

Image result for nexus sonatype orientdb

Last week (worst Friday of all ;-)) we had a very serious incident in our Nexus Repository Manager service which affected the releases lifecycle of our products. Unfortunately something bad happened in our NFS server and our nexus docker complained about several corruptions like:

2019-09-05 19:38:17,321+0000 ERROR [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Error on creating record in cluster: plocal cluster: quartz_trigger
com.orientechnologies.orient.core.exception.OPaginatedClusterException: Error during record creation
DB name="config"
Component Name="quartz_trigger"
at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createSinglePageRecord(OPaginatedCluster.java:687)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createDataRecord(OPaginatedCluster.java:564)

While searching on how others fixed it, most of them dropped the specific orient database table: config.quartz_trigger. This table is actually the holder of the scheduled tasks so it was not a big deal to drop and recreate it.

You should connect to orientdb console. Here are the commands to connect to config db and drop this table:

java -jar /opt/sonatype/nexus/lib/support/nexus-orient-console.jar
connect plocal:/opt/sonatype/sonatype-work/nexus3/db/config/ admin admin
drop class quartz_trigger

Then repair database, disconnect and restart nexus.

REPAIR DATABASE component
DISCONNECT

Nevertheless, after restart I experienced some other errors that were very strange how they occurred.

Return code is: 500 , ReasonPhrase:javax.servlet.ServletException: com.orientechnologies.orient.core.exception.OCommandExecutionException: Error on execution of command: sql.select from asset where bucket = :bucket and name = :propValue??	DB name="component". -> [Help 1]

This time I didn’t have with me google or stackoverflow so I was trying to understand what happened. After first corruption I decided to do an upgrade of nexus just in case there was a bug in the release before the upgrade. The update though did not use the old nexus.vmoptions, and the following vmoptions were missing:

-Xms2703m -Xmx2703m -XX:MaxDirectMemorySize=2703m

I realized that the orientdb did not have enough MaxDirectMemorySize (sets a limit on the amount of memory that can be reserved for all Direct Byte Buffers) allocated. Basically no new blob stores, no new settings, and no uploads were possible. Did an update on my ansible script for the docker container creation:

    - name: Create nexus container
      docker_container:
        name: nexus
        image: sonatype/nexus3:3.18.1
        state: started
        restart_policy: unless-stopped
        env:
          INSTALL4J_ADD_VM_PARAMS="-Xms2703m -Xmx2703m -XX:MaxDirectMemorySize=2703m"
        volumes:
           - /data/nexus/sonartype-work:/opt/sonatype/sonatype-work:rw
        ports:
          - "8081:8081"
          - "9000:9000"

In the meantime, developers did several maven deploy leading to the following error:

2019-09-06 11:21:20,573+0000 WARN [qtp1097449578-137] deployment org.sonatype.nexus.transaction.RetryController - Exceeded retry limit: 8/8 (com.orientechnologies.orient.core.storage.ORecordDuplicatedException: Cannot index record #31:309836: found duplicated key 'OCompositeKey{keys=[#22:6, null, gr/aaafx/backend/aaafx-dealer/maven-metadata.xml]}' in index 'asset_bucket_component_name_idx' previously assigned to the record #30:307921
DB name="component" INDEX=asset_bucket_component_name_idx RID=#30:307921)

Fortunately help command of orientdb console had listed a command called truncate record. Voila! That did the trick and since then after nexus restart everything worked as expected.

Here are the commands I applied:

java -jar /opt/sonatype/nexus/lib/support/nexus-orient-console.jar
connect plocal:/opt/sonatype/sonatype-work/nexus3/db/component/ admin admin

load record #30:307921
truncate record #30:307921
rebuild index asset_bucket_component_name_idx
REPAIR DATABASE component

Make sure that you always have a backup of the database you’re going to touch. Basically Nexus has 3 databases: config, component and security. I’ve added a scheduled task that creates a backup daily.

Here are the commands for backup and restore through orientdb console:

export database component-export
drop database
create database plocal:/nexus-data/db/component admin admin
import database component-export.json.gz

Enjoyed the weekend after all 😉

Overriding docker ENTRYPOINT of a base image

Related image

Recently my DevOps team and I decided to bring all the dev tools of all engineering teams (backend, frontend, mobile, operations) in a kubernetes cluster. We have many reasons to do so. One of them is that we need to have a centralized management of all those tools that are configured and upgraded manually sometimes by our IT department and some other times by our DevOps department. Another important reason is that we need to scale Jenkins jobs especially when many releases or automated procedures occur by many different teams. Last but not least is that we need a playground with real life issues and problems to cope with before we use a k8s cluster in production.

We’re very lucky to have a very experienced kubernetes DevOps engineer Stamatis Panorgios in this journey. He joined us recently but it is like we have been working together for years 😉 There are several things that we need to do before moving our apps to kubernetes, like dockerizing components, applications, etc.

One of the last experiments was to migrate our HornetQ servers to ActiveMQ ones. You may find more details here. In order to move ActiveMQ servers to kubernetes we would like to create a custom Dockerfile with specific configuration files. Unfortunately we do not have one set of configuration files but many. Therefore we should invent a way to copy configuration files conditionally based on docker runtime environment variables.

To be more precise let’s see a basic Dockerfile that would copy the configuration files we need into some specific folders in the docker image:

FROM vromero/activemq-artemis:latest

MAINTAINER Marios Karagiannopoulos <mkaragiannopoulos@zulutrade.com>

ENV MODE_PARAMS_FOLDER /var/lib/artemis/

RUN mkdir -p ${MODE_PARAMS_FOLDER}
ADD int ${MODE_PARAMS_FOLDER}/etc-override-int
ADD ext ${MODE_PARAMS_FOLDER}/etc-override-ext

We could build an image with etc-override-int configuration and another one with etc-override-ext. But why to waste disk space with too many images one per configuration files set? There is no reason to do it since we can create one image and select our configuration based on a runtime environment variable.

Problem

We have a problem though here. The base image: “vromero/activemq-artemis:latest” has an ENTRYPOINT that does not take into account our configuration files at all.

Resolution

An idea is to override ENTRYPOINT by adding your CMD command following by an ENTRYPOINT [“/usr/bin/env”] like:

# trick to override base image's ENTRYPOINT
ENTRYPOINT ["/usr/bin/env"]
CMD ["bash", "/sed_broker_files.sh /var/lib/artemis/etc/broker-05.xml"]

Nevertheless, if you want to use container’s runtime environment variables this is not going to work.

So, what I’ve been thinking of is to look how ENTRYPOINT of base image is written and try to inject into it other scripts or modifying script code, during docker image build time. The base image ENTRYPOINT script is located here. If you see carefully the variable we’re interested of is: 

OVERRIDE_PATH=$BROKER_HOME/etc-override
Based on the selected configuration in runtime of the container it should be etc-override-int or etc-override-ext.

So a sed command could do the job during the build time of the docker image. However we also need to run some other commands before the startup of the application. We may put these commands in another script and inject the execution of this script inside the ENTRYPOINT of the base image. Let’s see the final Dockerfile:

FROM vromero/activemq-artemis:latest

MAINTAINER Marios Karagiannopoulos <mkaragiannopoulos@zulutrade.com>

# trick to override artemis user when entering the container
USER root

ENV MODE_PARAMS_FOLDER /var/lib/artemis/

RUN mkdir -p ${MODE_PARAMS_FOLDER}
ADD int ${MODE_PARAMS_FOLDER}/etc-override-int
ADD ext ${MODE_PARAMS_FOLDER}/etc-override-ext

# trick to override base image's ENTRYPOINT
COPY activate_mode_param.sh /
RUN head -2 /docker-entrypoint.sh > /docker-entrypoint.sh.tmp
RUN echo "/activate_mode_param.sh" >> /docker-entrypoint.sh.tmp
RUN all_lines=`wc -l /docker-entrypoint.sh | cut -d' ' -f1` && \
  new_lines=`expr $all_lines - 3` && \
  tail -$new_lines /docker-entrypoint.sh >> /docker-entrypoint.sh.tmp && \
  mv /docker-entrypoint.sh.tmp /docker-entrypoint.sh
RUN sed -i "s/etc-override/etc-override-\${MODE_PARAM}/g" /docker-entrypoint.sh

RUN chmod 777 /docker-entrypoint.sh
RUN chown -R artemis:artemis ${MODE_PARAMS_FOLDER}

USER artemis

As you can see above we copy the activate_mode_param.sh script under root folder and then we inject it’s call inside /docker-entrypoint.sh. Also we change the etc-override presence to etc-override-${MODE_PARAM} where $MODE_PARAM is a runtime environment. Take a look here:

docker build -t 10.0.8.171:5000/amq:latest -f Dockerfile .

docker run -it --name='amq-master-int-206' \
-v /opt/amq/sharedstore:/var/lib/artemis/data \
-e 'MODE_PARAM=int' \
-e 'AMQ_MASTER_IP=10.0.9.206' \
-e 'AMQ_MASTER_PORT=61616' \
-e 'AMQ_SLAVE_IP=10.0.9.206' \
-e 'AMQ_SLAVE_PORT=61617' \
-e 'ARTEMIS_PERF_JOURNAL=ALWAYS' \
-e 'ARTEMIS_USERNAME=admin' \
-e 'ARTEMIS_PASSWORD=admin' \
-e 'ARTEMIS_MIN_MEMORY=512M' \
-e 'ARTEMIS_MAX_MEMORY=1024M' \
-e 'ENABLE_JMX=true' \
-e 'JAVA_OPTS=-Dorg.apache.activemq.SERIALIZABLE_PACKAGES=*' \
-p 8161:8161 \
-p 61616:61616 \
-d 10.0.8.171:5000/amq:latest

Deploying to kubernetes

After building the image with the command above we need to push it to our local registry:

docker login 10.0.8.171:5000 --username zulu --password ********
docker push 10.0.8.171:5000/amq:latest

alias k='kubectl'
k create namespace jms
k -n jms apply -f regced.yaml
k -n jms apply -f deployment-amq-master-int-206.yaml
k -n jms apply -f svc-amq-master-int-206.yaml

With contents:

regced.yaml (secret file to read images from our private registry)

apiVersion: v1
kind: Secret
metadata:
  namespace: jms
  name: regcred
data:
  .dockerconfigjson: "HIDDEN_HASH"
type: kubernetes.io/dockerconfigjson

deployment-amq-master-int-206.yaml 

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: amq-master-int-206
  namespace: jms
spec:
  replicas: 1
  template:
    metadata:
      labels:
        k8s-app: amq-master-int-206
    spec:
      imagePullSecrets:
      - name: regcred
      containers:
      - name: amq-master-int-206
        image: 10.0.8.171:5000/amq:latest
        ports:
        - name: http
          containerPort: 8161
        - name: jnp
          containerPort: 61616
        env:
        - name: MODE_PARAM
          value: "int"
        - name: AMQ_MASTER_IP
          value: "10.0.8.170"
        - name: AMQ_MASTER_PORT
          value: "6116"
        - name: AMQ_SLAVE_IP
          value: "10.0.8.170"
        - name: AMQ_SLAVE_PORT
          value: "6117"
        - name: ARTEMIS_PERF_JOURNAL
          value: always
        - name: ARTEMIS_USERNAME
          value: "admin"
        - name: ARTEMIS_PASSWORD
          value: "admin"
        - name: ARTEMIS_MIN_MEMORY
          value: "512M"
        - name: ARTEMIS_MAX_MEMORY
          value: "1024M"
        - name: ENABLE_JMX
          value: "true"
        - name: JAVA_OPTS
          value: "-Dorg.apache.activemq.SERIALIZABLE_PACKAGES=*"
        volumeMounts:
          - name: nfs-amq
            mountPath: /var/lib/artemis/data
      volumes:
      - name: nfs-amq
        nfs:
          server: 10.0.8.64
          path: /volume1/Storage/YB/k8s/amq206

svc-amq-master-int-206.yaml 

apiVersion: v1
kind: Service
metadata:
  namespace: jms
  name: amq-master-int-206
spec:
  type: NodePort
  ports:
    - port: 8161
      name: http
      targetPort: 8161
      nodePort: 8161
    - port: 6116
      name: jnp
      targetPort: 61616
      nodePort: 6116
  selector:
    k8s-app: amq-master-int-206