Last week (worst Friday of all ;-)) we had a very serious incident in our Nexus Repository Manager service which affected the releases lifecycle of our products. Unfortunately something bad happened in our NFS server and our nexus docker complained about several corruptions like:
2019-09-05 19:38:17,321+0000 ERROR [FelixStartLevel] *SYSTEM com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage - $ANSI{green {db=config}} Error on creating record in cluster: plocal cluster: quartz_trigger com.orientechnologies.orient.core.exception.OPaginatedClusterException: Error during record creation DB name="config" Component Name="quartz_trigger" at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createSinglePageRecord(OPaginatedCluster.java:687) at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.createDataRecord(OPaginatedCluster.java:564)
While searching on how others fixed it, most of them dropped the specific orient database table: config.quartz_trigger. This table is actually the holder of the scheduled tasks so it was not a big deal to drop and recreate it.
You should connect to orientdb console. Here are the commands to connect to config db and drop this table:
java -jar /opt/sonatype/nexus/lib/support/nexus-orient-console.jar connect plocal:/opt/sonatype/sonatype-work/nexus3/db/config/ admin admin drop class quartz_trigger
Then repair database, disconnect and restart nexus.
REPAIR DATABASE component DISCONNECT
Nevertheless, after restart I experienced some other errors that were very strange how they occurred.
Return code is: 500 , ReasonPhrase:javax.servlet.ServletException: com.orientechnologies.orient.core.exception.OCommandExecutionException: Error on execution of command: sql.select from asset where bucket = :bucket and name = :propValue?? DB name="component". -> [Help 1]
This time I didn’t have with me google or stackoverflow so I was trying to understand what happened. After first corruption I decided to do an upgrade of nexus just in case there was a bug in the release before the upgrade. The update though did not use the old nexus.vmoptions, and the following vmoptions were missing:
-Xms2703m -Xmx2703m -XX:MaxDirectMemorySize=2703m
I realized that the orientdb did not have enough MaxDirectMemorySize (sets a limit on the amount of memory that can be reserved for all Direct Byte Buffers) allocated. Basically no new blob stores, no new settings, and no uploads were possible. Did an update on my ansible script for the docker container creation:
- name: Create nexus container docker_container: name: nexus image: sonatype/nexus3:3.18.1 state: started restart_policy: unless-stopped env: INSTALL4J_ADD_VM_PARAMS="-Xms2703m -Xmx2703m -XX:MaxDirectMemorySize=2703m" volumes: - /data/nexus/sonartype-work:/opt/sonatype/sonatype-work:rw ports: - "8081:8081" - "9000:9000"
In the meantime, developers did several maven deploy leading to the following error:
2019-09-06 11:21:20,573+0000 WARN [qtp1097449578-137] deployment org.sonatype.nexus.transaction.RetryController - Exceeded retry limit: 8/8 (com.orientechnologies.orient.core.storage.ORecordDuplicatedException: Cannot index record #31:309836: found duplicated key 'OCompositeKey{keys=[#22:6, null, gr/aaafx/backend/aaafx-dealer/maven-metadata.xml]}' in index 'asset_bucket_component_name_idx' previously assigned to the record #30:307921 DB name="component" INDEX=asset_bucket_component_name_idx RID=#30:307921)
Fortunately help command of orientdb console had listed a command called truncate record. Voila! That did the trick and since then after nexus restart everything worked as expected.
Here are the commands I applied:
java -jar /opt/sonatype/nexus/lib/support/nexus-orient-console.jar connect plocal:/opt/sonatype/sonatype-work/nexus3/db/component/ admin admin load record #30:307921 truncate record #30:307921 rebuild index asset_bucket_component_name_idx REPAIR DATABASE component
Make sure that you always have a backup of the database you’re going to touch. Basically Nexus has 3 databases: config, component and security. I’ve added a scheduled task that creates a backup daily.
Here are the commands for backup and restore through orientdb console:
export database component-export drop database create database plocal:/nexus-data/db/component admin admin import database component-export.json.gz
Enjoyed the weekend after all 😉