A customer complained that their database is growing at the rate of 1GB per month. We suspected that the archive and purge activity is not happening for customer. We verified that the Clean-up services(Schedule_BackupService, Schedule_IndexBusinessProcessService, Schedule_PurgeService, Schedule_BPLinkagePurgeService, Schedule_BPRecovery) are working fine.
We verified the following:
1. dbstat report created from SI UI , Operations , Reports in pdf format.
2. datasweeper.sh -reportOnly from command line.
./dataSweeper.sh -detailedReport > detailedReport.txt
3. Screen shot of Troubleshooter - BP usage screen.
4. Screen shot of Troubleshooter - Database usage screen.
5. Status report of Backup service (from last occurrence of the BP) showing cutoff date.
6. Screen shot of archive manager config page (Configure Archive Settings and click on Go). This will indicate what the retention settings are in this environment.
7. The mailbox messages retention time in the system for before the Mailbox Delete Service gets rid of them.
8. jdbc.properties file.
9. Output of below SQLs:
a) Document statistics by month (Oracle and DB2):
SELECT to_char(create_time,'MM-YYYY'),count(*) from DOCUMENT group by to_char(create_time,'MM-YYYY')
b) SQL for purge counts
SELECT ARCHIVE_FLAG , count(*) from ARCHIVE_INFO group by ARCHIVE_FLAG
SELECT ARCHIVE_FLAG,GROUP_ID, MIN(ARCHIVE_DATE) , MAX(ARCHIVE_DATE) FROM ARCHIVE_INFO where ARCHIVE_FLAG >=0 GROUP BY ARCHIVE_FLAG, GROUP_ID
c) BP index count
Select count(*) from ARCHIVE_INFO where ARCHIVE_FLAG in (-1,-2,-5)
d) select * from MBX_MESSAGE order by CREATED_DATETIME asc
select * FROM ARCHIVE_INFO WHERE ARCHIVE_DATE > CURRENT_DATE + 365 DAYS AND ARCHIVE_FLAG=0
select WF_ID FROM ARCHIVE_INFO WHERE ARCHIVE_DATE > CURRENT_DATE + 365 DAYS AND ARCHIVE_FLAG=1
To get a count on the number of BPs in backup
select WF_ID FROM ARCHIVE_INFO WHERE ARCHIVE_DATE > CURRENT_DATE + 365 DAYS AND ARCHIVE_FLAG=2
select count(*) from ARCHIVE_INFO where ARCHIVE_FLAG = 0 and GROUP_ID in (1,2)
We later found that the BP Schedule_MailboxScheduledDeleteService is never ending and it is active for long time. Since, the customer has huge number of mailbox messages, we scheduled it to run every 15 minutes to delete 10000 messages from mailbox.
In mailbox.log, we see the following lock message :
[2017-12-01 16:57:28.87] ALL Refraining from deleting message:8631 due to message lock.
[2017-12-01 16:57:28.87] ALL Refraining from deleting message:8662 due to message lock.
[2017-12-01 16:57:28.87] ALL Refraining from deleting message:8663 due to message lock.
[2017-12-01 16:57:28.87] ALL Refraining from deleting message:8697 due to message lock.
[2017-12-01 16:57:28.87] ALL Refraining from deleting message:8715 due to message lock.
The problem still persist even we set the delete mailbox size to 100. The BP never finish executing. There seems to be a locks to the message.
MBX_MESSAGE_HOLD was having 5 year old records of size 15k having EXTRACTABLE column value set to 'No'. This was the reason for the lock in Mailbox delete message service execution.
We checked the Creation time and Extractability of those message which were on mbx_message_hold table, all of them were not ex-tractable and extract count was set to zero.
We suggested customer to take the back up of the table and delete the old entries before running Mailbox Message Delete service. The service should be run during low peak processing hours with reduced data set on each Mailbox processing hours. This solved the customer case.