No announcement yet.

How to gauge Mirth database growth?

  • Filter
  • Time
  • Show
Clear All
new posts

  • How to gauge Mirth database growth?

    We've been developing a Mirth Connect system for a few months and are now about ready to install it in production. Can anyone point me to good resources to help us how to set up our Mirth database with adequate space and to anticipate its probable data growth, and/or how best to prune it? I've looked thru the Community, the wiki, the User Guide, and Google searches on Mirth Connect, but I haven't found anything especially informative. In particular I can't seem to find much that explains the purpose of the various d_* tables that get created and how they are populated.

    We are on Mirth Connect 3.2.2 using SQL-Server 2014.

  • #2
    Looks like a number of folks have read my prior post, yet no responses. Surely someone out there must have an idea, or at least can talk about how they made their own growth estimates. Are there no takers?

    I'm particularly mystified by the creation of successively numbered d_* tables in our dev environment. I see tables such as d_<xxx><N> not getting populated while table d_<xxx><N+1> created after it will be full of records. How does Mirth Connect decide that it no longer can subsist on N copies of a table and needs to build <N+1>, <N+2>, etc., and how does it distribute the records it inserts into them?


    • #3
      If you look at the column names, it's pretty obvious
      • d_m(x) is some metadata regarding the messages
      • d_ma(x) is a table for attachments
      • d_mc(x) is the message itself
      • d_mcm(x) looks like some kind of cross-reference
      • d_mm(x) is another table with data regarding the message (transactional)
      • d_ms(x) contains stats for the channel

      Each channel creates these tables. They're size will depend on your retention settings. (pruning... Development, Production, Raw, etc..)


      • #4
        Depending on your environment, you will need to decide how long you need to keep the message data and meta data. Once that is decided, configure the message pruning appropriately.

        If you decide you need to keep the messages for two weeks.
        In Message Pruning on the destination
        Check Prune messages older than 14
        Check Prune when message metadata is removed

        After fourteen days, your database should quit growing. Obviously, if you add more interfaces or increase the retention of messages the db will grow. But you should be able to get a baseline and run your calculations from there.


        Mirth Certified|Epic Bridges Certified|Cloverleaf Level 2 Certified

        Appliance Version 3.11.4
        Mirth Connect Version 3.8.0
        Java Version 1.6.0_45-b06
        Java (64 bit) Version 1.6.0_45-b06
        Java 7 (64 bit) Version 1.7.0_151-b15
        Java 8 (64 bit) Version 1.8.0_181-b13
        PostgreSQL Version 9.6.8


        • #5
          Thanks. What I wasn't picking up on is that for the d_<x><n> tables the <n> value represents the Local Id of the channel. I had assumed that it was just a sequentially built copy of the d_<x> type of table. Therefore I was thinking that I had to plan not only for pruning and growth in the number of messages, but second-guessing the table creations without understanding where they came from. Now I get it.


          • #6
            does anybody not prune/purge their database?

            i need to keep records for years.

            I have 4 channels running in development mode, and the database has grown to 50 gigs within a few months..


            • #7
              Originally posted by sbkeown View Post
              does anybody not prune/purge their database?

              i need to keep records for years.

              I have 4 channels running in development mode, and the database has grown to 50 gigs within a few months..
              Have you considered letting the prune job work, but archive the messages to a compressed file when pruned?


              • #8
                I'm in the same boat as the original user.

                In a particular installation, we are receiving 10k+ messages per day, and we should store them for a couple of months. Since the upgrade from Mirth 2.X to 3.X, the DB is growing like 1Gb per day. We are currently modifying and optimizing the channels in order to avoid storing more data than we should, but the DB management of the version 3 looks incredibly inefficient and reduntant. Maybe it's needed for enabling all the new features and everything, but it uses way a lot of space, which honestly, I think is unneeded.

                Some ideas for saving space:
                - If the messages are sent through a series of channels, store the messages only in the 1st channel, and keep them only 1 day or less in the rest. This works, but I don't like this idea a lot.
                - Set the transformers in the correct scope, i.e. create a transformer used only for one destination in that destination and store in in the connector map, instead of creating it in the source and storing it in the channel map. This is suggested in the forums and it works, but the space saving is quite small.
                - Instead of using filters in the destinations, set the resultSet in the source, for avoiding copies of the messages in all the destinations. This saves space, but the result set is barely documented, it's prone to errors and bugs (you have to use the name of the destination), and it can be very complicated when the filters of the destination are not always mutually exclusive.
                - Play around with the slider of message storage of each channel. It looks that even the "Production" setting still stores too much. Maybe set it to raw or less in some channels, remove content, etc., but this is again not very documented. I don't know which consequences this can have with queues, temporal shut-downs, etc. e.g. what will happen in a shut-down if a channel which stores everything resends the messages to a channel with "Raw" mode? are the messages queued in the 1st, are they lost? and if the content has been removed, does that mean that they cannot be reprocessed and the data is lost forever?

                About the data pruning, for some reason, in my case, it doesn't work good enough. I'm probably confused with the "Block Size" setting. Does that mean that it only removes 1000 messages when the pruner is launched, or it removes all of them which have to be pruned in blocks of 1000? The documentation, forums, and logic suggest the latter, but my experience the former, since there is barely no data saving with this (checking the real disk usage, not the one shown in the file system).

                And another idea: we are using MySQL 5.7. What about setting the biggest tables and columns of the database as compressed? Will this help, will it be utterly slow, won't work at all...?

                Any other idea about the DB size problem?


                • #9
                  Just following up this thread after one of my colleague has mentioned earlier this afternoon, and hope this is useful

                  Pruning works in general and it is regular maintenance task.

                  If you are setting this up then i would suggest batch/block pruning as removing all old messages in single go may require longer query time and not always supported. You will also need to shrink database afterwords if auto sizing is set.

                  As many experts already suggested here, store messages only where you need them - not in all channels, and use storage duration sensibly.

                  If longer period of storage is required then it may be archived.

                  Hope this helps.