Announcement

Collapse
No announcement yet.

Parse HTML tags inside Mirth

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Parse HTML tags inside Mirth

    Hi Guys,

    I'm facing a problem here, I'm receiving a HTML tag based file in email, I want to remove all the html tags and get only the text in it. Is there any way i can remove all the <html> tags and parse the content inside mirth?. Im using mirth3.0.3 . Im posting my incoming sample data here.
    Code:
    <dl class="dl-vertical">
    					<dt>Name</dt>
    					<dd>xxxxxx</dd>
    					<dt>Event</dt>
    					<dd>yyyyyy</dd>
    					<dt>MRN</dt>
    					<dd>xxxx</dd>
    					<dt>Date of Birth</dt>
    					<dd>xxxxxxx</dd>
    					<dt>Gender</dt>
    					<dd>y</dd>
    					<dt>Address</dt>
    					<dd>1yyyyy, , yyyyyy, yyyy, yyyyyy</dd>
    				</dl>
    I want to get only the values, (i.e) xxxx,yyyy values removing the tags. Is it possible?

    Mirth Interface Engineer
    AWS DevOps

  • #2
    Hi,

    You can make use of jsoup library. (Download from jsoup website)

    Below the working solution for your problem.

    importPackage(org.jsoup) // initialize package
    var contents = FileUtil.read('E:\\1.txt'); // Read your html
    var doc = Jsoup.parse(contents); // Parse and store in doc
    var p= doc.select("dd").first(); // To take only 1st data
    logger.info(doc.select("dd").get(0).text()); // To take only 1st data in array
    logger.info(doc.select("dd").get(1).text()); // To take only 2nd data in array of doc selector

    Let me know if you are facing issue

    -
    Arvind
    HIT Security Professional

    Comment


    • #3
      parse the value and replace with RegExp

      Code:
      <.\w+>|<*.*">

      Comment


      • #4
        Working Fine

        Hi Arvind,

        Thanks for the Script, this squence of Code works fine in solving my issue.

        Mirth Interface Engineer
        AWS DevOps

        Comment

        Working...
        X