Tag Content Extractor | HackerRank Solution

Hello coders, today we are going to solve Tag Content Extractor HackerRank Solution in Java.

Tag Content Extractor

Problem

In a tag-based language like XML or HTML, contents are enclosed between a start tag and an end tag like contents. Note that the corresponding end tag starts with a /.

Given a string of text in a tag-based language, parse this text and retrieve the contents enclosed within sequences of well-organized tags meeting the following criterion:

  1. The name of the start and end tags must be same. The HTML code <h1>Hello World</h2> is not valid, because the text starts with an h1 tag and ends with a non-matching h2 tag.
  2. Tags can be nested, but content between nested tags is considered not valid. For example, in
<h1><a>contents</a>invalid</h1>, contents is valid but invalid is not valid.

3. Tags can consist of any printable characters.

Input Format

The first line of input contains a single integer, N (the number of lines).
The N subsequent lines each contain a line of text.

Constraints

  • 1 <= N <=100
  • Each line contains a maximum of 104 printable characters.
  • The total number of characters in all test cases will not exceed 106.

Output Format

For each line, print the content enclosed within valid tags.
If a line contains multiple instances of valid content, print out each instance of valid content on a new line; if no valid content is found, print None.

Sample Input

 4
 <h1>Nayeem loves counseling</h1>
 <h1><h1>Sanjay has no watch</h1></h1><par>So wait for a while</par>
 <Amee>safat codes like a ninja</amee>
 <SA premium>Imtiaz has a secret crush</SA premium>

Sample Output

 Nayeem loves counseling
 Sanjay has no watch
 So wait for a while
 None
 Imtiaz has a secret crush

Solution – Tag Content Extractor in Java

import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;

public class Solution{
   public static void main(String[] args){
      
       Pattern pattern = Pattern.compile("<([^>]+)>([^<]+)</\\1>");
       
      Scanner in = new Scanner(System.in);
      int testCases = Integer.parseInt(in.nextLine());
      while(testCases>0){
         String line = in.nextLine();
         Matcher m = pattern.matcher(line);
          int matches = 0;
          while(m.find()) {
              matches++;
              System.out.println(m.group(2));
          }
          if(matches == 0) {
              System.out.println("None");
          }
         
         testCases--;
      }
   }
}

Disclaimer: The above Problem (Tag Content Extractor ) is generated by Hacker Rank but the Solution is Provided by CodingBroz. This tutorial is only for Educational and Learning Purpose.

Leave a Comment

Your email address will not be published. Required fields are marked *