JSP/Java - strip_tags() PHP like function

Another PHP function that is very used is strip_tags.
This function tries to return a string with all HTML tags stripped from a given string.

public static String strip_tags(String text, String allowedTags) {
  String[] tag_list = allowedTags.split(”,”);
  Arrays.sort(tag_list);
  
  final Pattern p = Pattern.compile(”<[/!]?([^\\\\s>]*)\\\\s*[^>]*>”, Pattern.CASE_INSENSITIVE);
  Matcher m = p.matcher(text);
  

  StringBuffer out = new StringBuffer();
  int lastPos = 0;
  while (m.find()) {
    String tag = m.group(1);
    // if tag not allowed: skip it
    if(Arrays.binarySearch(tag_list, tag) < 0) {
      out.append(text.substring(lastPos, m.start()))
        .append(” “);

    } else {
      out.append(text.substring(lastPos, m.end()));
    }
    lastPos = m.end();
  }
  if (lastPos > 0) {
    out.append(text.substring(lastPos));
    return out.toString().trim();

  } else {
    return text;
  }
}

3 comments so far

  1. You have a mistake with a variable name on line 13. It should be changed to this:
    if(Arrays.binarySearch(tag_list, tag) < 0) {

    It’s also worth noting that this implementation is case sensitive.

  2. Additionally, two backslashes (\\) need to be added before each occurrence of “s” in the regex pattern.

  3. Remus Stratulat [Member] June 22, 2006 11:05 am

    Thank you for your observations. I’ve corrected them out.

Leave a comment

Please be polite and on topic. Your e-mail will never be published.