JSP/Java – strip_tags() PHP like function
Another PHP function that is very used is strip_tags.
This function tries to return a string with all HTML tags stripped from a given string.
1: public static String strip_tags(String text, String allowedTags) {
2: String[] tag_list = allowedTags.split(",");
3: Arrays.sort(tag_list);
4:
5: final Pattern p = Pattern.compile("<[/!]?([^\\\\s>]*)\\\\s*[^>]*>",
6: Pattern.CASE_INSENSITIVE);
7: Matcher m = p.matcher(text);
8:
9: StringBuffer out = new StringBuffer();
10: int lastPos = 0;
11: while (m.find()) {
12: String tag = m.group(1);
13: // if tag not allowed: skip it
14: if (Arrays.binarySearch(tag_list, tag) < 0) {
15: out.append(text.substring(lastPos, m.start())).append(" ");
16:
17: } else {
18: out.append(text.substring(lastPos, m.end()));
19: }
20: lastPos = m.end();
21: }
22: if (lastPos > 0) {
23: out.append(text.substring(lastPos));
24: return out.toString().trim();
25: } else {
26: return text;
27: }
28: }

You have a mistake with a variable name on line 13. It should be changed to this:
if(Arrays.binarySearch(tag_list, tag) < 0) {It’s also worth noting that this implementation is case sensitive.
Additionally, two backslashes (\\) need to be added before each occurrence of “s” in the regex pattern.
Thank you for your observations. I’ve corrected them out.
this should work as well.
6: sb.append(allow[0]);9: sb.append(allow[i]);10: }14: }16: }At a first glance it seems this method is removing all allowed tags and keeps the others and that is opposite of what it should do.
Please tell me if I am wrong.