•  

    February 2008
    M T W T F S S
    « Jan   Mar »
     123
    45678910
    11121314151617
    18192021222324
    2526272829  

Java regular expression: Replace and negate some punctuations

The situation is that we want to replace some punctuations in a text string and we want to exploit POSIX code, \p{Punct}. However, we do not want to cover all punctuations in the \p{Punct} class. For example, we want to replace punctuations  in the \p{Punct} class except ‘.’, ‘/’, ‘<’ and ‘>’ in an XML text string. We can construct a regular expression as follows:

String doc=”THE CONTENT OF YOUR XML DOCUMENT IS HERE.”;

<code>

String regex=”[\\p{Punct}&&[^<>./]]”;

doc=doc.replaceAll(regex, “”);

</code>

The idea is to exploit a boolean conjuction ‘&&’ and a negation ‘^’.

[Update: March 21, 2008] Well, how about negating string? It’s also not difficult. The key is using (?!) or (?=) in your regex.

2 Responses

  1. Just stumbled across this post whilst trying to figure out how to replace all punctuation in a string except for some allowed values. Exactly what I was looking for, you saved me a lot of work – I guess I owe you a beer.

    :-)

  2. Thanks for the post, I was searching for the same thing!

Leave a Reply