http://stackoverflow.com/questions/38408645/swift-regex-to-match-unicodes
The Unicode code point of the emoji you have shown is U+1F600.
(Unicode 9.0 Character Code Charts – Emoticons)
And your regex pattern (which may work for UTF-16 representation) [\uD800-\uDBFF\uDC00-\uDFFF]
matches all non-BMP characters — U+10000…U+10FFFF, which contains most of all emojis but also contains huge non-emoji characters.
So, as you say “[\uD800-\uDBFF\uDC00-\uDFFF]” was working, the equivalent pattern in NSRegularExpression
is "[\\U00010000-\\U0010FFFF]"
.
1 2 3 |
var s="? emoji ?" let regex = try! NSRegularExpression(pattern: "[\\U00010000-\\U0010FFFF]", options: []) let replaced = regex.stringByReplacingMatchesInString(s, options: [], range: NSRange(0..<s.utf16.count), withTemplate: "*") //->"* emoji *" |
(Addition) To see Unicode code points in your string literal:
1 2 3 |
s.unicodeScalars.forEach { print(String(format: "U+%04X ", Int($0.value))) } |
For your example string, I get:
1 2 3 4 5 6 7 8 9 |
U+1F600 U+0020 U+0065 U+006D U+006F U+006A U+0069 U+0020 U+1F600 |