url in text handling problem
Not all urls in messages are converted to clickable form (with libsexy). Example on screenshot
Unfortunately problem still exists. I tried to find what type of urls are not handled but it looks for me that is not particularly connected with some type of urls. When I got massage: XML code on paste.pocoo.org/show/83401/ due to spam protection mechanism.
urls: (1.) h t t p://rss.slashdot.org/~r/Slashdot/slashdot/~3/375217335/article.pl and (2.) h t t p://rss.slashdot.org/~r/Slashdot/slashdot/~3/375253688/article.pl weren't clicable -- all other were. (I add spaces after '//' here to workaround spam protection mechanism.)
When I send this again the same urls weren't clicable. But when I resend only part (second slashdot news) the url (1.) was clicable. XML code: paste.pocoo.org/show/83400/
The same when third massage send alone -- all urls made clicable. Second and third send together -- first url for third news (California's Wireless...) not clicable.
NB it can't recognize strings not starting with 'www' or 'http://' as urls at all.
Can you write me what is used for url handling/recognizing in gajim code?
Thanks in advance
formatting regex has to be updated to detect "/test/ /test/" but not " /test//test/"
a space (or begining of line) is needed before the first / and a space (or end of line) is needed after the second / I think
this is the regex:
r'(?\<!\w|\\<)' r'/[^\s/]' r'([^/]*[^\s/])?' r'/(?!\w)|'
but I'm not familiar enough with regex
The first part is a lookbehind for
\\<. If I recall correctly, then
\\<means word boundary, and it cannot appear before
/. So the first part effectively means that a word-constituent character cannot appear before the first /. If we wanted to forbid / as well, we would change it to
r'(?\<![\w/])'. If we want to allow space only, we would use
The last component contains a similar component: lookforward. Again, we can replace
\S. Then there is a vertical bar; I think that it means that if no other match is found, an empty match at the beginning of the string is returned instead. I would suggest to remove the vertical bar, but that would mean fixing the code handling the return value...
Tu sum up, try this:
r'(?\<!\S)' r'/[^\s/]' r'([^/]*[^\s/])?' r'/(?!\S)|'
Feel free to contact me if this approximation is not what you wanted.
About identity: me is not him. Stepan Kasal speaking, abusing Matej Cepl's login. I hope the Snake will forgive me.