Random old messages appear in open chat windows on reconnect (without smacks)
When gajim reconnects, tries to resume via smacks and the smacks session is unknown something weird happens.
It correctly uses mam to load its history from the server and that seems to be correct (I looked at the xml console output).
But in my open chat windows old messages from days ago (8, 9 days old or even older) appear.
Those messages are not in the mam responses I see in the xml console.
And I even restarted my computer 3 days ago, so it seems to be something persistent across gajim restarts.
Only incoming messages appear that way, not even a single outgoing message. And not all incoming messages since x days ago appear, but only random "blocks" of continous messages.
All stanzas sent and received in the last 2 months use uuids as ids. So it cannot be a simple id clash of some sort (my first suspect when I saw this).
Steps to reproduce
See above (but I'm not 100% sure if this is always the case).
I guess you don't have a way to reproduce the bug for sure? in History, are they with the correct date time? What's the values of restore_lines / restore_timeout options?
No, unfortunately not.
Yes, in history they are correct. And not even doubled or something like this.
restore_lines and restore_timeout are very high (restore_lines = 1024; restore_timeout = 172800).
But the chat window is already open for many hours when this bug happens. Closing and reopening the chat window causes the "wrong" messages to disappear and only the correct history is displayed again.
It just happened again and now I think I know how to reproduce this:
I apparently have to restart my prosody v0.10 server to trigger this bug.
But I currently don't have time to investigate further.
Could someone apply this patch to know if printed messages come from local logs or maybe not correctly cleaned notification events.
it will show a call stack each time you open a chat window (that's not the interesting thing), but when those strange logs are shown, is the call stack printed too? Could you copy/paste it here?
diff -r 936e4ca17ba5 src/chat_control.py --- a/src/chat_control.py Sun Apr 03 16:42:21 2016 +0200 +++ b/src/chat_control.py Sun Apr 03 16:56:45 2016 +0200 @@ -3000,6 +3000,9 @@ self.contact.get_full_jid(), ['chat', 'pm'])) try: + print 'restoring from logs' + import traceback + traceback.print_stack() rows = gajim.logger.get_last_conversation_lines(jid, restore_how_many, pending_how_many, timeout, self.account) except exceptions.DatabaseMalformed:
I have very similar issue (writed to mail lists). I applied patch and compile Gajim, than run it and see "new" message arrived 1 day ago. Where should I see call stack? There is nothing new information appears in chat window and XML console.
Gajim 2016.04.04, Ejabberd 16.02, Debian 8 x64, 3.16.7.
It's printed on stderr. So if you run it from a console, you'll see everything.
There is no call stack on stderr when I open chat window. And there is no call stack when "new" messages "arrives".
then you're not running the version with the patch. The patch prints thing in the console as soon as you open a chat window.
Hm. I added this 3 lines and compile Gajim. Thats how this segment in path_to_folder/src/chat_control.py looks:
self.contact.get_full_jid(), ['chat', 'pm'])) try: print 'restoring from logs' import traceback traceback.print_stack() rows = gajim.logger.get_last_conversation_lines(jid, restore_how_many, pending_how_many, timeout, self.account) except exceptions.DatabaseMalformed:
Does I missed something?
That's enough. If nothing is printed in console when you run gajim from console, you maybe don't run the correct version of Gajim. I don't know where you installed it, but run launch.sh from gajim folder (./launch.sh)
I deleted all previous Gajim versions, download daily snapshot (2016-04-04), change path_to_folder/src/chat_control.py, compile and install it with:
./configure --prefix=/opt/gajim make make install
No errors appear. Then run it from terminal emulator with:
There is no launch.sh script in source code or in /opt/gajim.
Traceback does not printed if I open chat window, and printed when I send message. They looks like this:
12.04.2016 18:35:05 (E) gajim.c.ged Error while running an even handler: \<bound method OtrPlugin.handle_outgoing_msg_stanza of \<gotr.otrmodule.OtrPlugin object at 0x7f762c049bd0>> Traceback (most recent call last): File "/opt/gajim/share/gajim/src/common/ged.py", line 93, in raise_event if handler(*args, **kwargs): File "/home/ilia/.local/share/gajim/plugins/gotr/otrmodule.py", line 628, in handle_outgoing_msg_stanza getContext(event.msg_iq.getTo()).\ File "/home/ilia/.local/share/gajim/plugins/gotr/potr/context.py", line 536, in getContext self.ctxs[uid] = self.contextclass(self, uid) File "/home/ilia/.local/share/gajim/plugins/gotr/otrmodule.py", line 122, in __init__ self.jid = get_jid_from_fjid(peer) File "/home/ilia/.local/share/gajim/plugins/gotr/otrmodule.py", line 113, in get_jid_from_fjid return gajim.get_room_and_nick_from_fjid(fjid) File "/opt/gajim/share/gajim/src/common/gajim.py", line 269, in get_room_and_nick_from_fjid l = jid.split('/', 1) AttributeError: JID instance has no attribute 'split'
I does not detect any other changes.
Debug output of gajim when this bug triggered
It happened again!
For about 14 days I had mam disabled, but I reenabled it today and just a few hours later this bug was triggered again (I'm not sure if this is just a coincidence or not).
I also applied your debug patch, but no "restoring from logs" shows up and there is no backtrace, too.
I attached the complete debug log of gajim when this bug triggered.
The reconnect was caused by my prosody server being restarted.
Thanks tmolitor. So the tb isn't printed when the bug occurs. So here is a new patch (you can revert the other one) that will print a traceback for every line printed in chat, showing the call stack to understand where it comes from.
This will add a lot of debug info in the console (a traceback for each printed line)
diff -r 7eb2b43c233d src/chat_control.py --- a/src/chat_control.py Sun Apr 03 21:32:53 2016 +0200 +++ b/src/chat_control.py Fri Apr 15 22:17:59 2016 +0200 @@ -2535,6 +2535,10 @@ kind = 'info' name = '' else: + print 'print_conversation' + print text + import traceback + traceback.print_stack() if self.session and self.session.enable_encryption: # ESessions if not encrypted:
I could reproduce the bug again and your new debug output shows some interesting things...
All 4 messages that got displayed again where delayed ones, meaning, that they had the delay tag set when they arrived at gajim and the timestamp in the delay tag was different to the system time.
I received them at 14:33:47 and manually reconnected my account at 18:23:34 (I set it to offline and a few seconds later I set it to online again). Nothing happened at this time, but the MAM archive was fetched. After this, these 4 messages are in the log.db twice. Each message once with the correct timestamp as given in the delay tag and once with a wrong timestamp (the time when the message was received).
Now (after server restart and the associated gajim reconnect), these 4 messages are still doubled in the log, but this time they get displayed because the are marked as unread.
Here is an example from my logs when the message was received (I show you only one message out of 4):
17.04.2016 14:33:47 (D) gajim.c.connection_handlers MessageCB 17.04.2016 14:33:47 (D) gajim.c.ged raw-message-received Args: (\<common.nec.NetworkEvent object at 0xf6ba86ac>,) 17.04.2016 14:33:47 (D) gajim.c.ged message-received Args: (\<common.connection_handlers_events.MessageReceivedEvent object at 0xf6ba84ac>,) 17.04.2016 14:33:47 (D) gajim.c.connection_handlers_events subhashes: msgtxt, id_, jid = ('4145025d1fc1b0a5f90275c4e1686bd9a06258dafee7e8b37cf1d0b00af4a159', 'febc8397970347cd1111d14e2116146c3817ee127909d5c278bc236ff092649f', '24ca6bb89b9ca2440ce9a2809983283e576fd72ae20723dd432ad708e46a1ebf') 17.04.2016 14:33:47 (D) gajim.c.ged decrypted-message-received Args: (\<common.connection_handlers_events.DecryptedMessageReceivedEvent object at 0xf6ba8c2c>,) marking message as unread: 57276 (64, None, 1460884584, 4, None, u'Hey :) ', None) 17.04.2016 14:33:48 (D) gajim.c.ged notification Args: (\<common.connection_handlers_events.NotificationEvent object at 0xf4fe2f4c>,)
... later I read my messages and gajim marks them accordingly (relevant id from above: 57276):
17.04.2016 16:22:39 (D) gajim.c.ged stanza-sent Args: (\<common.connection_handlers_events.StanzaSentEvent object at 0xf15bbd8c>,) marking message as read: [57275, 57276, 57277, 57285, 57292, 57294] 17.04.2016 16:22:40 (D) gajim.plugin_system ClientsIconsPlugin.connect_with_roster_draw_contact() \<entered>
... then I reconnect my account manually (log off, wait, log on again) and receive all missing MAM messages since the last mam query, including the "Hey :) " message which gets log id 57327 and the wrong timestamp of 14:33:46:
17.04.2016 18:23:44 (D) gajim.c.connection_handlers MessageCB 17.04.2016 18:23:44 (D) gajim.c.ged raw-message-received Args: (\<common.nec.NetworkEvent object at 0xf18947cc>,) 17.04.2016 18:23:44 (D) gajim.c.ged mam-message-received Args: (\<common.connection_handlers_events.MamMessageReceivedEvent object at 0xf18942ec>,) 17.04.2016 18:23:44 (D) gajim.c.ged mam-decrypted-message-received Args: (\<common.connection_handlers_events.MamDecryptedMessageReceivedEvent object at 0xf18946ac>,) 17.04.2016 18:23:44 (D) gajim.c.logger New log received from server archives, storing it marking message as unread: 57327 (64, None, 1460896426, 4, None, u'Hey :) ', None)
... and NOW, after restarting the server and reconnecting gajim displays these 4 messages because it thinks they weren't displayed yet:
17.04.2016 23:39:02 (D) gajim.plugin_system ClientsIconsPlugin.connect_with_roster_draw_contact() \<left> marking message as read:  marking message as show: 57326 marking message as read:  marking message as show: 57327 marking message as read:  marking message as show: 57328 marking message as read:  marking message as show: 57329 17.04.2016 23:39:03 (D) gajim.plugin_system ClientsIconsPlugin.connect_with_roster_draw_contact() \<entered>
Here, message 57327 is the unread duplicate of 57276.
The Message stanza in question is this:
\<message id='eef5f75f-0441-4b04-b433-76297d531230' type='chat' firstname.lastname@example.org' from='userA@example.org/mobile'> \<body>Hey :) \</body> \<markable xmlns='urn:xmpp:chat-markers:0'/> \<request xmlns='urn:xmpp:receipts'/> \<delay stamp='2016-04-17T09:16:24.919Z' xmlns='urn:xmpp:delay'/> \</message>
The incoming MAM XML looks like this:
\<message email@example.com/Gajim'> \<result id='4b1fafc5-6cec-4e31-a56a-acab2a93f2fc' queryid='8557a58b-2ec4-4063-9564-609ca6a84b3a' xmlns='urn:xmpp:mam:0'> \<forwarded xmlns='urn:xmpp:forward:0'> \<delay stamp='2016-04-17T12:33:46Z' xmlns='urn:xmpp:delay'/> \<message type='chat' firstname.lastname@example.org' from='userA@example.org/mobile' id='eef5f75f-0441-4b04-b433-76297d531230' xmlns='jabber:client'> \<body>Hey :) \</body> \<markable xmlns='urn:xmpp:chat-markers:0'/> \<request xmlns='urn:xmpp:receipts'/> \<delay stamp='2016-04-17T09:16:24.919Z' xmlns='urn:xmpp:delay'/> \</message> \</forwarded> \</result> \</message>
--> The error is in common/logger.py save_if_not_exists() where the wrong timestamp is used (the delay tag contained in the forward tag and not the one contained in the forwarded message itself).
I'll try to find the point where this wrong timestamp is selected and create a patch :)
Well, seems I can provide some statistics and observations. If this message will go to spam, I'll repost it to maillist.
When old "new" message appears, there is notification before every message: "This message IS NOT encrypted" (actually all my communications except transports are encrypted with gpg both on Gajim and Conversations). Maybe it is because messages taken from local history.
When "new" message appears, it dublicated in Gajim history with another time, i.e. history is clogged. Dublicated messages inserts in random time (not necessarily in same day).
In some cases "new" messages arrived not in the correct order. There are other messages in days from "new" messages arrives, i.e. not every message arrives twice.
I did not found any match with "new" messages when see XML console log.
"New" messages arrives independently from server and online clients. I have 2 accounts (1st on jabber.ru, 2nd on my local server), and "new" messages arrives from both accounts.
19.05.2016 Gajim goes online in 14:07, arrived 1 message from 18.05.2015 08:53.
24.05.2016 Gajim goes online in 14:10, arrived 28 messages from last 4 days, 20-23.05.2016
25.05.2016 Gajim goes online in 17:56, arrived 1 message from 24.05.2016 11:07
26.05.2016 Gajim goes online in 14:40, arrived 2 messages from 25.05.2016
27.05.2016 Gajim goes online in 10:54, arrived 2 messages from 26.05.2016
27.05.2016 Gajim goes online in 14:26, arrived 7 messages from 27.05.2016
27.05.2016 Gajim goes online in 15:11, arrived 6 messages from 27.05.2016
29.05.2016 Gajim goes online in 09:06, arrived 1 message from 13.05.2016
31.05.2016 Gajim goes online in 21:14, arrived 1 message from 13.05.2016
04.06.2016 Gajim goes online in 03:25, arrived 94 messages. 8 messages from 31.05.2016 from 1st contact, 4 messages from 2nd contact from 24.05.2016, 35 messages from 3rd contact from 17-18.05.2016, 46 messages from 4th contact (13-22 days ago), 1 message from 5th contact.
It affects from 0.16.5 to 2016.05.10 snapshot (I did not test previous and next versions), Debian 8 and Windows 7.
Is there any data I can provide or tests I can perform?
Thanks for your comment and observations. It really seems to be a date/time thing.
If I understand you correctly, you checked XML console when you connect, and haven't found the false new messages in the logs? Then I'm quite sure they are in the unread_messages table. printing the content of the table before connecting would probably show the messages that you'll get show as new when connecting. That also explains why they are not encrypted. They are stored unencrypted in local DB.
This means that Gajim thinks that those messages were not read when you received them.
Why? No idea for the moment. Probably getting the XML logs of the previous days would help. I think those fake new messages are received twice. Once while online (via carbons of directly) and once later via MAM. That is the normal behaviour, but Gajim handles that badly.
Could you store XML logs by running gajim -l .nbxmpp.client_nb=INFO 2> Gajim_logs_DATE.log
Then when the bug occurs, could you look in the previous logs for the problematic message text? But while writing that, I'm thinking that it could work only if you use non-encrypted chat ...
I runned Gajim on with your command, and some minuted later got 2 messages. One of them is (this conversation was not encrypted):
You cancelled the transfer of file
You can see it in attached log twice. Also there are 2 records in logs (for this message the same time).
Original message arrived 2016.06.09, and I could not find it in log from this date.
gajim -l .nbxmpp.client_nb=INFO 2> Gajim_logs_
I found similar problem.. I am using Gajim on 3 OSes..
1) Win 7 on my desktop
2) Fedora 24 on my laptop
3) Win 7 on my laptop
Everywhere I have Gajim 0.16.6 (GTK+ Version: 2.24.31; PyGTK Version: 2.24.0)
I was connected from Win 7 last week and didn't touch laptop. Then I was travelling, so I shutdown my desktop and turn on laptop in Win 7 and I got all old messages from MAM. Then I restart to Linux and got also all old messages from MAM. Any ideas how to solve it or debug it better?
thats the purpose of MAM, so all clients are in sync
for the restart bug, i think the problem lies with not correctly deleting unread messages from the unread sql table.
for example i use gajim daily, i still have 90 unread messages in the unread sql table, it actually grows.
i can reproduce the problem if i switch between python3 branch and python2 when im using the same logs.db for both, then on start i get many notifications of old messages.
But I think client should sync it in background or load it on demand not show them as new messages (and if you are not using client for week, you can have 1000+ unread messages)
i dont think that has something to do with MAM, its a bug where old messages show as unread
But this bugreport is about it.. and it's still happening in last version, so shouldn't this bug be reopened? Or should I create a new ticket for that?
i think it would be better to open a new issue, i dont think this is the same issue.
also please try to give more information
do you see a notification in the roster about these old messages? are there only some of the old messages, or all?