My swarm has a serious message delay issue

The message delay is very severe, and only by restarting Jami can messages be received after restarting Jami. My swarm has around 10 people, but when there are 5 people, as long as there are enough messages from the swarm, this problem may still be triggered.
I have encountered this issue not only in one swarm, but it may not be uncommon. Even if the device can be ensured to be online normally, this problem still occurs. The platforms that encountered the problem are Windows and Android, and Jami is the latest stable version.

It’s not uncommon, but I really have been unable to figure out what combination of factors causes it.

For chat platforms, my process is to test it with my wife, and if she doesn’t completely veto it, then bring in a couple of other more willing family members. After a rough start, Jami started being reliable enough between my wife and I that I brought on a couple of other people. We’ve since seen a number of issues, and this is one of them.

  1. Even just 1:1, sometimes messages won’t be delivered until 8-10 hours later. This – thankfully – is pretty rare, and oddly it seems far less frequent when either my wife or I are on a different network. Jami was super responsive and reliable when my wife was traveling in the Netherlands (with myself in the US); it’s pretty reliable and fast when one of us is out driving around. It most frequently fails to deliver when we are on the same LAN (both behind the same NAT). We use a mix of linked Android, Linux, and iOS devices.
  2. The new family members – both exclusively on iOS and Darwin – frequently complain that messages are received out of order and frequenly with delays.
  3. We have a few swarms. Three of us have no apparent issues with the swarms; in one swarm, which has all of us, one person (who accepted the invite) has yet to receive any messages to the swarm.
  4. Multimedia sent on Android devices is frequently is not viewable by the iOS folks.

This is obviously more than just “swarm delay” issues, but some of the behavior is clearly the same sort of issue. I don’t understand the ordering issue, or whether it’s related to the delay issue, but if the client is ordering messages by device-local sent/received times it easily could be. In any case, the behavior makes me suspect Jami isn’t employing a vector clock for message ordering; I’d guess it’s just looking at local received time.

Yes, when the group size exceeds three people, there may be a series of problems, which may be determined by the complexity of the swarm network. However, because I am using IPv6, latency issues are relatively rare in 1:1 swarm. Of course, there may be occasional situations where latency is very severe. I guess this may be related to Jami’s connection mechanism and the number of contacts he has.
I hope Jami can strengthen the connection in the front desk, especially when we use a certain swarm, to ensure the stability of the connection. Although this may increase power consumption, I think it’s worth it, especially on desktop computers where we don’t need to worry too much about power consumption.

As for the question you mentioned about the iPhone, I think it is related to the power saving strategy of the iPhone. The battery capacity of the iPhone is relatively low. Therefore, when Jami is in the background, IOS may choose to make Jami enter the power saving state, which will make Jami unable to maintain the connection with the contact person

The problem of sorting has caused discussion. If we strictly sort by time, we may miss some delayed messages. Therefore, Jami may have adopted some strategies to enable us to see unread messages, but this has also caused other sorting problems. This problem is complex, and my point of view is just my personal guess

Jami is a commonweal software, which will slow down the development of Jami. I can’t do more for Jami, but I can only hope that it will become better. Recently, there are some foundations that can help Jami

which will make Jami unable to maintain the connection with the contact person

Sure; I could see that. If Jami is configured right on Android (or, Android isn’t told to let Jami off the leash) the same issues could happen there. What I don’t understand is the frequent complaints about out-of-order message delivery from the iPhone people. I haven’t yet seen one of those, regardless of the permutation of Linux/Android sender/receiver. I doubt this has anything to do with delivery time delay, though.

1 Like

(replying separately because this should probably be a completely separate thread)

The problem of sorting has caused discussion. If we strictly sort by time, we may miss some delayed messages.

Hm. “By time” is the tricky question. By the sender’s sent-time? Or the receiver’s received-time? IMO, because it’s impossible for the receiver to know whether it hasn’t yet received a message that’s been sent, weaving with special annotation for “newly received” so that the conversation makes sense when read in isolation would be the best solution. But, honestly, if Jami is dealing with a sort of “eventual consistency” pattern, it should probably borrow from established “eventually consist” systems, who’ve both had to already address this, and is a well-trodden field. Yes, Jami is a messaging system, but ultimately the UI is a view of a distributed, (hopefully) eventually consistent database.

Jami already has some nice handling of edited messages, and one could think of out-of-order message arrival as a sort of “edit” event, where the edit was “∅ → {something}”.

This problem is complex, and my point of view is just my personal guess

It is, I agree… which is why it would be good to not try to re-invent a solution and instead look at how other eventually-consistent systems handle these sorts of occurrences.

IMHO

The UI of some software in China is done well, and I think this may be one of the few advantages of Chinese software.
This software works like this: when the sending and receiving times of a message are both at the latest position in the message list, the software will not perform any operations. However, when there is a message that was sent much earlier than the receiving time and sent earlier than other received messages, the software will sort the messages by time but prompt the user that there are unread messages until they click on the message and confirm that they have been read.
The advantage of this mechanism is that sorting by time can make contextual reading smooth, while not missing messages received too late.

In a simple way, it could be described as follows:
When a newly received message appears on the screen sorted by sending time, it will not be specially marked.
When this message is sorted by sending time and cannot be displayed on the user’s screen, it will be specially marked and the user needs to click ‘read’ to clear this mark

Is the team aware of any iOS users reporting chats where the messages seem all out of order? I’m hearing this second-hand from my siblings-in-laws, and I don’t see this issue on Android, so it’s hard for me to try to describe or provide any details. The algorithm you describe seems sound and wouldn’t, IMO, cause noticeable out-of-order UX.

The only issue my wife and I see is the one where – occasionally, a day every here or there – like OP it can take several hours for messages to be delivered; sometimes, killing Jami and restarting it helps. I’ve also caused a flood of messages to arrive just by dicking around with the various DHT settings – enabling/disabling proxy, use/don’t use proxy list, etc. But I do think that’s separate from the OOO chat issue I’m hearing about, and it’d be comforting to know that there are devs with iOS devices that are seeing it, too :slight_smile:

1 Like

Jami often faces more problems with IOS devices. People around me use Android devices, so I have no direct contact with this fault :people_hugging: