Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: [OT] Merging messages between users on different machines

by bitingduck (Chaplain)
on Jun 21, 2015 at 18:57 UTC ( [id://1131363]=note: print w/replies, xml ) Need Help??


in reply to [OT] Merging messages between users on different machines

Each message could also have stored with it a digest like MD5(Timestamp+message). Then you sync by comparing lists of the hash, which is conveniently fixed length and easily ordered.
  • Comment on Re: [OT] Merging messages between users on different machines

Replies are listed 'Best First'.
Re^2: [OT] Merging messages between users on different machines
by Anonymous Monk on Jun 23, 2015 at 10:28 UTC

    MD5(Timestamp+message)
    That's an idea, thanks. If both timestamp and contents of two messages coincide, then the messages are, indeed, same. (Though I think that SHA1 is better in terms of collisions, and I won't employ SHA2 there because cryptographic integrity will be protected by SSL.)

    Also, I still need a way to distinguish the messages as they are delivered to different users (I need some time to prepare an answer for you, BrowserUK).

    Then you sync by comparing lists of the hash, which is conveniently fixed length and easily ordered.
    So, part of the question is comparing two sorted arrays of 20-byte numbers. Thank you, now I understand better how tools like rsync work. The open part of the question is: which shortcuts I can apply to avoid exchanging full lists of messages? (for me, the worst case would be (1e5 * 20) = 2M of data per sync attempt, which is okay if performed rarely, but quite a lot - and may require resuming - on a EGDE connection)

      MD5(Timestamp+message)

      I strongly suggest that you keep the timestamps outside of (separate from) whichever digest function you use. Digests are inherently unorderable with respect to time.

      The open part of the question is: which shortcuts I can apply to avoid exchanging full lists of messages?

      In the following assume: messageNo is a monotonically increase number from oldest to newest; messageId is timestamp+digest[+deviceId+deviceId].

      1. Swap counts; same and you're done.
      2. Otherwise; swap messageId of the message at messageNo (smaller count/2) from the beginning.
      3. Same; add (smaller count/4) to previous messageNo (smaller count/2) and exchange messageIds. goto step 3.
      4. Different; subtract (smaller count/4) from previous messageNo and exchange messageIds. goto step 3.

      Basically; do a binary search (oldest->newest) to locate the first difference; and then move forward from that point; exchanging first messageIds; then full messages if needed.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!

        I strongly suggest that you keep the timestamps outside of (separate from) whichever digest function you use. Digests are inherently unorderable with respect to time.

        I've been a little busy this week so didn't bother looking back at this thread, but my (unstated) reasoning behind putting the timestamps inside the hash was for easy sortability on a single field, at the expense of easy time ordering. It also guarantees that short messages that may be have dupes in the past or future get unique digests. It might be reasonable to include the timestamp both inside the digest and outside as a separate column, depending on the application.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1131363]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (9)
As of 2024-04-16 08:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found