Punching above Your Weight

tiwaryshailesh
6 min readJul 3, 2021

What’s up? WhatsApp

Originally published at https://techtrots.medium.com/punching-above-your-weight-515f6ea8c152 on 27 May 2021

The last few hours have seen a lot of furore over WhatsApp suing the Indian Government over user privacy issues; they claim they want to defend it for a change. Knowing WhatsApp’s Facebook connection, that’s surprising. It’s been quite some time since Facebook and Privacy have been used together in the same sentence! TRUMPets…

The main issue, as read by the lawsuit, concerns traceability. Simply put, traceability in this context implies technical feasibility and intent to identify the originator of specific content (text/image/video/file), which gets shared several times on an end-to-end encrypted messaging platform. WhatsApp claims that if they have to have traceability, they will end up breaking the end-to-end encryption. Ouch! that must hurt any privacy-respecting firm!

The claim that the end-to-end encryption would break if traceability were to be provisioned by WhatsApp does not fully stand to logic. This writeup by no means amounts to an expert opinion; it just wants to get a debate going amongst the readers (Hello, is anybody there?)

The Forwarded Labels

The forwarded labels over WhatsApp messages always piqued one’s interest — Even with all the cryptographic wizardry, how on earth could someone track message repetitions without breaking confidentiality aka the end-to-end encryption.

While WhatsApp’s official WhatsApp Encryption Overview is full of cryptographic gobbledygook, it fails to give out the privacy protecting secret sauce making the forwarded labels.

DIY — Forwarded Label

Having failed to find a scratch to this itch, one was forced to think through this mystery. This article lets out a possible blueprint for setting up the forwarded label. While the exact method used by WhatsApp may vary, the underlying approach would mostly be the same. Since most people are aware of the WhatsApp UX, let’s jump right into the forward labeling mechanism blueprint. The entire process can be understood in the phases shown below.

Methods to Message Labelling

Let’s consider a scenario where Bob clicks a Unicorn picture and sends it to his bae Elisa. Elisa loves the picture and decides to send it to her niece Sophia. One thing leads to another, and the unicorn’s photo goes viral. Since the police in Lalaland did not like the unicorn (unicorns can be hated), they wanted to take to task the person who brought the photo first in the ecosystem. Faced with the issue, how can traceability help track the originator without (almost) breaking the encryption?

Metadata Stripping and Compression (MSC) Stage

Bob clicks the unicorn photo so that he could send it to Elisa. The photo clicked from the multi-megapixel iPhone camera has much more than just the photo of the unicorn. Termed metadata, it is data about the data and may contain artifacts like camera model, camera settings, username, location, etc.

Image Metadata

The photo right out of the camera must be anywhere north of 2–8 MB. When Bob selects the image to send to Elisa, the messenger’s metadata stripping and compression module kicks in. It strips off all the metadata and further reduces the size of the photo so that it is optimized to travel the wire. For better understanding, it can be assumed that WhatsApp has this box into which one can put an image/video and take out a privacy-safe miniature version of it — ready to go. At this stage, the MSC module may save the metadata, including the hash of the original file married to the identity of Bob (WhatsApp must not be doing it, they respect privacy. EFF would agree). The skeletal database used for storing this information may look something like this.

Content Bookkeeping

TRIVIA — “bookkeeping” is the only word in the English language to contain three pairs of double letters in consecutive order. With that out of the way, this phase involves keeping a log of content to be later matched for repetitions. This can be done in a number of ways but the most obvious and privacy-respecting way of doing it is by making use of something called cryptographic hashes. Simply put, hashing is another magical box into which one can put any digital content — text/image/video/file, and take out a fixed length alpha-numeric gibberish which is a unique signature of that piece. Also, there is no way one can get back the content from the gibberish. It is like fingerprints for humans. Everyone has a different fingerprint. And there is no way one can tell who’s the person just from the fingerprint.

So as soon as the metadata stripped, compressed unicorn picture comes to this phase, its hash is calculated and stored in the content database along with the details as shown below.

Content Tracker Table

Message Labelling

All elated, now when Elisa decides to forward the image to her niece, she presses the forward button (that’s a cue to check the labelling routine). The moment she presses forward, the hash of the unicorn image is checked in the master database for a match. In case of a match, a forwarded label shows up on the unicorn’s image on her niece Sophia’s phone. The match count and the details of Elisa is also updated against the hash of the unicorn file to further track things. So, the entry in the master database now looks like this

Master Database

Hereafter, depending on a count threshold, the message may show a ‘forwarded’ or ‘forwarded multiple times’ label.

Unicorn Menace in Lalaland and Traceability

Alarmed by the unicorn menace as the Lalaland police reaches out to the messenger office. They just bring the unicorn photo received on Inspector Duggati’s messenger. The cooperative people at the messenger office find the hash of the unicorn photo, and pull the data from the database. Bob gets a visit from Inspector Duggati. End of the story.

Traceability — A Question of Intent

Since WhatsApp already has a setup to label messages as forwarded/forwareded many times, with little to no change in their setup they can easily provision traceability.

Well, this is just a thought!

--

--

tiwaryshailesh

Peacekeeper. Pacemaker. Desingineer. Tech voyeur. Tech doer. Sunday photographer. Weekday wanderer. Part time rockstar. Full time awesome. Kinda big deal.