Project blog

Fixing EXIF date taken field for WhatsApp image backups

January 17, 2019 | 8 Minute Read

If you are a user of WhatsApp on Android and have gotten a new phone recently, you might have noticed that when recovering the WhatsApp backup all the file metadata is gone! This can be extremely annoying if you, like me, backup your photos to Google Photos or some other cloud service, since it messes up the order that the images are shown in the application. In this post I will explain how you can at least partially fix this problem.

WhatsApp stripping the EXIF data

I am not exactly sure why WhatsApp is doing this, some people have mentioned privacy as a reason (it’s most likely privacy reasons). WhatsApp is also known for optimizing their app to the extreme and most likely this is an attempt to save some kb of memory. Sadly, the app removes all EXIF data that could give any indication of when the photos are taken and this is causing a lot of issues for those of us who backup our photos to the cloud.

The solution

UPDATE: I have finally completed the web app that I briefly mentioned below. Check out BATCHIFIER if you just want a simple and easy way to fix your images without having to deal with Python and all that. Works for both images and videos (Android only, sorry iOS 😕 - see comments below). Please do not hesitate to leave a comment if you find it useful or if there should be any issues.

Luckily, there is a solution (kinda)! There is no way to truly recover the date taken because WhatsApp strips all EXIF data before the photo is sent and this information is lost forever - assuming that you do not have access to the original photo. The only real solution to this issue is to convince WhatsApp to change how the app works, something I see as rather unlikely to happen in the near future. Until that happens our best option is to estimate the date taken based on the only reliable data we have, which is the time that the message was sent.

When you view an photo in WhatsApp you can actually see when the photo is sent, meaning the information has to be stored somewhere. So I started looking into it and found that the date information is stored in the file name. Now, sadly, the file name only contains the date and not the timestamp of when the image was sent. But at least this gives us enough information to restore the photos so that we can get them to show up in the right order (with daily resolution). Note: This is not the case for iOS as it uses random characters as file names when storing photos and therefore the date cannot be recovered.

We can modify the EXIF data of the photos by running a simple script. I tried creating a script that could run purely in the browser, but ran into some problems. Therefore, I reverted to creating a Python script instead and which requires some technical knowledge to get the solution running (and which will not be covered here).

I will give the quick solution first and then I will go into the details afterwards:

(NOTE: MAKE SURE YOU HAVE A BACKUP OF YOUR PHOTOS. I do not guarantee that this works correctly, so double check that things look right before moving stuff around)

  1. Install Python if you don’t have it installed already (Anaconda is recommended for Windows).
  2. Run pip install piexif to install the require library for editing EXIF image fields.
  3. Download the images that are missing the date information. This will probably be all images that you have recovered from the backup storage.
  4. Go to this link and click on the green button saying “clone or download”, then click “Download ZIP”.
  5. When the zip-file is downloaded, open it and move the file called “fix_exif.py” into the folder containing your images.
  6. Open a terminal view where you can run python and navigate to the folder containing your images and the script.
  7. From here run python fix_exif.pyto fill in all the EXIF date taken values.

Note that videos will not get their date values fixed The script now works on videos too thanks to Mr.Sheep (Github code has been updated with the new code). This should hopefully give you a set of images containing the correct date at least. What you do from here is really up to yourself and depends on what cloud solution you have and whether the images where backed up there before restoring the images from the WhatsApp backup. I hope this was useful for making your photo timeline pretty again!

How it works

The script is fairly short, to understand how it works we have to take a quick look at the filename format that WhatsApp works with. The date is stored in the filename (for Android at least) with the following format:

IMG-YYYYMMDD-WAXXXX.jpg

Where YYYY is year, MM is month and DD is day. The WAXXXX just increments by one for every image taken on the same day, ex. WA0000, WA0001, etc. We can split this out from the filename and convert it into the format used for EXIF. EXIF dates are represented in the following form:

YYYY:MM:DD HH:mm:ss

The code below shows how this formatting can be done in Python.

def get_date(filename):
    date_str = filename.split('-')[1]
    return datetime.strptime(date_str, '%Y%m%d').strftime("%Y:%m:%d %H:%M:%S")

In the second line, we split the filename, so that we only get the date. On the next line, we take this date string and format it into the form we want using strftime.

Now that we have a way to get the date, we need to find all jpg files in the folder. This can be done in one line as seen below.

filenames = [fn for fn in os.listdir(folder) if fn.split('.')[-1] == 'jpg']

Here we get a list of all filder in a given folder using os.listdir and then we split the filename and check whether it ends with ‘jpg’. If it does, we add the filename into our list of files we want to process. This is a fairly simple check that assumes and there are no images with some other naming convention in the folder.

With our list of filenames, we can now loop through the filenames, extract the date and convert it into an EXIF field value. Using piexif, the field value can then be inserted into the image file, without even opening it.

for i, filename in enumerate(filenames):
    exif_dict = {'Exif': { piexif.ExifIFD.DateTimeOriginal: get_date(filename) }}
    exif_bytes = piexif.dump(exif_dict)
    piexif.insert(exif_bytes, folder + filename)

On our first line, we start looping through the filenames and the first thing we do in our loop is to create the EXIF DateTaken field using our date function. The field is dumped as bytes and then finally inserted into our image based on the filename we have. If we combine all the code and include some handy print-out and imports, this gives us the final result of:

import os
from datetime import datetime
import piexif

folder = './'

def get_date(filename):
    date_str = filename.split('-')[1]
    return datetime.strptime(date_str, '%Y%m%d').strftime("%Y:%m:%d %H:%M:%S")

filenames = [fn for fn in os.listdir(folder) if fn.split('.')[-1] == 'jpg']
for i, filename in enumerate(filenames):
    exif_dict = {'Exif': { piexif.ExifIFD.DateTimeOriginal: get_date(filename) }}
    exif_bytes = piexif.dump(exif_dict)
    piexif.insert(exif_bytes, folder + filename)
    print('{}/{}'.format(i + 1, l))
print('\nDone!')

I hope this was helpful. If you found this useful, please leave a comment below! :)