The Exchangeable Image File Format (EXIF) is a standard that specifies formats for images and sounds. It stores technical details through metadata, data that describes other data, such as the camera make and model and the date and time the image was taken.
Initially, EXIF was used for two image formats, JPEG and TIFF. But today, other file formats such as PNG, WEBP, or HEIC also support EXIF for metadata.
This article will focus on the JPEG format. In the first part, we will explore its structure before seeing how to read and update associated metadata through Javascript in a browser environment.
Before moving on, it is essential to review some key concepts:
📌 What is the 0x
notation?
0x
indicates that the following number is in hexadecimal format, which uses a base-16 number system (as opposed to the base-10 decimal system). This notation is case-insensitive, meaning that 0XFF
and 0xff
are exactly the same.
📌 What is a bit or a byte? In computer science, a bit is the smallest and the most basic unit of information. It is a binary digit (base 2) representing 0 or 1. A byte (or octet) is a group of eight bits. Since there are 256 possible combinations of 8 bits, a byte can be expressed as a hexadecimal number. For example:
0x00
represents 0
in decimal and corresponds to 0000 0000
in binary, which is the minimum 8-bit value.0xD8
represents 216
and corresponds to 1101 1000
.0xFF
represents 255
and corresponds to 1111 1111
, which is the maximum 8-bit value.For multiple-byte words, the hex numbers are just combined: 0xFFD8
is a two-byte word, and 0x45786966
is a four-byte word.
📌 What is Endianness? This is how a set of bytes is stored in memory. In big-endian, the most significant byte (leftmost) comes first, while in little-endian, the least significant byte (rightmost) comes first.
For example, let’s consider the two-byte word 0x0124
. In a big-endian system, it will be written as 01 24
, whereas in a little-endian one, it will be written as 24 01
. Knowing whether an image has been written on a big or little-endian device is essential to read its data correctly.
The structure of a JPEG image is divided into parts marked by two-byte markers, always starting with a 0xFF
byte. Below is a list of key markers found in the pages 20/21 of the JPEG compression specification:
0xFFD8
: SOI (Start of Image); indicates the beginning of the image structure.0xFFE*n*
: APPn (Application-related tags); following the SOI marker, with n
between 0
and F
(full list). For example, APP11 (or 0xFFEB
) is for HDR data, APP13 (or 0xFFED
) for Photoshop and APP1 (or 0xFFE1
) for EXIF.0xFFDA
: SOS (Start of Scan); indicates the beginning of the image-related data.0xFFD9
: EOI (End of Image); indicates the end of the image.The first four file bytes, here FF D8 FF E0
for JPEG, are also known as magic numbers and are used by software to identify the file type.
The size of a segment can be determined by reading the two bytes following its marker. For example, if the segment starts with FFE1 0124 XXXXXXX
, then the APP1 segment size is 292 bytes, with 0124
being the size’s hexadecimal representation.
Data in JPEG structure is grouped into directories called IFDs. For example, IDF0
is located in the APP1
segment, and IFDExif
is a sub-IFD of IDF0
.
The IFD dataset includes a two-byte word indicating the number of tags, followed by the tags data and ending with the four-byte offset of the next IFD (or 0 if none).
A tag, like all EXIF tags, is a twelve-byte length sequence made up of:
1
for a BYTE
(one-byte integer), 3
for a SHORT
(two-byte integer), or 4
for a LONG
(four-byte integer). For further details, see the pages 25 and 26 of the JPEG compression specification.SHORT
values, two bytes are read; for LONG
values, four bytes are read. If the value is longer than four bytes (e.g., RATIONAL
type), these four bytes store the offset needed to reach the actual value.Time to code! The FileReader
API is here used to read the image as a buffer
. Then it is transformed into a DataView
for easier byte manipulation.
The next step is to examine the start of the JPEG structure, which should be the SOI marker:
Note: The getUint16
function in Javascript is used to read two bytes (2*8 = 16bits), and there is a similar function for four bytes, getUint32
.
From here can begin the loop through the image data to locate the EXIF section. The EXIF segment uses the APP1
marker followed by a special six-byte ASCII code Exif
(0x457869660000
) immediately following the APP1
size data.
Reaching SOS marker is reached means reaching the start of the image data so the end of the metadata section.
The last thing to do here is to determine which is the endianness used to encode that image. In the JPEG structure, the endianness is provided thanks to the two-bytes word following the Exif
special word. If the word is 0x4949
, it means it’s little endian, otherwise it is 0x4D4D
for big endian. This endianness data must be followed by the two bytes 0x002A
(42 in decimal).
Note: From now on, always provide the endianness to the getUint16
/getUint32
functions to correctly read the bytes.
If APP1 appears at the very beginning of the image structure (which is usually the case), then the structure should be as follows:
All the necessary information are now known to search for the EXIF tags:
Orientation
, located in the IFD IFD0
ExifImageWidth
or PixelXDimension
tag, located in the IFD IFDExif
, provided by the ExifOffset
tag of IFD0
ExifImageHeight
or PixelYDimension
tag, also located in IFDExif
To locate the IFD0
, its offset is given by the 4-byte word immediately after the endianness 42
number.
This sequence that includes the endianness two-byte word, 42
, and the IFD0
offset four-byte word is commonly referred to as the “TIFF (Tagged Image File Format) Header”:
At this point, there are two tags that need to be found through the IFD0
data:
Orientation
tag (hex 0x0112
) which is a SHORT
value that must be replaced by 1
ExifOffset
tag (hex 0x8769
) which is a LONG
value allowing to find the EXIF IFD tagsAs mentioned earlier, the first two-byte word of the IFD indicates the number of tags in the IFD. Since each tag is 12 bytes long, multiplying the number of tags by 12 gives the size of all the IFD tags, allowing for looping through them.
Note: Following the same logic as for reading, the setUint16
/setUint32
functions are used to respectively write two or four bytes.
Once the offset of the EXIF sub-IFD is found, a new loop must be executed through that IFD’s data to find the remaining height and width tags.
Here is information about the two tags that need to be replaced:
ExifImageWidth
or PixelXDimension
tag (hex 0xa002
), a LONG
value that must be replaced by 1920
ExifImageHeight
or PixelYDimension
tag (hex 0xa003
), a LONG
value that must be replaced by 1080
As a reminder of what was previously stated, the IFD tag is composed of 2 bytes for the type, 4 for the count, and 4 for the value.
Getting the final image is as simple as building a new Blob from the updated buffer data:
In the end, the updated blob can be converted to a file or downloaded, depending on the application’s needs.
This article covers the basics of reading and updating tags, but you can expand the code by adding more tags. All the hex codes for the tags can be found on exiftools.org or this tags reference site.
It’s worth noting that there are existing libraries like exif-js
or piexifjs
for manipulating EXIF data, but they may be larger than what is needed here and seems not being actively maintained.
If you want to see the full code used to write this article, feel free to check out this gist.