How-To: Unzip a zip file in the browser, natively, no dependencies
14th January 2025
Looking for JavaScript code to unzip a file from a zip archive file? Want it to run in the browser? Don't want to include additional libraries or dependencies? See below 😀
The code
Here's some TypeScript code which understands the internal format of a zip file and makes use of the recently available Compressions Streams API to unzip one or more files from a zip file. See below for an example of how to use this code.
// zip.ts
export function getZipFileHeaders(buffer: ArrayBuffer): ZipFileHeader[] {
const endOfCentralDirectoryValues = getEndOfCentralDirectoryValues(buffer);
if (!endOfCentralDirectoryValues) {
throw "end of central directory not found";
}
const centralDirectoryFileHeaders = getCentralDirectoryFileHeaders(
buffer,
endOfCentralDirectoryValues,
);
return [...centralDirectoryFileHeaders];
}
function getEndOfCentralDirectoryValues(
buffer: ArrayBuffer,
): EndOfCentralDirectoryValues | undefined {
const firstPossibleOffset = buffer.byteLength - EOCD_FIXED_SIZE;
for (let eocdOffset = firstPossibleOffset; eocdOffset > 0; eocdOffset--) {
const eocd = new DataView(buffer, eocdOffset);
if (eocd.getUint32(EOCD_OFFSET_SIGNATURE, true) !== EOCD_SIGNATURE) {
continue;
}
const recordSize =
EOCD_FIXED_SIZE + eocd.getUint16(EOCD_OFFSET_COMMENT_SIZE, true);
if (recordSize !== eocd.byteLength) continue;
return {
numRecords: eocd.getUint16(EOCD_OFFSET_NUM_RECORDS, true),
centralDirectoryStart: eocd.getUint32(EOCD_OFFSET_CD_START, true),
};
}
}
function* getCentralDirectoryFileHeaders(
buffer: ArrayBuffer,
{ centralDirectoryStart, numRecords }: EndOfCentralDirectoryValues,
): IterableIterator<CentralDirectoryFileHeaderValues> {
// cdfh = CentralDirectoryFileHeader
let cdfhOffset = centralDirectoryStart;
for (let record = 0; record < numRecords; record++) {
const cdfh = new DataView(buffer, cdfhOffset);
if (cdfh.getUint32(CDFH_OFFSET_SIGNATURE, true) !== CDFH_SIGNATURE) {
throw "unexpected central directory file header signature";
}
yield {
fileHeaderOffset: cdfh.getUint32(CDFH_OFFSET_FILE_HEADER, true),
filename: decodeText(cdfh, CDFH_FIXED_SIZE, CDFH_OFFSET_FILENAME_SIZE),
uncompressedSize: cdfh.getUint32(CDFH_OFFSET_UNCOMPRESSED_SIZE, true),
};
const headerSize =
CDFH_FIXED_SIZE +
cdfh.getUint16(CDFH_OFFSET_FILENAME_SIZE, true) +
cdfh.getUint16(CDFH_OFFSET_EXTRA_FIELD_SIZE, true) +
cdfh.getUint16(CDFH_OFFSET_COMMENT_SIZE, true);
cdfhOffset += headerSize;
}
}
export function unzipFile(
buffer: ArrayBuffer,
{ fileHeaderOffset }: ZipFileHeader,
) {
// fh = FileHeader
const fh = new DataView(buffer, fileHeaderOffset);
if (fh.getUint32(FH_OFFSET_SIGNATURE, true) !== FH_SIGNATURE) {
throw "unexpected file header signature";
}
const compressedDataStart =
FH_FIXED_SIZE +
fh.getUint16(FH_OFFSET_FILENAME_SIZE, true) +
fh.getUint16(FH_OFFSET_EXTRA_FIELD_SIZE, true);
const compressedSize = fh.getUint32(FH_OFFSET_COMPRESSED_SIZE, true);
const compressedData = getData(fh, compressedDataStart, compressedSize);
switch (fh.getUint16(FH_OFFSET_COMPRESSION_METHOD, true)) {
case COMPRESSION_NONE:
return compressedData;
case COMPRESSION_DEFLATE:
return deflate(compressedData);
default:
throw Error("compression method not supported");
}
}
// helpers
function getData(dataView: DataView, offset: number, size: number) {
const start = dataView.byteOffset + offset;
return dataView.buffer.slice(start, start + size);
}
function decodeText(dataView: DataView, offset: number, sizeOffset: number) {
const size = dataView.getUint16(sizeOffset, true);
const data = getData(dataView, offset, size);
return new TextDecoder("ascii").decode(data);
}
function deflate(data: ArrayBuffer) {
const decodedStream = new Blob([data])
.stream()
.pipeThrough(new DecompressionStream("deflate-raw"));
return new Response(decodedStream).arrayBuffer();
}
// constants
const COMPRESSION_NONE = 0;
const COMPRESSION_DEFLATE = 8;
const FH_SIGNATURE = 0x04034b50;
const FH_OFFSET_SIGNATURE = 0;
const FH_OFFSET_COMPRESSION_METHOD = 8;
const FH_OFFSET_COMPRESSED_SIZE = 18;
const FH_OFFSET_FILENAME_SIZE = 26;
const FH_OFFSET_EXTRA_FIELD_SIZE = 28;
const FH_FIXED_SIZE = 30;
const CDFH_SIGNATURE = 0x02014b50;
const CDFH_OFFSET_SIGNATURE = 0;
const CDFH_OFFSET_UNCOMPRESSED_SIZE = 24;
const CDFH_OFFSET_FILENAME_SIZE = 28;
const CDFH_OFFSET_EXTRA_FIELD_SIZE = 30;
const CDFH_OFFSET_COMMENT_SIZE = 32;
const CDFH_OFFSET_FILE_HEADER = 42;
const CDFH_FIXED_SIZE = 46;
const EOCD_SIGNATURE = 0x06054b50;
const EOCD_OFFSET_SIGNATURE = 0;
const EOCD_OFFSET_NUM_RECORDS = 10;
const EOCD_OFFSET_CD_START = 16;
const EOCD_OFFSET_COMMENT_SIZE = 20;
const EOCD_FIXED_SIZE = 22;
// types
type CentralDirectoryFileHeaderValues = {
fileHeaderOffset: number;
filename: string;
uncompressedSize: number;
};
type EndOfCentralDirectoryValues = {
numRecords: number;
centralDirectoryStart: number;
};
export type ZipFileHeader = CentralDirectoryFileHeaderValues;
How to use
There are two functions exported from the above code:
getZipFileHeaders(buffer: ArrayBuffer): ZipFileHeader[]
- takes the zip file, checks the zip file contents and returns a list of the files inside (the zip file is passed as an ArrayBuffer, this could have been returned from a fetch call, for example)unzipFile(buffer: ArrayBuffer, fileHeader: ZipFileHeader)
- takes one of the file headers returned from the above getZipFileHeaders function and extracts that file from the zip
Example of how these functions are used:
import { getZipFileHeaders, unzipFile } from "./zip";
// ...
// load file using fetch
const res = await fetch(zipFileName);
// ...check response is ok, handle errors (omitted)
// check it's a zip
if (res.headers.get("content-type") === "application/zip") {
// get raw file data
const responseBuffer = await res.arrayBuffer();
try {
// get the headers of all the files in the zip
const zipFileHeaders = getZipFileHeaders(responseBuffer);
// find the file header for the file we want to unzip
// this example looks for a readme file inside the zip
const selectedFileHeader = zipFileHeaders.find(
header => header.filename.endsWith("README.md"),
);
// if file is found in the zip, unzip and return the data
if (selectedFileHeader) {
const fileData = await unzipFile(responseBuffer, selectedFileHeader);
// do something with the unzipped file
// fileData will be an ArrayBuffer (to convert to a string see below)
return fileData;
}
} catch (error) {
// ... handle error reading zip
}
}
Hint: to convert the fileData
from an ArrayBuffer to a string use something like this:
// See https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/encoding#value
// for possible values of the text encoding type
const textDecoder = new TextDecoder("utf-8");
const fileDataAsString = textDecoder.decode(fileData);
How does it work?
getZipFileHeaders
- this function traverses the internal file format of the .zip file (see ZIP file format Structure on Wikipedia). The code makes use of DataView to check all the various headers and byte signatures are as expected, it then returns an array of file headers for the files found within the zip.unzipFile
- takes a file header as an argument and locates the compressed data for that file in the zip archive. The data is then unzipped using the relatively new Compressions Streams API (using thedeflate-raw
algorithm)
Disclaimer ⚠️
As mentioned, the Compressions Streams API is a relatively new browser API (May 2023), so check Can I use figures to verify browser support is acceptable.
Also note: the code above should work for most zip files but it's not exhaustive - 'your mileage may vary'! If there's something unexpected in the zip file or the zip file format is unsupported then an exception may be thrown - so it's important to call this code in a try/catch block.
Thanks for reading! 🙂