The png Module

Pure Python PNG Reader/Writer

This Python module implements support for PNG images (see PNG specification at http://www.w3.org/TR/2003/REC-PNG-20031110/ ). It reads and writes PNG files with all allowable bit depths (1/2/4/8/16/24/32/48/64 bits per pixel) and colour combinations: greyscale (1/2/4/8/16 bit); RGB, RGBA, LA (greyscale with alpha) with 8/16 bits per channel; colour mapped images (1/2/4/8 bit). Adam7 interlacing is supported for reading and writing. A number of optional chunks can be specified (when writing) and understood (when reading): tRNS, bKGD, gAMA.

For help, type import png; help(png) in your python interpreter.

A good place to start is the Reader and Writer classes.

This file can also be used as a command-line utility to convert Netpbm PNM files to PNG, and the reverse conversion from PNG to PNM. The interface is similar to that of the pnmtopng program from Netpbm. Type python png.py --help at the shell prompt for usage and a list of options.

A note on spelling and terminology

Generally British English spelling is used in the documentation. So that’s “greyscale” and “colour”. This not only matches the author’s native language, it’s also used by the PNG specification.

The major colour formats supported by PNG (and hence by PyPNG) are: greyscale, colour, greyscale–alpha, colour–alpha. These are sometimes referred to using the abbreviations: L, RGB, LA, RGBA. In this case each letter abbreviates a single channel: L is for Luminance or Luma or Lightness which is the channel used in greyscale images; R, G, B stand for Red, Green, Blue, the components of a colour image; A stands for Alpha, the opacity channel (used for transparency effects, but higher values are more opaque, so it makes sense to call it opacity).

A note on formats

When getting pixel data out of this module (reading) and presenting data to this module (writing) there are a number of ways the data could be represented as a Python value. Generally this module uses one of three formats called “flat row flat pixel”, “boxed row flat pixel”, and “boxed row boxed pixel”. Basically the concern is whether each pixel and each row comes in its own little tuple (box), or not.

Consider an image that is 3 pixels wide by 2 pixels high, and each pixel has RGB components:

Boxed row flat pixel:

list([R,G,B, R,G,B, R,G,B],
     [R,G,B, R,G,B, R,G,B])

Each row appears as its own list, but the pixels are flattened so that three values for one pixel simply follow the three values for the previous pixel. This is the most common format used, because it provides a good compromise between space and convenience. The module regards itself as at liberty to replace any sequence type with any sufficiently compatible other sequence type; in practice each row is an array (from the array module), and the outer list is sometimes an iterator rather than an explicit list (so that streaming is possible).

Flat row flat pixel:

[R,G,B, R,G,B, R,G,B,
 R,G,B, R,G,B, R,G,B]

The entire image is one single giant sequence of colour values. Generally an array will be used (to save space), not a list.

Boxed row boxed pixel:

list([ (R,G,B), (R,G,B), (R,G,B) ],
     [ (R,G,B), (R,G,B), (R,G,B) ])

Each row appears in its own list, but each pixel also appears in its own tuple. A serious memory burn in Python.

In all cases the top row comes first, and for each row the pixels are ordered from left-to-right. Within a pixel the values appear in the order, R-G-B-A (or L-A for greyscale–alpha).

There is a fourth format, mentioned because it is used internally, is close to what lies inside a PNG file itself, and may one day have a public API exposed for it. This format is called serialised. When serialised an image is a sequence of bytes (integers from 0 to 255). Each row is packed into bytes (if bit depth < 8) or decomposed into bytes (big-endian, if bit depth is 16). This isn’t a particularly convenient format, but it is produced (in part) as a necessary step for decoding and encoding PNG files. There are some sorts of PNG to PNG recoding where this might be the most efficient format to use.

And now, my famous members

class png.Reader(_guess=None, **kw)

PNG decoder in pure Python.

Create a PNG decoder object.

The constructor expects exactly one keyword argument. If you supply a positional argument instead, it will guess the input type. You can choose among the following keyword arguments:

filename
Name of input file (a PNG file).
file
A file-like object (object with a read() method).
bytes
array or string with PNG data.
asDirect()

Returns the image data as a direct representation of an x * y * planes array. This method is intended to remove the need for callers to deal with palettes and transparency themselves. Images with a palette (colour type 3) are converted to RGB or RGBA; images with transparency (a tRNS chunk) are converted to LA or RGBA as appropriate. When returned in this format the pixel values represent the colour value directly without needing to refer to palettes or transparency information.

As for the read() method this method returns a 4-tuple:

(x, y, pixels, meta)

The meta dictionary that is returned reflects the direct format and not the original source image. For example, an RGB source image with a tRNS chunk to represent a transparent colour, will have planes=3 and alpha=False for the source image, but the meta dictionary returned by this method will have planes=4 and alpha=True because an alpha channel is synthesized and added.

pixels is the pixel data in boxed row flat pixel format (just like the read() method).

All the other aspects of the image data (bit depth for example) are not changed.

Note

When the source image is greyscale, has bit depth < 8, and has a tRNS chunk, then an alpha channel will be added, but the bit depth does not change. This results in pixel data which is 2-channel (greyscale+alpha) but bit depth < 8. Whilst this is perfectly sensible, it is not a pixel format supported by PNG so you cannot write it out unmodified to another PNG file. This is not regarded as a bug in this method. It is not the job of this method to rescale pixel values.

asRGB()

Return image as RGB pixels. Greyscales are expanded into RGB triplets. An alpha channel in the source image will raise an exception. The return values are as for the read() method except that the metadata reflect the returned pixels, not the source image. In particular, for this method metadata['greyscale'] will be False.

Note

Like the asDirect() method, this method can return pixels in “non PNG” formats. For example, a greyscale image of bit depth 4 will be returned as a colour image with bit depth 4 by this method. That format is not supported by PNG (but makes sense in other formats).

asRGB8()

Return the image data as an RGB pixels with 8-bits per sample. This is like the asRGB() method except that this method additionally rescales the values so that they are all between 0 and 255 (8-bit). In the case where the source image has a bit depth < 8 the transformation preserves all the information; where the source image has bit depth > 8, then rescaling to 8-bit values loses precision. No dithering is performed. Like asRGB(), an alpha channel in the source image will raise an exception.

This function returns a 4-tuple: (width, height, pixels, metadata). width, height, metadata are as per the read() method.

pixels is the pixel data in boxed row flat pixel format.

Note that unlike asRGB() this method always returns pixels in a format that can be represented in a PNG; that’s because it forces data to be 8-bit.

asRGBA()

Return image as RGBA pixels. Greyscales are expanded into RGB triplets; an alpha channel is synthesized if necessary. The return values are as for the read() method except that the metadata reflect the returned pixels, not the source image. In particular, for this method metadata['greyscale'] will be False, and metadata['alpha'] will be True.

Note

Like the asDirect() method, this method can return pixels in “non PNG” formats. For example, a greyscale image of bit depth 4 will be returned as an RGBA image with bit depth 4. That format is not supported by PNG (but makes sense in other formats).

asRGBA8()

Return the image data as RGBA pixels with 8-bits per sample. This method is similar to asRGB8() and asRGBA(): The result pixels have an alpha channel, _and_ values are rescale to the range 0 to 255. The alpha channel is synthesized if necessary.

Note that unlike asRGBA() this method always returns pixels in a format that can be represented in a PNG; that’s because it forces data to be 8-bit.

chunk(seek=None)
Read the next PNG chunk from the input file; returns type (as a 4 character string) and data. If the optional seek argument is specified then it will keep reading chunks until it either runs out of file or finds the type specified by the argument. Note that in general the order of chunks in PNGs is unspecified, so using seek can cause you to miss chunks.
deinterlace(raw)
Read raw pixel data, undo filters, deinterlace, and flatten. Return in flat row flat pixel format.
iterboxed(rows)
Iterator that yields each scanline in boxed row flat pixel format. rows should be an iterator that yields the bytes of each row in turn.
iterstraight(raw)
Iterator that undoes the effect of filtering, and yields each row in serialised format (as a sequence of bytes). Assumes input is straightlaced. raw should be an iterable that yields the raw bytes in chunks of arbitrary size.
palette(alpha='natural')

Returns a palette that is a sequence of 3-tuples or 4-tuples, synthesizing it from the PLTE and tRNS chunks. These chunks should have already been processed (for example, by calling the preamble() method). All the tuples are the same size, 3-tuples if there is no tRNS chunk, 4-tuples when there is a tRNS chunk. Assumes that the image is colour type 3 and therefore a PLTE chunk is required.

If the alpha argument is 'force' then an alpha channel is always added, forcing the result to be a sequence of 4-tuples.

preamble()
Extract the image metadata by reading the initial part of the PNG file up to the start of the IDAT chunk. All the chunks that precede the IDAT chunk are read and either processed for metadata or discarded.
process_chunk()
Process the next chunk and its data. This only processes the following chunk types, all others are ignored: IHDR, PLTE, bKGD, tRNS, gAMA.
read()

Read the PNG file and decode it. Returns (width, height, pixels, metadata).

May use excessive memory.

pixels are returned in boxed row flat pixel format.

read_flat()

Read a PNG file and decode it into flat row flat pixel format. Returns (width, height, pixels, metadata).

May use excessive memory.

pixels are returned in flat row flat pixel format.

See also the read() method which returns pixels in the more stream-friendly boxed row flat pixel format.

serialtoflat(bytes, width=None)
Convert serial format (byte stream) pixel data to flat row flat pixel.
undo_filter(filter_type, scanline, previous)

Undo the filter for a scanline. scanline is a sequence of bytes that does not include the initial filter type byte. previous is decoded previous scanline (for straightlaced images this is the previous pixel row, but for interlaced images, it is the previous scanline in the reduced image, which in general is not the previous pixel row in the final image). When there is no previous scanline (the first row of a straightlaced image, or the first row in one of the passes in an interlaced image), then this argument should be None.

The scanline will have the effects of filtering removed, and the result will be returned as a fresh sequence of bytes.

validate_signature()
If signature (header) has not been read then read and validate it; otherwise do nothing.
class png.Writer(width, height, greyscale=False, alpha=False, bitdepth=8, palette=None, transparent=None, background=None, gamma=None, compression=None, interlace=False, bytes_per_sample=None, chunk_limit=1048576)

PNG encoder in pure Python.

Create a PNG encoder object.

Arguments:

width, height
Size of the image in pixels.
greyscale
Input data is greyscale, not RGB.
alpha
Input data has alpha channel (RGBA or LA).
bitdepth
Bit depth: 1, 2, 4, 8, or 16.
palette
Create a palette for a colour mapped image (colour type 3).
transparent
Specify a transparent colour (create a tRNS chunk).
background
Specify a default background colour (create a bKGD chunk).
gamma
Specify a gamma value (create a gAMA chunk).
compression
zlib compression level (1-9).
interlace
Create an interlaced image.
chunk_limit
Write multiple IDAT chunks to save memory.

greyscale and alpha are booleans that specify whether an image is greyscale (or colour), and whether it has an alpha channel (or not).

bitdepth specifies the bit depth of the PNG image. This is the number of bits used to specify the value of each colour channel (or index, in the case of a palette). PNG allows this to be 1,2,4,8, or 16, but there are some restrictions on some values.

For greyscale and palette images the PNG specification allows the bit depth to be less than 8. For other types (including greyscale+alpha), bit depths less than 8 are rejected.

The palette option, when specified, causes a colour mapped image to be created: the PNG colour type is set to 3; greyscale must not be set; alpha must not be set; transparent must not be set; the bit depth must be 1,2,4, or 8.

The palette argument value should be a sequence of 3- or 4-tuples. 3-tuples specify RGB palette entries; 4-tuples specify RGBA palette entries. If both 4-tuples and 3-tuples appear in the sequence then all the 4-tuples must come before all the 3-tuples. A PLTE chunk is created; if there are 4-tuples then a tRNS chunk is created as well. The PLTE chunk will contain all the RGB triples in the same sequence; the tRNS chunk will contain the alpha channel for all the 4-tuples, in the same sequence. Palette entries are always 8-bit.

If specified, the transparent and background parameters must be a tuple with three integer values for red, green, blue, or a simple integer (or singleton tuple) for a greyscale image.

If specified, the gamma parameter must be a positive number (generally, a float). A gAMA chunk will be created. Note that this will not change the values of the pixels as they appear in the PNG file, they are assumed to have already been converted appropriately for the gamma specified.

The compression argument specifies the compression level to be used by the zlib module. Higher values are likely to compress better, but will be slower to compress. The default for this argument is None; this does not mean no compression, rather it means that the default from the zlib module is used (which is generally acceptable).

If interlace is true then an interlaced image is created (using PNG’s so far only interace method, Adam7). This does not affect how the pixels should be presented to the encoder, rather it changes how they are arranged into the PNG file. On slow connexions interlaced images can be partially decoded by the browser to give a rough view of the image that is successively refined as more image data appears.

Note

Enabling the interlace option requires the entire image to be processed in working memory.

chunk_limit is used to limit the amount of memory used whilst compressing the image. In order to avoid using large amounts of memory, multiple IDAT chunks may be created.

array_scanlines(pixels)
Generates boxed rows (flat pixels) from flat rows (flat pixels) in an array.
array_scanlines_interlace(pixels)
Generator for interlaced scanlines from an array. pixels is the full source image in flat row flat pixel format. The generator yields each scanline of the reduced passes in turn, in boxed row flat pixel format.
convert_pnm(infile, outfile)
Convert a PNM file containing raw pixel data into a PNG file with the parameters set in the writer object. Works for (binary) PGM, PPM, and PAM formats.
convert_ppm_and_pgm(ppmfile, pgmfile, outfile)
Convert a PPM and PGM file containing raw pixel data into a PNG outfile with the parameters set in the writer object.
file_scanlines(infile)
Generates boxed rows in flat pixel format, from the input file infile. It assumes that the input file is in a “Netpbm-like” binary format, and is positioned at the beginning of the first pixel. The number of pixels to read is taken from the image dimensions (width, height, planes) and the number of bytes per value is implied by the image bitdepth.
make_palette()
Create the byte sequences for a PLTE and if necessary a tRNS chunk. Returned as a pair (p, t). t will be None if no tRNS chunk is necessary.
write(outfile, rows)
Write a PNG image to the output file. rows should be an iterable that yields each row in boxed row flat pixel format. The rows should be the rows of the original image, so there should be self.height rows of self.width * self.planes values. If interlace is specified (when creating the instance), then an interlaced PNG file will be written. Supply the rows in the normal image order; the interlacing is carried out internally. Interlacing will require the entire image to be in working memory.
write_array(outfile, pixels)
Write an array in flat row flat pixel format as a PNG file on the output file.
write_chunk(outfile, tag, data='')
Write a PNG chunk to the output file, including length and checksum.
write_passes(outfile, rows)
Write a PNG image to the output file. The rows should be given to this method in the order that they appear in the output file. For straightlaced images, this is the usual top to bottom ordering, but for interlaced images the rows should have already been interlaced before passing them to this function. Most users are expected to find the write() or write_array() method more convenient. rows should be an iterable that yields each row in boxed row flat pixel format.