For example, each pixel of a picture file could consist of three 32-bit fields. Knowing that each field is 32-bits is up to you. A header at the beginning of the file may provide clues about interpreting the file, but even so, it’s up to you to know how to interact with the file.
This example shows how to work with a picture as an unstructured file. The example image is a public domain offering from commons.wikimedia.org. To work with images, you need to access the scikit-image library, which is a free-of-charge collection of algorithms used for image processing. Here’s a tutorial for this library.
The first task is to be able to display the image on-screen using the following code. (This code can require a little time to run. The image is ready when the busy indicator disappears from the IPython Notebook tab.)
from skimage.io import imread
from skimage.transform import resize
from matplotlib import pyplot as plt
import matplotlib.cm as cm
example_file = ("http://upload.wikimedia.org/" +
"wikipedia/commons/7/7d/Dog_face.png")
image = imread(example_file, as_grey=True)
plt.imshow(image, cmap=cm.gray)
plt.show()
The code begins by importing a number of libraries. It then creates a string that points to the example file online and places it in example_file
. This string is part of the imread()
method call, along with as_grey
, which is set to True
. The as_grey
argument tells Python to turn color images into grayscale. Any images that are already in grayscale remain that way.
Now that you have an image loaded, it’s time to render it (make it ready to display on-screen. The imshow()
function performs the rendering and uses a grayscale color map. The show()
function actually displays image
for you.
Close the image when you’re finished viewing it. (The asterisk in the In [*]:
entry tells you that the code is still running and you can’t move on to the next step.) The act of closing the image ends the code segment. You now have an image in memory, and you may want to find out more about it. When you run the following code, you discover the image type and size:
print("data type: %s, shape: %s" %
(type(image), image.shape))
The output from this call tells you that the image type is a numpy.ndarray
and that the image size is 90 pixels by 90 pixels. The image is actually an array of pixels that you can manipulate in various ways. For example, if you want to crop the image, you can use the following code to manipulate the image array:
image2 = image[5:70,0:70]
plt.imshow(image2, cmap=cm.gray)
plt.show()
The numpy.ndarray
in image2
is smaller than the one in image
, so the output is smaller as well. Typical results are shown below. The purpose of cropping the image is to make it a specific size. Both images must be the same size for you to analyze them. Cropping is one way to ensure that the images are the correct size for analysis.
Another method that you can use to change the image size is to resize it. The following code resizes the image to a specific size for analysis:
image3 = resize(image2, (30, 30), mode='nearest')
plt.imshow(image3, cmap=cm.gray)
print("data type: %s, shape: %s" %
(type(image3), image3.shape))
The output from the print()
function tells you that the image is now 30 pixels by 30 pixels in size. You can compare it to any image with the same dimensions.
After you have all the images in the right size, you need to flatten them. A data set row is always a single dimension, not two dimensions. The image is currently an array of 30 pixels by 30 pixels, so you can’t make it part of a data set. The following code flattens image3
so that it becomes an array of 900 elements that is stored in image_row
:
image_row = image3.flatten()
print("data type: %s, shape: %s" %
(type(image_row), image_row.shape))
Notice that the type is still a numpy.ndarray
. You can add this array to a data set and then use the data set for analysis purposes. The size is 900 elements, as anticipated.