在本教程中,您将学习18种pytorch自带的图像预处理方法,同时展示如何写一个能够和PyTorch兼容的自定义图像预处理操作。需要说明的是,这些都仅限在图像操作,而没有相应的标签处理,因此不适合诸如目标检测之类的任务,因为当图像内容相对位置改变时,目标检测的标签也需要做相应调整,而这类图像预处理操作将在《PyTorch-常见的图像预处理(2)》中展示
# 下载一张图片用于本教程中图像处理结果的可视化 ! wget http://www.ruanyifeng.com/blogimg/asset/2017/bg2017121301.jpg
import matplotlib.pyplot as plt import numpy as np import torchvision.transforms as T from PIL import Image img = Image.open('bg2017121301.jpg') plt.imshow(np.array(img))
<matplotlib.image.AxesImage at 0x7f86520af450>
# 将给定的PIL.Image进行中心裁剪,得到给定的size,size可以是tuple(H, W), 也可以是一个Int,在这种情况下,裁剪出来的图片的形状是正方形 transform = T.CenterCrop(160) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8652021750>
# 随机改变亮度、对比度、饱和度、色相,其中前3项调整范围为[max(0, 1 - value), 1 + value] 或给定 [min, max],最后1项为 [-value, value]或给定[min, max] transform = T.ColorJitter(brightness=0.3, contrast=0.3, saturation=0.3, hue=0.4) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8652405d50>
# 将给定的PIL.Image进行5部分裁剪(4个角+1个中心) transform = T.FiveCrop(112) results = transform(img) for result in results: plt.figure() plt.imshow(np.array(result))
# 图像灰度处理,num_output_channels=1或3,3表示RGB值相同 transform = T.Grayscale(num_output_channels=1) result = transform(img) plt.imshow(np.array(result), cmap ='gray')
<matplotlib.image.AxesImage at 0x7f8651d96e50>
# 图像四边填充像素,padding可以是int,tuple,fill可以是int,tuple(RGB) transform = T.Pad(padding=(2,4,6,8), fill=(255, 255, 255), padding_mode='constant') result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651d10510>
# 随机仿射变换 transform = T.RandomAffine(degrees=20, translate=None, scale=None, shear=0.3, resample=False, fillcolor=0) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651c864d0>
# 以一定概率随机选择预处理操作 transforms = [ T.CenterCrop(160), T.Pad(padding=(2,4,6,8), fill=(255, 255, 255), padding_mode='constant') ] transform = T.RandomApply(transforms, p=0.5) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651c70dd0>
# 随机选择预处理操作 transforms = [ T.CenterCrop(160), T.Pad(padding=(2,4,6,8), fill=(255, 255, 255), padding_mode='constant') ] transform = T.RandomChoice(transforms) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651beb290>
# 随机裁剪 transform = T.RandomCrop((100,200), padding=None, pad_if_needed=False, fill=0, padding_mode='constant') result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651b596d0>
# 按概率随机灰度 transform = T.RandomGrayscale(p=0.8) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651ac0b50>
# 随机翻转图像 transform = T.RandomHorizontalFlip(p=0.9) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651ab0f50>
# 按随机顺序选择处理操作 transforms = [ T.CenterCrop(160), T.Pad(padding=(2,4,6,8), fill=(255, 255, 255), padding_mode='constant') ] transform = T.RandomOrder(transforms) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651e76090>
# 随机透视变换,distortion_scale范围[0,1] transform = T.RandomPerspective(distortion_scale=0.5, p=0.5, interpolation=3) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651e2bcd0>
# 先将给定的PIL.Image随机裁剪,然后再resize成给定的大小 transform = T.RandomResizedCrop(256, scale=(0.08, 1.0), ratio=(0.75, 1.33), interpolation=2) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f865259a310>
# 随机旋转,degrees按[-degrees, degrees] transform = T.RandomRotation(degrees=20, resample=False, expand=False, center=None, fill=0) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651fb3910>
# 随机垂直翻转 transform = T.RandomVerticalFlip(p=0.9) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651f43b50>
# 调整图片大小 transform = T.Resize((200,300), interpolation=2) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f86519c6fd0>
# 随机擦除, 需要先转换到tensor transform = T.Compose([ T.ToTensor(), T.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False) ]) result = transform(img) plt.imshow(result.permute(1,2,0).numpy())
<matplotlib.image.AxesImage at 0x7f86519b62d0>
下面我们写一个GaussianMask来演示如何自定义和PyTorch兼容的图像预处理操作,即生成一个高斯mask,与图像做点乘
import random class GaussianMask(object): def __init__(self, p=0.5, scale=1): self.probability = p self.scale = scale def __call__(self, img): if random.uniform(0, 1) >= self.probability: return img # print(img.size) width = img.size[0] height = img.size[1] mask = np.zeros((height, width)) mask_h = np.zeros((height, width)) mask_h += np.arange(0, width) - width / 2 mask_v = np.zeros((width, height)) mask_v += np.arange(0, height) - height / 2 mask_v = mask_v.T numerator = np.power(mask_h, 2) + np.power(mask_v, 2) denominator = 2 * (height * height + width * width)*self.scale mask = np.exp(-(numerator / denominator)) img = np.asarray(img) new_img = np.zeros_like(img) new_img[:, :, 0] = np.multiply(mask, img[:, :, 0]) new_img[:, :, 1] = np.multiply(mask, img[:, :, 1]) new_img[:, :, 2] = np.multiply(mask, img[:, :, 2]) return Image.fromarray(new_img)
# 随机擦除, 需要先转换到tensor transform = GaussianMask(p=0.9, scale=0.1) result = transform(img) plt.imshow(np.array(result))
<matplotlib.image.AxesImage at 0x7f8651897690>