Augmentations (albumentations.augmentations)¶

Transforms¶

class albumentations.augmentations.transforms.Blur(blur_limit=7, always_apply=False, p=0.5)[source]¶

Blur the input image using a random-sized kernel.

Parameters:	blur_limit (int) – maximum kernel size for blurring the input image. Default: 7. p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.VerticalFlip(always_apply=False, p=0.5)[source]¶

Flip the input vertically around the x-axis.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.HorizontalFlip(always_apply=False, p=0.5)[source]¶

Flip the input horizontally around the y-axis.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.Flip(always_apply=False, p=0.5)[source]¶

Flip the input either horizontally, vertically or both horizontally and vertically.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

apply(img, d=0, **params)[source]¶: Args: d (int): code that specifies how to flip the input. 0 for vertical flipping, 1 for horizontal flipping,

-1 for both vertical and horizontal flipping (which is also could be seen as rotating the input by 180 degrees).

class albumentations.augmentations.transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, always_apply=False, p=1.0)[source]¶

Divide pixel values by 255 = 2**8 - 1, subtract mean per channel and divide by std per channel.

Parameters:	mean (float, float, float) – mean values std (float, float, float) – std values max_pixel_value (float) – maximum possible pixel value

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.Transpose(always_apply=False, p=0.5)[source]¶

Transpose the input by swapping rows and columns.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomCrop(height, width, always_apply=False, p=1.0)[source]¶

Crop a random part of the input.

Parameters:	height (int) – height of the crop. width (int) – width of the crop. p (float) – probability of applying the transform. Default: 1.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomGamma(gamma_limit=(80, 120), always_apply=False, p=0.5)[source]¶

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomRotate90(always_apply=False, p=0.5)[source]¶

Randomly rotate the input by 90 degrees zero or more times.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

apply(img, factor=0, **params)[source]¶

Parameters:	factor (int) – number of times the input will be rotated by 90 degrees.

class albumentations.augmentations.transforms.Rotate(limit=90, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶

Rotate the input by an angle selected randomly from the uniform distribution.

Parameters:

limit ((int, int) or int) – range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: 90
interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
border_mode (OpenCV flag) – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.1, rotate_limit=45, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶

Randomly apply affine transforms: translate, scale and rotate the input.

Parameters:

shift_limit ((float, float) or float) – shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [0, 1]. Default: 0.0625.
scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Default: 0.1.
rotate_limit ((int, int) or int) – rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: 45.
interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
border_mode (OpenCV flag) – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask
Image types:: uint8, float32

class albumentations.augmentations.transforms.CenterCrop(height, width, always_apply=False, p=1.0)[source]¶

Crop the central part of the input.

Parameters:	height (int) – height of the crop. width (int) – width of the crop. p (float) – probability of applying the transform. Default: 1.

Targets:: image, mask, bboxes
Image types:: uint8, float32

Note

It is recommended to use uint8 images as input. Otherwise the operation will require internal conversion float32 -> uint8 -> float32 that causes worse performance.

class albumentations.augmentations.transforms.OpticalDistortion(distort_limit=0.05, shift_limit=0.05, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶

Targets:: image, mask
Image types:: uint8, float32

class albumentations.augmentations.transforms.GridDistortion(num_steps=5, distort_limit=0.3, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶

Targets:: image, mask
Image types:: uint8, float32

class albumentations.augmentations.transforms.ElasticTransform(alpha=1, sigma=50, alpha_affine=50, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶

Targets:: image, mask
Image types:: uint8, float32

class albumentations.augmentations.transforms.HueSaturationValue(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, always_apply=False, p=0.5)[source]¶

Randomly change hue, saturation and value of the input image.

Parameters:

hue_shift_limit ((int, int) or int) – range for changing hue. If hue_shift_limit is a single int, the range will be (-hue_shift_limit, hue_shift_limit). Default: 20.
sat_shift_limit ((int, int) or int) – range for changing saturation. If sat_shift_limit is a single int, the range will be (-sat_shift_limit, sat_shift_limit). Default: 30.
val_shift_limit ((int, int) or int) – range for changing value. If val_shift_limit is a single int, the range will be (-val_shift_limit, val_shift_limit). Default: 20.
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.PadIfNeeded(min_height=1024, min_width=1024, border_mode=4, value=[0, 0, 0], always_apply=False, p=1.0)[source]¶

Pad side of the image / max if side is less than desired number.

Parameters:	p (float) – probability of applying the transform. Default: 1.0. value (list of ints [r, g, b]) – padding value if border_mode is cv2.BORDER_CONSTANT.

Targets:: image, mask
Image types:: uint8, float32

class albumentations.augmentations.transforms.RGBShift(r_shift_limit=20, g_shift_limit=20, b_shift_limit=20, always_apply=False, p=0.5)[source]¶

Randomly shift values for each channel of the input RGB image.

Parameters:

r_shift_limit ((int, int) or int) – range for changing values for the red channel. If r_shift_limit is a single int, the range will be (-r_shift_limit, r_shift_limit). Default: 20.
g_shift_limit ((int, int) or int) – range for changing values for the green channel. If g_shift_limit is a single int, the range will be (-g_shift_limit, g_shift_limit). Default: 20.
b_shift_limit ((int, int) or int) – range for changing values for the blue channel. If b_shift_limit is a single int, the range will be (-b_shift_limit, b_shift_limit). Default: 20.
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomBrightness(limit=0.2, always_apply=False, p=0.5)[source]¶

class albumentations.augmentations.transforms.RandomContrast(limit=0.2, always_apply=False, p=0.5)[source]¶

class albumentations.augmentations.transforms.MotionBlur(blur_limit=7, always_apply=False, p=0.5)[source]¶

Apply motion blur to the input image using a random-sized kernel.

Parameters:	blur_limit (int) – maximum kernel size for blurring the input image. Default: 7. p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.MedianBlur(blur_limit=7, always_apply=False, p=0.5)[source]¶

Blur the input image using using a median filter with a random aperture linear size.

Parameters:	blur_limit (int) – maximum aperture linear size for blurring the input image. Default: 7. p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.GaussNoise(var_limit=(10, 50), always_apply=False, p=0.5)[source]¶

Apply gaussian noise to the input image.

Parameters:	var_limit ((int, int) or int) – variance range for noise. If var_limit is a single int, the range will be (-var_limit, var_limit). Default: (10, 50). p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8

class albumentations.augmentations.transforms.CLAHE(clip_limit=4.0, tile_grid_size=(8, 8), always_apply=False, p=0.5)[source]¶

Apply Contrast Limited Adaptive Histogram Equalization to the input image.

Parameters:	clip_limit (float) – upper threshold value for contrast limiting. Default: 4.0. tile_grid_size ((int, int)): size of grid for histogram equalization. Default: (8, 8). p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8

class albumentations.augmentations.transforms.ChannelShuffle(always_apply=False, p=0.5)[source]¶

Randomly rearrange channels of the input RGB image.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.InvertImg(always_apply=False, p=0.5)[source]¶

Invert the input image by subtracting pixel values from 255.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8

class albumentations.augmentations.transforms.ToGray(always_apply=False, p=0.5)[source]¶

Convert the input RGB image to grayscale. If the mean pixel value for the resulting image is greater than 127, invert the resulting grayscale image.

Parameters:	p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.JpegCompression(quality_lower=99, quality_upper=100, always_apply=False, p=0.5)[source]¶

Decrease Jpeg compression of an image.

Parameters:	quality_lower (float) – lower bound on the jpeg quality. Should be in [0, 100] range quality_upper (float) – lower bound on the jpeg quality. Should be in [0, 100] range

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.Cutout(num_holes=8, max_h_size=8, max_w_size=8, always_apply=False, p=0.5)[source]¶

CoarseDropout of the square regions in the image.

Parameters:	num_holes (int) – number of regions to zero out max_h_size (int) – maximum height of the hole max_w_size (int) – maximum width of the hole

Targets:: image
Image types:: uint8, float32

Reference: | https://arxiv.org/abs/1708.04552 | https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py | https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/arithmetic.py

class albumentations.augmentations.transforms.ToFloat(max_value=None, always_apply=False, p=1.0)[source]¶

Divide pixel values by max_value to get a float32 output array where all values lie in the range [0, 1.0]. If max_value is None the transform will try to infer the maximum value by inspecting the data type of the input image.

See also

FromFloat

Parameters:	max_value (float) – maximum possible input value. Default: None. p (float) – probability of applying the transform. Default: 1.0.

Targets:: image
Image types:: any type

class albumentations.augmentations.transforms.FromFloat(dtype='uint16', max_value=None, always_apply=False, p=1.0)[source]¶

Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value and then cast the resulted value to a type specified by dtype. If max_value is None the transform will try to infer the maximum value for the data type from the dtype argument.

This is the inverse transform for ToFloat.

Parameters:	max_value (float) – maximum possible input value. Default: None. dtype (string or numpy data type) – data type of the output. See the ‘Data types’ page from the NumPy docs. Default: ‘uint16’. p (float) – probability of applying the transform. Default: 1.0.

Targets:: image
Image types:: float32

class albumentations.augmentations.transforms.Crop(x_min=0, y_min=0, x_max=1024, y_max=1024, always_apply=False, p=1.0)[source]¶

Crop region from image.

Parameters:	x_min (int) – minimum upper left x coordinate y_min (int) – minimum upper left y coordinate x_max (int) – maximum lower right x coordinate y_max (int) – maximum lower right y coordinate

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomScale(scale_limit=0.1, interpolation=1, always_apply=False, p=0.5)[source]¶

Randomly resize the input. Output image size is different from the input image size.

Parameters:

scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (1 - scale_limit, 1 + scale_limit). Default: 0.1.
interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.LongestMaxSize(max_size=1024, interpolation=1, always_apply=False, p=1)[source]¶

Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:	p (float) – probability of applying the transform. Default: 1. max_size (int) – maximum size of the image after the transformation

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.SmallestMaxSize(max_size=1024, interpolation=1, always_apply=False, p=1)[source]¶

Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.

Parameters:	p (float) – probability of applying the transform. Default: 1. max_size (int) – maximum size of smallest side of the image after the transformation

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.Resize(height, width, interpolation=1, always_apply=False, p=1)[source]¶

Resize the input to the given height and width.

Parameters:

p (float) – probability of applying the transform. Default: 1.
height (int) – desired height of the output.
width (int) – desired width of the output.
interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomSizedCrop(min_max_height, height, width, w2h_ratio=1.0, interpolation=1, always_apply=False, p=1.0)[source]¶

Crop a random part of the input and rescale it to some size.

Parameters:

min_max_height ((int, int)) – crop size limits.
height (int) – height after crop and resize.
width (int) – width after crop and resize.
w2h_ratio (float) – aspect ratio of crop.
interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
p (float) – probability of applying the transform. Default: 1.

Targets:: image, mask, bboxes
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomBrightnessContrast(brightness_limit=0.2, contrast_limit=0.2, always_apply=False, p=0.5)[source]¶

Randomly change brightness and contrast of the input image.

Parameters:

brightness_limit ((float, float) or float) – factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: 0.2.
contrast_limit ((float, float) or float) – factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: 0.2.
p (float) – probability of applying the transform. Default: 0.5.

Targets:: image
Image types:: uint8, float32

class albumentations.augmentations.transforms.RandomCropNearBBox(max_part_shift=0.3, always_apply=False, p=1.0)[source]¶

Crop bbox from image with random shift by x,y coordinates

Parameters:	max_part_shift (float) – float value in (0.0, 1.0) range. Default 0.3 p (float) – probability of applying the transform. Default: 1.

Targets:: image
Image types:: uint8, float32

Functional transforms¶

albumentations.augmentations.functional.bbox_flip(bbox, d, rows, cols)[source]¶

Flip a bounding box either vertically, horizontally or both depending on the value of d.

Raises:	`ValueError` – if value of d is not -1, 0 or 1.

albumentations.augmentations.functional.bbox_hflip(bbox, rows, cols)[source]¶: Flip a bounding box horizontally around the y-axis.

albumentations.augmentations.functional.bbox_rot90(bbox, factor, rows, cols)[source]¶

Rotates a bounding box by 90 degrees CCW (see np.rot90)

Parameters:	bbox (tuple) – A tuple (x_min, y_min, x_max, y_max). factor (int) – Number of CCW rotations. Must be in range [0;3] See np.rot90. rows (int) – Image rows. cols (int) – Image cols.

albumentations.augmentations.functional.bbox_rotate(bbox, angle, rows, cols, interpolation)[source]¶

Rotates a bounding box by angle degrees

Parameters:	bbox (tuple) – A tuple (x_min, y_min, x_max, y_max). angle (int) – Angle of rotation in degrees rows (int) – Image rows. cols (int) – Image cols. interpolation (int) – interpolation method. a tuple (return) –

albumentations.augmentations.functional.bbox_transpose(bbox, axis, rows, cols)[source]¶

Transposes a bounding box along given axis.

Parameters:	bbox (tuple) – A tuple (x_min, y_min, x_max, y_max). axis (int) – 0 - main axis, 1 - secondary axis. rows (int) – Image rows. cols (int) – Image cols.

albumentations.augmentations.functional.bbox_vflip(bbox, rows, cols)[source]¶: Flip a bounding box vertically around the x-axis.

albumentations.augmentations.functional.crop_bbox_by_coords(bbox, crop_coords, crop_height, crop_width, rows, cols)[source]¶: Crop a bounding box using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.

albumentations.augmentations.functional.elastic_transform_fast(image, alpha, sigma, alpha_affine, interpolation=1, border_mode=4, random_state=None)[source]¶

Elastic deformation of images as described in [Simard2003] (with modifications). Based on https://gist.github.com/erniejunior/601cdf56d2b424757de5

[Simard2003]

Simard, Steinkraus and Platt, “Best Practices for Convolutional Neural Networks applied to Visual Document Analysis”, in Proc. of the International Conference on Document Analysis and Recognition, 2003.

albumentations.augmentations.functional.grid_distortion(img, num_steps=10, xsteps=[], ysteps=[], interpolation=1, border_mode=4)[source]¶

Reference:: http://pythology.blogspot.sg/2014/03/interpolation-on-regular-distorted-grid.html

albumentations.augmentations.functional.optical_distortion(img, k=0, dx=0, dy=0, interpolation=1, border_mode=4)[source]¶

Barrel / pincushion distortion. Unconventional augment.

Reference:: https://stackoverflow.com/questions/6199636/formulas-for-barrel-pincushion-distortion

https://stackoverflow.com/questions/10364201/image-transformation-in-opencv

https://stackoverflow.com/questions/2477774/correcting-fisheye-distortion-programmatically

http://www.coldvision.io/2017/03/02/advanced-lane-finding-using-opencv/

albumentations.augmentations.functional.preserve_channel_dim(func)[source]¶: Preserve dummy channel dim.

albumentations.augmentations.functional.preserve_shape(func)[source]¶: Preserve shape of the image.

Helper functions for working with bounding boxes¶

albumentations.augmentations.bbox_utils.normalize_bbox(bbox, rows, cols)[source]¶: Normalize coordinates of a bounding box. Divide x-coordinates by image width and y-coordinates by image height.

albumentations.augmentations.bbox_utils.denormalize_bbox(bbox, rows, cols)[source]¶: Denormalize coordinates of a bounding box. Multiply x-coordinates by image width and y-coordinates by image height. This is an inverse operation for normalize_bbox().

albumentations.augmentations.bbox_utils.normalize_bboxes(bboxes, rows, cols)[source]¶: Normalize a list of bounding boxes.

albumentations.augmentations.bbox_utils.denormalize_bboxes(bboxes, rows, cols)[source]¶: Denormalize a list of bounding boxes.

albumentations.augmentations.bbox_utils.calculate_bbox_area(bbox, rows, cols)[source]¶: Calculate the area of a bounding box in pixels.

albumentations.augmentations.bbox_utils.filter_bboxes_by_visibility(original_shape, bboxes, transformed_shape, transformed_bboxes, threshold=0.0, min_area=0.0)[source]¶

Filter bounding boxes and return only those boxes whose visibility after transformation is above the threshold and minimal area of bounding box in pixels is more then min_area.

Parameters:	original_shape (tuple) – original image shape bboxes (list) – original bounding boxes transformed_shape (tuple) – transformed image transformed_bboxes (list) – transformed bounding boxes threshold (float) – visibility threshold. Should be a value in the range [0.0, 1.0]. min_area (float) – Minimal area threshold.

albumentations.augmentations.bbox_utils.convert_bbox_to_albumentations(bbox, source_format, rows, cols, check_validity=False)[source]¶

Convert a bounding box from a format specified in source_format to the format used by albumentations: normalized coordinates of bottom-left and top-right corners of the bounding box in a form of [x_min, y_min, x_max, y_max] e.g. [0.15, 0.27, 0.67, 0.5].

Parameters:	bbox (list) – bounding box source_format (str) – format of the bounding box. Should be ‘coco’ or ‘pascal_voc’. check_validity (bool) – check if all boxes are valid boxes rows (int) – image height cols (int) – image width

Note

The coco format of a bounding box looks like [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. The pascal_voc format of a bounding box looks like [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212].

Raises:	`ValueError` – if target_format is not equal to coco or pascal_voc.

albumentations.augmentations.bbox_utils.convert_bbox_from_albumentations(bbox, target_format, rows, cols, check_validity=False)[source]¶

Convert a bounding box from the format used by albumentations to a format, specified in target_format.

Parameters:	bbox (list) – bounding box with coordinates in the format used by albumentations target_format (str) – required format of the output bounding box. Should be ‘coco’ or ‘pascal_voc’. rows (int) – image height cols (int) – image width check_validity (bool) – check if all boxes are valid boxes

Note

The coco format of a bounding box looks like [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. The pascal_voc format of a bounding box looks like [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212].

Raises:	`ValueError` – if target_format is not equal to coco or pascal_voc.

albumentations.augmentations.bbox_utils.convert_bboxes_to_albumentations(bboxes, source_format, rows, cols, check_validity=False)[source]¶: Convert a list bounding boxes from a format specified in source_format to the format used by albumentations

albumentations.augmentations.bbox_utils.convert_bboxes_from_albumentations(bboxes, target_format, rows, cols, check_validity=False)[source]¶

Convert a list of bounding boxes from the format used by albumentations to a format, specified in target_format.

Parameters:	bboxes (list) – List of bounding box with coordinates in the format used by albumentations target_format (str) – required format of the output bounding box. Should be ‘coco’ or ‘pascal_voc’. rows (int) – image height cols (int) – image width check_validity (bool) – check if all boxes are valid boxes