Augmentations (albumentations.augmentations)¶
Transforms¶
-
class
albumentations.augmentations.transforms.
Blur
(blur_limit=7, always_apply=False, p=0.5)[source]¶ Blur the input image using a random-sized kernel.
Parameters: - blur_limit (int) – maximum kernel size for blurring the input image. Default: 7.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
VerticalFlip
(always_apply=False, p=0.5)[source]¶ Flip the input vertically around the x-axis.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
HorizontalFlip
(always_apply=False, p=0.5)[source]¶ Flip the input horizontally around the y-axis.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
Flip
(always_apply=False, p=0.5)[source]¶ Flip the input either horizontally, vertically or both horizontally and vertically.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
Normalize
(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225), max_pixel_value=255.0, always_apply=False, p=1.0)[source]¶ Divide pixel values by 255 = 2**8 - 1, subtract mean per channel and divide by std per channel.
Parameters: - mean (float, float, float) – mean values
- std (float, float, float) – std values
- max_pixel_value (float) – maximum possible pixel value
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
Transpose
(always_apply=False, p=0.5)[source]¶ Transpose the input by swapping rows and columns.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomCrop
(height, width, always_apply=False, p=1.0)[source]¶ Crop a random part of the input.
Parameters: - height (int) – height of the crop.
- width (int) – width of the crop.
- p (float) – probability of applying the transform. Default: 1.
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomGamma
(gamma_limit=(80, 120), always_apply=False, p=0.5)[source]¶ - Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomRotate90
(always_apply=False, p=0.5)[source]¶ Randomly rotate the input by 90 degrees zero or more times.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
Rotate
(limit=90, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶ Rotate the input by an angle selected randomly from the uniform distribution.
Parameters: - limit ((int, int) or int) – range from which a random angle is picked. If limit is a single int an angle is picked from (-limit, limit). Default: 90
- interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
- border_mode (OpenCV flag) – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
ShiftScaleRotate
(shift_limit=0.0625, scale_limit=0.1, rotate_limit=45, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶ Randomly apply affine transforms: translate, scale and rotate the input.
Parameters: - shift_limit ((float, float) or float) – shift factor range for both height and width. If shift_limit is a single float value, the range will be (-shift_limit, shift_limit). Absolute values for lower and upper bounds should lie in range [0, 1]. Default: 0.0625.
- scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (-scale_limit, scale_limit). Default: 0.1.
- rotate_limit ((int, int) or int) – rotation range. If rotate_limit is a single int value, the range will be (-rotate_limit, rotate_limit). Default: 45.
- interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
- border_mode (OpenCV flag) – flag that is used to specify the pixel extrapolation method. Should be one of: cv2.BORDER_CONSTANT, cv2.BORDER_REPLICATE, cv2.BORDER_REFLECT, cv2.BORDER_WRAP, cv2.BORDER_REFLECT_101. Default: cv2.BORDER_REFLECT_101
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image, mask
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
CenterCrop
(height, width, always_apply=False, p=1.0)[source]¶ Crop the central part of the input.
Parameters: - height (int) – height of the crop.
- width (int) – width of the crop.
- p (float) – probability of applying the transform. Default: 1.
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
Note
It is recommended to use uint8 images as input. Otherwise the operation will require internal conversion float32 -> uint8 -> float32 that causes worse performance.
-
class
albumentations.augmentations.transforms.
OpticalDistortion
(distort_limit=0.05, shift_limit=0.05, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶ - Targets:
- image, mask
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
GridDistortion
(num_steps=5, distort_limit=0.3, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶ - Targets:
- image, mask
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
ElasticTransform
(alpha=1, sigma=50, alpha_affine=50, interpolation=1, border_mode=4, always_apply=False, p=0.5)[source]¶ - Targets:
- image, mask
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
HueSaturationValue
(hue_shift_limit=20, sat_shift_limit=30, val_shift_limit=20, always_apply=False, p=0.5)[source]¶ Randomly change hue, saturation and value of the input image.
Parameters: - hue_shift_limit ((int, int) or int) – range for changing hue. If hue_shift_limit is a single int, the range will be (-hue_shift_limit, hue_shift_limit). Default: 20.
- sat_shift_limit ((int, int) or int) – range for changing saturation. If sat_shift_limit is a single int, the range will be (-sat_shift_limit, sat_shift_limit). Default: 30.
- val_shift_limit ((int, int) or int) – range for changing value. If val_shift_limit is a single int, the range will be (-val_shift_limit, val_shift_limit). Default: 20.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
PadIfNeeded
(min_height=1024, min_width=1024, border_mode=4, value=[0, 0, 0], always_apply=False, p=1.0)[source]¶ Pad side of the image / max if side is less than desired number.
Parameters: - p (float) – probability of applying the transform. Default: 1.0.
- value (list of ints [r, g, b]) – padding value if border_mode is cv2.BORDER_CONSTANT.
- Targets:
- image, mask
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RGBShift
(r_shift_limit=20, g_shift_limit=20, b_shift_limit=20, always_apply=False, p=0.5)[source]¶ Randomly shift values for each channel of the input RGB image.
Parameters: - r_shift_limit ((int, int) or int) – range for changing values for the red channel. If r_shift_limit is a single int, the range will be (-r_shift_limit, r_shift_limit). Default: 20.
- g_shift_limit ((int, int) or int) – range for changing values for the green channel. If g_shift_limit is a single int, the range will be (-g_shift_limit, g_shift_limit). Default: 20.
- b_shift_limit ((int, int) or int) – range for changing values for the blue channel. If b_shift_limit is a single int, the range will be (-b_shift_limit, b_shift_limit). Default: 20.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomBrightness
(limit=0.2, always_apply=False, p=0.5)[source]¶
-
class
albumentations.augmentations.transforms.
RandomContrast
(limit=0.2, always_apply=False, p=0.5)[source]¶
-
class
albumentations.augmentations.transforms.
MotionBlur
(blur_limit=7, always_apply=False, p=0.5)[source]¶ Apply motion blur to the input image using a random-sized kernel.
Parameters: - blur_limit (int) – maximum kernel size for blurring the input image. Default: 7.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
MedianBlur
(blur_limit=7, always_apply=False, p=0.5)[source]¶ Blur the input image using using a median filter with a random aperture linear size.
Parameters: - blur_limit (int) – maximum aperture linear size for blurring the input image. Default: 7.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
GaussNoise
(var_limit=(10, 50), always_apply=False, p=0.5)[source]¶ Apply gaussian noise to the input image.
Parameters: - var_limit ((int, int) or int) – variance range for noise. If var_limit is a single int, the range will be (-var_limit, var_limit). Default: (10, 50).
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8
-
class
albumentations.augmentations.transforms.
CLAHE
(clip_limit=4.0, tile_grid_size=(8, 8), always_apply=False, p=0.5)[source]¶ Apply Contrast Limited Adaptive Histogram Equalization to the input image.
Parameters: - clip_limit (float) – upper threshold value for contrast limiting. Default: 4.0. tile_grid_size ((int, int)): size of grid for histogram equalization. Default: (8, 8).
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8
-
class
albumentations.augmentations.transforms.
ChannelShuffle
(always_apply=False, p=0.5)[source]¶ Randomly rearrange channels of the input RGB image.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
InvertImg
(always_apply=False, p=0.5)[source]¶ Invert the input image by subtracting pixel values from 255.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image
- Image types:
- uint8
-
class
albumentations.augmentations.transforms.
ToGray
(always_apply=False, p=0.5)[source]¶ Convert the input RGB image to grayscale. If the mean pixel value for the resulting image is greater than 127, invert the resulting grayscale image.
Parameters: p (float) – probability of applying the transform. Default: 0.5. - Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
JpegCompression
(quality_lower=99, quality_upper=100, always_apply=False, p=0.5)[source]¶ Decrease Jpeg compression of an image.
Parameters: - quality_lower (float) – lower bound on the jpeg quality. Should be in [0, 100] range
- quality_upper (float) – lower bound on the jpeg quality. Should be in [0, 100] range
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
Cutout
(num_holes=8, max_h_size=8, max_w_size=8, always_apply=False, p=0.5)[source]¶ CoarseDropout of the square regions in the image.
Parameters: - num_holes (int) – number of regions to zero out
- max_h_size (int) – maximum height of the hole
- max_w_size (int) – maximum width of the hole
- Targets:
- image
- Image types:
- uint8, float32
Reference: | https://arxiv.org/abs/1708.04552 | https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py | https://github.com/aleju/imgaug/blob/master/imgaug/augmenters/arithmetic.py
-
class
albumentations.augmentations.transforms.
ToFloat
(max_value=None, always_apply=False, p=1.0)[source]¶ Divide pixel values by max_value to get a float32 output array where all values lie in the range [0, 1.0]. If max_value is None the transform will try to infer the maximum value by inspecting the data type of the input image.
See also
Parameters: - max_value (float) – maximum possible input value. Default: None.
- p (float) – probability of applying the transform. Default: 1.0.
- Targets:
- image
- Image types:
- any type
-
class
albumentations.augmentations.transforms.
FromFloat
(dtype='uint16', max_value=None, always_apply=False, p=1.0)[source]¶ Take an input array where all values should lie in the range [0, 1.0], multiply them by max_value and then cast the resulted value to a type specified by dtype. If max_value is None the transform will try to infer the maximum value for the data type from the dtype argument.
This is the inverse transform for
ToFloat
.Parameters: - max_value (float) – maximum possible input value. Default: None.
- dtype (string or numpy data type) – data type of the output. See the ‘Data types’ page from the NumPy docs. Default: ‘uint16’.
- p (float) – probability of applying the transform. Default: 1.0.
- Targets:
- image
- Image types:
- float32
-
class
albumentations.augmentations.transforms.
Crop
(x_min=0, y_min=0, x_max=1024, y_max=1024, always_apply=False, p=1.0)[source]¶ Crop region from image.
Parameters: - x_min (int) – minimum upper left x coordinate
- y_min (int) – minimum upper left y coordinate
- x_max (int) – maximum lower right x coordinate
- y_max (int) – maximum lower right y coordinate
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomScale
(scale_limit=0.1, interpolation=1, always_apply=False, p=0.5)[source]¶ Randomly resize the input. Output image size is different from the input image size.
Parameters: - scale_limit ((float, float) or float) – scaling factor range. If scale_limit is a single float value, the range will be (1 - scale_limit, 1 + scale_limit). Default: 0.1.
- interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
LongestMaxSize
(max_size=1024, interpolation=1, always_apply=False, p=1)[source]¶ Rescale an image so that maximum side is equal to max_size, keeping the aspect ratio of the initial image.
Parameters: - p (float) – probability of applying the transform. Default: 1.
- max_size (int) – maximum size of the image after the transformation
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
SmallestMaxSize
(max_size=1024, interpolation=1, always_apply=False, p=1)[source]¶ Rescale an image so that minimum side is equal to max_size, keeping the aspect ratio of the initial image.
Parameters: - p (float) – probability of applying the transform. Default: 1.
- max_size (int) – maximum size of smallest side of the image after the transformation
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
Resize
(height, width, interpolation=1, always_apply=False, p=1)[source]¶ Resize the input to the given height and width.
Parameters: - p (float) – probability of applying the transform. Default: 1.
- height (int) – desired height of the output.
- width (int) – desired width of the output.
- interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomSizedCrop
(min_max_height, height, width, w2h_ratio=1.0, interpolation=1, always_apply=False, p=1.0)[source]¶ Crop a random part of the input and rescale it to some size.
Parameters: - min_max_height ((int, int)) – crop size limits.
- height (int) – height after crop and resize.
- width (int) – width after crop and resize.
- w2h_ratio (float) – aspect ratio of crop.
- interpolation (OpenCV flag) – flag that is used to specify the interpolation algorithm. Should be one of: cv2.INTER_NEAREST, cv2.INTER_LINEAR, cv2.INTER_CUBIC, cv2.INTER_AREA, cv2.INTER_LANCZOS4. Default: cv2.INTER_LINEAR.
- p (float) – probability of applying the transform. Default: 1.
- Targets:
- image, mask, bboxes
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomBrightnessContrast
(brightness_limit=0.2, contrast_limit=0.2, always_apply=False, p=0.5)[source]¶ Randomly change brightness and contrast of the input image.
Parameters: - brightness_limit ((float, float) or float) – factor range for changing brightness. If limit is a single float, the range will be (-limit, limit). Default: 0.2.
- contrast_limit ((float, float) or float) – factor range for changing contrast. If limit is a single float, the range will be (-limit, limit). Default: 0.2.
- p (float) – probability of applying the transform. Default: 0.5.
- Targets:
- image
- Image types:
- uint8, float32
-
class
albumentations.augmentations.transforms.
RandomCropNearBBox
(max_part_shift=0.3, always_apply=False, p=1.0)[source]¶ Crop bbox from image with random shift by x,y coordinates
Parameters: - max_part_shift (float) – float value in (0.0, 1.0) range. Default 0.3
- p (float) – probability of applying the transform. Default: 1.
- Targets:
- image
- Image types:
- uint8, float32
Functional transforms¶
-
albumentations.augmentations.functional.
bbox_flip
(bbox, d, rows, cols)[source]¶ Flip a bounding box either vertically, horizontally or both depending on the value of d.
Raises: ValueError
– if value of d is not -1, 0 or 1.
-
albumentations.augmentations.functional.
bbox_hflip
(bbox, rows, cols)[source]¶ Flip a bounding box horizontally around the y-axis.
-
albumentations.augmentations.functional.
bbox_rot90
(bbox, factor, rows, cols)[source]¶ Rotates a bounding box by 90 degrees CCW (see np.rot90)
Parameters: - bbox (tuple) – A tuple (x_min, y_min, x_max, y_max).
- factor (int) – Number of CCW rotations. Must be in range [0;3] See np.rot90.
- rows (int) – Image rows.
- cols (int) – Image cols.
-
albumentations.augmentations.functional.
bbox_rotate
(bbox, angle, rows, cols, interpolation)[source]¶ Rotates a bounding box by angle degrees
Parameters: - bbox (tuple) – A tuple (x_min, y_min, x_max, y_max).
- angle (int) – Angle of rotation in degrees
- rows (int) – Image rows.
- cols (int) – Image cols.
- interpolation (int) – interpolation method.
- a tuple (return) –
-
albumentations.augmentations.functional.
bbox_transpose
(bbox, axis, rows, cols)[source]¶ Transposes a bounding box along given axis.
Parameters: - bbox (tuple) – A tuple (x_min, y_min, x_max, y_max).
- axis (int) – 0 - main axis, 1 - secondary axis.
- rows (int) – Image rows.
- cols (int) – Image cols.
-
albumentations.augmentations.functional.
bbox_vflip
(bbox, rows, cols)[source]¶ Flip a bounding box vertically around the x-axis.
-
albumentations.augmentations.functional.
crop_bbox_by_coords
(bbox, crop_coords, crop_height, crop_width, rows, cols)[source]¶ Crop a bounding box using the provided coordinates of bottom-left and top-right corners in pixels and the required height and width of the crop.
-
albumentations.augmentations.functional.
elastic_transform_fast
(image, alpha, sigma, alpha_affine, interpolation=1, border_mode=4, random_state=None)[source]¶ Elastic deformation of images as described in [Simard2003] (with modifications). Based on https://gist.github.com/erniejunior/601cdf56d2b424757de5
[Simard2003] Simard, Steinkraus and Platt, “Best Practices for Convolutional Neural Networks applied to Visual Document Analysis”, in Proc. of the International Conference on Document Analysis and Recognition, 2003.
-
albumentations.augmentations.functional.
grid_distortion
(img, num_steps=10, xsteps=[], ysteps=[], interpolation=1, border_mode=4)[source]¶
-
albumentations.augmentations.functional.
optical_distortion
(img, k=0, dx=0, dy=0, interpolation=1, border_mode=4)[source]¶ Barrel / pincushion distortion. Unconventional augment.
- Reference:
Helper functions for working with bounding boxes¶
-
albumentations.augmentations.bbox_utils.
normalize_bbox
(bbox, rows, cols)[source]¶ Normalize coordinates of a bounding box. Divide x-coordinates by image width and y-coordinates by image height.
-
albumentations.augmentations.bbox_utils.
denormalize_bbox
(bbox, rows, cols)[source]¶ Denormalize coordinates of a bounding box. Multiply x-coordinates by image width and y-coordinates by image height. This is an inverse operation for
normalize_bbox()
.
-
albumentations.augmentations.bbox_utils.
normalize_bboxes
(bboxes, rows, cols)[source]¶ Normalize a list of bounding boxes.
-
albumentations.augmentations.bbox_utils.
denormalize_bboxes
(bboxes, rows, cols)[source]¶ Denormalize a list of bounding boxes.
-
albumentations.augmentations.bbox_utils.
calculate_bbox_area
(bbox, rows, cols)[source]¶ Calculate the area of a bounding box in pixels.
-
albumentations.augmentations.bbox_utils.
filter_bboxes_by_visibility
(original_shape, bboxes, transformed_shape, transformed_bboxes, threshold=0.0, min_area=0.0)[source]¶ Filter bounding boxes and return only those boxes whose visibility after transformation is above the threshold and minimal area of bounding box in pixels is more then min_area.
Parameters: - original_shape (tuple) – original image shape
- bboxes (list) – original bounding boxes
- transformed_shape (tuple) – transformed image
- transformed_bboxes (list) – transformed bounding boxes
- threshold (float) – visibility threshold. Should be a value in the range [0.0, 1.0].
- min_area (float) – Minimal area threshold.
-
albumentations.augmentations.bbox_utils.
convert_bbox_to_albumentations
(bbox, source_format, rows, cols, check_validity=False)[source]¶ Convert a bounding box from a format specified in source_format to the format used by albumentations: normalized coordinates of bottom-left and top-right corners of the bounding box in a form of [x_min, y_min, x_max, y_max] e.g. [0.15, 0.27, 0.67, 0.5].
Parameters: - bbox (list) – bounding box
- source_format (str) – format of the bounding box. Should be ‘coco’ or ‘pascal_voc’.
- check_validity (bool) – check if all boxes are valid boxes
- rows (int) – image height
- cols (int) – image width
Note
The coco format of a bounding box looks like [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. The pascal_voc format of a bounding box looks like [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212].
Raises: ValueError
– if target_format is not equal to coco or pascal_voc.
-
albumentations.augmentations.bbox_utils.
convert_bbox_from_albumentations
(bbox, target_format, rows, cols, check_validity=False)[source]¶ Convert a bounding box from the format used by albumentations to a format, specified in target_format.
Parameters: - bbox (list) – bounding box with coordinates in the format used by albumentations
- target_format (str) – required format of the output bounding box. Should be ‘coco’ or ‘pascal_voc’.
- rows (int) – image height
- cols (int) – image width
- check_validity (bool) – check if all boxes are valid boxes
Note
The coco format of a bounding box looks like [x_min, y_min, width, height], e.g. [97, 12, 150, 200]. The pascal_voc format of a bounding box looks like [x_min, y_min, x_max, y_max], e.g. [97, 12, 247, 212].
Raises: ValueError
– if target_format is not equal to coco or pascal_voc.
-
albumentations.augmentations.bbox_utils.
convert_bboxes_to_albumentations
(bboxes, source_format, rows, cols, check_validity=False)[source]¶ Convert a list bounding boxes from a format specified in source_format to the format used by albumentations
-
albumentations.augmentations.bbox_utils.
convert_bboxes_from_albumentations
(bboxes, target_format, rows, cols, check_validity=False)[source]¶ Convert a list of bounding boxes from the format used by albumentations to a format, specified in target_format.
Parameters: - bboxes (list) – List of bounding box with coordinates in the format used by albumentations
- target_format (str) – required format of the output bounding box. Should be ‘coco’ or ‘pascal_voc’.
- rows (int) – image height
- cols (int) – image width
- check_validity (bool) – check if all boxes are valid boxes