Involves introducing a new feature derived from preexisting features, in order to distinguish non linearly separable data. Commonly, using a non-linear function, features may be transformed (e.g. ). Then, in its respective new dimension, a new hyperplane can be found.

The name 'kernel trick' is not explicitly mentioned in the study design. Refer to the theory as 'introducing a new feature' instead.

1D to 2D

Only case covered in the study design. Common transformations include raising to an exponent or applying a logarithm.

2D to 3D

A common transformation is . where and are 2 diff feature/axis, are constants. represent the square of the distance of any point to the cords . Can be used to represent information like resemblance to a certain instance. Source