Involves introducing a new feature derived from preexisting features, in order to distinguish non linearly separable data. Commonly, using a non-linear function, features may be transformed (e.g. ). Then, in its respective new dimension, a new hyperplane can be found.
The name 'kernel trick' is not explicitly mentioned in the study design. Refer to the theory as 'introducing a new feature' instead.
1D to 2D
Only case covered in the study design. Common transformations include raising to an exponent or applying a logarithm.
2D to 3D
A common transformation is .
where and are 2 diff feature/axis, are constants.
represent the square of the distance of any point to the cords .
Can be used to represent information like resemblance to a certain instance.
Source