
How to Find and Remove Anomalies with Local Outlier Factor (LOF)
What Are Outliers? Outliers are data points that differ significantly from the rest of the dataset. Global Outlier: falls outside the normal range of a dataset Local Outlier: are outliers that are within the normal range of a dataset but different from its neighbours Machine learning algorithms don't work well when outliers are present. Outlier detection is important in many applications. Why Local Outlier Factor (LOF)? LOF is a density-based, unsupervised approach : which identifies outliers relative to their local neighbourhood LOF Score > 1 → Outlier Fast and robust for clusters with varying densities Implementation in Python Import Libraries from sklearn.neighbors import LocalOutlierFactor import pandas as pd import matplotlib.pyplot as plt Load Dataset data = pd . read_csv ( " fraud_lof_example.csv " ) Define LOF Model lof = LocalOutlierFactor ( n_neighbors = 20 , contamination = 0.1 ) scores = lof . negative_outlier_factor_ clean_data = data [ yhat != - 1 ] Visualise Outliers out
Continue reading on Dev.to Python
Opens in a new tab




