关键词:
Breakdown point
l(1)-norm minimization
outliers
robust regression
摘要:
The advantages of using l(1)-norm rather than l(2)-norm in terms of robustness for signal processing and other data analysis procedures are largely recognized across the scientific literature. However, from the robust statistic viewpoint, at least that based on the concept of breakdown point, l(1)-norm regression has no better resistance to outliers than least squares, and it is believed that it degrades in higher dimensions. We explain this seeming contradiction between theory and practice by the different contamination models used to assess robustness of outliers. After a brief review of the existing notions of robustness, we adopt a model where carriers are not subject to contamination, and only the response variable can be contaminated with outliers. We prove two new positive results concerning breakdown point robustness of l(1)-norm regression under this model. First, we show that l(1)-norm regression can have a positive breakdown point in any dimension, and this is rather common. We elaborate further in a second result, showing that random designs with unit normal rows yield to a high breakdown point, around 30% for moderate dimension growing asymptotically to 50%, with very large probability. These results provide a theoretical support to the practical success of l(1)-norm based procedures and are, at the same time, consistent with statistical robust regression theory.