Regularization and trade off parameter

How many classes of regularization there are? Which the difference between them? in this file, Regularization.py seems there are a lot: Simple Small Regularization, Simple Smooth Regularization, Tikhonov Regularization, Sparse Regularization. How to choose the beta ratio, cooling factor, and cooling rate?

Hi jcbarreto,

you are right, there are quite a few. They are important because they will strongly influence what our final model will look like.

The most used in Geophysics are the least-squares regularization on the model values (smallness) and model gradients (smoothness). Oldenburg, D. W., and Y. Li, 2005, Inversion for applied geophysics: A tutorial, in D. K. Butler, ed., Near-Surface Geophysics: Society of Exploration Geophysicists, 5, 89-150 is a good reference for those.

Those are registered under two regularizations. Simple is the simplest, straight-forward unweighted regularization. This is the one to use if you call any distance or sensitivity-weighting directives. The second is Tikhonov, which include the cell’s volume as weights, in an attempt at approximating their sensitivity.
Those two are the ones to start with as a user. When you see Small or Smooth in the name, it means they refer to those specific parts. So for example, Simple has both a Simple Small and Simple Smooth term in it. You should not have to call any Small or Smooth on its own.

There are no hard rules for the choice of beta ratio, cooling factor or rate… the rules of thumbs are: “start relatively high, and cool moderately”… starting too high or too low or cooling too fast or too slow this will, of course, affect your convergence.

In “Exploring nonlinear inversions: A 1D magnetotelluric example, Seogi Kang, Lindsey J. Heagy, Rowan Cockett, and Douglas W. Oldenburg, The Leading Edge 2017 36:8, 696-699”, the authors published notebooks you can play with to get a sense of the effect of beta and cooling rate.

There are useful directives for beta. Directives.BetaEstimate_ByEig estimate the importance of the regularization versus the data misfit. beta0_ratio between 1 and 10 is a good first try. Cooling factors such as below are also a good first try.

beta = Directives.BetaEstimate_ByEig(beta0_ratio=1.)
betaSched = Directives.BetaSchedule(coolingFactor=5., coolingRate=3)

The Sparse regularization is more involved. It promotes the model to only change at limited locations. See the examples in the documentation.:

  1. http://docs.simpeg.xyz/content/examples/01-basic/plot_inversion_linear_irls.html#sphx-glr-content-examples-01-basic-plot-inversion-linear-irls-py
  2. http://docs.simpeg.xyz/content/examples/05-mag/plot_nonLinear_Amplitude.html#sphx-glr-content-examples-05-mag-plot-nonlinear-amplitude-py

finally, not a SimPEG resource, but here is a very basic section about the various parts of an objective function and their respective influence:

https://giftoolscookbook.readthedocs.io/en/latest/content/fundamentals/index.html

4 Likes

Thank you, it is much clear now, and the last link you post is great!